Genetic investigation of isolated lactic

and their bacteriophage

Dragica Damnjanovic

A Thesis submitted as fulfilment of the requirement for the degree of

DOCTOR OF PHILOSOPHY

School of Biotechnology and Biomolecular Sciences

University of New South Wales

August 2019

ii

iii

iv

v

vi

ACKNOWLEDGEMENTS

I accomplished this PhD thesis under the mentorship of my supervisor Dr Wallace Bridge, who gave me the opportunity to conduct this research in his lab and provided his guidance and valuable feedback on my research and thesis writing. I am very greatful for all the advice and support that I received over the course of this study.

My sincere gratitude also goes to Dr Melissa Harvey for polishing this work by challenging my ideas, providing critical comments and engaging in stimulating discussions. I very much appreciate the time she spent reviewing many versions of my journals’ manuscripts, and responses to reviewers.

I would like to thank Professor Peter White and Dr Helen Speirs, UNSW, for their useful comments during performance reviews to help me stay on track with experiments. Also, thank you to my lab colleagues, Tereza Grigoriadis and Dr Gavin Ferguson for their support and companionship. Thank you to a wonderful collaborator - Emma Kettle, from the Westmead Institute for Medical Research - for sharing the joy of looking at phages with me and producing professional electron microscopy services. I also owe a special expression of gratitude to Dr Veljko Prodanovic, UNSW, for his invaluable assistance with statistical analysis.

The financial support that I received through The Postgraduate Research Student Support Scheme, a BABS Postgraduate Travel Fund, and from Dr Wallace Bridge, which enabled me to present at two international conferences, is much appreciated. I gained valuable knowledge that helped to me advance my doctoral research.

Big thank you to my daughters Jelena and Jana, for their love and moral support during these years, and above all, to my husband Vlado. This PhD study would not have been completed if it was not for Vlado’s patience and encouragement to keep going through difficult times. I love you all dearly.

Хвалим Те, што си ме услишио, и постао ми Спаситељ. (Пс. 117, 21)

vii

LIST OF FIGURES

CHAPTER ONE

Figure 1.1 Potential phage applications 10

Figure 1.2 Transmission electron micrographs of lactis phage representing different morphotypes and lactococcal phage species 16

CHAPTER TWO

Figure 2.1 (A) BOXA2R-PCR fingerprint profiles of some of the strains used for reproducibility testing. (B) The corresponding dendrogram of the BOXA2R-PCR patterns was generated using the UPGMA cluster analysis according to the Pearson product moment correlation coefficient (expressed as a percentage value, 0 - 100%) 37

Figure 2.2 The plot which illustrates 95% Confidence Intervals (CI) of dispersion estimates for colonies and DNA templates 38

Figure 2.3 A) Genetic identification of Lc. lactis ssp lactis and ssp cremoris type strains by BOXA2R-PCR. B) The corresponding dendrogram based on the UPGMA method displaying genetic distances above the branches illustrates the genetic relatedness of Lc. lactis isolates 40

Figure 2.4 Genetic comparison of the citrate-positive and citrate-negative isolates from the selective agar medium by BOXA2R-PCR 41

Figures 2.5 – 6 BOXA2R-PCR fingerprint profiles of the strains isolated in this study from retail dairy products 43

Figure 2.7 A) BOXA2R-PCR fingerprints of the representatives of the Gram-positive cocci LAB genera. B) The corresponding dendrogram displaying the genetic distances between isolates was obtained using the UPGMA method 44

Figure 2.8 Schematic representation of the common LAB genera identification by BOXA2R-PCR 49

CHAPTER THREE

Figure 3.1 BOXA1R-PCR fingerprint patterns of E. coli phage strains 65

Figure 3.2 Reproducibility testing of the BOXA2R-PCR using the DNA of lactococcal phages 66

viii

Figure 3.3 BOXA1R-PCR reproducibility testing 67

Figure 3.4 Reproducibility testing of the BOXA1R-PCR using commercially isolated AND 67

Figure 3.5 Reproducibility testing of the BOXA2R-PCR using the DNA of the Ø712 67

Figure 3.6 Reproducibility (n= 3) testing of the BOXA2R-PCR 68

Figure 3.7 Reproducibility testing of the BOXA2R-PCR 69

Figure 3.8 BOXA1R-PCR fingerprint profiles generated using different template sources 70

Figure 3.9 BOXA2R-PCR fingerprint profiles generated using different template sources 70

Figure 3.10 BOXA1R-PCR profiles of individual Lc. lactis phages next to their respective bacterial hosts 71

CHAPTER FOUR

Figure 4.1 Plaque morphology of phage 15(Mo9) on its host Lc. lactis ssp cremoris Mo9 89

Figure 4.2 Multiplex PCR applied on the purified DNA of the Lc. lactis phage 15(Mo9) showing 89

Figure 4.3 Electron micrograph of phage 15(Mo9) negatively stained with (A) 2% (w/v) uranyl acetate (UA) and (B) and 2.5% (w/v) phosphotungstic acid (PTA) 91

Figure 4.4 One step growth curve depicting the infection of Lc. lactis ssp cremoris Mo9 by phage 15(Mo9) at a multiplicity of infection of 0.001 92

Figure 4.5 Plaque morphology of phage15(Mo9) on its primary host Lc. lactis ssp cremoris Mo9 compared to that on the secondary hosts: Lc. lactis ssp lactis Ni301, Lc. lactis ssp cremoris AM1 and and Lc. lactis ssp lactis biovar diacetylactis FD11 and FD11R 95

Figure 4.6 Multiplex-PCR for CWPS typing of lactococcal strains susceptible to the phage 15(Mo9). The amplification of the rmlB gene, conserved in all CWPS clusters, represented the internal positive control 96

Figure 4.7 The application of the PCR designed to classify phages of P335-species into sub-groups I, II, III or IV based on their RBP-encoding genes to phage 15(Mo9) 96 ix

Figure 4.8 Read length distribution and quality of the Nanopore reads assessed using NanoPlot 98

Figure 4.9 Comparative genomics of phage 15(Mo9) and the reference Lactococcus phage c2 99

Figure 4.10 Phage 15(Mo9) genome structure 101

Figure 4.11 SDS-PAGE analysis of 15(Mo9) structural and their comparison with c2 structural profiles 103

x

LIST OF TABLES

CHAPTER TWO

Table 2.1. The list of dairy products used for strain isolation 29

Table 2.2 The list of strains used in this study 30-32

Table 2.3 The list of strains used for the reproducibility study 35

Table 2.4 SPSS output of the reproducibility analysis of the BOXA2R-PCR

fingerprints 39

CHAPTER THREE

Table 3.1 Bacteriophage samples isolated from samples 60

Table 3.2 Bacteriophage samples provided by external sources 61

Table 3.3 PCR amplification conditions for the primers, BOXA1R and BOXA2R 62

CHAPTER FOUR

Table 4.1 Host range analysis of the phage 15(Mo9) on Lc. Lactis strains 93 – 94

xi

Table of Contents

CHAPTER 1 ...... 1

1. BACKGROUND ...... 2

2. LITERATURE REVIEW ...... 3

2.1 STARTER CULTURES ...... 3

2.1.1 Starter cultures used in production ...... 5

2.1.2 Molecular methods of identification of microorganisms in cheese and dairy products ...... 6

2.2 BACTERIOPHAGES ...... 8

2.2.1 Discovery ...... 8

2.2.2 Distribution and importance ...... 8

2.2.3 Phage life cycle ...... 10

2.2.4 Modes of evolution ...... 12

2.3 DAIRY PHAGES ...... 13

2.3.1 Effect of phages in the dairy industry ...... 13

2.3.2 The sources of phage in the dairy industry ...... 14

2.3.3 Phage control measures ...... 14

2.3.4 Lactococcal phage types and prevalence ...... 15

2.3.5 Lactococcal phage characteristics ...... 16

2.4 PHAGE CHARACTERISATION METHODS ...... 18

2.4.1 Phenotypic characterisation ...... 18

2.4.2 Genetic characterisation ...... 19

2.5 PHAGE GENOME SEQUENCING ...... 20

2.6 REPETITIVE-PCR ...... 22

3. OBJECTIVE/AIMs ...... 24

CHAPTER 2 ...... 25

ABSTRACT ...... 26 xii

1. BACKGROUND ...... 27

2. MATERIALS and METHODS ...... 28

2.1 Strains ...... 28

2.2 Methods ...... 33

Strain isolation...... 33

Colony-PCR with BOXA2-R primer...... 33

Reproducibility testing...... 34

Statistical analysis...... 35

16S rRNA sequencing...... 36

ITS sequencing...... 36

3. RESULTS ...... 37

3.1 Application of rep-PCR fingerprinting using BOXA2R primer (BOXA2R-PCR) to dairy bacteria...... 37

3.2 Reproducibility of the BOXA2R-PCR method ...... 38

3.3 subspecies differentiation by BOXA2R-PCR ...... 39

3.3.1 Investigation of the potential of BOXA2R-PCR to detect biovar diacetylactis ...... 40

3.4 Application of BOXA2R-PCR for screening isolates from Australian dairy samples ...... 42

3.4.1 Strain isolation ...... 42

3.4.2 BOXA2R-PCR strain fingerprinting ...... 42

3.4.3 Discrimination of Gram-positive LAB cocci by BOXA2R-PCR ...... 43

4. DISCUSSION ...... 45

5. CONCLUSIONS ...... 49

CHAPTER 3 ...... 56

ABSTRACT ...... 57

1. BACKGROUND ...... 57

2. MATERIALS and METHODS ...... 59

xiii

2.1.1 Bacteria ...... 59

2.1.2 Bacteriophages ...... 59

2.2.1 Phage propagation, amplification, and titering...... 62

2.2.2 Phage purification and DNA isolation...... 62

2.2.3 Method validation ...... 63

2.2.4 Statistical analysis...... 63

2.2.5 Sequencing of the PCR bands...... 63

2.2.6 Validation of the phage origin of rep-PCR products...... 64

3. RESULTS ...... 64

3.1 Repetitive-PCR for phage fingerprinting ...... 64

3.2 BOXA-PCR method validation ...... 65

3.2.1 Influence of the DNA isolation method on the phage profile ...... 65

3.2.2 Influence of the cultivation temperature on the phage profile ...... 67

3.2.3 Influence of the propagating host on the phage profile ...... 68

3.3 Application of BOXA-PCR genotyping for screening phage isolates ...... 69

3.3.1 Comparison of the rep-PCR patterns of phage and their hosts ...... 71

3.4 Sequencing of the BOXA-PCR fragments ...... 72

4. DISCUSSION ...... 73

5. CONCLUSION ...... 75

CHAPTER 4 ...... 81

1. BACKROUND ...... 81

2. MATERIALS and METHODS ...... 82

2.1 Materials ...... 82

2.1.2 Bacteriophages ...... 82

2.2 Methods ...... 83

2.2.1 Multiplex PCR for phage typing ...... 83

2.2.2 Transmission electron microscopy (TEM)...... 83

xiv

2.2.3 Phage assays...... 83

2.2.4 Host range analysis ...... 84

2.2.5 Cell wall polysaccharide (CWPS)-typing of lactococcal strains ...... 84

2.2.6 PCR for the classification of P335 phages and sequencing of the generated amplicons ...... 84

2.2.7 Illumina MiSeq phage 15(Mo9) sequencing and genome analysis ...... 84

2.2.8 Nanopore sequencing and bioinformatic analysis ...... 85

2.2.9 SDS-PAGE and mass spectrometry of phage structural proteins ...... 86

3. RESULTS ...... 86

3.1 PHENOTYPIC AND GENETIC CHARACTERISATION of 15(Mo9) ...... 86

3.1.1 Plaque morphology...... 86

3.1.2 Phage species determination ...... 87

3.1.3 Morphology study by electron microscopy ...... 88

3.1.4 One step growth curve ...... 89

3.1.5 Host range analysis ...... 90

3.1.6 CWPS typing of lactococcal strains by multiplex-PCR ...... 93

3.1.7 PCR for the classification of P335 phages ...... 94

3.1.8 P335-specific PCR product sequencing ...... 95

3.2 GENOMIC CHARACTERISATION of 15(Mo9)...... 95

3.2.1 Analysis of phage genome sequenced on Illumina MiSeq...... 95

3.2.2 Analysis of phage genome sequenced on MiniOn nanopore device ...... 96

3.2.3 Comparison of Illumina MiSeq and Nanopore sequencing results ...... 97

3.2.4 Final draft genome characteristics of phage 15(Mo9)...... 98

3.2.5 SDS-PAGE and massspectrometry analysis ...... 101

4. DISCUSSION ...... 102

5. CONCLUSION ...... 105

CHAPTER 5 ...... 114

CONCLUSION AND FUTURE REMARKS ...... 114 xv

BIBLIOGRAPHY ...... 117

APPENDICES ...... 141

xvi

CHAPTER 1

INTRODUCTION

1. BACKGROUND

Large-scale commercial fermented dairy products are manufactured using bacteria (LAB) starter cultures. These cultures must meet specified performance criteria and contribute desirable organoleptic properties to the final products. Modern starter cultures are commonly produced by specialised companies which supply their customers with defined starter and adjunct strain combinations that have been customised for particular applications. The standardisation of manufacturing processes ensures product quality, safety and consistency (1). However, the regime of strict strain selection processes in the modern era has led to a relatively small pool of genetically distinct strains being used industrially worldwide (2). Conversely, many small-scale traditional products that involve the use of autochthonous cultures naturally present in raw , the environment or processing equipment, are renowned for their unique and distinctive organoleptic characteristics (3, 4). These artisan products are looked at by starter culture companies as sources of new strains that could potentially be used to improve product or process properties (5).

An essential step in the strain selection process is the taxonomic identification of new isolates. This is commonly achieved by the application of molecular analytical techniques, with various different methods being commonly used for genus and species identification, and for strain differentiation. The selection of suitable candidates for commercial starter production is a resource intensive process requiring the screening of large numbers of strains.

The infection of LAB starter cultures by bacterial viruses (bacteriophages) represents a substantial risk for manufacturers. Despite numerous measures taken to control their proliferation, bacteriophage (phage) infection remains the principal cause of slow LAB starter activity (6). This places pressure on starter culture manufacturers to establish programs to select or develop new phage resistant strains. These new strains must be resistant to phage yet still maintain the desired characteristics required for manufacture and application. A rapid simple technique for phage fingerprinting would be useful both for the development of phage resistant LAB and for monitoring phage populations in the factory environment. It could also further the understanding of phage evolution and assist in the establishment of approaches to controlling their proliferation.

2

Restriction of phage DNA is currently the predominantly used technique for the genetic characterisation of phage. This methodology has certain limitations due to digestion resistance of some phage genomes (7, 8). Insufficient resolution of selected restriction sites requires the testing of multiple restriction , which increases the method’s complexity, labour intensity, and analysis time.

The professional experience of the candidate in the starter culture industry has provided first-hand awareness of how access to a fast and reliable method for the fingerprinting of both bacterial and phage isolates would be of substantial potential value to the operations of a starter culture company. The rationale for carrying on the research presented in this PhD study was to explore the development of simple genetic tools that would be of practical benefit to industry. Moreover, the availability of such rapid genetic analysis methods could be useful in all bacterial -based industries and would be of likely value in general for microbiological research.

The performance of Lc. lactis can be significantly impaired by phage action. The three phage types most relevant for Lc. lactis are 936, P335 and c2 (9, 10). These three phage groups are of unrelated DNA homology and mostly exchange DNA within the same group (11). For example, prolate c2-type phages can readily recombine in the natural environment with high efficiencies (12). Lytic phages of P335 species exchange genetic material with chromosomal DNA of their hosts in addition to DNA exchanges between its lytic and temperate members (13, 14). In the context of an unlikely recombination between c2 and P335 phage groups, the detection of a single phage that was typed as both c2- and P335-type was regarded as out of the ordinary, which motivated its subsequent characterisation.

2. LITERATURE REVIEW

2.1 STARTER CULTURES

Starter cultures are microorganisms used for the production of various fermented and feed products and contribute nutritional, organoleptic, and preservation properties. The most well-known cultures employed in these processes are diverse species and strains of (LAB). The primary role of dairy starter cultures is to acidify milk through the fermentation of lactose to lactic acid and the digestion of milk proteins, but they also determine the functional and technological characteristics of the final product (15). Through their proteolytic and lipolytic activity, dairy cultures can

3

create distinctive flavours and aromas. Strains that produce exopolysaccharides are used to achieve desired and firmness. The low pH due to lactic acid plus the production of bacteriocins, such as nisin, inhibit many spoilage and pathogenic bacteria. This exerts a preservation function that enhances food safety and extends shelf-lives (16). Different LAB species can improve the nutritional value of food, such as through the synthesis of B-group vitamins (17) or the biologically-beneficial functional lipid CLA (conjugated linoleic acid) (18). The addition of to dairy products may lead to health improvements via the release of bioactive peptides with antihypertensive effects (19), treatment of lactose maldigestion (20) and by lowering cholesterol levels (21).

Distinctive flavour, taste and texture attributes are imparted to dairy products by specific combinations of the strains in a starter culture. Many types of , sour , and lactic are produced using the mesophilic Lc. lactis and L. mesenteroides species (22). The thermophilic starters, Str. thermophilus and different species, including Lactobacillus (Lb.) casei, Lb. rhamnosus, Lb. helveticus, Lb. plantarum, Lb. delbrueckii subsp. bulgaricus are used in the manufacture of yoghurt, fermented and certain types of cheeses, such as Italian and Swiss cheese varieties and pizza/ cheese (23). The use of non- conventional bacteria, such as Propionibacterium freudenreichii and Brevibacterium linens as well as yeast species Debaryomyces hansenii, Geotrichum candidum, Penicillium (P.) roqueforti and P. camemberti can contribute to typical organoleptic characteristics due to their unique metabolic activity (23). An authoritative catalogue of microorganisms with a documented safe history of use in fermented food products was updated by the International Dairy Federation (IDF) and the European Food and Feed Cultures Association (EFFCA) in 2012, and currently contains 265 bacterial species and 70 species of yeast and molds (24).

4

2.1.1 Starter cultures used in cheese production

Historically, “back-slopping” was used for cheese production (16, 23) and involved batch to batch inoculation. Such starter cultures are undefined mixtures of LAB strains and species plus any non-LAB microorganisms. These mixed cultures evolved over time with fermentation success being highly variable, especially if the dominant strains developed undesirable flavours or became prone to phage attack (16).

Today, industrially-manufactured dairy products typically use defined starter cultures under controlled conditions to achieve production consistency, standardised performance characteristics and product quality (25). The most commonly used acidifying strains in cheese-making are Lc. lactis ssp lactis and Lc. lactis ssp cremoris. Flavour metabolites, such as diacetyl, are produced by Lc. lactis ssp lactis biovar diacetylactis and sp. (26). The selection of strains to be used in large scale industrial production is restricted to those with the Generally Recognized as Safe (GRAS) status (27). The European equivalent of this US food ingredient classification system (28) is the Qualified Presumption of Safety (QPS) list of microorganisms developed by the European Food Safety Agency (EFSA) in 2007 to specifically address the safety of food cultures (29).

Commercial starter culture-producing companies supply the dairy industry with frozen bulk starter cultures (30) or highly concentrated (1011 - 1012 cfu/g) freeze-dried cultures that are suitable for direct-to-vat inoculation into milk. These starters are often tailored for specific applications or according to particular requirements for processing conditions and can consist of two to six strains (16, 30). However, these highly defined processes are sometimes perceived by consumers as delivering products with bland flavours (31, 32). This can be a consequence of multiple factors relating to the industrialization of the dairy industry. These include an increasingly limited pool of selected Lc. lactis strains with suitable technological characteristics used worldwide (33), which eliminates indigenous microbiota and their associated enzymes that play key roles in flavour development (34), or the loss of certain plasmid- encoded metabolic traits (32).

Traditional cheeses are still made with natural starter cultures of undefined composition using specific manufacturing methods and practices, traditional equipment and raw or minimally processed milk from different animals (4, 35). Such artisanal

5

cheeses produced in particular geographic regions may be granted a Protected Designation of Origin (PDO) qualification (3). The complex natural microbiota adds more pronounced and sensory diverse flavours than can be achieved with pasteurised milk cheeses, but their quality can be subject to variations caused by internal population dynamics and interactions (36). The metabolic products of taxonomically diverse indigenous microorganisms present in raw milk cheeses not only contribute to the more intense flavor than those produced at large scale but may also provide health benefits such as improvement of enteric microbiota, asthma and allergy protection and better immunity against food pathogens (27).

Species of Enterococcus, such as Ec. faecalis, Ec. faecium and Ec. durans dominate among the natural isolates from raw cheese (4, 35, 37), where their presence can be at a level comparable to Lactococcus sp. populations (38). The Enterococcus genus has an ambiguous status in regard to safety as some strains may be useful dairy starters, while others are considered dangerous for human health (39), which is why this is the only LAB genus excluded by default from the QPS list (29).

2.1.2 Molecular methods of identification of microorganisms in cheese and dairy products

Culture-independent techniques based on direct DNA or RNA analysis have been developed in order to reveal the actual diversity of cheese microbiota developed through the various stages of ripening (40). Some DNA-based technologies include denaturing gradient gel electrophoresis (DGGE) (41), terminal restriction fragment length polymorphism (T-RFLP) (42), single-strand conformation polymorphism (SSCP), and temporal temperature gradient gel electrophoresis (TTGe) (reviewed in (43)). The techniques that target the RNA molecules include reverse transcription PCR (RT-PCR) (44) or suppression subtractive hybridization (SSH) (45). The benefits of the use of culture-independent techniques involve the ability to follow the dynamics and the structure of the whole microbial community, to identify the metabolically active organisms as well as to detect those that are viable yet difficult to cultivate (reviewed in (46). Conversely, culture-dependent methods are employed when the objective is the isolation of new starter cultures, probiotics or other biotechnologically relevant strains.

Due to the importance of Lc. lactis subsp. lactis and subsp. cremoris to the dairy industry it is critical that this species is accurately genetically identified, and the two subspecies are genetically distinguished. The initial taxonomic identification of Lc. lactis

6

species is achieved by the commonly applied method of amplifying and sequencing 16S rRNA genes (40, 47, 48). PCR-based methods for sub-species classification exploit sequence polymorphisms of particular genes, such as those encoding for glutamate decarboxylase (gadB) (49) and peptidoglycan hydrolase (autolysin) (acmA) (50) or specific regions that are distinctive between subspecies lactis or cremoris genomes, such as that between rrnB and rrnC operons (51) or a histidine biosynthesis operon (48, 52).

These methods, however, require the subsequent restriction endonuclease digestion or partial DNA restriction analysis (partial ARDRA) of the obtained amplicons (53) for further subspeciation. The DNA digestion of some bacterial strains may be prevented when a base in the recognition site is modified, usually as a result of DNA methylation. Additionally, the use of a single restriction enzyme may not result in sufficiently discriminative digestion profiles, which would then require testing of additional enzymes to select one that is suitable for generating strain identifying fragments. This difficulty has necessitated Lc. lactis species and subspecies determination to be performed in two phases, with each using two sets of methods (48, 54). Additionally, this approach is usually inadequate for differentiation at a strain level.

Another issue is that methods used for strain-level differentiation do not offer any genus or species information (38). Several genotypic methods are currently in use for strain differentiation: restriction endonuclease digestion of chromosomal DNA in conjunction with pulse-field gel electrophoresis (PFGE) (53) or amplified fragment length polymorphism (AFLP) (55); random amplified polymorphic DNA (RAPD) (38, 56, 57); repetitive sequence-based PCR (rep-PCR) (4, 58) and multilocus sequence typing (MLST) (58). The use of one genetic method often requires verification by another, either due to an insufficient discriminatory ability of a single method or for an improved informative aspect of composite fingerprint analysis. Commonly a combination of methods is used, such as: (GTG)5 and AFLP (59); (GTG)5 and SDS-PAGE protein fingerprinting (60); PFGE and MLST (53). Additionally, Next Generation Sequencing technologies have greatly advanced in recent years and whole genome sequencing has become a more accessible option for rapid identification and selection of starter strains for the dairy industry based on gene-trait matching (61).

7

2.2 BACTERIOPHAGES

2.2.1 Discovery

Bacteriophages (phages) are viruses that infect bacteria. They were discovered independently by Frederick Twort in 1915 and then by Felix d’Herelle in 1917 (62). Due to their strong antibacterial activity and specificity, soon after this discovery phages were used to treat human infectious diseases using a clinical approach called ‘phage therapy’, mostly in the countries of the former (63, 64). With the recent recognition of the seriousness of the problem of bacterial resistance to antibiotics, the interest in phage therapy has been renewed in Western medicine (65). Their potential for applications in human medicine, animal health, and food safety are being increasingly explored (reviewed in (64)).

2.2.2 Distribution and importance

Bacterial viruses are considered the most abundant biological entities in our biosphere (66, 67). With the aid of high throughput sequencing methodologies and large scale metagenomic studies of viral communities in a wide range of ecological environments, the total number of bacterial and archael viruses has been estimated to be in the range of >1030 (68, 69). Moreover, a metagenomic approach has revealed a large diversity of previously unknown phage genomes (70). The most extensively studied ecosystems where phages thrive include freshwater (71), marine (72), terrestrial (72), air (73) and the human gut (reviewed in (74)).

The 16S rRNA gene present in bacteria and archea has proved useful for their identification and enabled insight into their diversity and population dynamics. Phages do not possess a universal phylogenetic marker analogous to the 16S rRNA since there is no single gene that is common to all tailed viral genomes (67). Although certain genes are conserved within particular viral types or families, such as DNA polymerase gene in T7-like Podophages (75) or terminases in most Caudovirales members (76), they have limited application potential in viral diversity and evolution studies. It has been suggested that despite the ubiquity of phage in the environment and the high viral diversity of various ecosystems on a local scale that the overall phage genomic pool is relatively limited due to the high global distribution in biomes of identical or near identical DNA fragments (66, 75).

8

Phages play major roles in biogeochemical cycling (77); in managing microbial populations (78), and reprogramming bacterial metabolism through horizontal transfer of phage-encoded genes (79). Furthermore, their association with human health, such as the provision of non-host derived immunity (80) or their influence on periodontal (81) or intestinal (82) health status have recently been uncovered.

The increased understanding of phage abundance and their co-existence with bacteria in various environments has offered new fields and topics for research. These include the overlooked non-tailed viral predators of marine bacteria and archea (83), the highly conserved and dominating phage, crAssphage, in the human gut of people across populations (84), therapeutic applications (85) and application of CRISPR-Cas genetic engineering to target the plasmid located antibiotic resistance determinants of bacterial pathogens (86).

Phages can be used in various biotechnological applications, such as phage therapy, pathogen detection or biofilm degradation (see Figure 1.1).

9

Figure 1.1 |Potential phage applications. Reprinted from: De Smet J, Hendrix H, Blasdel BG, Danis-Wlodarczyk K, Lavigne R. Nature Reviews|Microbiology 2017, 15:517-530 with the CCC RightsLink permission.

2.2.3 Phage life cycle

Bacteriophage infection commences when the phage particle binds to specific receptors on the surface of a bacterial cell mainly via electrostatic forces (87). The baseplate located at the bottom of the tail undergoes conformational change that allows phage genetic material packed in the capsid to be released through the tip of the tail tube into the cytoplasm of a bacterium. Virulent phages then engage in a lytic life cycle where they replicate by taking over the host cell machinery, and the subsequently formed capsids, packaged with phage genetic material and assembled into virions, are released by cell lysis. The number of released particles represents the burst size. Alternatively, temperate phages following infection enter a lysogenic cycle by

10

integrating their genome into the bacterial chromosome. The inserted phage genome remains within a lysogenic bacterium and replicates with bacterial cell division. Upon exposure of the bacterial host to stress condition, such as nutrient limitation, changes in pH or temperature, exposure to antibiotics or DNA damaging agents (88), the prophage can be excised from the bacterial genome and enter a lytic cycle. The availability of viable bacterial cells informs the lytic phage life cycle, whereas lysogeny increases the phage’s chances of further infections and reproduction when viable bacterial cells become scarce. It has recently been reported that a mechanism for switching between lytic and lysogenic phases involves inter-virus communication via a small peptide molecule (89).

Apart from these two well studied phage propagation pathways, alternative types of phage-host interactions, such as pseudolysogeny, phage ‘carrier state” and chronic infection have been described (90-93). Pseudolysogeny usually occurs when hosts experience nutrient-limited conditions (such as starvation) and is therefore particularly relevant to natural environments. The most comprehensive definition of pseudolysogeny describes it as “the stage of stalled development of a bacteriophage in a host cell without either multiplication of the phage genome (as in lytic development) or its replication synchronized with the cell cycle and stable maintenance in the cell line (as in lysogenization), which proceeds with no viral genome degradation, thus allowing the subsequent restart of virus development” (92).

Lytic phage infection of the sensitive portion of an otherwise resistant bacterial culture, can result in an association where the phage and bacteria form an equilibrium. This phenomenon is known as a phage “carrier state” (93, 94).

Additionally, chronic host infection by some phages, such as filamentous phages and particular archael viruses, involves the continuous shedding of viral particles without lysis of the host (67).

11

2.2.4 Modes of evolution

Comparative analysis of phage genomes reveals two seemingly opposite insights; the remarkable diversity of nucleotide sequences and at the same time considerable underlying similarities in overall gene organisation (68). For example, two unrelated phages typically share little or no common nucleotide sequences, yet they can display a nearly identical order for some of their structural genes (95). Conversely, sequence- related genes have been identified between distinct families of tailed phages infecting bacteria of a wide phylogenetic range indicating that these genes have a shared and conserved ancient ancestry (96).

One model proposes that phages evolved by modular evolution (97), where modules represent clusters of genes with a particular function, such as DNA replication, DNA packaging, head- and tail morphogenesis, transcription, lysis or lysogeny. Each phage genome contains a unique combination of modules, which can be exchanged by homologous recombination between phage genomes (69) and lead to the re- assortment of existing gene modules (68). The boundary between the modules is sharp and evidently discontinuous, pointing to a distinct evolutionary origin of the linker sequences (69).

The horizontal transfer of genes has been recognized as the foremost mode of phage evolution (76, 98). It has frequently been observed between diverse phages that share the same environment and have overlapping host ranges (98). Genetic exchanges most likely happen when two phages infect the same bacterial cell (coinfecting phages) or when a single phage infects a cell that already has one or more prophages within its genome (96). Lytic phages are less prone to recombination than temperate phages. Hypervariable regions of the phage genomes are often related to adaption to a particular host or to an environmental niche (99), while DNA replication, metabolism and morphogenesis genes are much more conserved and precisely structurally organised (98). Phage genomes are arranged in clusters of related function genes and these clusters are often exchanged in blocks during recombination (100). Further exchanges may happen within modules or within genes (101). Additionally, late and early genes clusters of different evolutionary origins can also readily combine (95).

The exchanges that involve regions without sequence similarity or with minimal sequence similarity are termed nonhomologous or illegitimate recombination (102).

12

Non-homologous recombination events between different phage genomes occur randomly and if within coding sequences will mostly result in non-productive exchanges and non-viable progeny (103). Hence these processes result in the rigorous selection for functional phage in the surviving population (104). The recombinations, which are likely to be productive probably occur in intergenic regions where they do not interfere with functional modules (102). The detection of genes with a bacterial origin in phage genomes is also not unusual. Phages can exchange and acquire DNA fragments that originate from the host bacterial chromosome, plasmids (98), foreign DNA from the environment (102) and even viruses (105).

With such extensive genetic exchanges, phage genomes may end up consisting of genes of disparate origin, which renders the characterization of phylogenetic relationships and the subsequent phylogenetic phage placements difficult to establish and reconcile (76, 98). A limited number of morphological characteristics (106) and nucleotide sequence similarity between phage genomes cannot be taken as the exclusive measure of their relatedness (95). The propositions for consideration of a single structural module, such as head or tail genes (107), a phage proteomic tree (108) or for placing a phage into more than one group to account for their pervasive mosaicism and complexity (106) represent challenges for the official ICTV taxonomy.

2.3 DAIRY PHAGES

2.3.1 Effect of phages in the dairy industry

The dairy industry has been long burdened with phage problems. It has been estimated that between 0.1% and 10% of the dairy production batches have detectable phages (109). Depending on the level of phage infection, this can range from fermentation slow-down to its complete halt in extreme cases (110, 111). Acidification delays may result in inferior product quality due to changes in properties, such as texture, taste and flavour. A relatively high pH may compromise the product’s microbiological safety due to potential contamination with undesirable microbiota (110, 112). Complete acidification failures can lead to substantial financial losses to affected dairy producer (113, 114). The threat of phage infection is particularly serious risk to cheese production due to the broad range of phage capable of infecting the main starter culture, Lc. lactis (33, 115, 116).

13

2.3.2 The sources of phage in the dairy industry

Slow acid production caused by phages was first recognised by Whitehead and Cox in 1935, and the tracing of phage origin has since been a subject of many research studies (112, 116, 117). Phages can gain entry into the factory in raw milk, where they exist naturally in titres of 101 – 104 pfu/ml (116, 118, 119). It is presumed that wild Lc. lactis strains colonising plants used for livestock feed serve as hosts for phage proliferation in raw milk (116). When milk from different farms is combined for processing and fermentation, phage biodiversity and numbers can increase (114). Pasteurisation is not equally efficient in removing phage. For example, lactococcal phage of the 936 species, have been shown to remain viable not only after pasteurisation (116), but also at higher temperatures surviving 90°C for 20 min or 97°C for 5 min (120). The reutilisation of milk products, such as concentrates, whey permeates and milk powders as ingredients in cheese production with the purpose to increase product yield (121) or to improve texture (122) can contribute to the of thermo-resistant phages (123).

Drained cheese whey is the natural environment for phages (124), where they can reach titres as high as 109 pfu/ml (120). From contaminated surfaces and through whey splashes, phages can readily spread throughout a dairy plant (6, 125) via aerosols (73).

Another source of virulent phages emergence and phage colonisation in dairy plants are lysogenic starter cultures (6, 112). It is well-known that lysogeny is widespread among lactococcal strains (126). If a lysogenic starter strain is used for cheese making, the prophage can be induced either spontaneously (127) or incited by the manufacturing conditions (128). The progeny of such phages can then enter the lytic infectivity cycle in starter strains other than its lysogenic host (11).

2.3.3 Phage control measures

Despite years of efforts by dairy scientists it has proven difficult to completely eliminate phages from the dairy environment (6, 129). To address this, industry has focused on risk mitigation strategies, such as: cleaning and disinfection of the milk processing equipment and lines, surfaces, worker hygiene; improved factory design and adequate air filtration and ventilation (6, 109). In addition to the introduction of good manufacturing practices, a sound dairy management program also includes the use of

14

strain rotation systems, which involves the sequential use of strains based on phage- host relationships (130). The strains for rotation are selected from phage-unrelated strain groups or are composed of strains with natural phage resistance mechanisms (naturally phage resistant strains ) (30). However, the continual rotation of multiple strains also has the potential to drive phage co-evolution and diversification and may lead to the emergence of hybrid “super-phages” with extended host ranges (12, 131). The design of a lactococcal strain with incorporated phage defense systems (e. R/M or Abi) through conjugation or in vitro self-cloning (132, 133) has been shown to be a useful solution to control phage (134).

An additional measure can include the use of Phage Inhibitory Medium (PIM) for strain propagation (6). These types of media are based on the inclusion of phosphates to bind calcium (or other soluble divalent cations, such as magnesium or manganese) and thus prevent phage infection. However, although calcium ions may be beneficial for phage replication, many lactococcal phage species do not require calcium for adsorption hence phage inhibitory media are often ineffective (135). Also, these media formulations may negatively influence starter growth rates and acidification abilities (136).

2.3.4 Lactococcal phage types and prevalence

The International Committee on Taxonomy of Viruses (ICTV) classifies the lactococcal phages into the Caudovirales order and further into the families: Siphoviridae, which possess long non-contractile tails and Podoviridae, which have short tails (ICTV Taxonomy Release #34: 2018b https://ictv.global/taxonomy). Following the revision of the original classification of lactococcal phages into 12 groups (137), there are currently ten recognised lactococcal phage species based on the analysis of the available completely sequenced genomes, electron microscopy and Southern hybridization studies (10). Apart from the three main groups, the 936, c2 and P335 (10), the following species: Q54 (138), P087 (139), 949 (140), 1358 (141) and 1706 (142) belong to the Siphoviridae family. The remaining two species: KSY1 (143) and asccphi28/P034 (144) are identified as members of the Podovoridae family. The Myoviridae phages, characterized by long contractile tails consisting of a sheath and a central tube, have been rarely isolated on lactococcal hosts (145, 146).

Of most relevance to the dairy industry are three phage groups: 936, P335 and c2, as they are responsible for most dairy fermentation failures worldwide (109, 147-150). In

15

particular, the most frequently isolated phage over the years and in dairy plants of various countries is the 936-type phage (119, 147, 149, 151-153). It is reasoned that these three phage species dominate within manufacturing plants with large-scale cheese production, due to the intense use of the specifically selected and limited number of industrial starter cultures (10, 154). On the contrary, in cheese produced on a small scale using traditional methods, the so-called rare phage groups, such as 949 and P087, appear to be dominant (1).

Figure 1.2 |Transmission electron micrographs of Lactococcus lactis phage representing different morphotypes and lactococcal phage species. Reprinted from Atamer S, Samtlebe M, Neve H, Heller KJ, Hinrichs J. Front Microb 2013, 4:191 under the Creative Commons Attribution 4.0 International License.

2.3.5 Lactococcal phage characteristics 16

The 936 and c2 groups consists of obligatory lytic phages, while P335 species contains both temperate and lytic phages (155). The 936 and P335 types possess small isometric capsids (morphotype B1, Bradley), while the c2 species is characterized by a prolate-shaped head (morphotype B2) (137). Early research identified a lack of DNA homology and serological relatedness between lactococcal phage species (143, 151, 156, 157). Different morphologies are characterised by different phage genome sizes, where DNA length correlates to the size of the phage head (112). The prolate-headed phages have the smallest genome size of 20 - 22 kb; the genome length of 936 –type phages ranges from 26 – 32 kb and of P335 phage species from 33 – 48 kb (143, 158). The 936 and c2 phage groups have a high level of intraspecies genetic conservation (154) with most differences between them having arisen from point mutations and small insertions or deletions (159). Vertical evolution was particularly evident for 936- like phages, which phylogenetically clustered with their hosts (160). Their evolution is believed to have occurred through the exchanging of DNA modules with other phages of the same species (150). The homologous recombination between 936 and c2, as identified in the formation of the phage Q54, is a rare event and probably occurred during host co-infection (138).

A greater diversity is encountered among the P335 types (161), which represent a polythetic species where virulent phages and prophages easily recombine (10). P335 phages have narrow host range and a possible affinity towards the lactis subspecies (150). Members of the P335 share modules, yet they do not have a single common attribute (161, 162).They are very prone to homologous genetic recombination and genome reshuffling (13). The lytic P335-type phage was observed to acquire a large fragment of the chromosomal DNA to evolve into a new lytic phage (162). The recombination events that involve resident prophages in bacterial strains and lytic P335 phages are seen as contributors to the emergence of the evolved virulent types (13, 14, 155), though this link has not been unambiguously confirmed (163, 164).

On the basis of the observed gene order similarity to that of coliphage λ, particularly for 936 and P335-type phages, it has been proposed that lactococcal phages be assigned to a λ supergroup within the Siphoviridae family (95, 107).

The most probable route of appearance of rare phages, has been through exchanges of genetic material between phages and bacteria interacting in diverse environments (142, 143).

17

2.4 PHAGE CHARACTERISATION METHODS

2.4.1 Phenotypic characterisation

An essential step in characterising phages is their visualization through the use of transmission electron microscopy (TEM). The technique is based on negative staining of a phage suspension, commonly with uranyl acetate or phosphotungstic acid (165), and does not require specific reagents for particular viral samples (166).

The most significant contribution to revealing the morphological structure of bacteriophages has been made by two researchers, Bradley (167), and later by Ackermann, who analysed TEM images of thousands of phages (168). The visualization revealed the fundamental phage structure of a nucleic acid core surrounded by a protein coat, which together forms the infective particle, which was termed a virion. The images demonstrated that phage follow strict geometry and rules of symmetry (167). The observations of phage structural characteristics, such as the size and shape of the head, the tail and the presence or absence of appendages, such as collars, whiskers and fibres (169) together with the determination of the nucleic types (170) formed the basis for the first phage classification proposals (167, 171). TEM analysis is still an essential tool in phage taxonomy, but new approaches to the phage classification system aim to incorporate contemporary genomic and proteomic tools (172, 173).

Modern electron microscopy techniques, such as immunoelectron microscopy and cryo-electron tomography (cryo-ET), have facilitated the advancement of research into the viral structural components and the elucidation of viral replication processes and morphogenesis (166). For example, cryo-ET has enabled visualisation at nanometer scale level of a phage piercing the thick Gram-positive cell wall of Bacillus subtilis to initiate infection (174) as well as the extensive conformational changes that the baseplate of T4 is subjected to during adsorption (175). Cryo-ET has also revealed completely new features, such as the nucleus-like shell that encapsulates the replication machinery and the replicating DNA of phage 201ø2-1 during a lytic infection of Pseudomonas chlororaphis (176). This nucleus-like structure is positioned at the center of the bacterial cell by cytoskeletal filaments formed by a phage-encoded protein with a structural homology to tubulin (177) and is proposed to be protected from bacterial defense systems (70).

18

Further development of three-dimensional electron microscopy is expected to provide insight into how viral infection modifies cellular structures and to clarify the role of different viral proteins in the infection process. This knowledge may further aid in the design and development of targeted antiviral drugs (178).

2.4.2 Genetic characterisation

Restriction digestion

Currently, the most widely used method for genetic profiling of isolated phages and for the estimation of their genome size involves restriction enzyme digestion of their DNA. The three most prominent restriction enzymes applied to phage that infect various bacterial species include HindIII, EcoRI and EcoRV (117, 162, 179). However, phage DNA is often resistant to restriction digestion due to the scarcity of recognition sequences or the modification of phage DNA (7, 157, 162). The modification of bases at or near recognition sites by DNA methylases can induce DNA conformational changes or improper protein-DNA contact at a catalytic site, which inhibits hydrolysis (180).

It is usual to test several restriction enzymes to select the most suitable for adequate digestion of the DNA of the phages of interest. Some phages have developed anti- restriction mechanisms against the corresponding restriction enzymes in their bacterial hosts (181). Accordingly, the DNA of many phages, particularly when highly modified, can be difficult to cleave by many enzymes (7, 117) or can result in very slow and incomplete digestion (180). In contrast, digestion by numerous enzymes, such as has been observed for phages of Lb. bulgaricus and some Lc. lactis seem to be rare (182).

It has been observed that restriction digestion patterns may vary with experimental conditions, which limits data reproducibility and comparison (160). An improvement in the restriction fragment length polymorphism (RFLP) method has been achieved by incorporating a fluorescent primer into each amplified restriction fragment so that they can be visualised during capillary electrophoresis (fRFLP) (183). This approach while it enables digitisation of restriction digestion profiles and their easier interlaboratory comparisons, it has certain drawbacks, which include a lack of detection of minor host- induced mutations and the limitation of being restricted to comparing bands of less than 1000 bp in size (183).

19

Random amplified polymorphic DNA (RAPD)

Random amplified polymorphic DNA (RAPD)-PCR has been used to explore genetic diversity of phages infecting various bacterial and algal species (8, 184-187). Apart from genotypic of individually isolated phages, it has been applied to studying genetic changes within entire viral communities (188-190). However, this approach requires a high degree of protocol optimisation to achieve good results (184, 186, 189). The first challenge to the method is the selection of a suitable set of primers. The use of a single primer may not be sufficiently discriminative or reproducible due to an insufficient resolving power of typical sequence-specific decamer primers (184). The fingerprint patterns from commonly three different primers are pooled (185, 186, 191), though up to ten pooled primers have been reported (192). For optimal RAPD-PCR performance the settings of each particular analysis need to be specifically adjusted.

Multiplex PCR methods

Multiplex PCR systems offer great utility to the dairy industry for the detection and classification of the three main lactococcal phage species, 936, P335 and c2, as well as Str. thermophilus and Lb. delbrueckii phages (9, 193). These rapid and sensitive tests are based on the generation of specific PCR amplification products for each phage species. They can detect phage in milk, yoghurt, cheese whey, phage lysates and starter cultures (P335 prophage type). This makes them useful for phage monitoring, milk treatment decisions and the selection of appropriate phage-insensitive starter culture rotations (193).

2.5 PHAGE GENOME SEQUENCING

The decreasing cost of the whole genome sequencing as well as the development of advanced sequencing technologies has enabled the rapid increase in the initiation of new phage sequencing projects. In the last three years the number of fully sequenced phage genomes in the NCBI genome database has doubled (194).

The quality of these finished genomes has been reported to vary (195). However, when a phage is intended for use in therapeutic applications and requires regulatory approval, its characterization by sequencing, assembly, annotation and manual checks are subjected to the utmost rigor (194).

20

Genome sequencing projects commence with the selection of a sequencing platform. The choice of platform is dependent on the size of a genome to be sequenced, sequencing error rates, the presence of repeat sequences and the extent of genome duplications (196). Different sequencing platforms differ in biochemistries and have their own advantages and disadvantages, in terms of accuracy, yield of high-quality sequences, cost and run time (speed) (197). Generally, short read technologies are characterised by higher accuracy and deeper coverage, while long read sequencing technologies are more likely to produce errors that may require correction with short reads (194).

From the first generation sequencing platforms that delivered longer read outputs, such as traditional Sanger sequencing (~ 1kb sequence reads) and Roche 454 (up to 800 bp), the trend has moved towards short read technologies, such as Illumina (usually 2 x 150 bp) (196). The third generation Pacific Biosciences (PacBio) produce long reads (3000 – 15 000 bp) and is less expensive, but is characterised by high error rates (198). Although not without its challenges related to the reliable detection of different nucleotides (Agah 2016), the Nanopore single-molecule sequencing that has been termed as the 4G technology provides cost and speed advantages (199, 200). These include real time data acquisition and analysis of long reads generated from the direct sequencing of RNA and DNA nucleic acids (201). It can also detect additional features of genomic DNA, such as epigenetic modifications and strand breaks (202).The MinION sequencer device, which was introduced in 2014 by Oxford Nanopore Technologies, sequences individual, long native DNA strands. It is portable and convenient for use, and with improvements in subsequent data analysis offers significant benefits for research and clinical applications (201-203).

Sequencing data analysis incorporates the following steps: an initial quality check of the filtered data followed by an alignment of sequences to a reference genome or assembly de novo and genome structural and functional annotation (204, 205). In parallel with the advancements of high-throughput sequencing technologies, bioinformatics algorithms and softwares, and integrated data analysis pipelines are being developed with the aim to improve the accuracy of homology searches and taxonomic assignments (206, 207).

Although phage genomes are typically small in size, specific challenges have been experienced in sequencing them, such as the determination of the type and location of

21

genome ends (208). The smallest sequenced phage genome is Leuconostoc phage L5 (2435 bp) and the largest is Pseudomonas phage 201phi2−1 (316,674b) (69). The analysis of phage genome sequencing data is additionally challenging as many viral proteins, including those that are prophage-encoded, are listed in databases as belonging to microbial genomes (209).

The majority of molecular knowledge and the adopted paradigms stem from research on model system phages, mostly E. coli (76). However, it is recognized that the growing number and diversity of sequenced cultured phage genomes can contribute new knowledge about viral genes, their function, and their influence on the metabolic processes of their bacterial hosts (210). Additionally, phage re-sequencing provides the possibility to understand their genome evolution (211).

The nucleotide sequences of 142 completely sequenced lactococcal phage genomes are currently deposited in the NCBI repository database. (https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/find-data/virus).

Sequencing efforts in the dairy field have significantly contributed to the understanding of phage origin, diversity, the relationship between various phages and provided clues to their evolutionary history. For example, the sequencing of the lactococcal phage Q54 revealed an unusual genomic organisation that has resulted from the homologous recombination between 936- and c2-like phages (138). This would be thought to be an unlikely event for these two groups of phages as they both lack temperate members (11). In another instance, the characterisation of a genome of the lytic Lc. lactis phage P087 disclosed a mosaic structure that also contained modules with identities to Enterococcus faecalis prophage (139). Further, although displaying a c2-like morphotype, Lc. lactis phage asccφ28 appeared more related to the phage Strep. pneumoniae Cp-1 and the φ29-like phages of Bacillus subtilis than to other Lc. lactis phages as inferred through similarities in their gene arrangement, the mode of DNA replication, and protein sequence homology (144).

2.6 REPETITIVE-PCR

Repetitive-PCR is based on the amplification of unique single-copy DNA regions located between successive repetitive elements by using these elements as primer binding sites for a PCR reaction (212).

22

Bacterial DNA repetitive sequences, which have a proposed role in regulating gene expression were originally identified in the genomes of Escherichia coli and Salmonella typhimurium and were termed Repetitive Extragenic Palindromic (REP) sequences (213, 214). Subsequently, other short repetitive polynucleotide sequence patterns were discovered, such as Enterobacterial Repetitive Intergenic Consensus (ERIC) (215);

BOX sequences (216) and tandemly repeated trinucleotide sequences (GTG)5 (217). The most recently described class of noncoding repetitive sequences are those associated with CRISPR, which, through interaction with cas genes provide acquired resistance to bacteriophage infection (218).

The BOX repetitive DNA elements were discovered in Gram-positive Streptococcus pneumoniae (216). Unlike other repetitive-sequences types, the organization of the BOX element is modular and consists of various combinations of three subunits, boxA, boxB and boxC (216). Using moderate stringency, only the boxA subsequences have a high degree of conservation among diverse bacteria (219). Complementary oligonucleotide primers BOXA1, BOXA1R and BOXA2R were designed based on the consensus sequences of the boxA subunit (216). Inverted orientation of several consecutive repeats in the genome with respect to each other renders their use as single primers sufficient to simultaneously amplify multiple DNA fragments of different sizes (220). It has subsequently been shown that the use of BOXA-based primers could be used for DNA fingerprinting of isolates from both Gram-positive and Gram- negative bacterial species (219). BOXA1R-PCR and BOXA2R-PCR bacterial strain typing has been applied in epidemiological (221, 222), ecological (223) and industrial application studies (224). It has also previously been used as a complement to RAPD for phage characterization (185), but its use as a sole method for phage genotyping has not been reported.

23

3. OBJECTIVE/AIMS

The objective of this dissertation was to contribute knowledge that would be of relevance and practical benefit in the dairy field.

Three main aims of this thesis were:

1) to develop a simple and rapid genetic method that could provide speciation, subspeciation and strain differentiation of Lc. lactis in a single reaction.

2) to investigate whether repetitive-PCR typing is applicable to bacteriophage, where amplified viral DNA will produce specific and stable fingerprints that could be used for phage genotyping

3) to perform phenotypic and genomic characterization of the lactococcal bacteriophage vB_LacS_15(Mo9)

24

CHAPTER 2

APPLICATION OF COLONY BOXA2R-PCR FOR THE DIFFERENTIATION AND IDENTIFICATION OF LACTIC ACID COCCI

(reproduced word for word from the article published in Food Microbiology, https://doi.org/10.1016/j.fm.2019.02.011, Copyright © 2019 Elsevier B.V. )

ABSTRACT

Repetitive-PCR (rep-PCR) is a well-established genetic method for bacterial strain fingerprinting that is used mostly with REP, ERIC, (GTG)5, BOXA1R and occasionally BOXA2R repetitive primers. In this study, it was demonstrated that BOXA2R-PCR could effectively discriminate between Lactococcus lactis, Leuconostoc mesenteroides and Streptococcus thermophilus; differentiate Lactococcus lactis strains and subspeciate them into lactis and cremoris in a single reaction; generate unique strain fingerprints of various lactic acid bacteria (LAB species) commonly isolated from fermented dairy products, including occasional spoilage bacteria and yeasts. Furthermore, using direct colony PCR a reproducible and rapid method was developed for the differentiation and identification of lactic acid cocci. The simplicity and speed of this microbial identification method has potential practical value for dairy microbiologists, which was demonstrated through a microbiota investigation of select Australian retail dairy products.

Highlights

Direct colony BOXA2R-PCR: - subspeciates Lactococcus lactis into lactis and cremoris genotypes and differentiates individual strains in a single reaction

- fingerprint strains isolated from LAB fermented food products, including occasional contaminant bacteria and yeasts;

- discriminates Gram-positive coccal LAB: Lactococcus lactis, Leuconostoc mesenteroides and Streptococcus thermophilus from each other and from the Enterococcus genus

Key words: Repetitive-PCR, BOXA2R, Colony-PCR, Lactic acid bacteria, Dairy yeasts

26

1. BACKGROUND

Various fermented food and feed products and probiotics contain diverse species and strains of LAB, which contribute to their nutritional, organoleptic, preservation or health- promoting properties. Industrially manufactured dairy products typically use defined starter cultures, while artisanal products, such as the traditional raw milk cheeses (1, 2) represent a rich source of complex natural microbiota. To genetically analyze the composition of complex bacterial communities, a variety of culture-independent techniques have been developed, these include PCR-DGGE, SSCP, LH-PCR (3).

Lactococcus lactis subsp. lactis and subsp. cremoris are used as starter cultures for fermented dairy products, in particular cheese. Species identification, subspecies determination and strain differentiation of Lactococcus lactis, the main starter culture used in cheese production, usually involve two stages and two sets of methods ( 4, 5). Presumptive lactococci are initially identified by 16S rRNA analysis (5). The amplification of the 16S rRNA genes followed by partial DNA restriction analysis (partial ARDRA) of the amplicons has previously been used to identify isolates as belonging to Lactococcus lactis (6). PCR-based methods with subsequent restriction endonuclease digestion of the products has been used to detect species- and subspecies-specific sequence polymorphisms in gadB (glutamate decarboxylase) gene (7), the histidine biosynthesis operon (5) or 16S rRNA gene (8). If the strain DNA is resistant to digestion with a particular restriction enzyme or if digestion profiles of amplicons with one enzyme are identical, then more than one enzyme will be required. However, these methods do not provide any differential information at the strain level. Conversely, application of a typing technique that is highly discriminatory at the strain level does not infer any genus or species information (9). Restriction fragment length polymorphism (RFLP) typing by PFGE has been used to evaluate strain diversity and relatedness (6). The use of multiple enzymes in RFLP-PFGE increases the technique’s discriminatory power but renders the method technically complicated and time-consuming (6). MLST analysis of the genetic diversity of Lactococcus lactis based on analysis of partial nucleotide sequences of seven house-keeping protein encoding genes (5, 6, 10) has consistently clustered strains into lactis and cremoris genotypes. However, the disadvantage of this method is “the substantial costs for reagents, equipment and labour to complete the requisite amplification, sequencing, editing and the concatenating of the multiple housekeeping genes” (11).

27

RAPD and repetitive-PCR are two genetic methods that enable fast and inexpensive fingerprinting. The RAPD method using the 10-mer arbitrary primer has been assessed as generally not appropriate for species identification (12) with the major issue being that it is not sufficiently discriminatory for Lc. lactis ssp cremoris (13). It also requires the use of more than one primer (12, 14) and displays a limited capability to distinguish between genetically related strains due to the low number of amplicons obtained with most RAPD primers (9).

Repetitive PCR (rep-PCR) fingerprinting technique enables the amplification of DNA fragments of different sizes located between interspersed repetitive sequences and was originally developed to distinguish strains of diverse Gram-negative species (15). The BOX family of repetitive DNA elements identified in Gram-positive Streptococcus pneumoniae consists of different combinations of three subunits, boxA, boxB, and boxC (16, 17). Only the BOXA-like subunit sequences appear to be highly evolutionary conserved among both Gram-positive and Gram-negative bacteria (17). BOXA1R and BOXA2R primers are based on the boxA sequence and have been used in rep-PCR amplification of DNA from many bacterial species (17).

This work aimed to investigate the applicability of the repetitive-PCR method using the single oligonucleotide primer BOXA2R for the genetic identification of species, sub- species and strains of Lactococcus lactis. Our interest in exploring the use of BOXA2R primer was based on an initial evaluation of several repetitive primers, which indicated that BOXA2R primer produced the most informative fingerprints for lactic acid cocci. Cultures from various Australian dairy products were then isolated to test its usefulness for fingerprinting of diverse dairy microbial isolates.

2. MATERIALS and METHODS

2.1 Strains

Thirteen dairy products purchased from retail stores in Sydney, NSW, Australia in the first half of 2016 (see Table 2.1) were analyzed. Out of one-hundred and six strains isolated from these products and initially screened, forty-four unique isolates were included in this study: thirty-four LAB-, four non-LAB bacteria and six yeasts. Additionally, nine Lactococcus lactis ssp lactis biovar diacetylactis and six Leuconostoc mesenteroides strains were isolated from the mesophilic starter type culture Flora Danica (FD). Reference strains and Lc. lactis laboratory strains were sourced from our

28

internal culture collection (see Table 2.2). The total number of tested strains was eighty-one.

Table 2.1 The list of dairy products used for strain isolation

PRODUCTS Type of product Made in Bega Farmers' Tasty Cheese cheese Australia Lemnos Cheese Fetta cheese Australia Riverina Full Cream Australian Fetta cheese cheese Australia Danish Fetta Cheese cheese Australia Henry Willig's Cow Cheese with Herbs cheese Holland Bulla Lite’n Healthy Plain Yoghurt yoghurt Australia CHTAURA Natural Set Yoghurt yoghurt Australia Bekaa Natural Set Yoghurt yoghurt Australia Green Valley Dairy Yoghurt yoghurt Australia "Paprika u pavlaci" (Bell peppers stuffed with ) sour cream Weight Watchers Extralight Sour Cream sour cream Australia Dairy Farmers Buttermilk buttermilk Australia Sharma's kitchen Mango yoghurt Australia

29

Table 2.2 The list of strains used in this study

STRAIN NAME SPECIES SOURCE Identified by ATCC19435T Lc. lactis ssp. lactis Dr W. Bridge ATCC19257T Lc. lactis ssp. cremoris Dr W. Bridge Leucon. mesenteroides CSIRO4202T Dr W. Bridge ssp. cremoris Leucon. mesenteroides CSIRO4301T Dr W. Bridge ssp. dextranicum Leucon. mesenteroides ATCC8293T Dr W. Bridge ssp. mesenteroides UK712 Lc. lactis ssp cremoris Dr W. Bridge MG1363 Lc. lactis ssp cremoris Dr W. Bridge E8(NZ) Lc. lactis ssp cremoris Dr W. Bridge ML8 Lc. lactis ssp lactis Dr W. Bridge C2 Lc. lactis ssp cremoris Dr W. Bridge SL894 Lc. lactis ssp lactis Dr W. Bridge AM1 Lc. lactis ssp cremoris Dr W. Bridge AM2 Lc. lactis ssp cremoris Dr W. Bridge FG2 Lc. lactis ssp cremoris Dr W. Bridge 18 Lc. lactis ssp lactis Dr W. Bridge 19 Lc. lactis ssp lactis Dr W. Bridge ET2 Enterococcus faecium Dr W. Bridge 16S rRNA sequencing Enterococcus durans/ ET5 Dr W. Bridge 16S rRNA sequencing faecium ET4 Enterococcus faecalis Dr W. Bridge 16S rRNA sequencing BA1 Lactobacillus paracasei this study 16S rRNA sequencing BA2 Ochrobactrum anthropi this study 16S rRNA sequencing BA3 this study 16S rRNA sequencing BA4 Staphylococcus sp. this study 16S rRNA sequencing BA5 Lc. lactis ssp cremoris this study 16S rRNA sequencing BA6 Lc. lactis ssp lactis this study BOXA2R-PCR/KMK BA7 Strep. thermophilus this study BOXA2R-PCR

30

BA8 Lactobacillus paracasei this study 16S rRNA sequencing BA9 Staphylococcus warnerii this study 16S rRNA sequencing Staphylococcus BA10 this study 16S rRNA sequencing epidermidis Streptococcus BA11 this study BOXA2R-PCR thermophilus Streptococcus BA12 this study 16S rRNA sequencing thermophilus BA13 Lactobacillus fermentum this study 16S rRNA sequencing BA14 Lactobacillus paracasei this study 16S rRNA sequencing BA15 Lactobacillus casei this study 16S rRNA sequencing BA16 Lactobacillus plantarum this study 16S rRNA sequencing BA17 Lactobacillus fermentum this study 16S rRNA sequencing Streptococcus BA18 this study 16S rRNA sequencing thermophilus Lb. delbrueckii ssp. BA19 this study 16S rRNA sequencing bulgaricus Lb. delbrueckii ssp. BA20 this study 16S rRNA sequencing bulgaricus BA21 Enterococcus faecalis this study 16S rRNA sequencing BA22 Enterococcus sp. this study 16S rRNA sequencing BA23 Str. thermophilus this study BOXA2R-PCR

BA24 Str. thermophilus this study BOXA2R-PCR BA25 Str. thermophilus this study BOXA2R-PCR BA27 Str. thermophilus this study BOXA2R-PCR BA28 Str. thermophilus this study BOXA2R-PCR BA29 Str. thermophilus this study BOXA2R-PCR BA30 Leucon. mesenteroides this study BOXA2R-PCR BA31 Enterococcus faecalis this study 16S rRNA sequencing BA32 Lc. lactis ssp. cremoris this study BOXA2R-PCR BA33 Lc. lactis ssp. cremoris this study BOXA2R-PCR BA34 Lc. lactis ssp lactis this study BOXA2R-PCR/KMK BA35 Lactobacillus plantarum this study 16S rRNA sequencing BA36 Lc. lactis ssp. lactis this study 16S rRNA sequencing

31

Lc. lactis ssp lactis biovar BA37 this study BOXA2R-PCR/KMK diacetylactis Streptococcus BA38 this study BOXA2R-PCR thermophilus BA39 Enterococcus faecalis this study 16S rRNA sequencing KV1 Kluyveromyces lactis this study ITS1 sequencing KV2 Candida parapsilosis this study ITS1 sequencing KV3 Candida zeylanoides this study ITS1 sequencing KV4 Kluyveromyces marxianus this study ITS1 sequencing KV5 Pichia cactophila this study ITS1 sequencing KV6 Kluyveromyces lactis this study ITS1 sequencing Lc. lactis ssp. lactis biovar FD1 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD2 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD3 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD5 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD6 this study BOXA2R-PCR/KMK diacetylactis FD9 Lc. lactis ssp. lactis this study BOXA2R-PCR/KMK FD8 Leucon. mesenteroides this study BOXA2R-PCR FDL4(=L5) Leucon. mesenteroides this study BOXA2R-PCR FDL5 Leucon. mesenteroides this study BOXA2R-PCR FDL10 Leucon. mesenteroides this study BOXA2R-PCR FDL12 Leucon. mesenteroides this study 16S rRNA sequencing Lc. lactis ssp. lactis biovar FD11 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD13 this study BOXA2R-PCR/KMK diacetylactis Lc. lactis ssp. lactis biovar FD15 this study BOXA2R-PCR/KMK diacetylactis T – Type strain Strain synonyms: ATCC19435T - DSMZ 20481; NCDO 604 32

ATCC19257T - DSMZ 20069; NCDO 607; HP CSIRO4202T - ATCC19254; DSMZ20346; NCDO543 CSIRO4301T - ATCC19255; DSMZ20484; NCDO529 ATCC8293T - DSMZ20343; NCDO523 UK712 - NCDO712

2.2 Methods

Strain isolation. Dairy samples (10 g or 10 ml) were homogenized in 90 ml sterile 2% (w/v) trisodium citrate pre-warmed to 45°C (cheese and sour cream samples) or 0.9% (w/v) sodium chloride (buttermilk and yoghurt samples). Serial 10-fold dilutions of the suspensions in 0.9% (w/v) sodium chloride were thoroughly mixed by vortexing and 0.1 ml of the appropriate successive dilutions were spread plated onto the following media: M17 agar (Oxoid) with 0.5% (w/v) lactose (LM17), MRS agar (Oxoid) and KMK citrate agar (18). LM17 plates were incubated anaerobically for 24-48 h at 30°C and 37° C, and aerobically at 45°C; MRS plates were incubated anaerobically at 37°C and KMK plates at 30°C. Within two days following the incubation, the plates were inspected for the presence of microbial growth. Morphologically distinct colonies were purified by re- streaking, Gram-stained and stocked as broth cultures (M17 with 0.5% (w/v) glucose or MRS broth) with the addition of 15% (v/v) glycerol or in autoclaved 9.5% (w/v) milk. Yeast colonies, which were occasionally observed and picked from LM17 or MRS agar media, were plated on Sabouraud dextrose agar (Oxoid) and stocked in 30% (v/v) glycerol. All stocks were stored at -80°C.

Colony-PCR with BOXA2-R primer. PCR reactions were performed in puReTaq™Ready-To-Go™ Polymerase Chain Reaction (PCR) Beads (GE Healthcare). This PCR premix contained a final concentration of ~2.5 units puReTaq DNA polymerase, 200 M of each dNTP in 10 mM Tris-HCl (pH 9), 50 mM potassium chloride and 1.5 mM magnesium chloride. To each reaction, DNAse RNAse free (MP Biomedicals) was added to re-suspend the bead followed by addition of 50 pmol of the single BOXA2R primer (Sigma Oligos), which has the oligonucleotide sequence, 5’- ACGTGGTTTGAAGAGATTTTCG-3’ (17). Colonies of representative morphologies were picked directly from a plate with a sterile inoculation loop (blunt side) and mixed into the final 25 l volume of the PCR mixture. Care was taken to avoid colony overloading in order to generate optimally resolved fragments and clear fingerprints. The finding that vegetative cells can be substituted for genomic DNA as a template for

33

PCR has been reported earlier (7, 15). PCR amplifications were performed in an automated thermal cycler (Eppendorf), with the lid preheated to 105°C before the start of the reaction. The negative control reaction had water substituted for the template DNA. The PCR cycling program included an initial denaturation step (95°C, 7 min) and then 35 cycles of denaturation (90°C, 30 s), annealing (40°C, 1 min), and extension (65°C, 8 min), followed by one final extension step at 65°C for 16 min (19).

Following amplification, 2.5 µl loading buffer was added to each PCR tube. The amplification products (5 µl) were electrophoresed on 15 x 20 cm 1.5% (w/v) molecular grade agarose (Bio-Rad) gels in 1 x TAE (Tris-acetate, EDTA, pH8.0) at a constant 100 V for 3 h. The amplicons were assessed against the molecular size marker HyperLadder I (M I) (Bioline) or HypperLadder II (Bioline) (M II). After staining with GelRed 3 x staining solution in water (Biotium) for 30 min, the gels were photographed on a UV-transilluminator.

BOXA2R-PCR profiles were analyzed by visual inspection and by using the gel image analysis software PyElph 1.4 (20). Visual assessment of relatedness between individual isolates was carried out by firstly looking for the strongest and most distinctive bands of each isolate, followed by the observation of the presence (or absence) of the less intensive bands, their position and size relative to the marker. A phylogenetic tree generated by PyElph software was based on the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis applied on the computed distance matrix (20).

Reproducibility testing. Reproducibility of the BOXA2R-PCR method was tested on a total of 20 strains of target and non-target species, including ATCC type strains and wild isolates (see Table 2.3). Bacterial DNA used in this experiment was isolated using the two commercial products: Illustra Bacteria GenomicPrep Mini Spin Kit (GE Healthcare) (referred to as DNA1) and DNeasy Ultraclean Microbial Kit (Qiagen) (referred to as DNA2). The colonies were randomly picked from the agar plates on which the strains were cultivated. Leuconostoc mesenteroides strains ATCC19254, ATCC19255 and ATCC8293 were grown on two agar media: MRS and MRS with 20 µg/ml of vancomycin (MRSV). Three replicates per template for each strain were used for PCR amplifications and were run on individual gels. Yeast DNA was isolated according to the protocol described by Hoffman (21). The quantity of DNA used in the reproducibility investigation was 50 – 100 ng.

34

Table 2.3 The list of strains used for the reproducibility study

STRAIN NAME SPECIES SYNONYM ATCC19435T Lactococcus lactis ssp. lactis DSMZ 20481; NCDO 604 DSMZ 20069; NCDO 607; ATCC19257T Lactococcus lactis ssp. cremoris HP Streptococcus salivarius ssp ATCC19258T thermophilus DSMZ 20617; NCDO 573 Leuconostoc mesenteroides ssp. ATCC19254T cremoris DSMZ20346; NCDO543 Leuconostoc mesenteroides ssp. ATCC8293T mesenteroides DSMZ20343; NCDO523 Lactobacillus paracasei subsp. ATCC25302T paracasei DSMZ5622; NCDO 151 ATCC393T Lactobacillus casei DSMZ20011; NCDO 161 DSMZ20684; ATCC 49156; ATCC43921T Lactococcus garvieae ssp garvieae NCDO 2155 C2 Lactococcus lactis ssp. cremoris LMG8523; CECT916 Mo9 Lc. lactis ssp cremoris BA6 Lc. lactis ssp lactis BA37 Lc. lactis ssp lactis FD15 Lc. lactis ssp. lactis biovar diacetylactis BA18 Streptococcus thermophilus BA40 Streptococcus thermophilus BA21 Enterococcus faecalis BA31 Enterococcus sp. BA9 Staphylococcus warnerii KV2 Candida parapsilosis KV4 Kluyveromyces marxianus

Statistical analysis. Gel images of the BOXA2R-PCR fingerprints were processed with the PyElph software (20). The generated dendrograms were based on a clustering method applied on the distance matrix computed with the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) algorithm (20).

High reproducibility of the BOXA2R-PCR method resulted in difficulties when choosing the correct statistical model. The replicates were either very similar or identical (zero variability in the outcome), and this had a significant impact on the intraclass correlation 35

coefficient calculation. Due to the results type and distribution-free data, non- parametric Mann-Whitney test was used for the statistical analysis in IBM SPSS Statistics Version 25. The assumptions of this test; random sampling from the population, independence within the samples and mutual independence, and ordinal measurement scale, were judged as satisfied. The effects of the two templates used, i.e. the isolated DNA and the intact colony, on the reproducibility of the fingerprint profiles were compared. Data points were tested at p < 0.05.

16S rRNA sequencing. 16S rRNA sequencing was applied on selected isolates from dairy samples (see Table 2) to firstly corroborate the BOXA2R-PCR speciation inferred from co-migrating fingerprint profiles, and secondly to identify the species of strains with unfamiliar profiles. The sample preparation method involved PCR amplification of a gene coding for 16S rRNA with 16-1A (5’-GTCGGAATCGCTAGTAATCG -3’) and 23- 1B (5’-GGGTTCCCCCATTCGGA -3’) primer pairs following the recommended PCR cycling profile (22). The 16S rRNA PCR of bacterial strains was performed using either the isolated DNA or a single colony, as a template for the reaction. Gel electrophoresis of the PCR products was carried on in 1% (w/v) mini-agarose gel (7.5 x 10 cm) run in 0.5 x TBE (Tris-base, boric acid, EDTA, pH8.0) buffer for at 80 V for 1 h. The size of the amplified fragment was measured against the molecular size marker, and the concentration and purity were determined by Nanodrop. 16S rRNA gene target Sanger sequencing of the amplicons was performed in the Ramaciotti Center for Genomics using the 16-1A primer. Bacterial identification was achieved by the use of BLAST to compare the obtained sequences with sequences in the GenBank database.

ITS sequencing. The isolated yeasts were identified by the molecular method based on amplification of the Internal Transcribed Spacer (ITS) region employing the primer pair ITS1 (5’-TCCGTAGGTGAACCTGCGG-3’)/ITS4 (5’- TCCTCCGCTTATTGATATGC-3’) following the recommended PCR cycling conditions (23). ITS-PCR was applied directly to a yeast colony. PCR amplicons were sequenced with the ITS1 primer in the Ramaciotti Center for Genomics. The identification of isolates was performed using the best BLAST hits from the NCBI database.

36

3. RESULTS

3.1 Application of rep-PCR fingerprinting using BOXA2R primer (BOXA2R- PCR) to dairy bacteria

To investigate if BOXA2R-PCR is applicable for fingerprinting of dairy bacteria, a set of cultures including type strains and wild isolates was tested (Figure 2.1) and resulted in well resolved, distinguishable fingerprint patterns. Phylogenetic analysis showed that the isolates belonging to the species Lactococcus lactis, Leuconostoc mesenteroides and Streptococcus thermophilus grouped together into separate clusters (Figure 2.1, A and B), which indicated the method could be a useful tool for their delineation. Lb. casei and Lb. paracasei ssp. paracasei were all positioned within the same cluster. Enterococcus faecalis was placed outside the Lactococcus species cluster and Staphylococcus warnerii appeared as an outlier to the Strep. thermophilus species.

It was observed that Lc. garviae ssp garvieae ATCC43921 was placed in the same group with Lc. lactis. The most pronounced difference in the profiles of Lc. lactis ssp lactis isolates and Lc. garvieae ssp garvieae ATCC43921, as inferred from BOXA2R- PCR fingerprints, was the presence of an obvious band of ~ 300 bp in Lc. garvieae (see Figure 2.1, A), which was not present in any other Lc. lactis ssp lactis strains tested in the course of this study.

Figure 2.1 (A) BOXA2R-PCR fingerprint profiles of some of the strains used for reproducibility testing. HypperLadder I (M I); Lactococcus lactis ssp. lactis ATCC19435; Lactococcus garvieae ssp garvieae ATCC43921; Lactococcus lactis ssp. cremoris

37

ATCC19257; Streptococcus salivarius ssp thermophilus ATCC19258; Leuconostoc mesenteroides ssp. cremoris ATCC19254; Leuconostoc mesenteroides ssp. mesenteroides ATCC8293; Lactobacillus paracasei subsp. paracasei ATCC25302; Lactobacillus casei ATCC393; Lc. lactis ssp lactis BA37; Strep. thermophilus BA18; Lc. lactis ssp. cremoris Mo9; Enterococcus faecalis BA21; Staphylococcus warnerii BA9; Leuconostoc mesenteroides ssp. dextranicum ATCC19255. (B) The corresponding dendrogram of the BOXA2R-PCR patterns was generated using the UPGMA cluster analysis according to the Pearson product moment correlation coefficient (expressed as a percentage value, 0 - 100%).

3.2 Reproducibility of the BOXA2R-PCR method

The reproducibility of the method was explored using both isolated DNA and colony templates from 20 strains of target and non-target species, including ATCC type strains and wild isolates (see Table 2.3). It was observed that replicates from the same DNA source produced stable fingerprints exclusive to each different strain (data not shown). Profiles obtained from individual colonies were slightly more variable than when sub- sampled from DNA purified using commercial kits; the observed reproducibility for the DNA templates was 98% and for colonies was 97% (see Figure 2.2).

Figure 2.2 The plot illustrates 95% Confidence Intervals (CI) of dispersion estimates for colonies and DNA templates

Statistical analysis was performed in SPSS using non-parametric Mann-Whitney test. There was no significant difference (p=0.118) in dispersion (variability) between the replicates generated by application of the BOXA2R-PCR to the DNA vs intact colonies (see Table 4).

Table 2.4. SPSS output of the reproducibility analysis of the BOXA2R-PCR fingerprints

38

Test Statisticsa

Values Mann-Whitney U 3565.000 Wilcoxon W 6050.000 Z -1.566 Asymp. Sig. (2-tailed) .117 Exact Sig. (2-tailed) .118 Exact Sig. (1-tailed) .060 Point Probability .000 Grouping Variable: Source of DNA

3.3 Lactococcus lactis subspecies differentiation by BOXA2R-PCR

BOXA2R-PCR fingerprinting of Lactococcus lactis strains revealed an evident distinction between the subspecies lactis and cremoris based on the presence of subspecies-specific fragments. The number of Lc. lactis strains and isolates from various sources used for defining the common band pattern of the two subspecies was approximately 100, and included those preliminary screened (data not shown) in addition to those presented in this work. Lc. lactis ssp lactis type strain ATCC19435 and the strains 18 and 19 displayed a characteristic profile with three amplified fragments of ~200 bp, ~800 bp and ~3800 bp, while two other ssp lactis strains, ML8 and SL894, lacked the ~3800 bp band (see Figure 2.3 A). The most pronounced band of cremoris subspecies was the ~1.1 Kb, and the only other band common to all cremoris strains was ~450 bp. While some bands were unique to individual isolates and others were shared between the subspecies or within the subspecies, the key difference between them was the presence or the absence of the ~200 bp band. This defining band is present in a lactis genotype including biovar diacetylactis, and absent in a cremoris genotype (Figure 2.3 A). Two pairs of strains, 18 and 19, and AM1 and AM2, displayed identical patterns (Figure 2.3 A).

39

Ladder II

→ 2000

→ 1200 → 800

→ 300 → 200 Figure 2.3 A) Genetic identification of Lc. lactis ssp lactis and ssp cremoris type strains → 100 by BOXA2R-PCR. 1.HyperLadder I; 2. Lactococcus lactis ssp lactis ATCC19435T; 3. Lactococcus lactis ssp cremoris ATCC19257T; 4. Lc. lactis ssp cremoris NCDO712; 5. Lc. lactis ssp cremoris MG1363; 6. Lc. lactis ssp cremoris E8NZ; 7. Lc. lactis ssp lactis ML8; 8. Lc. lactis ssp cremoris C2; 9. Lc. lactis ssp lactis SL894; 10. Lc. lactis ssp cremoris AM1; 11. Lc. lactis ssp cremoris AM2; 12. Lc. lactis ssp cremoris FG2; 13. Lc. lactis ssp lactis 18; 14. Lc. lactis ssp lactis 19; 15. HyperLadder II. B) The corresponding dendrogram based on the UPGMA method displaying genetic distances above the branches illustrates the genetic relatedness of Lc. lactis isolates

Two main clades were identified: group lactis included all Lc. lactis ssp lactis isolates, and group cremoris included all Lc. lactis ssp cremoris isolates (Figure 2.3 B). Within the cremoris cluster, the three differentiated separate sub-clusters were AM1 and AM2; NCDO712-related group of strains (NCDO712, C2, MG1363) and the type strain ATCC19257(HP) and FG2, with the strain E8 being an outlier.

3.3.1 Investigation of the potential of BOXA2R-PCR to detect biovar diacetylactis

The potential of BOXA2R-PCR to distinguish biovar diacetylactis strains from Lc. lactis strains that are unable to metabolize citrate was assessed on selected citrate-positive and citrate-negative isolates (see Figure 2.4).

40

Based on their BOXA2R-PCR profile, citrate negatives were identified as leuconostocs (FDL4 and FDL10) or lactococci (either lactis or cremoris).

_ _ + +

+ + + + + _ _

CIT

CIT CIT . CIT

1. M I 1. M 2.cnt Neg 3. CIT 4. CIT 5. CIT 6. CIT 7. CIT 8. 9. 10 11. 12. CIT 13. CIT

Figure 2.4 Genetic comparison of the citrate-positive and citrate-negative isolates from the selective agar medium by BOXA2R-PCR. 1.HyperLadder I (M I); 2. Negative control; 3. Lc. lactis ssp lactis biovar diacetylactis FD3; 4. Lactobacillus plantarum BA35; 5. Lc. lactis ssp lactis BA36; 6. Lc. lactis ssp lactis biovar diacetylactis BA37; 7. Lc. lactis ssp lactis biovar diacetylactis FD11; 8. Leuconostoc sp. FDL4; 9. Leuconostoc sp. FDL10; 10. Lc. lactis ssp lactis FD9 ; 11. Lc. lactis ssp cremoris BA32 ; 12. Lc. lactis ssp lactis biovar diacetylactis FD2; 13. Lc. lactis ssp lactis biovar diacetylactis FD1

The correlation between the citrate fermenting phenotype of the Lc. lactis ssp lactis biovar diacetylactis and a specific genetic pattern could not be observed. There were no typical PCR fragments detected which could unambiguously differentiate CIT+ and CIT- isolates among all the tested strains and could therefore exclusively point to the biovar diacetylactis.

One citrate-positive isolate, BA35, that displayed a profile uncharacteristic for Lc. lactis was identified by 16s rRNA sequencing as Lactobacillus plantarum. This suggests that BOXA2R-PCR fingerprinting can be useful to discern a citrate fermenting bacterium that does not belong to Lactococci and Leuconostoc. This was further corroborated when other citrate-positive organisms, such as Lb. plantarum BA16, Ent. faecalis ET4 and Staph. epidermidis BA10 (data not shown), were subjected to the BOXA2R-PCR and could be readily separated from Lc. lactis ssp lactis biovar diacetylactis based on their fingerprint patterns.

41

3.4 Application of BOXA2R-PCR for screening isolates from Australian dairy samples

3.4.1 Strain isolation

The purpose of the strain isolation process was to collect isolates for testing of the BOXA2R-PCR method, hence the origin of the isolated bacterial and yeast strains was intentionally unspecified, and their codes were anonymized.

To test the potential of BOXA2R-PCR as a screening method, fingerprints of the new isolates were first inspected for common patterns. The identity of single isolates and their speciation was confirmed by 16S rRNA gene sequencing. The majority of isolated bacterial strains belonged to common LAB species (see Table 2.2). Enterococcus sp. isolates were identified in two of the analyzed dairy products with one containing Ec. faecalis BA21 (104 CFU/g) and other had Enterococcus sp. BA22 (108 CFU/g) as a predominant strain. The 16S rRNA sequencing returned an ambiguous identification result showing sequence homology for both Ec. faecalis and Streptococcus salivarius.

Three dairy products contained non-LAB species including Staphylococcus epidermidis, Staphylococcus warneri, Staphylococcus sp. and Ochrobactrum anthropi. Additionally, the following six yeast strains were isolated from five different dairy products; Kluyveromyces marxianus, Kluyveromyces lactis, Pichia cactophila, Candida zeynaloides and Candida parapsilosis.

3.4.2 BOXA2R-PCR strain fingerprinting

To investigate its value as a fingerprinting tool BOXA2R-PCR was tested on the isolates from dairy products. The distinguishing fingerprints of all isolates tested in this study were generated (see Figures 2.5-6). The size of the obtained PCR products ranged from approximately 200 to 5000 bp. The smallest number of bands (2-3) was detected in Staphylococcus sp. BA4 and Staph. epidermidis BA10 (refer Figure 2.5), while the LAB genera usually displayed from 5 to 15 bands. The individual bacterial isolates were easily differentiated by visual comparison with each strain displaying a unique and distinctive BOXA2R-PCR profile. Interestingly, BOXA2R-PCR fingerprinting of yeast genomes also yielded different fingerprint patterns (Figure 2.5, isolate KV1; Figure 2.6, isolates KV3-KV6).

42

KV3 KV4 KV5 KV6 ATCC8293 CSIRO4301 FDL5 BA21 ET5 BA22 BA23 BA24 BA25 I M

A3

M I M Negcntr BA1 BA2 B BA4 BA5 BA6 BA7 KV1 BA8 BA9 BA10 BA11 BA12

Figure 2.5 Figure 2.6

Figures 2.5 - 6 BOXA2R-PCR fingerprint profiles of the strains isolated in this study from retail dairy products. The amplification products (5 µl) were electrophoresed on 1.5% (w/v) molecular grade agarose (Bio-Rad) gels in 1 x TAE (Tris-acetate, EDTA, pH8.0) at a constant 100 V for 3 h.

Figure 2.5 HypperLadder I (M I); Negative control; Lb. paracasei BA1; Ochrobactrum anthropi BA2; Lb. casei BA3; Staphylococcus sp. BA4; Lc. lactis ssp cremoris BA5; Lc. lactis ssp lactis BA6; Strep. thermophilus BA7; Kluyveromyces lactis KV1; Staph. warnerii BA9; Staph. epidermidis BA10; Streptococcus thermophilus BA11; Strep. thermophilus BA12

Figure 2.6 Candida zeylanoides KV3; Kluyveromyces marxianus KV4; Pichia cactophila KV5; Kluyveromyces lactis KV6; Leucon. mesenteroides ssp mesenteroides ATCC8293; Leucon. mesen. ssp. cremoris CSIRO4301; Leucon sp. FDL5; Ent. faecalis BA21; Ent. durans/faecium ET5;Enterococcus sp. BA22; Strep. thermophilus BA23; Strep. thermophilus BA24; Strep. thermophilus BA25; 14. HypperLadder I (M I) 15. Negative control.

3.4.3 Discrimination of Gram-positive LAB cocci by BOXA2R-PCR

BOXA2R-fingerprints of various LAB isolates were further inspected for common banding patterns, with a particular focus on the Gram-positive LAB species of: Lactococcus lactis; Strep. thermophilus; Leuconostoc mesenteroides and Enterococcus since these coccus- or coccoid-shaped LAB species are not always easily differentiated phenotypically. 43

The specific patterns for Strep. thermophilus, Leucon. mesenteroides and Lc. lactis and cremoris were generated, placing them in four separate clusters on the phylogenetic tree (see Figure 2.7), while no typical profile was observed for enterococci (see Figures 2.6 & 7). Strep. thermophilus isolates clustered together producing two characteristic intensive bands of ~ 300 bp and ~1600 bp, common to all tested strains. Two less intensive bands of ~390 bp and ~1 Kb were present in most strains, while the presence or absence of other fragments was variable among isolates. Leucon. mesenteroides strains also grouped together in a clearly identifying clade. Their fingerprints were characterized by distinctive bands of ~380 bp and ~1200 bp size. The characteristic profiles and the clear separation of the Leuconostoc and Strep. thermophilus isolates is also visually observable (Figure 2.7, from the left: isolates 5-7 and 11-13, respectively). Lc. lactis ssp lactis and Lc. lactis ssp cremoris strains each grouped together into separate clusters (Figure 2.7, isolates 12-13 and 14-15, respectively). To confirm that the speciation of the natural isolates deduced from the BOXA2R-PCR profiles was correct, several isolates per species were subjected to 16S rRNA sequencing (see Table 2.2). The BOXA2R-PCR species assignment of Strep. thermophilus, Leucon. mesenteroides and Lc. lactis and cremoris isolates matched the 16S rRNA findings.

The profiles of Enterococcus sp. isolates looked very different to each other, which was presumably the result of the different enterococcal species they belonged to. Ent. faecalis ET4 and Ent. durans/faecium ET5 clustered in the same group, but were evidently phylogenetically distant. These two strains, as well as BA31, seemed phylogenetically closer to the Strep. thermophilus clade, whereas the Ent. faecium ET2 was placed just out of the lactis clade.

44

Figure 2.7 A) BOXA2R-PCR fingerprints of the representatives of the Gram-positive cocci LAB genera. 1.HyperLadder I; 2 – 4. Strep. thermophilus: BA28, BA27, BA29; 5 - 6. Leuconostoc sp: FDL12, BA30; 7. Leuconostoc mesenteroides ssp cremoris CSIRO4202T; 8. Ent. faecium ET2; 9. Enterococcus faecalis BA31; 10. Enterococcus durans/faecium ET5 ; 11. Enterococcus faecalis ET4; 12 – 13. Lc. lactis ssp cremoris BA32, BA33; 14. Lc. lactis ssp lactis biovar diacetylactis FD6; 15. Lc. lactis ssp lactis BA34. B) The corresponding dendrogram displaying the genetic distances between isolates was obtained using the UPGMA method.

4. DISCUSSION

Repetitive-PCR using a single primer or the combination of primers has been previously applied in studies of genetic diversity in lactic acid bacteria isolated from raw milk and traditional fermented dairy products as well as for the typing of Lactococcus lactis strains (1, 2, 4, 24). Successful differentiation was dependant on the primer used. Rep-PCR performed with the primer pair Rep-1R-Dt/REP2-D was deemed inadequate as a sole method for the accurate identification of Lactococcus, Enterococcus and Streptococcus cheese isolates as it was not possible to assign them to a species (1). Discrimination between L. lactis subspecies was unsuccessful using the primer LL-

Rep1 (14). Repetitive-PCR using the (GTG)5 oligonucleotide primer could not distinguish isolates of the Lc. lactis subspecies (2, 4) and could not properly cluster isolates without the application of an additional method (25). Rep-PCR characterization of Lactococcus sp. strains with BOXA1R primer was ambiguous (24). The use of rep- PCR employing LcRep-H, which is a mixture of three designed repetitive primers proved successful for the genotyping of Lc. lactis at both subspecies and strain level (4). It was demonstrated in this current work that genotyping of Lactococcus lactis strains using the single primer BOXA2R in the repetitive-PCR resulted in distinctive and informative fingerprint profiles and that Lc. lactis and cremoris could be clearly segregated, which suggests that BOXA2R has potential for Lactococcus lactis subspecies-specific typing.

It was not possible to explicitly identify biovar diacetylactis among the Lc. lactis strains without further testing for this phenotypic trait. Difficulties in distinguishing Lc. lactis ssp lactis and L. lactis ssp lactis biovar diacetylactis strains by genetic methods have been reported earlier, where they were indistinguishable by rep-PCR genomic fingerprinting using (GTG)5 primer (10), LcREP-H primer (4) and by AFLP (26) or MLSA (10).

45

However, BOXA2R-PCR fingerprinting appeared useful to distinguish if an unknown citrate fermenting bacterium belongs to a genus or species other than the common dairy bacteria that can metabolize citrate and produce diacetyl and other flavor compounds, such as lactococci and Leuconostoc.

The value of the BOXA2R-PCR as a strain fingerprinting tool in dairy microbiology was demonstrated on isolates from various Australian dairy products. The majority belonged to Lactococcus lactis, Streptococcus thermophilus, and Enterococcus sp plus different Lactobacillus species. Some Enterococcus species are human pathogens (27) whereas Ec. faecalis, Ec. faecium and Ec. durans are often encountered in raw milk and artisanal cheeses (1, 2, 28). While these LAB species are commonly present in dairy products, the detection in some dairy products of Staphylococcus, and in particular Ochrobactrum anthropi is uncommon. Staphylococci have been isolated from a range of , including meat, cheese and milk, but no case of illness related to their consumption in dairy products has been reported (29). Ochrobactrum anthropi, a Gram-negative rod bacterium phylogenetically closely related to the pathogen Brucella abortus (30) is normally a saprophyte that inhabits soil and can colonize a variety of habitats including plants, animals and human (31). The finding of these contaminant species in the samples product should be considered undesirable from food safety and quality aspects.

The role of dairy yeasts in the products from which they were isolated could only be a matter of speculation. Some yeast strains such as Kl. marxianus and Pichia cactophila can contribute to the flavor and aroma of dairy products (32) while others may be linked to product spoilage. Candida zeynaloides and Candida parapsilosis are among the most common non-starter yeast species isolated from milk, brines and cheeses. These yeasts could be introduced into the dairy ecosystem from various potential sources including environmental inputs, workers or equipment (33).

Further advantage of the BOXA2R-PCR concerned its potential to separate Gram- positive coccal LAB: Lactococcus lactis, Leuconostoc mesenteroides and Streptococcus thermophiles from each other and from the Enterococcus species. The strains of these LAB species can have overlapping phenotypic characteristics, which make their correct speciation difficult when employing physiological and biochemical methods (1, 13). For example, Enterococcus sp. can grow under a wide temperature range (15 - 45°C) and can be misidentified as either a Lactococcus or a Strep.

46

thermophilus. Some lactoccocci, particularly from artisanal products can grow under conditions not specified by the traditional key for species identification, such as in 6.5% NaCl and at 10°C and 45°C (1). Leuconostoc sp. may grow on media used for isolation of Lactococcus sp. and can metabolize citrate like Lc. lactis ssp lactis biovar diacetylactis and some species of Enterococcus (E. faecalis). The Enterococcus isolates tested in this work were highly heterogenous, hence their BOXA2R-PCR profiles did not produce any characteristic recognizable profile. The observation that Lc. garviae ssp garvieae ATCC43921 clustered phylogenetically with Lc. lactis was not surprising considering that these two species are both phenotypically similar (34) and have a close genetic relationship as inferred from the maximum likelihood tree based on the concatenated amino acid sequences of Lactococcus core genes (35). The similarity of BOXA2R-PCR patterns between these two species should be considered when exploring the diversity of any artisanal products that are commonly known to be associated with the presence Lc. garviae (21). For these products, to positively genetically discriminate Lc. garviae isolates would require a species-specific PCR identification test, such as one that can target the 16S rRNA gene (36) or the 16S-23S ribosomal RNA (rRNA) internal transcribed spacer region (37).

A good correlation of BOXA2R-PCR typing with previous genotyping data on reference and other well-researched lactococcal strains obtained by other methods was observed both in terms of subspecies speciation and their relatedness. For example, strain NCDO712, which was renamed C2 when deposited in CSIRO, Australia, and its plasmid-free derivative MG1363 belong to the same group of closely related strains having a lactis phenotype, but a cremoris genotype (38). This is in agreement with BOXA2R-PCR results, which classified them as cremoris subspecies yet resolved subtle differences between the two strains (see Figure 2.4). Further, according to their PFGE patterns with SmaI, Lc. lactis ssp cremoris strains E8, HP and SK11 (the phage- resistant derivative of AM1) were representatives of related groups of strains (38). HP and FG2 are closely related strains with some chromosomal rearrangements, which cluster together, but separately from SK11and AM2 (38); AM1 and AM2 display identical fingerprints pointing to a clonal relationship. BOXA2R-PCR profiles of the strains E8, HP and AM1 also show related, but different fingerprint profiles thereby demonstrating good correlation between the PFGE and BOXA2R typing results. These observations are also in agreement with analysis performed by the high-resolution AFLP protocol that showed a clonal relationship between AM1 and AM2 and their placement with HP into a separate sub-cluster within the cremoris genotype (26). 47

Figure 2.8 Schematic representation of the common LAB genera identification by BOXA2R-PCR

Though it has been long established that repetitive PCR using the single primer BOXA2R can produce distinguishing fingerprint profiles for many different bacterial genera (17), the ERIC and REP repetitive primers have been more widely used. From our preliminary screening experiments aimed at investigating the genotyping potential of various repetitive primers for the application to the dairy microbiota, the latter two did not appear suitable, as they were not sufficiently informative. In relation to dairy bacteria, BOXA2R has been employed for genotyping of Enterococcus faecium (28). Leuconostoc species (39), Lactococcus lactis and Lactobacillus plantarum (40). BOXA2R-PCR fingerprinting of LAB isolates in this study was performed following the PCR profile by Malathum et al. (19), which was originally applied to the typing of clinical isolates of Enterococcus faecalis at the subspecies level. The method involved a lower annealing temperature (40°C vs 52°C) and higher number of PCR cycles (35 vs 30) compared to the profile used by Koeuth et al. (17). The annealing temperature of 40°C was recommended as a compensation for the lower GC content of the oligonucleotide primer BOXA2R (17). It is presumed that the lowered specificity of the PCR reaction under these conditions enabled fingerprinting of a broad range of dairy associated

48

microorganisms, including sporadically detected Gram-positive and Gram-negative contaminants.

It was demonstrated in this study that colony PCR can be a valid alternative to extracted DNA as no qualitative differences in BOXA2R-PCR fingerprint profiles were observed between the two sources of DNA. Rep-PCR on the minimally processed (15) or unprocessed colonies (41) has been reported earlier to yield genomic patterns indistinguishable from those produced with purified DNA. On occasions, though colony sources resulted in minor variations in band intensity, presumably due to loading of a non-standardized amount of DNA, the profiles’ authenticity were not compromised. High reproducibility of the BOXA2R-PCR method across both templates was achieved (95 -100%) with no statistically significant difference (P = 0.118).

In summary, this work has demonstrated that BOXA2R has potential use as a single primer for repetitive-PCR identification of dairy bacteria, in particular the most common dairy cocci. Further work will explore the general use of the method to differentiate and classify bacteria.

5. CONCLUSIONS

The choice of BOXA2R as a single primer in repetitive-PCR genomic fingerprinting offers several benefits in genotyping the microorganisms associated with dairy products; it generates unique and distinctive fingerprint profiles of individual bacterial and yeast strains; it enables classification of Lc. lactis subspecies into a genotype lactis or cremoris and the discrimination of Lc. lactis strains, including closely related ones, in a single step (Lc. lactis sub-species and strain specificity); it is applicable for the presumptive assignment of Gram-positive cocci commonly encountered in dairy products to Lactococcus lactis, Stretococcus thermophilus and Leuconostoc mesenteroides, and colonies containing intact cells are a sufficient source of template DNA.

BOXA2R-PCR can be particularly useful for the preliminary screening of large numbers of unknown dairy isolates, for example in strain selection processes or rationalization of culture collections, facilitating fast identification of commonly present LAB species, which minimizes the number of strains required to undergo further phenotypic characterization or complex genome studies.

49

The current methodology for the identification of catalase-negative Gram-positive bacteria isolated from a dairy environment may require the use of several genus- and species-specific PCR reactions (42). The use of BOXA2R-PCR as an alternative strategy provides a fast and simple approach for strain screening and their preliminary identification. The applied molecular protocol involves: 1) plating a sample on agar medium; 2) performing BOXA2R-PCR on morphologically different colonies or a chosen number of purified colonies depending on the sample’s expected complexity; 3) genetic identification of Lactococcus lactis, Strep. thermophilus and Leuconostoc mesenteroides; and 4) 16S rRNA gene sequencing of purified colonies that have unfamiliar profile (see Figure 2.8).

Direct application of the method on vegetative cells instead of extracted DNA (colony- PCR) significantly reduces time and cost of screening process. In addition to other advantages of PCR methods, speed and cost related, the BOXA2R-PCR fingerprint patterns are informative, the method is easy to use and interpret, has high resolving power, and is reproducible under defined conditions.

APPENDIX A. Statistical analysis of the BOXA2R-PCR for bacterial fingerprinting

REFERENCES

1. Callon C, Millet L, Montel MC. Diversity of lactic acid bacteria isolated from AOC . J Dairy Res. 2004;71(2):231-44. 2. Terzic-Vidojevic A, Vukasinovic M, Veljovic K, Ostojic M, Topisirovic L. Characterization of microflora in homemade semi-hard white Zlatar cheese. Int J Food Microbiol. 2007;114(1):36-42. 3. Ndoye B, Rasolofo EA, LaPointe G, Roy D. A review of the molecular approaches to investigate the diversity and activity of cheese microbiota. Dairy Science and Technology. 2011;91(5):495. 4. Odamaki T, Yonezawa S, Sugahara H, Xiao JZ, Yaeshima T, Iwatsuki K. A one step genotypic identification of Lactococcus lactis subspecies at the species/strain levels. Syst Appl Microbiol. 2011;34(6):429-34. 5. Cavanagh D, Casey A, Altermann E, Cotter PD, Fitzgerald GF, McAuliffe O. Evaluation of Lactococcus lactis isolates from nondairy sources with potential dairy

50

applications reveals extensive phenotype-genotype disparity and implications for a revised species. Appl Environ Microbiol. 2015;81(12):3961-72. 6. Fernandez E, Alegria A, Delgado S, Martin MC, Mayo B. Comparative phenotypic and molecular genetic profiling of wild Lactococcus lactis subsp. lactis strains of the L. lactis subsp. lactis and L. lactis subsp. cremoris genotypes, isolated from starter-free cheeses made of raw milk. Appl Env Microbiol. 2011;77(15):5324-35. 7. Nomura M, Kobayashi M, Okamoto T. Rapid PCR-based method which can determine both phenotype and genotype of Lactococcus lactis subspecies. Appl Environ Microb. 2002;68(5):2209-13. 8. Pu ZY, Dobos M, Limsowtin GK, Powell IB. Integrated polymerase chain reaction-based procedures for the detection and identification of species and subspecies of the Gram-positive bacterial genus Lactococcus. J Appl Microbiol. 2002;93(2):353-61. 9. Delgado S, Mayo B. Phenotypic and genetic diversity of Lactococcus lactis and Enterococcus spp. strains isolated from Northern starter-free farmhouse cheeses. Int J Food Microbiol. 2004;90(3):309-19. 10. Rademaker JL, Herbet H, Starrenburg MJ, Naser SM, Gevers D, Kelly WJ, et al. Diversity analysis of dairy and nondairy Lactococcus lactis isolates, using a novel multilocus sequence analysis scheme and (GTG)5-PCR fingerprinting. Appl Environ Microbiol. 2007;73(22):7128-37. 11. LiPuma J, Spilker T. Method for bacterial species identification and strain typing. 2015. US patent 20150376685A1. 12. Psoni L, Kotzamanidis C, Yiangou M, Tzanetakis N, Litopoulou-Tzanetaki E. Genotypic and phenotypic diversity of Lactococcus lactis isolates from Batzos, a Greek PDO raw cheese. Int J Food Microbiol. 2007;114(2):211-20. 13. Corroler D, Mangin I, Desmasures N, Gueguen M. An ecological study of lactococci isolated from raw milk in the cheese registered designation of origin area. Appl Environ Microbiol. 1998;64(12):4729-35. 14. Prodelalova J, Spanova A, Rittich B. Application of PCR, rep-PCR and RAPD techniques for typing of Lactococcus lactis strains. Folia Microbiol. 2005;50(2):150-4. 15. Versalovic J, Schneider M, de Bruijn FJ, Lupski JR. Genomic fingerprinting of bacteria using repetitive sequence-based polymerase chain reaction. Methods in Molecular and Cellular Biology. 1994;5:25-40.

51

16. Martin B, Humbert O, Camara M, Guenzi E, Walker J, Mitchell T, et al. A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae. Nucleic Acids Research. 1992;20(13):3479-83. 17. Koeuth T, Versalovic J, Lupski JR. Differential subsequence conservation of interspersed repetitive Streptococcus pneumoniae BOX elements in diverse bacteria. Genome Res. 1995;5(4):408-18. 18. Kempler GM, McKay LL. Improved medium for detection of citrate-fermenting Streptococcus lactis subsp. diacetylactis. Appl Environ Microbiol. 1980;39(4):926-7. 19. Malathum K, Singh KV, Weinstock GM, Murray BE. Repetitive sequence-based PCR versus pulsed-field gel electrophoresis for typing of Enterococcus faecalis at the subspecies level. J Clin Microbiol. 1998;36(1):211-5. 20. Pavel AB, Vasile CI. PyElph - a software tool for gel images analysis and phylogenetics. Bmc Bioinformatics. 2012;13. 21. Hoffman CS, Winston F. A ten-minute DNA preparation from yeast efficiently releases autonomous plasmids for trausformation of Escherichia coli. Gene. 1987;57:267-12. 22. Tilsala-Timisjärvi A, Alatossava T. Development of oligonucleotide primers from the 16S-23S rRNA intergenic sequences for identifying different dairy and lactic acid bacteria by PCR. International Journal of Food Microbiology. 1997;35(1):49- 56. 23. White TJ, Bruns TD, Lee SB, Taylor JW. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In PCR Protocols: A guide to methods and applications. 1990:315-22. 24. Mohammed M, Abd El-Aziz H, Omran N, Anwar S, Awad S, El-Soda M. Rep- PCR characterization and biochemical selection of lactic acid bacteria isolated from the Delta area of . Int J Food Microbiol. 2009;128(3):417-23. 25. Zamfir M, Vancanneyt M, Makras L, Vaningelgem F, Lefebvre K, Pot B, et al. Biodiversity of lactic acid bacteria in Romanian dairy products. Syst Appl Microbiol. 2006;29(6):487-95. 26. Kutahya OE, Starrenburg MJ, Rademaker JL, Klaassen CH, van Hylckama Vlieg JE, Smid EJ, et al. High-resolution amplified fragment length polymorphism typing of Lactococcus lactis strains enables identification of genetic markers for subspecies- related phenotypes. Appl Environ Microbiol. 2011;77(15):5192-8. 27. Ogier JC, Serror P. Safety assessment of dairy microorganisms: the Enterococcus genus. Int J Food Microbiol. 2008;126(3):291-301.

52

28. Edalatian MR, Najafi MBH, Mortazavi SA, Alegria A, Nassiri MR, Bassami MR, et al. Microbial diversity of the traditional Iranian cheeses Lighvan and Koozeh, as revealed by polyphasic culturing and culture-independent approaches. Dairy Science & Technology. 2012;92(1):75-90. 29. Irlinger F. Safety assessment of dairy microorganisms: coagulase-negative staphylococci. Int J Food Microbiol. 2008;126(3):302-10. 30. Scholz HC, Pfeffer M, Witte A, Neubauer H, Al Dahouk S, Wernery U, et al. Specific detection and differentiation of Ochrobactrum anthropi, Ochrobactrum intermedium and Brucella spp. by a multi-primer PCR that targets the recA gene. Journal of Medical Microbiology. 2008;57(1):64-71. 31. Chain PSG, Lang DM, Comerci DJ, Malfatti SA, Vergez LM, et al. Genome of Ochrobactrum anthropi ATCC 49188(T), a versatile opportunistic pathogen and symbiont of several eukaryotic hosts. Journal of Bacteriology. 2011;193(16):4274-5. 32. Celinska E, Bonikowski R, Bialas W, Dobrowolska A, Sloma B, Borkowska M, et al. Pichia cactophila and Kluyveromyces lactis are highly efficient microbial cell factories of natural amino acid-derived aroma compounds. Molecules. 2018;23(1). 33. Banjara N, Suhr MJ, Hallen-Adams HE. Diversity of yeast and mold species from a variety of cheese types. Curr Microbiol. 2015;70(6):792-800. 34. Fortina MG, Ricci G, Acquati A, Zeppa G, Gandini A, Manachini PL. Genetic characterization of some lactic acid bacteria occurring in an artisanal protected denomination origin (PDO) Italian cheese, the Toma piemontese. Food Microbiol. 2003;20(4):397-404. 35. Yu J, Song YQ, Ren Y, Qing YT, Liu WJ, Sun ZH. Genome-level comparisons provide insight into the phylogeny and metabolic diversity of species within the genus Lactococcus. BMC Microbiol. 2017;17:1-10. 36. Zlotkin A, Eldar A, Ghittino C, Bercovier H. Identification of Lactococcus garvieae by PCR. J Clin Microbiol. 1998;36(4):983-5. 37. Dang HT, Park HK, Myung SC, Kim W. Development of a novel PCR assay based on the 16S-23S rRNA internal transcribed spacer region for the detection of Lactococcus garvieae. J Fish Dis. 2012;35(7):481-7. 38. Kelly WJ, Ward LJ, Leahy SC. Chromosomal diversity in Lactococcus lactis and the origin of dairy starter cultures. Genome Biol Evol. 2010;2:729-44. 39. Alegria A, Delgado S, Florez AB, Mayo B. Identification, typing, and functional characterization of Leuconostoc spp. strains from traditional, starter-free cheeses. Dairy Science & Technology. 2013;93(6):657-73.

53

40. Alegria A, Fernandez ME, Delgado S, Mayo B. Microbial characterisation and stability of a farmhouse natural fermented milk from Spain. Int J Dairy Technol. 2010;63(3):423-30. 41. Louws FJ, Fulbright DW, Stephens CT, Debruijn FJ. Specific genomic fingerprints of phytopathogenic Xanthomonas and Pseudomonas pathovars and strains generated with repetitive sequences and PCR. Appl Env Microb. 1994;60(7):2286-95. 42. Wullschleger S, Lacroix C, Bonfoh B, Sissoko-Thiam A, Hugenschmidt S, Romanens E, et al. Analysis of lactic acid bacteria communities and their seasonal variations in a spontaneously fermented dairy product (Malian fene) by applying a cultivation/genotype-based binary model. International Dairy Journal. 2013;29(1):28-35.

54

CHAPTER 3

BACTERIOPHAGE GENOTYPING USING

BOXA REPETITIVE-PCR

ABSTRACT

Repetitive-PCR (rep-PCR) using BOXA1R and BOXA2R as single primers was investigated for its potential to genotype bacteriophage. Previously, this technique has been primarily used for the discrimination of bacterial strains. Reproducible DNA fingerprint patterns for various phage types were generated using either of the two primers. The similarity index of replicates ranged from 89.4% - 100% for BOXA2R- PCR, and from 90% - 100% for BOXA1R-PCR. The method of DNA isolation (p=0.08) and the phage propagation conditions at two different temperatures (p=0.527) had no significant influence on generated patterns. Rep-PCR amplification products were generated from different templates including purified phage DNA, phage lysates and phage plaques. The use of this method enabled comparisons of phage genetic profiles to establish their similarity to related or unrelated phages and their bacterial hosts. The findings suggest that repetitive-PCR could be used as a rapid and inexpensive method to preliminary screen phage isolates prior to their selection for more comprehensive studies. The adoption of this rapid, simple and reproducible technique could facilitate preliminary characterisation of a large number of phage isolates and the investigation of genetic relationship between phage genotypes.

1. BACKGROUND

Repetitive DNA sequences constitute a substantial component of both eukaryotic and prokaryotic genomes. In some higher plant species, they can account for up to 90% of the genomic DNA (1), while in humans DNA repeats comprise nearly half of the genome (2). The presence or absence of certain types of repeats, diversity in their nucleotide sequences, their size, location and copy number per genome characterize various bacterial species, even those with the smallest genomes (3). Interspersed repeats play a significant role in genomic rearrangements, such as inversions, deletions, duplications and translocations (3). The proposed functional roles of repetitive sequences involve the regulation of coding sequence expression and the formatting necessary for genome packaging; DNA repair and restructuring; genome replication and transmission to progeny cells; formation of nucleoprotein complexes; and formation of a characteristic genome system organization that allows for evolutionary significant changes without altering coding sequences (4).

57

Specific families of interspersed DNA elements have been observed in many bacterial and archael genomes (3, 5), while bacteriophages are considered to carry few repetitive elements (5). The BOX family of repetitive DNA elements, consisting of different combinations of three subunits, boxA, boxB, and boxC, was originally identified in Gram-positive Streptococcus pneumoniae (6). Hybridization studies have shown that only boxA sequences are highly evolutionary conserved. The outwardly facing repetitive primers BOXA1, BOXA1R and BOXA2R that are complementary to the consensus sequences of boxA, when used as single primers in the repetitive-PCR generated complex fingerprint patterns in various Gram-positive and Gram-negative bacterial species (7). BOXA-based primers have since been used for genotyping diverse bacterial species in various ecological (8), epidemiological (9) and industrial application studies (10).

The optimal performance of dairy starter cultures is challenged by the risk of lytic bacteriophage (phage) infection (11, 12) and indeed, phage may pose a problem to any industry based on bacterial fermentation (13). In recent years, there has been a renewed interest in phage from the perspective of their beneficial applications, such as phage therapy applications for treating pathogens (14, 15). The multiplex PCR methods have been developed to detect the most relevant phage species for the dairy industry, such as lactococcal phage species, 936, P335 and c2 and phages that infect the thermophilic starter cultures Str. thermophilus and Lb. delbrueckii (16, 17). Methods that can be applied for genetic characterization of phages involve restriction digestion of genomic DNA (18); multilocus sequence typing (MLST) (19); restriction fragment length polymorphism (RFLP) or a denaturing gradient gel eletrophoresis (DGGE) of a particular gene (20); random amplification of polymorphic DNA (RAPD)- PCR (18, 21, 22) and genomic sequencing (23). However, these methods are not necessarily the most suitable for routine use due to time or cost-associated constraints. Additionally, it is known that the DNA of some phages are resistant to many restriction enzymes, which imposes the need to use several restriction enzymes to ensure digestion (18). RAPD-PCR and rep-PCR are in the category of fast and non-expensive genotyping methods, but RAPD-PCR has been reported to have poor reproducibility due to the use of short and arbitrary primers which target randomly distributed sequences (24) and requires substantial optimization of conditions (18).

Our preliminary work on genetic characterization of Lactococcus lactis phages showed that restriction digestion was not successful for all of them rendering comprehensive

58

phage comparison unachievable. This prompted our search for an alternative method for phage differentiation. To this end, repetitive-PCR using BOXA-based primers, which we have previously successfully applied to discriminate bacterial and yeast isolates (25) was explored. The aim of this study was to investigate whether repetitive-PCR typing is applicable to bacteriophage, by determining whether amplified viral DNA could produce specific and stable fingerprints that could be used for phage genotyping.

2. MATERIALS and METHODS

2.1.1 Bacteria

Lc. lactis ssp lactis Ni301 and Lc. lactis ssp cremoris Mo9 were isolated from dairy products. Lc. lactis ssp lactis biovar diacetylactis FD11 was isolated from the mesophilic starter type culture Flora Danica (FD). Lc.lactis ssp cremoris HP and UK712, and Lc. lactis ssp lactis biovar diacetylactis ML8 were retrieved from our internal culture collection. Lc. lactis ssp lactis 112, C6, C10 and IL1407 were kindly provided by Dr Jasna Rakonjac, Massey University, NZ. Pseudomonas aeruginosa PAO1 (wild type) and PAO1- (free of endogenous prophage Pf4) were kindly provided on plates by Dr Vanessa Huron, The Centre for Marine Bio-Innovation and the School of Biotechnology and Biomolecular Sciences (BABS).

The strains were grown anaerobically for 24 - 48 h at 30°C on M17 agar (Oxoid) or M17 broth supplemented with 0.5% lactose (LM17) (36). Stock cultures were prepared in 9.5% (w/v) autoclaved (121°C for 15 min) reconstituted skim milk and kept at -80°C.

2.1.2 Bacteriophages

The Lc. lactis phages used in the study were isolated from dairy whey samples (see Table 3.1) with other phage types provided by external sources (see Table 3.2). Phage preparations provided by external sources were analysed by PCR in the form they were supplied without any further modification. The phage coding system for the phages isolated in this study was based on designating the phage name followed by the name of the bacterial strain on which that phage was propagated (in brackets).

59

Table 3.1 Bacteriophage samples isolated from whey samples

Year Titer Phage plaque of Phage code Host Species Source (pfu/ml) description isolati on 15(Mo9) Lc. lactis ssp cremoris Mo9 2.00E+08 2 mm clear + 1 mm zone 2016 isolated in this study R48(Mo9) Lc. lactis ssp cremoris Mo9 1.90E+08 2 mm and 4 mm clear 2016 isolated in this study

301(Ni301) Lc. lactis ssp lactis Ni301 2.20E+10 1 mm clear 2016 isolated in this study Lc. lactis ssp lactis var 3 mm clear + 2 mm 54(FD11) diacetylactis FD11 2.80E+09 margin 2016 isolated in this study Lc. lactis ssp lactis var 2 mm clear + 2 mm 38(FD11) diacetylactis FD11 6.00E+09 margin 2016 isolated in this study BU(HP) Lc. lactis ssp cremoris HP 7.20E+09 6 mm clear 2016 isolated in this study Lc. lactis ssp lactis var BABS, UNSW 63(ML8) 3.70E+09 2 mm clear, sharp edge 1981 diacetylactis ML8 culture collection Lc. lactis ssp lactis var BABS, UNSW CW(ML8) 1.50E+08 1mm clear 1981 diacetylactis ML8 culture collection

60

Table 3.2 Bacteriophage samples provided by external sources

Phage code Host Species Preparation Provider Pseudomonas Pf4G purified DNA A/Prof. Scott , UNSW aeruginosa Dr Susanne Erdmanne, Max Planck Institute DL4HV Halorubrum sp. purified DNA for Marine Microbiology, DM01, DM24, DM26, DM36, DM38, Escherichia coli purified DNA Dr Nicola Petty, The iThree institute, UTS DM46, DM49, DM56, DM57, DM59 T4 Escherichia coli phage lysate Dr Nicola Petty, The iThree institute, UTS Citrobacter CR1 phage lysate Dr Nicola Petty, The iThree institute, UTS rodentium Ps6, Ps15, Ps19, Ps21 Pseudomonas sp. phage lysate Dr Lisa Elliott, AusPhage Staphylococcus Sa1, Sa7, Sa11, Sa12 phage lysate Dr Lisa Elliott, AusPhage aureus 923, 943 Lactococcus lactis phage lysate A/Prof. Jasna Rakonjac, Massey Univ., NZ

PRD1 E. coli plate Culture collection, BABS, UNSW

61

2.2.1 Phage propagation, amplification, and titering. The lysates of the isolated phages were prepared from single plaques picked from sensitive host lawns of double layer LM17 agar plates containing 10 mM CaCl2 (35) and were purified twice. Following the amplification and filtration with a 0.22 μm sterile Millipore filter, broth lysates with titers equal or higher than 108 pfu/ml were stored in 1.5 ml cryovials at - 20° C.

2.2.2 Phage purification and DNA isolation. Crude phage lysate (1.5 ml) was treated with 10 µg/ml RNase A and 1 µg/ml DNAse I (Sigma) final concentration for 30 min at 37°C to remove bacterial nucleic acids. Following centrifugation at 22, 000 X G for 10 min, the phage particles in the supernatant were PEG- precipitated and further purified as previously described (36).

Repetitive Polymerase Chain Reaction (rep-PCR). Phage genotyping was performed using puReTaq™Ready-To-Go™ Polymerase Chain Reaction (PCR) Beads (GE Healthcare) with either BOXA1R (5’-CTACGGCAAGGCGACGCTGACG-3’) or BOXA2R (5’-ACGTGGTTTGAAGAGATTTTCG-3’) (7) as a single primer.

Table 3.3 PCR amplification conditions for the primers, BOXA1R and BOXA2R

Primer Amplification conditions Reference name

BOXA1R 2 min at 92C, 35 cycles of: 30 sec Zavaglia et al., 2000 (37) at 92C, 1 min at 40C, 2 min at 72C; 5 min at 72C

BOXA2R 7 min at 95C, 35 cycles of: 30 sec Malathum et al., 1998 (38) at 90C, 1 min at 40C, 8 min at 65C; 16 min at 65C

For each reaction, 50 pmol of the single primer BOXA1R or BOXA2R, 50 - 100 ng Lc. lactis phage DNA in DNase Rnase - free water at a final volume of 25 µl was added to a tube containing a single PCR bead. The rep-PCR method was also tested using phage lysates (105 pfu/ml) and plaques as DNA sources. The negative control reaction contained primer only, with water substituted for the template DNA. PCR amplifications were performed in an automated thermal cycler (Eppendorf) with an optimal cycling program set for each primer (see Table 3.3). The amplification products (5 µl) were 62

electrophoresed on 15 x 20 cm 1.2% (w/v) molecular grade agarose (Bio-Rad) gels in 1 x TAE (Tris-acetate, EDTA, pH 8.1) at a constant 100 V for 3 h. The amplicons were assessed against the molecular size marker HyperLadder I (M I) or HyperLadder I (M II) (Bioline). After staining with GelRed 3 x staining solution in water (Biotium) for 15 min, the gels were photographed with a camera attached to a tripod mounted on a UV- transilluminator.

2.2.3 Method validation. Reproducibility of the rep-PCR method using BOXA1R and BOXA2R was tested on a set of known phages under defined conditions (see Appendices, Chapter 3, Supplementary 1, Table S1). Lc. lactis ssp lactis phage (Ø) 301, which was isolated in this work was sequenced and included in the reproducibility study. Phage DNA was isolated following two protocols: the protocol described above (referred to as DNA1) and QIAamp DNA Blood Mini Kit (Qiagen) (referred to as DNA2). For the preparation of DNA2, phage lysate was pre-treated with DNase I and concentrated with PEG-salt solution before the Qiagen DNA viral DNA purification protocol was followed. At least three separate replicates from the same DNA template for each strain were used for PCR amplifications.

2.2.4 Statistical analysis. Gel images of the BOXA1R- and BOXA2R-PCR fingerprints were analyzed with the PyElph 1.4 software (39). The generated dendrograms were based on the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis applied on the computed distance matrix (39). Due to the log-normal data distribution, the statistical analysis was performed using the non-parametric Mann- Whitney test in IBM SPSS Statistics Version 25. The effects of the two DNA templates (DNA1 and DNA2) and two phage growth temperatures (30°C vs 37°C) on the reproducibility of the fingerprint profiles were tested at p < 0.05.

2.2.5 Sequencing of the PCR bands. Randomly selected individual DNA amplicons generated with either primer were separated from a mixture of other PCR products by applying the band stab method (40). The subsequent re-amplification with the same primer in a new PCR reaction was run on a gel to verify that a single band of the same size was obtained. The size of the amplified fragment was measured against the molecular size marker, and the concentration and purity were determined by Nanodrop. After purification with GenElute PCR Clean-Up kit (Sigma), Sanger sequencing of the selected amplicons was performed in the Ramaciotti Centre for Genomics using either BOXA1R or BOXA2R as a sequencing primer. BLAST runs of the obtained nucleotide

63

sequences against the UniProtKB database resources were used to deduce the protein sequences and their functional information (34).

2.2.6 Validation of the phage origin of rep-PCR products. Primers targeting specific phage genes were designed based on the results from BLAST analyses (see Appendices, Chapter 3, Supplementary 4, Table S6). PCR was performed using the puReTaq™Ready-To-Go™ Polymerase Chain Reaction (PCR) Beads (GE Healthcare) and the appropriate purified phage DNA as template. The sizes of PCR products were evaluated by electrophoresis in 1% (w/v) agarose gels against the molecular size marker. Selected PCR products were sequenced in the Ramaciotti Centre for Genomics for final validation using the corresponding forward primer as the sequencing primer. Sequencing results were aligned against the coding sequence of genes identified during BLAST analyses using the MAFFT algorithm in the Benchling platform (41).

3. RESULTS

3.1 Repetitive-PCR for phage fingerprinting

The potential of repetitive-PCR as a fingerprinting method was originally tested with the BOXA1R and BOXA2R primers on a set of E. coli phages. The fingerprint profiles of eight out of ten E. coli DM phages looked almost identical with BOXA1R (see Figure 3.1) and very similar with BOXA2R (results not shown), suggesting the phage were closely related and that they were possibly isolated from the same source. The remaining two phages, DM46 and DM49, had profiles that were different to each other and the other DM phages for both primers. The set of DM phages appeared unrelated to the E. coli ØT4 phage as well as to the murine pathogen Citrobacter rodentium ØCR1 (24). The profiles of the tested phages displayed on average eight bands ranging from 200 – 4000 bp, which indicated that the method should have a good discriminatory power.

64

Figure 3.1 BOXA1R-PCR fingerprint patterns of E. coli phage strains. The PCR reactions were performed using purified phage DNA (ØDM01 – ØDM59) or phage lysates (ØT4 and ØCR1) as templates. The dendrogram of the BOXA1R-PCR patterns was generated using the UPGMA cluster analysis according to the Pearson product moment correlation coefficient (expressed as a percentage value, 0 - 100%). M- HypperLadder I (Bioline). NC- negative control

3.2 BOXA-PCR method validation

On the basis of these initial results the method was further investigated and validated on a group of well characterised Lc. lactis phages using both BOXA1R and BOXA2R primers (see Appendices, Chapter 3, Supplementary 1, Table S1). The reproducibility of the method was explored using phage propagated at two different temperatures (30ºC and 37ºC) with the DNA being isolated by two protocols.

The details of the statistical analysis descriptives performed using the non-parametric Mann-Whitney test are presented in the Appendices, Chapter 3, Supplementary 2.

Repetitive-PCR with both BOXA1R and BOXA2R primers resulted in reproducible DNA fingerprint profiles of the tested phages, which also included lambda, phiX174 and T4 phages (see Figures 3.2 – 7). The similarity index of the BOXA1R-PCR replicates ranged from 90.2% - 100% and of BOXA2R-PCR replicates from 89.4% - 100%. The difference in performance between these two primers was not statistically significant (p = 0.279) (see Suppl. 2. Table S2).

3.2.1 Influence of the DNA isolation method on the phage profile

65

Separate PCR amplifications from the same DNA source generated reproducible fingerprints with each BOXA2R primer (see Figure 3.2) and BOXA1R (see Figure 3.3). The replicates showed very similar (see Figure 3.2, Øsk1 and Figure 3.3, ØT4) or indistinguishable (see Figure 3.3, ØC6A and Figure 3.4, ØLambda) profiles. No statistically significant difference (p=0.08) was observed in the fingerprint profiles of the same phage using the two DNA preparation methods, phenol-chloroform and Qiagen commercial kit (see Table S4, Additional file 2).

Figure 3.2 Reproducibility testing of the BOXA2R-PCR using the DNA of lactococcal phages c2 and sk1. Both phages were propagated on Lc. lactis ssp cremoris MG1363 at 30C. The corresponding dendrogram was generated using the UPGMA method. Three replicates per each DNA isolated by the phenol-chloroform procedure (DNA1) and the Qiagen kit (DNA2) were tested. M- HypperLadder I (Bioline). NC- negative control

Rep-PCR produced stable profiles when the same phage DNA was amplified at different times on separate gels (results not shown), although band migration on gels may fluctuate. An example, ØX174 amplified with BOXA1R is presented in Figures 3.3 and 3.4. The two gels appeared different due to different concentrations of DNA used in the PCR reactions, 100 ng (Fig. 3.3) vs 25 ng (Fig. 3.4). When the migrating bands are compared against the marker it can be seen that the banding patterns are consistent between the two gels (see Figures 3.3 and 3.4).

66

Figure 3.3 BOXA1R-PCR reproducibility testing. The templates used in PCR reactions included phiX174 RF1 DNA (Thermo Fisher), ØT4 lysate and DNA of ØC6A amplified on Lc. lactis ssp lactis C6. The corresponding dendrogram was generated using the UPGMA method. M- HypperLadder I (Bioline). NC- negative control

Figure 3.4 Reproducibility testing of the BOXA1R-PCR using commercially isolated DNA. The reactions were performed using 30 ng DNA of the ØLambda (Øλ) (Thermo Fischer) and 25 ng of ØX174 (phiX174, Thermo Fischer) per reaction. The corresponding dendrogram was generated using the UPGMA method. M- HypperLadder I (Bioline). NC- negative control

3.2.2 Influence of the propagation temperature on the phage profile

Minor variations in the fingerprint patterns of the same phage cultivated and amplified at two temperatures, 30°C and 37°C, were observed, however the differences were not significant (p = 0.527) (see Table S3, Additional file 2). The variability in band intensity, which was occasionally seen, was not associated with the incubation temperature as there was no clear separation between the 30°C or 37°C clusters (see Figure 3.5).

67

Figure 3.5 Reproducibility testing of the BOXA2R-PCR using the DNA of the Ø712. The phage was propagated on Lc. lactis ssp cremoris C2 at 30C and 37C. The corresponding dendrogram was based on the UPGMA method. Three replicates per each DNA isolated by phenol-chloroform procedure (DNA1) and using the Qiagen kit (DNA2) were tested. M- HypperLadder I (Bioline). NC- negative control

3.2.3 Influence of the propagating host on the phage profile

The host on which a phage was amplified slightly altered the phage profile, such as seen in the BOXA2R-PCR profiles of the Øsk1 amplified on two bacterial hosts: Lc. lactis ssp cremoris LMO230 and MG1363 (see Figure 3.6) or the ØP087 amplified on Lc. lactis ssp lactis strains ML8 and C10 (see Figure 3.7). Slightly differing bands were seen in the ~2 Kbp region, where Øsk1(MG1363) displayed a single, pronounced band whereas Øsk1(LMO230) displayed two smaller bands instead (96% similarity). The differences in the ØP087 profiles were detected in the ~800 – 1000 bp range, which resulted in the 90% similarity between amplifications when propagated on the two hosts.

Figure 3.6 Reproducibility (n= 3) testing of the BOXA2R-PCR. Øc2 was propagated on Lc. lactis ssp cremoris MG1363 and Øsk1 was propagated on the Lc. lactis ssp cremoris LMO230 and MG1363 at 30C. PCR amplifications were performed using DNA1. Three replicates of the phage Lambda were generated with 12 ng of DNA (Thermo Fischer). The corresponding dendrogram was generated using the UPGMA method. M- HypperLadder I (Bioline). NC- negative control

68

Figure 3.7 Reproducibility testing of the BOXA2R-PCR. Three replicates of ØP087 DNA1 propagated on Lc. lactis ssp lactis biovar diacetylactis ML8 and Lc. lactis ssp lactis C10 at 30C and four replicates of the Ø712 propagated on Lc. lactis ssp cremoris UK712 are displayed. The corresponding dendrogram was generated using the UPGMA method. M- HypperLadder I (Bioline). NC- negative control

Similarly, one band difference between the profiles of the same phage propagated on two different hosts was also observed with BOXA1R. For example, Øc6A amplified on the Lc. lactis ssp lactis C6 and C10 differed by the presence of a ~580 bp band, which was visible only for the C6 host (results not shown).

3.3 Application of BOXA-PCR genotyping for screening phage isolates

Repetitive-PCR using either BOXA1R or BOXA2R repetitive primer was further tested on a number of ecological isolates with distinctive fingerprint profiles being generated for all tested phage (see Figures 3.8 – 10).

The amplification of Staph. aureus phages with BOXA1R primer produced the lowest number (2 – 5) of PCR fragments (see Figure 3.8). All four isolates appeared highly related, with there being no difference in profiles for Sa1 and Sa7, and for Sa11 and Sa12, whereas the similarity between these two pairs was 90%. Contrary to the Staph. aureus phages, the four Pseudomonas phages (Ps6, Ps15, Ps19 and Ps21) produced four unique fingerprint profiles. The Ps6, Ps15, Ps19 phage all exhibited one band of ~1.5 kb in common, whereas all other bands were exclusive to each phage isolate.

69

Figure 3.8 BOXA1R-PCR fingerprint profiles generated using different template sources. The purified DNA of the Halorubrum phage DL4HV and Ps. aeuruginosa Pf4G; lysates of the Ps. aeruginosa phages Ps6, Ps15, Ps19 and Ps21; Staph. aureus phages Sa1, Sa7, Sa11 and Sa12; and a plaque of the entero phage PRD1 were used. BOXA1R-PCR profiles of the two bacterial strains were also included: Ps. aeruginosa PAO1 (wild type) and PAO1- (prophage-free mutant). The corresponding dendrogram was generated using the UPGMA method. M- HypperLadder I (Bioline). NC- negative control

Figure 3.9 BOXA2R-PCR fingerprint profiles generated using different template sources. The purified DNAs of the Halorubrum phage DL4HV and Ps. aeuruginosa phage Pf4G; lysates of the Ps. aeruginosa phages Ps15 and Ps19; and Staph. aureus 70

phages Sa7 and Sa12 were used. Two bacterial strains were also included: Ps. aeruginosa PAO1 (wild type) and PAO1- (prophage-free mutant). The corresponding dendrogram was based on the UPGMA method. M- HypperLadder I (Bioline). M II- HypperLadder II (Bioline). NC- negative control

The PCR profiles of DNA extracted from the purified particles of the lytic ØPf4G obtained using the two boxA primers were compared to that of the wild-type Ps. aeruginosa PAO1, and the mutant of PAO1 created by the removal of the complete filamentous Pf4 prophage genome (PAO1-) (27). While the band patterns of both bacterial strains were similar with either boxA primer, the phage Pf4G displayed a very different profile to these bacteria and yielded DNA fragments of substantially different size and length (see Figure 3.8 and 3.9). This indicated that the superinfective ØPf4G may have undergone significant DNA rearrangement compared to its prophage/temperate form.

Successful amplification of the purified DNA of an archeal phage, DL4HV, which infects Halorubrum sp. and some other halobacteriacae was also achieved using the BOXA- PCR method (see Figures 3.8 and 3.9). `

3.3.1 Comparison of the rep-PCR patterns of phage and their hosts

The primer BOXA1R was further used to compare phage fingerprints to those of their host bacteria (see Figure 3.10). The phage band complement ranged from a single band, such as in ØBU(HP) to many amplicons and of larger size than in bacteria, such as Ø301(Ni301). Only two P335-type phages infecting Lc. lactis ssp lactis ML8, Ø63 and ØCW shared the same size amplicons (1 kb and 1.5 kb) with their host. These could have been genuine phage sequences or could indicate an interference of bacterial DNA on their profiles through contamination, but this was not investigated further. All other Lc. lactis phages displayed noticeably different fingerprint profiles to their hosts.

71

Figure 3.10 BOXA1R-PCR profiles of individual Lc. lactis phages next to their respective bacterial hosts. Lc. lactis ssp lactis biovar diacetylactis FD11 (Ø38 and Ø54) and ML8 (Ø63 and ØCW); Lc. lactis ssp lactis Ni301 (Ø301) and 112 (Ø923 and Ø943), and Lc. lactis ssp cremoris HP (ØBU) and Mo9 (Ø15 and ØR48). The corresponding dendrogram was generated using the UPGMA method. M I- HypperLadder I (Bioline). M II- HypperLadder II (Bioline). NC- negative control

3.4 Sequencing of the BOXA-PCR fragments

To investigate the molecular nature of the DNA fragments amplified with the BOXA primers, eight PCR products randomly selected from fingerprints of the different phage isolates were sequenced with either BOXA1R or BOXA2R primer (see Appendices, Chapter 3, Supplementary 3, Table S5). This work was designed to obtain preliminary information on the identity of genes contained in the rep-PCR products. Annotation of the sequences resulted in the identification of encoded proteins that are clearly linked to the phages they originated from, although with varying levels of homology (37% - 89%) (see Suppl. 3, Table S5). The lower levels of homology may have resulted from technical issues or from alternative binding sites of the BOXA primers to the rep-PCR products, thereby producing overlapping sequences that are difficult to deconvolute. To further validate the presence of the identified genes in the phage DNA a series of primers were designed to target the exact coding sequences returned by the BLAST analyses (Suppl. 3, Table S6). PCR products were successfully amplified from the corresponding phage DNA using the appropriate pair of primers and the size of the products matched the size of the targeted genes (Suppl. 3, Fig. S4). Selected PCR products were sequenced with the results confirming that the PCR products were the genes identified in the BLAST results (Suppl. 3, Figures S5 – S8). 72

The comparisons of results obtained by sequencing selected fragments using both the boxA repetitive primers and primers specifically designed for particular annotated genes are presented in Appendices, Chapter 3, Suppl. 3, Table S6.

4. DISCUSSION

Current phage genotyping methods have certain disadvantages. Molecular approaches, such as Multilocus Sequence Typing (MLST) or denaturing gradient gel eletrophoresis (DGGE), which use markers to assess the phage genome diversity or a diversity of a particular gene respectively (20) require a priori genetic information (19). Restriction digestion of genomic DNA and DNA/DNA hybridizations are considered time-consuming and often require large quantities (µg) of pure DNA (21). Additionally, the DNA of some phages can be resistant to restriction enzyme digestion which may be due to a scarcity of cleavage sites (28); a base modification within the recognition sequence, genome methylation or other antirestriction mechanisms (18). While MLST can distinguish phages with the same RFLP pattern (18), it may not be universally applicable for fingerprinting all phage types. For example, it has proved suitable for phylogenetic analysis of the 936-like phages of Lc. lactis, but could not be used to generate amplification products from the c2 and P335 phage groups (18). Whole-genome phage sequencing is associated with substantial cost and technical difficulties (22), and may be impractical for routine use.

In terms of speed and simplicity the repetitive-PCR described in this work is comparable to Random Amplified Polymorphic DNA (RAPD)-PCR. Both methods allow for a whole genome comparison. Rep-PCR is commonly applied to bacteria but has been reported as a complement to RAPD for phage characterization (29). Although RAPD-PCR method has been used to examine genetic diversity of various bacterial and algal phages it has been associated with poor reproducibility (31) and hence requires extensive optimisation (21, 31). This involves selection of a suitable decamer primer (22, 28) followed by optimization of PCR conditions, such as primer and MgCl2 concentration, annealing temperature and standardization of template concentrations (28, 32). Furthermore, to increase the sensitivity of the RAPD assay, the pooling of patterns from at least two amplifications with different random primers has often been applied (29, 32). RAPD cannot discriminate closely related phages and the amplification of the phage and host DNA has not been possible under the same RAPD- PCR assay conditions (22). Our findings demonstrated that the use of BOXA-PCR 73

under defined conditions could achieve better discrimination, higher reproducibility with fewer optimisation steps and would be more broadly applicable that RAPD-PCR for use in phage genotyping.

BOXA-PCR also offers advantages over the existing multiplex PCR systems (16, 17) for the detection of the most relevant dairy phage types. While these systems can readily identify lactococcal 936, P335 and c2 phage species as well as phages infecting Str. thermophilus and Lb. delbrueckii, they do not enable the identification and differentiation of the individual phage.

Reproducibility of the rep-PCR method was demonstrated using both BOXA1R and BOXA2R primers on a set of selected known phages amplified on different hosts and/or amplified in the same host after cultivation at different temperatures (30C and 37C). Consistent fingerprint profiles were obtained from phage DNA purified by two protocols (p= 0.08). The amplification of a phage on different hosts resulted in some small, but obvious differences. These variations likely reflect the small genetic differences between the host strains, such as in the case of Lc. lactis ssp cremoris MG1363 and LMO230, the plasmid-free and prophage-cured derivatives of their closely related parent strains NCDO712 and C2 (33).

Amplification products were generated from different phage preparations, such as purified phage DNA (ØDL4HV, ØPf4G, ØDM01 – ØDM59 and Lactococcus phages), phage lysates (Pseudomonas sp. and Staph. aureus phages, ØT4, ØCR1) or a single plaque inoculated into a PCR reaction directly from the agar plate (ØPRD1). These findings suggest the rep-PCR method would have utility and convenience for routine phage typing. However, appropriate controls should be included to account for possible host DNA contamination when unpurified phage preparations are used, such as lysate that has not been pre-treated with DNAse or a directly picked plaque.

Repetitive-PCR also appears useful for application to phages, which can be difficult to characterize by other genetic methods. For example, the DNA of the CR1 phage has previously been shown to be refractory to both restriction endonuclease digestion and to genome sequencing, which has prevented its genomic characterization (26). Lc. lactis ssp lactis Ø15(Mo9) and some other lactococcal phages investigated in this study were also resistant to cutting by several restriction enzymes hence the direct determination of genetic relationship between various lactococcal phages by restriction enzyme digestion method was not possible. 74

Repetitive-PCR was tested on a range of phages that infect a variety of bacteria, for example E. coli, Pseudomonas sp., Staphylococcus sp., Lc. lactis; and also the haloarchea algae Halorubrum. Both the BOXA1R and BOXA2R primers used in the PCR amplification of phage DNA produced distinctive individual fingerprint profiles for all tested isolates. This broad application may be due to the lower GC content of these primers and the associated optimal annealing temperature of 40°C (7). This decreased stringency may have enabled the amplification of various phage types. It is suggested that potentially other repetitive primers, for example ERIC primers for enterococcal phages, could be more suitable for the genotyping of other bacteriophage types.

The sequencing of PCR fragments produced with the boxA primers verified the phage origin of the amplified DNA. The majority of the genes identified during the BLAST analyses against the UniProt KB database (34) encoded phage-associated proteins involved in different viral functions, such as the Replication protein A in ØX174 and the Replication protein P in ØLambda, respectively, or baseplate proteins (ØT4 and Ø301). Phage-originating genes identified from the nucleotide sequences might facilitate future work to identify and locate the repeats and elucidate the potential repetitive primer binding sites on the target genomes.

5. CONCLUSION

The findings of this study suggest that repetitive-PCR could be a useful high resolution method for bacteriophage genotyping. Repetitive-PCR, preferably using specific primers to BOX sequences: BOXA1R and BOXA2R, is applicable to the genetic identification of an isolated phage and for the investigation of genetic relationships between phage genotypes. The technique could well be used in a phage management program where the knowledge of genetic profiles of circulating and emerging phages could be advantageous for the selection of the most appropriate strategies to address contamination problems. It could also be used to complement other phage genotyping methods, such as RFLP, and would be particularly useful for analyzing phage which appear refractory to digestion with restriction endonucleases. Given the abundance of phage in the environment (20), having a rapid, inexpensive method for genetic characterization of both phage and its bacterial hosts could be useful especially when preliminary genotyping large numbers of phage isolates is required for screening purposes.

75

SUPPLEMENTARY FILES in APPENDICES

Supplementary 1. List of phages used in the reproducibility study. This data outlines the phages used in the study, which strain they were propagated on, and the culture conditions.

Supplementary 2. Statistical analysis of the BOXA-PCR for phage fingerprinting. This file provides the details of the statistical analysis of the reproducibility testing.

Supplementary 3. Sequenced phage fragments. This file provides the list of sequenced phage fragments and details relating to their analysis, including; the primers used for the amplification of selected phage genes; results from the second round of PCR amplifications and BLAST analyses that used gene-specific primers; and a figure showing PCR amplifications of the selected phage DNA fragments using the forward and reverse primer pair corresponding to the annotated genes in each phage.

Supplementary 4. Sanger sequencing of the BOXA-PCR fragments. This file provides the alignment of the sequencing of the PCR products to the sequences of the corresponding proteins.

ACKNOWLEDGEMENTS

The authors would like to express their gratitude to Prof. Peter White and A/Prof. Scott Rice, School of Biotechnology and Biomolecular Sciences, UNSW, Sydney; Dr Nicola Petty, The iThree Institute, University of Technology, Sydney; Dr Lisa Elliott, AusPhage Pty Ltd, Queensland, A/Prof. Jasna Rakonjac, Massey University, NZ and Dr Susanne Erdmanne, Max Planck Institute for Marine Microbiology, Germany, for their provision of phage samples used in this study.

REFERENCES

1. Mehrotra S, Goyal V. Repetitive sequences in plant nuclear DNA: types, distribution, evolution and function. Genom Proteom Bioinf. 2014;12(4):164-71. 2. Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nature Rev Genet. 2012;13:36 - 46. 3. Smirnov GB. Repeats in bacterial genome: evolutionary considerations. Mol Gen Microbiol+. 2010;25(2):56-65. 76

4. Shapiro JA, von Sternberg R. Why repetitive DNA is essential to genome function. Biol Rev. 2005;80(2):227-50. 5. Treangen TJ, Abraham AL, Touchon M, Rocha EPC. Genesis, effects and fates of repeats in prokaryotic genomes. Fems Microbiol Rev. 2009;33(3):539-71. 6. Martin B, Humbert O, Camara M, Guenzi E, Walker J, Mitchell T, et al. A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae. Nucleic Acid Res. 1992;20(13):3479-83. 7. Koeuth T, Versalovic J, Lupski JR. Differential subsequence conservation of interspersed repetitive Streptococcus pneumoniae BOX elements in diverse bacteria. Genome Res. 1995;5(4):408-18. 8. Alegria A, Fernandez ME, Delgado S, Mayo B. Microbial characterisation and stability of a farmhouse natural fermented milk from Spain. Int J Dairy Technol. 2010;63(3):423-30. 9. Wolska K, Kot B, Jakubczak A, Rymuza K. BOX-PCR is an adequate tool for typing of clinical Pseudomonas aeruginosa isolates. Folia Histochem Cyto. 2011;49(4):734-8. 10. Koc M, Cokmus C, Cihan AC. The genotypic diversity and lipase production of some thermophilic bacilli from different genera. Braz J Microbiol. 2015;46(4):1065-76. 11. Marcó MB, Moineau S, Quiberoni A. Bacteriophages and dairy . Bacteriophage. 2012;2(3):149-58. 12. Mahony J, Moscarelli A, Kelleher P, Lugli GA, Ventura M, Settanni L, van Sinderen D. Phage biodiversity in artisanal cheese wheys reflects the complexity of the fermentation process. Viruses. 2017;9(3). 13. Wunsche L. Importance of bacteriophages in fermentation processes. Acta Biotechnol. 1989;9(5):395-419. 14. Jeon J, Park JH, Yong D. Efficacy of bacteriophage treatment against carbapenem-resistant Acinetobacter baumannii in Galleria mellonella larvae and a mouse model of acute pneumonia. BMC Microbiol. 2019;19(1):70. 15. Yang H, Liang L, Lin S, Jia S. Isolation and characterization of a virulent bacteriophage AB1 of Acinetobacter baumannii. BMC Microbiol. 2010;10:131. 16. Labrie S, Moineau S. Multiplex PCR for detection and identification of lactococcal bacteriophages. Appl Environ Microbiol. 2000;66:987-994. 17. del Rio B, Binetti AG, Martín MC, Fernández M, Magadán AH, Alvarez MA. Multiplex PCR for the detection and identification of dairy bacteriophages in milk. Food Microbiol. 2007;24:75-81.

77

18. Barrangou R, Yoon SS, Breidt F, Jr., Fleming HP, Klaenhammer TR. Characterization of six Leuconostoc fallax bacteriophages isolated from an industrial sauerkraut fermentation. Appl Environ Microbiol. 2002;68(11):5452-8. 19. Moisan M, Moineau S. Multilocus sequence typing scheme for the characterization of 936-like phages infecting Lactococcus lactis. Appl Environ Microbiol. 2012;78(13):4646-53. 20. Clokie MR, Millard AD, Letarov AV, Heaphy S. Phages in nature. Bacteriophage. 2011;1(1):31-45. 21. Comeau AM, Short S, Suttle CA. The use of degenerate-primed random amplification of polymorphic DNA (DP-RAPD) for strain-typing and inferring the genetic similarity among closely related viruses. J Virol Methods. 2004;118(2):95-100. 22. Gutiérrez D, Martin-Platero AM, Rodríguez A, Martínez-Bueno M, García P, Martínez B. Typing of bacteriophages by randomly amplifed polymorphic DNA (RAPD)- PCR to assess genetic diversity. FEMS Microbiol Letters. 2011;322:90-7. 23. Klumpp J, Fouts DE, Sozhamannan S. Next generation sequencing technologies and the changing landscape of phage genomics. Bacteriophage. 2012;2(3):190-9. 24. Welsh J, McClelland M. Fingerprinting genomes using PCR with arbitrary primers. Nucleic Acids Res. 1990;18(24):7213-8. 25. Damnjanovic D, Harvey M, Bridge WJ. Application of colony BOXA2R-PCR for the differentiation and identification of lactic acid cocci. Food Microb. 2019;82:277-86. 26. Petty NK, Toribi AL, Goulding D, Foulds I, Thomson N, Dougan G, et al. A generalized transducing phage for the murine pathogen Citrobacter rodentium. Microbiol-Sgm. 2007;153:2984-8. 27. Rice SA, Tan CH, Mikkelsen PJ, Kung V, Woo J, Tay M, et al. The biofilm life cycle and virulence of Pseudomonas aeruginosa are dependent on a filamentous prophage. ISME J. 2009;3(3):271-82. 28. Prevots F, Mata M, Ritzenthaler P. Taxonomic differentiation of 101 lactococcal bacteriophages and characterization of bacteriophages with unusually large genomes. Appl Environ Microbiol. 1990;56(7):2180-5. 29. Dini C, de Urraza PJ. Isolation and selection of coliphages as potential biocontrol agents of enterohemorrhagic and Shiga toxin-producing E. coli (EHEC and STEC) in . J Appl Microbiol. 2010;109(3):873-87. 30. Doria F, Napoli C, Costantini A, Berta G, Saiz JC, García-Moruno E. Development of a new method for detection and identification of Oenococcus oeni

78

bacteriophages based on endolysin gene sequence and randomly amplified polymorphic DNA. Appl Environ Microbiol. 2013;79(16):4799-805. 31. Hopkins KL, Hilton AC. Optimization of random amplification of polymorphic DNA analysis for molecular subtyping of Escherichia coli O157. Lett Appl Microbiol. 2001;32(3):126-30. 32. Winget DM, Wommack KE. Randomly amplified polymorphic DNA PCR as a tool for assessment of marine viral richness. Appl Env Microbiol. 2008;74(9):2612-8. 33. Kelly WJ, Ward LJ, Leahy SC. Chromosomal diversity in Lactococcus lactis and the origin of dairy starter cultures. Genome Biol Evol. 2010;2:729-44. 34. The UniProt C. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45(D1):D158-D69. 35. Terzaghi BE, Sandine WE. Improved medium for lactic streptococci and their bacteriophages. Appl Microbiol. 1975;29(6):807-13. 36. Mahony J, Kot W, Murphy J, Ainsworth S, Neve H, Hansen LH, et al. Investigation of the relationship between lactococcal host cell wall polysaccharide genotype and 936 phage receptor binding protein phylogeny. Appl Environ Microbiol. 2013;79(14):4385-92. 37. Zavaglia AG, de Urraza P, De Antoni G. Characterization of strains using box primers. Anaerobe. 2000;6(3):169-77. 38. Malathum K, Singh KV, Weinstock GM, Murray BE. Repetitive sequence-based PCR versus pulsed-field gel electrophoresis for typing of Enterococcus faecalis at the subspecies level. J Clin Microbiol. 1998;36(1):211-5. 39. Pavel AB, Vasile CI. PyElph - a software tool for gel images analysis and phylogenetics. BMC Bioinformatics. 2012;13. 40. Bjourson AJ, Cooper JE. Band-stab PCR: a simple technique for the purification of individual PCR products. Nucleic Acids Res. 1992;20(17):4675. 41. Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059-66.

79

CHAPTER 4

PHENOTYPIC AND GENOMIC CHARACTERISATION OF LACTOCOCCAL BACTERIOPHAGE 15(Mo9)

1. BACKROUND

Lactococcus lactis bacteriophages have been the subject of extensive research over the years, primarily due to the substantial negative effect on the cheese production economics that phage infection may cause. This biotechnological problem continues to persist largely due to the dynamic nature of phage populations in dairy factories (1) which involves phage genome recombinations (2) and continuous cycles of co- evolution with bacteria (3).

Historically, the 936, P335 and c2 phage species of the Siphoviridae family were the most commonly isolated lactococcal phages from industrial dairy fermentations (4-7). They most frequently recombine with members of the same DNA homology group (8). Recombination between phages is necessary for their evolution to acquire novel features and broaden their host range with the main route being through coinfection (9). Additionally, as a consequence of host-imposed pressures, phage mutation or recombination with segments of bacterial DNA may ensue and could represent a strong evolutionary force for both phage and bacteria (10).

The International Committee on Taxonomy of Viruses (ICTV) recognizes the Ceduovirus genus of the Siphoviridae family. The phages that belong to this genus are generally characterized by small genomes (~22kb) and have few restriction sites (4). Lactococcus virus c2 (11) and Lactococcus virus bIL67 (12) are listed as two representative prototype species of this genus. Though they have diverse host ranges, it is deduced from their completely sequenced genomes that they are closely related sharing 80% nucleotide sequence identity. The low conserved region encompasses the late transcribed genes, which in phage c2 have been identified as l14, l15 and l16 and in bIL67 as ORF34, ORF35 and ORF36 (11, 12). These genes encode structural proteins and are believed to be involved in host recognition (11).

In the first stage of the infection process, C2viruses recognize host cell wall-located carbohydrate receptors to absorb to cell surfaces and then subsequently interact with cell membrane proteins to inject their DNA (13). Phage infection protein (Pip) has been identified as a lactococcal host receptor required for infection by c2 phage species (14, 15). More recent research has identified the second lactococcal receptor as the YjaE transmembrane protein (16, 17). Members of C2viruses can be differentiated on the basis of their ability to infect strains with either Pip or YjaE protein into c2-like (requiring

81

Pip protein) or bIl67-like (requiring YjaE protein) (13). Additionally, a clear correlation between the use of Pip or YjaE with the variable region that spans structural proteins l14- l15- l16 or ORF34-35-36, respectively, has been established and it has been suggested that each set acts as one functional (structural) module (13).

Phage genome sequencing in recent years has enabled extensive insight into phage diversity, the mechanisms of phage evolution as well as their coevolution with bacteria (18-22). The comparative genomics of lactococcal phages of distinct species has revealed that they resulted from a combination of horizontal and vertical evolutionary events (18). In addition to the completely sequenced reference c2 and bIL67 genomes, a further eighteen fully sequenced Unclassified Ceduovirus genomes have been deposited in the NCBI database (13, 23) (Hayes S., 2019, unpublished). An increasing number of phage genomes are being sequenced using the Illumina MiSeq platform (24) while sequencing of bacteriophages using nanopore sequencing technology is still rare (22).

This study reports the phenotypic, morphological and genomic characterization of a lytic lactococcal phage vB_LacS_15(Mo9). The interest in this phage was initiated by the detection of PCR products typical of both c2 and P335 phage types during routine multiple-PCR testing for phage species determination (25). The aim of this work was to explore this observation and investigate the characteristics of this phage.

2. MATERIALS and METHODS

2.1 Materials

2.1.1 Bacteria. Bacterial strains used in this study were isolated previously from various dairy products (26) or were retrieved from our internal strain collection (see Table 4.1). They were cultivated in M17 broth supplemented with 0.5% (w/v) lactose (LM17) at 30°C anaerobically for 24 - 48 h (27). Stock cultures were prepared in 9.5% (w/v) autoclaved (121°C for 15 min) reconstituted skim milk and stored at -80°C. Bacterial DNA was isolated using the Illustra Bacteria GenomicPrep Mini Spin Kit (GE Healthcare).

2.1.2 Bacteriophages. Phage 15 was isolated from an Australian dairy whey sample in 2016 on the Lc. lactis ssp cremoris Mo9 and was named 15(Mo9) (Chapter 3). The lysate was filtered through 0.22 μm sterile Millipore filter and stocked at -20° C (without

82

glycerol). Phage amplification, enumeration, purification and DNA isolation was performed as described previously (Chapter 3). Following the recommended guidelines for bacteriophage naming (28), phage 15(Mo9) was given the designation vB_LacS_15(Mo9).

2.2 Methods

2.2.1 Multiplex PCR for phage typing. The determination of lactococcal phage type was performed by multiplex PCR as described by Labrie Moineau 2000 (25) using isolated phage DNA as template. The method was also applied to bacterial strains: Lc. lactis ssp cremoris Mo9 and AM1, Lc. lactis ssp lactis Ni301 and Lc. lactis ssp lactis biovar diacteylactis FD11R to test for the presence of integrated prophage (P335 species). Phage 15(Mo9) was additionally tested in a multiplex PCR reaction that employed three different sets of primers for the phage classification into 936, P335 or c2 type (29). PCR reactions without template DNA served as a negative control.

2.2.2 Transmission electron microscopy (TEM). Phage particles were concentrated from 2-5 ml of lysates (depending on the starting titer) with 10% (w/v) PEG8000 in 0.5 M NaCl followed by centrifugation at 22000 x G for 30min at 4°C. The pellet was washed twice with and then resuspended in 60 µl 0.9% (w/v) NaCl. A drop (10 µl) was loaded onto a freshly glow-charged formvar/pioloform carbon coated 400 mesh cooper grid for 2 min to allow phage to absorb. Negative staining was performed with 2% (w/v) aqueous uranyl acetate (UA) or 2.5% (w/v) phosphotungstic acid (PTA) (pH 7.2). Phage morphology was examined using a Philips CM120 BioTwin transmission electron microscope at an accelerating voltage of 93 kV. Electron micrographs were recorded using a Morada camera and iTEM software (Olympus SIS).

2.2.3 Phage assays. The one step growth curve, the burst size and the latent period of the phage 15(Mo9) were determined against its host Lc. lactis ssp cremoris Mo9 following the previously described method (30). For the adsorption assay, the phage were incubated with bacterial cells for 5 min in LM17 with 10 mM CaCl2 at 30° C. Phage-adsorbed bacterial cells were then pelleted by centrifugation and the unadsorbed phage virions in the supernatant were counted using the standard double- layer agar plaque test. The phage suspension in medium without bacterial cells represented a control titer. The percentage of adsorption was calculated as: [(control titer - residual titer)/control titer] x 100 (31). These assays were carried out in duplicate using the same lysate in three separate experiments.

83

2.2.4 Host range analysis. To distinguish from possible bacteriocin action, the host range of phage 15(Mo9) was initially spot tested by applying 10 µl of the 10-1 dilution on a lawn of 31 Lc. lactis strains in 0.7% (w/v) top layer LM17 agar. A plaque test was subsequently conducted on sensitive strains using serial dilutions of the phage lysate (titer 8.0 x 108 pfu/ml). All lactococcal strains used in this analysis were fingerprinted by BOXA2R-PCR (26) and were confirmed to be unique isolates.

2.2.5 Cell wall polysaccharide (CWPS)-typing of lactococcal strains. The lactococcal strains that were sensitive to 15(Mo9) were classified into a CWPS type by the multiplex-PCR test based on the differences in their cell wall polysaccharide biosynthesis gene clusters (32).

2.2.6 PCR for the classification of P335 phages and sequencing of the generated amplicons. Seven pairs of primers were applied to the purified DNA (1 µl) of the phage 15(Mo9) in individual PCR reactions following the PCR cycle as described in Mahony et al. (21). The obtained PCR products were purified with GenElute PCR Clean-Up kit (Sigma) and Sanger sequenced in the Ramaciotti Centre for Genomics. The resulting nucleotide sequences were subjected to BLAST analysis against the NCBI database (33) and UniProtKB database (34).

Electrophoresis conditions, gel staining and the visualisation of amplicons in all PCR- based experiments was performed as in earlier work (Chapters 2 and 3).

2.2.7 Illumina MiSeq phage 15(Mo9) sequencing and genome analysis. Phage DNA was isolated and purified by phenol-chloroform procedure as described previously (Chapter 3). The sequencing was performed in the Ramaciotti Center for Genomics, UNSW, Sydney on a MiSeq system (Illumina) (1x MiSeqreagent kit v2, 2x150 bp Nano Sequencing Run). The sequencing library was prepared with 10x Nextera XT DNA Library preparation kit (Illumina). Following the quality control (>80% bases higher than Q30), the reads were trimmed using Trimmomatic v0.36 (35). The generated reads were initially assembled with three different assemblies; ABySS (36) was used with values of k ranging from 31 to 96 by increments of two; SPAdes (v3.11.1) (37) was run with both the paired and unpaired reads output from trimming, with default kmers and with values of k ranging from 21 to 127, and VelvetOptimiser v2.2.6 (38) was run with values of k ranging from 31 to 96. The genome assemblies were assessed, and genome sizes estimated with jellyfish v2.2.3 (39) and CovEst v0.5.6 (40). The metrics

84

were calculated with SLiMSuite v1.14.1 (41). Phage genome annotations were performed with Prokka4.v1.4 (42).

Following the detection of a cross-contamination with phage 936-type reads, the 15(Mo9) reads that mapped to the low-depth contigs were removed and the assembly was repeated using MEGAHIT (43). The best assembly was obtained by down- sampling reads to 100x. Pilon-corrected sequence was re-annotated with Prokka4.v1.4 (42). Protein-encoding open reading frames (ORFs) were predicted using Prodigal (44). The translated ORF products were compared with known proteins using BLASTP (45) against the UniProt database (34). MAFFT (46) was used to align 15(Mo9) genome to the reference Lactococcus virus c2 (NC_001706).

The search for tRNA genes was conducted with tRNAscan-SE v.2.0

2.2.8 Nanopore sequencing and bioinformatic analysis. Nanopore sequencing of the phage 15(Mo9) genome was performed using a MiniOn sequencing device from Oxford Nanopore Technology (ONT) at the KU Leuven, . Phage genomic DNA was isolated from 50 ml phage lysate as previously described (Chapter 3). The purity of the isolated DNA was checked by NanoDrop and the concentration measured using Qubit fluorometer (32 ng/µl) with the DNA integrity confirmed on an agarose gel. Approximately 1 µg of the purified genomic material was used to prepare a sequencing library following a 1D ligation protocol that involved attaching sequencing adapters at the ends of the molecules to avoid DNA fragmentation. Nanopore sequencing was performed on a R9.4 flowcell. The reads were quality checked using FastQC (47) and base-calling was performed using Albacore v2.1 (ONT), followed by Porechop v0.2.1 (https://github.com/rrwick/Porechop) to remove barcode sequences.

Genome assembly. Phage 15(Mo9) genome was assembled with Canu v1.6 (https://github.com/marbl/canu), which is an assembler specifically designed for noisy single-molecule sequences (48). Read length distribution and quality of the Nanopore reads was assessed using NanoPlot (49).

Utilization of illumina reads to polish nanopore assembly.

To correct the errors that may occur in Nanopore assemblies after the generation of a consensus genome (frequent errors are small indels and SNPs in homopolymer regions), the Illumina reads were used to polish the assembly and improve its accuracy for the annotation. This was achieved using the seven rounds of Pilon (50) by firstly 85

mapping the reads to the nanopore assembly with the BWA aligner (51), then correcting mismatches. No more changes were detected after the seventh round.

Annotation and comparison with the reference Lactococcus phage c2

The resulting polished assembly was submitted for annotation to RAST server (52) as a Lactococcus virus unclassified c2-like due to significant differences with c2 resulting in another species (<95% identity). The blastn tool of the NCBI nucleotide database [NCBI 2016] was used to determine genome homology of 15(Mo9) with the reference Lactococcus virus c2 (NC_00017.) genome. The compared linear genome maps were visualized with EasyFig (53).

2.2.9 SDS-PAGE and mass spectrometry of phage structural proteins. Analysis of phage proteins by SDS-PAGE was performed using the entire phage particles, which were concentrated and partially purified as described by Boulanger (54). Samples were mixed with 2x Laemmli loading buffer (1% SDS–2.5% b-mercaptoethanol–6.25 mM Tris-HCl (pH 6.8)–5% glycerol) and boiled for 5 min. The structural proteins were resolved in a precast gradient 4–20% Mini-Protean TGX gel (Bio-Rad) in Tris/Glycine running buffer and stained with Bio-Safe Coomassie G-250 (Bio-Rad). One protein fragment, which was unique to phage 15(Mo9) in comparison with the c2 phage structural proteins, was excised from the gel, digested with trypsin and analysed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) on a QExactive plus (Thermo Fisher, Germany) instrument using the standard protocol at the Bioanalytical Mass Spectrometry Facility (BMSF) at the Mark Wainwright Analytical Centre (MWAC), UNSW, Sydney. Peak lists were generated using Mascot Daemon/Mascot Distiller (Matrix Science, London, ), and submitted to the database search program Mascot (version 2.5.1, Matrix Science). Search parameters were: Precursor tolerance 5 ppm and product ion tolerances ± 0.05 Da; Met(O) carboxyamidomethyl-Cys specified as variable modification, enzyme specificity was trypsin, 1 missed cleavage was possible and the Uniprot database (14-2-2018) was searched.

3. RESULTS

3.1 PHENOTYPIC AND GENETIC CHARACTERISATION of 15(Mo9)

3.1.1 Plaque morphology. Phage 15(Mo9) produced a uniform plaque morphology on its host Lc. lactis ssp cremoris Mo9. The plaques had a clear centre that ranged

86

between 1 – 3 mm in diameter, with the majority being 2 mm. The clear centres were surrounded by a 1 - 2 mm diameter halo. Within 24 h – 48 h an additional clearance zone was formed, which was presumably a result of the diffusion of a phage produced soluble enzyme, such as lysin (see Figure 4.1).

Figure 4.1 Plaque morphology of phage 15(Mo9) on its host Lc. lactis ssp cremoris Mo9

3.1.2 Phage species determination. Three different primer pairs, P335A/B, c2A/B and 936A/B, were used to determine the phage species of 15(Mo9). The putative protein with unknown function of the P335-type phages was the target region for the design of the P335A/P335B primers. The conserved mcp (major capsid protein) genes of the c2- type phages and the msp (major structural protein) genes of the 936-type phages formed the basis for the design of the c2A/B and 936A/B primers (25).

The presence of two PCR products corresponding to the c2 (474 bp) and P335 (682 bp) phage types were observed for phage 15(Mo9) (see Figure 4.2). This result was confirmed in an alternative multiplex PCR test for phage typing (29) when the same two phage species were detected in 15(Mo9) (results not shown).

type type

-

type

type type

-

- Band size

(bp) c2

Neg cnt Neg Ø15(Mo9) Ø936 Ø ØP335 Marker (1) Mo9 Ni301 (2) Mo9 AM1 FD11 FD11R Marker

682 → 474 → 179 →

87

Figure 4.2 Multiplex PCR applied on the purified DNA of the Lc. lactis phage 15(Mo9) showing the presence of 474 bp and 682 bp PCR products in a single PCR amplification. Lc. lactis phages SK1 (936-like), c2 (c2-type) and BK5 (P335-type) (all retrieved from our internal collection) served as positive controls. Lc. lactis ssp cremoris Mo9 genomic DNA was tested in two separate PCR reactions. Lc. lactis ssp cremoris AM1, Lc. lactis ssp lactis Ni301 and Lc. lactis ssp lactis biovar diacetylactis FD11 and FD11R displayed the P335-like fragment. HypperLadder I (Bioline) was used as a molecular marker. The negative control contained primers and water only.

Host lactococcal strains were also tested for the presence of prophages since prophages are often detected in sequenced bacterial genomes. The 682 bp amplicon was absent from Lc. lactis ssp cremoris Mo9, but was detected in Lc. lactis ssp cremoris AM1, Lc. lactis ssp lactis Ni301 and Lc. lactis ssp lactis biovar diacetylactis FD11 and FD11R (see Figure 4.2). Since prophage related sequences are usually homologous to P335 phage species (25, 55) these observations indicated that Mo9 did not contain a P335-like prophage DNA in its genome, while the other strains did.

3.1.3 Morphology study by electron microscopy. Phage 15(Mo9) displayed a prolate capsid morphology and striated (ring-like appearing) tails with non-contractile sheaths. Based on morphological characteristics, this phage was classified as morphotype B2 of the Siphoviridae family (56). The UA preparations revealed the small baseplate with undefined appendages and with no obvious collars (Figure 4.3, A). A somewhat different looking phage was seen on the PTA electron microscopy images. The heads were also prolate, but the short spike-like structures protruding from the tail’s tip and tail fibers (occasionally) represented unusual features for a prolate phage (see Figure 4.3, B). An arithmetic average of eleven phage measurements resulted in a mean capsid size of 42 nm by 35 nm and a tail of 77 nm long and 6 nm wide (see Appendices, Supplementary Chapter 4 - 1, Table 1).

88

(A) (B)

(A) (B)

Figure 4.3 Electron micrograph of phage 15(Mo9) negatively stained with (A) 2% (w/v) uranyl acetate (UA) and (B) and 2.5% (w/v) phosphotungstic acid (PTA). The phage appeared smaller when stained with UA then with PTA, which was particularly seen with the tails, where the difference could reach 25 – 35 nm. Small spike extending from a baseplate and fibers at the distal end of the tail are indicated by arrows.

3.1.4 One step growth curve. One step growth curves were performed using phage 15(Mo9) and Lc. lactis ssp cremoris Mo9 as a host. The latent period of 20 – 25 min and the burst size of 49 +/- 10 virions per infected cell represented the average values obtained in three experiments using the multiplicity of infection (MOI) of 0.001. The adsorption level of phage 15(Mo9) to its host Mo9 was 90+/- 2%.

89

Figure 4.4 One step growth curve depicting the infection of Lc. lactis ssp cremoris Mo9 by phage 15(Mo9) at a multiplicity of infection of 0.001

3.1.5 Host range analysis. Thirty-one Lactococcus lactis strains were tested for susceptibility to the 15(Mo9) phage and seven strains were identified as phage sensitive. Phage 15(Mo9) was able to infect two out of ten subspecies lactis strains: Ni301 and 15K; three out of thirteen subspecies cremoris: Mo9, B36 and AM1 and two out of eight tested ssp lactis biovar diacetylactis strains: FD11 and FD11R (see Table 4.1).

90

Table 4.1 Host range analysis of the phage 15(Mo9) on Lc. Lactis strains

Efficiency of STRAIN SPECIES SOURCE Plaque description plaquing

UK712 Lc. lactis ssp cremoris Internal collection - - C2 Lc. lactis ssp cremoris Internal collection - - HP Lc. lactis ssp cremoris Internal collection - - BK5 Lc. lactis ssp cremoris Internal collection - - BK5D Lc. lactis ssp cremoris Internal collection - AM1 Lc. lactis ssp cremoris Internal collection 0.045 2 mm faint E8NZ Lc. lactis ssp cremoris Internal collection - - FG2 Lc. lactis ssp cremoris Internal collection - - ML1 Lc. lactis ssp lactis Internal collection - - ML8 Lc. lactis ssp lactis Internal collection - - 761A Lc. lactis ssp lactis Internal collection - - SL894 Lc. lactis ssp lactis Internal collection - - 18 Lc. lactis ssp lactis Internal collection - - BA5 Lc. lactis ssp cremoris (26) - - BA32 Lc. lactis ssp cremoris (26) - - BA33 Lc. lactis ssp cremoris (26) - - Mo9 Lc. lactis ssp cremoris Chapter 3 1 2 mm with halo B36 Lc. lactis ssp cremoris (26) 0.86 2 mm with halo Ni301 Lc. lactis ssp lactis (26) (syn BA6) 8.88 1 mm clear

91

15K Lc. lactis ssp lactis (26) (syn BA34) 0.13 1 mm turbid BA36 Lc. lactis ssp lactis (26) - - Ni37 Lc. lactis ssp lactis (26) - - FD9 Lc. lactis ssp lactis (26) - - BA37 Lc. lactis ssp. lactis biovar diacetylactis (26) - - BA42 Lc. lactis ssp. lactis biovar diacetylactis (26) - - FD11 Lc. lactis ssp. lactis biovar diacetylactis (26) 0.013 6-7 mm clear 3-4 mm clear with FD11R Lc. lactis ssp. lactis biovar diacetylactis (26) 0.015 halo FD13 Lc. lactis ssp. lactis biovar diacetylactis (26) - - FD15 Lc. lactis ssp. lactis biovar diacetylactis (26) - - FD1 Lc. lactis ssp. lactis biovar diacetylactis (26) - - FD3 Lc. lactis ssp. lactis biovar diacetylactis (26) - -

92

Plaque morphology. Single 15(Mo9) plaque purification on the sensitive hosts produced different plaque morphologies (see Figure 4.5). They varied from the very faint, barely visible plaques on AM1, the small (1 mm), clear plaques on Ni301 to the large (6 - 7 mm) clear plaques on FD11. The most similar looking plaque to the 15(Mo9) was that displayed on the phage 15 (FD11R), although clear centres of the latter were of larger size (3-4 mm).

A - Ø15(Mo9) – cremoris B - Ø15(Ni301) – lactis C - Ø15(AM1) - cremoris

D - Ø15(FD11) and E - Ø15(FD11R) – biovar diacetylactis

Figure 4.5 Plaque morphology of phage15(Mo9) on its primary host Lc. lactis ssp cremoris Mo9 compared to that on the secondary hosts: Lc. lactis ssp lactis Ni301, Lc. lactis ssp cremoris AM1 and and Lc. lactis ssp lactis biovar diacetylactis FD11 and FD11R.

3.1.6 CWPS typing of lactococcal strains by multiplex-PCR. It is assumed that certain lactococcal phages can use carbohydrates on the cell surface, such as cell wall polysaccharides (CWPS), for host recognition (32). CWPS typing of lactococcal strains that were sensitive to the phage 15(Mo9) showed the presence of PCR products corresponding to all three CWPS types (see Figure 4.6). The product of 442 bp in Lc. lactis ssp cremoris strains B36 corresponded to the CWPS type A. The product of 183 bp, detected in both diacetyl producing strains FD11 and FD11R, corresponded to the type B, and Lc. lactis ssp cremoris Mo9, the primary host strain of phage 15(Mo9) belonged to the CWPS type C. No amplicon was detected in Lc. lactis ssp lactis Ni301 93

and 15K (result not shown) and Lc. lactis ssp cremoris AM1, which suggested that these strains possess an unknown CWPS type (32). The amplification of the positive control gene rmlB gene verified that multiplex-PCR worked in all reactions.

01

B36 AM1 Ni3 FD11 FD11R Mo9 M I M Band size (bp)

← 891 – internal control ← 686 – Type C ← 442 – Type A

← 183 – Type B Figure 4.6 Multiplex-PCR for CWPS typing of lactococcal strains susceptible to the phage 15(Mo9). The amplification of the rmlB gene, conserved in all CWPS clusters, represented the internal positive control.

3.1.7 PCR for the classification of P335 phages. The amplification of phage 15(Mo9) DNA using seven pairs of primers designed to classify P335 phages into sub-groups resulted in positive amplicons corresponding to the receptor binding sequences of the phages 98101 and Tuc2009 (see Figure 4.7). Both of these phages belong to the P335 subgroup II (21). The amplicon generated with 98101for/98101rev pair was of the correct size of 268 bp. The Tuc2009 primer pair amplified a ~600 bp product in 15(Mo9).

M I M 62503 98101 53801 98204 LC3 38502 Tuc2009 I M

Figure 4.7 The application of the PCR designed to classify phages of P335-species into sub-groups I, II, III or IV based on their RBP-encoding genes to phage 15(Mo9).

94

3.1.8 P335-specific PCR product sequencing. Sequencing of the PCR products generated from 15(Mo9) with both 98101 and Tuc2009 forward and reverse primers resulted in identity matches to the intended targets of the same size products in phages 98101 and Tuc2009 (see the details in the Appendices, Supplementary Chapter 4 – 2).

The nucleotide sequences of the 268 bp product from 15(Mo9) aligned by BLASTn at 97% (93% query cover, E- value 3e -98) to the Lactococcus phage 98104 KX160215.1 using the 98101 forward primer and at 95% using the 98101 reverse primer. In UniProt KB, the corresponding product was identified as Baseplate/Receptor Binding protein of the 98101-98104 phage group (A0A1P8BLR2 Lactococcus phage 98102; A0A1P8BM18 Lactococcus phage 98104; A0A1P8BLK4 Lactococcus phage 98101).

In the NCBI database, the nucleotide sequences of the ~600 bp product from 15(Mo9) generated with Tuc2009 forward primer aligned at 100% identity (93% query cover, total score 998, E-value 0.0) to Lactococcus phage 49801 (KX160205.1) and 94.21% identity (93% query cover, total score 854, E-value 0.0) to Bacteriophage Tuc2009 (AF109874.2). The results obtained using the Tuc2009 reverse primer for sequencing were similar: 99.64% identity (92% query cover, total score 985, E-value 0.0) to Lactococcus phage 49801 (KX160205.1) and 94 % ID (92% query cover, total score 846, E-value 0.0) to Bacteriophage Tuc2009 (AF109874.2). The top hit in the UniProt KB corresponded to the Baseplate protein from the phage 49801 (100% identity), while it aligned to the Minor structural protein 4 from the phage Tuc2009 at 95%.

3.2 GENOMIC CHARACTERISATION of 15(Mo9)

3.2.1 Analysis of phage genome sequenced on Illumina MiSeq. Following the testing of different assembly programs (see the details in the Appendices, Supplementary Chapter 4 – 3) and the perusal of subsequent phage genome annotations, the best assembly was generated by down-sampling reads to 100x, and then annotated. It revealed the double-stranded DNA linear genome of 15(Mo9) was 20900 bp long. Its overall G+C content was 35%, which is comparable to that of the Lc. lactis host bacteria (median G+C= 35.1%). Bioinformatic analysis identified 33 open reading frames and the overall genome similarity to the reference Lactococcus phage c2 (11) with leftward (CDS1 – 16) and rightward (CDS17 – 34) orientation of the putative genes.

95

3.2.2 Analysis of phage genome sequenced on MiniOn nanopore device. Nanopore sequencing of phage 15(Mo9) produced long reads in the order of 40k, many of which were of “genome-wide” size, around 21 kb (see Figure 4.8). Blasting the P335A and P335B primers against this phage consensus assembly did not generate any positive hits. A few longer reads were also observed, which were not “consensus” assemblies and could have occurred as by-products of the ligation approach that creates chimera. To investigate this, 98 reads in total greater than 24 kbp were analyzed. Of these reads, only 7 corresponded to a P335-type phage when blasted against the P335A and P335B primers. The rest had either c2A/c2B signature or were missing any signature. No combinations of both c2-like and P335-like signatures were detected. Further analysis of these 7 reads by blastn revealed that they were very close to Lactococcus phage 49801 (coverage 91%, identity 88%), which is a known P335 phage.

Figure 4.8 Read length distribution and quality of the Nanopore reads assessed using NanoPlot.

The genome of 15(Mo9) consisted of 21204 base pairs (bp) as a single linear molecule with 32 protein coding sequences (CDS). The annotation was compared with the 96

Lactococcus virus c2 (NCBI: txid31537) and showed similar genome organization with <95% identity and some key differences (see Figure 4.9). These differences involved the lack of several small putative proteins from the early gene region of the c2 genome in 15(Mo9) genome, and the apparent low level similarity of one of the structural proteins from the late region (rightward located) between the two genomes. Based on nucleotide sequence similarity with c2 of <90% phage 15(Mo9) was designated as Lactococcus virus unclassified c2-like.

Figure 4.9 Comparative genomics of phage 15(Mo9) and the reference Lactococcus phage c2. Conserved regions are connected by grey shading.

3.2.3 Comparison of Illumina MiSeq and Nanopore sequencing results. Different sets of bioinformatic tools were used to analyse the raw sequencing data. For the Illumina MiSeq, ABBySS was used for assembly and Prokka for annotation, and for the Nanopore, Canu was used for assembly and RAST for annotation. These methodologies resulted in slightly different genome sizes and the number of annotated proteins (20900 bp and 33 for Illumina MiSeq) and (21204 bp and 32 for Nanopore). The putative holin gene located at the end of ABBySS assembled contig was present as an Open Reading Frame (ORF), but lacked a stop codon, and was subsequently missing from the Prokka annotation. The pipeline used for processing of Nanopore sequencing data result failed to annotate the genes CDS10, which had a lot of overlap with the ORFS surrounding it and CDS25. Further manual inspection was undertaken to scrutinize the present/absent CDS complements from each of the primary annotations and generate the final nonredundant draft sequence. The comparison to the known proteins in the reference Lactococcus virus c2 genome validated the final 15(Mo9) proteome. The annotated proteins were blasted (blastp) against the UniProt database to translate some of the uncharacterized/putative ones.

97

3.2.4 Final draft genome characteristics of phage 15(Mo9). The draft genome of phage 15(Mo9) is 21006 bp long. Out of predicted 34 protein coding sequences, 17 were putative/uncharacterized proteins and 17 had some functional annotations assigned to them (see Figure 4.10 and Table.S3 in Appendices, Chapter 4, Suppl. 4).

98

Figure 4.10 Phage 15(Mo9) genome structure. The arrows represent CDSs with the putative functions colour coded.

CDS 1 – 15 in the early region were leftward oriented and ORFs 16 – 33 were reading to the right and were located in the late region.

99

Most genes were transcribed on a forward strand. The start codon ATG was used in 91.2% and TTG in 8.8% CDSs, while the most used stop codon was TAA (61.8%), followed by TGA (23.5%) and TAG (14.7%). Four predicted proteins of 15(Mo9), CDS8, 9, 10 and 16, were 100% identical to the proteins of the early region of the type c2 phage (e14, e13, the transcription regulator and e1), and two proteins (CDS24 and CDS30) fully matched the uncharacterized proteins of the type phage bIL67. Several genes of the c2 genome, namely e2, e3, e7, e9 and e10, that encode putative regulatory proteins, were absent from the 15(Mo9) annotation. Interestingly, blastp (33) showed that e10 does not show homology to any other lactococcal phage.

The putative genes involved in metabolism, transcription regulation and virion structure and assembly appeared to be well conserved in prolate phages as the homologs of these genes in15(Mo9) genome showed high level of identity (>90%) to both type and unclassified c2 phages. However, the late region encompassing putative structural genes CDS31, 32 and 33, which are the homologs of c2 phage genes l14, l15 and l16, appeared more variable. For example, CDS32 in 15(Mo9) and its equivalent l15 were similar at the nucleotide level with 90% homology over just 16% query cover. The gene l15 has been proposed to determine host specificity in some prolate phages(16), while in others it has neither influenced host specificity nor adsorption efficiency (9).

Comparative genomics revealed three other structural proteins from c2, gene products L7 (major tail shaft protein), L9 and L10 (probable tape measure protein), which displayed low amino acid (aa) identity (80 – 85%) to their counterpart gene products CDS23, CDS26 and CDS27 in the 15(Mo9) genome. Additionally, CDS25 did not have its homolog in the phage c2 genome. All of these genes, that code for structural proteins, displayed more obvious sequence divergence. They could potentially carry determinants for host tropism of 15(Mo9) and be responsible for its broad host range. CDS 27 was the largest predicted protein (706 aa), and the only one which gave the highest match to an uncharacterized protein in Lc. lactis.

The majority of identified protein coding sequences o 15(Mo9) shared significant similarity with the known proteins of the various c2-like phages (unclassified Ceduovirus), such as 62606, 50504 and M6165. Neither transfer RNA genes (tRNA) nor integrase genes, either of which would be indicative of lysogenicity (57), were predicted in 15(Mo9) genome thus supporting its lytic nature.

100

The nucleotide sequence of phage 15(Mo9) was deposited in GenBank database under the accession number MN337887.

3.2.5 SDS-PAGE and massspectrometry analysis. Analysis of 15(Mo9) structural proteins by SDS-PAGE showed similar composition of structural proteins to that of c2 (see Figure 10). However, in c2 two distinctive proteins with sizes over 150 kDa corresponding to major head proteins (11) were clearly observed, in 15(Mo9) they were just barely visible (see the arrow 1). Overall, seven protein fragments were detected with a molecular weight that ranged from ~10 kDa to ~150 kDa, and one fragment that was larger than 250 kDa. Four structural proteins were detected, with estimated sizes of ~ 15, ~25, 45 and 60 Kda, the major one being that of ~ 15 KDa (see the arrow 2).

Additionally, that fragment was not present in phage c2 protein complement.

)

c2

Ø15(Mo9) Marker

Ø Marker Ø15(Mo9 )

kDa 250 1 150 100 75

50 37

25 2 20 15 10

Figure 4.11 SDS-PAGE analysis of 15(Mo9) structural proteins and their comparison with c2 structural protein profiles. The proteins were resolved in a polyacrilamide 4 - 12% bis-tris gel. Precision Plus Protein Dual Extra standard (Biorad) was used as a molecular weight marker.

The LC-MS/MS analysis of the ~15kDa protein fragment showed that the most dominant protein was bovine ribonuclease pancreatic with the correct molecular weight of 13 kDa corresponding to the analysed gel band. This was expected, given that the phage protein sample preparation involved the use of bovine RNAse A. By narrowing the search to the customized lactococcal phage sequences database that also included the 15(Mo9) annotated proteins, two protein hits matching the same set of peptides

101

were detected. The top hit corresponded to the hypothetical protein NP_076757.1 (gene ID:920433) of the Lactococcus phage bIL310, which is a P335 type phage (58). The second detected match corresponded to the Lactococcus virus c2 minor structural protein NP_043552.1 (gene=l4). However, although detected with reasonable Mascot scores (29 for NP_076757.1 and 26 for NP_043552.1), both proteins were identified based on only one short peptide. This represents a low level of protein identification of the virion protein, which may be explained by relative low levels of the 15(Mo9) virion proteins compared to the dominant bovine protein, which hindered the confident identification by the mass spectrometry. It is also possible that the peptides derived from virion proteins exhibited low ionisation efficiency, leading to poor detection or affecting fragmentation. Thus, alternative sample preparation, such as using alternatives to trypsin during peptide generation, may be necessary to improve detection of virion proteins.

4. DISCUSSION

The research presented in this work was instigated by an unusual observation regarding the phage 15(Mo9) isolated on Lc. lactis ssp cremoris Mo9. Following the application of the multiplex-PCR method for the identification of one of the three most prevalent lactococcal species, 936, P335 or c2, the presence of the PCR products typical of both c2- and p335-phage types were detected. This finding was considered atypical since c2 type phages are strictly lytic and conserved within its DNA homology group and have not been known for their ability to recombine with the P335-species (58-59). Hence an initial assumption was made that two different phage species had been co-amplified in the original lysate. However, re-purification attempts on soft agar plates of the Mo9 host, aimed at isolating and separating phage species, consistently produced a uniform population of a single plaque morphology type.

Transmission electron microscopy images of the phage 15(Mo9) also confirmed a single phage population. The virions possessed prolate capsid morphology and non- contractile tails characteristic for a lactococcal c2- species. Its average head size (42 X 35 nm) was smaller and the tail length size (77 nm) was shorter than the usual prolate phage dimensions, which have been reported to be 55-65 x 40-48 nm for heads and 80-110 nm for tails (60-62).

102

Host range analysis showed that seven out of thirty Lc. lactis strains were sensitive to 15(Mo9). Phages of the c2- species with both narrow (infecting 1 -2 strains) and wide (6 – 8) host spectrum have been reported (4, 6, 62). In a recent phage-host survey that involved a large number of lactococcal phages and strains, c2-type phages were shown to infect on average a higher number of strains (9.1) than P335 (4.5) and 936 phages (2.5 strains) (23). However, the lytic activity of the tested c2-type phages was predominantly observed on strains that belonged to the cremoris species. The infectivity range of 15(Mo9) included genetically unrelated strains that belonged to both the subspecies cremoris and lactis, including biovar diacetylactis. The cross-infection of lactococcal subspecies has previously been reported for 936-type phages (63), but not for c2-types.

Infection by lactococcal phages, including C2viruses, involves the initial adsorption of the phage to the cell surface in a reversible reaction that involves binding to the different carbohydrates of the lactococcal cell wall, such as rhamnose, sucrose, galactose or phosphoglycerol, which is a component of teichoic or lipoteichoic acids (13). To further explore the broad host range of 15(Mo9), the cell wall polysaccharide (CWPS) type of the sensitive strains was investigated, which revealed that they belonged to all three major CWPS types, A, B and C, as well as to the unknown or U type (32). The diversity of the CWPS types was observed in the three tested cremoris strains, with each assigned to a different CWPS type. The primary host Mo9 possessed type C, B36 the type A, and AM1 the U-type. It has been shown previously that the 936 and c2 phages preferably infect strains with the CWPS type C, while the P335 group phages have preference for the type A strains (23). It remains unknown whether the infection process by 15(Mo9) was initiated by its ability to recognize different polysaccharides on the cell wall surfaces of different strains or whether it recognized the same carbohydrate that was present in each of these strains. Either way this observation corroborated the finding of a broad lactococcal host range infectivity of 15(Mo9).

Lysogeny in Lc. Lactis is well-known with some strains containing multiple prophages in their genomes (58, 64). Lc. lactis Mo9 did not appear lysogenic as it did not contain P335-species sequences. This indicates that the P335 signal in 15(Mo9) could not have resulted from a potential recombination with prophage segments of Mo9 genome. The sequence analysis of the consensus 15(Mo9) genome did not find any P335 signal. One possibility may be that the P335 phage is present as an extrachromosomal

103

element akin to 'the phage carrier state' (65) where it is neither lytic nor lysogenic but just exists as a phage entity at low level.

In a further experiment performed to clarify whether the detected presence of genetic elements characteristic of the P335 phage type was true or an artefact, a PCR method that has been used to classify P335 phages into one of the four subgroups (I – IV) was applied to 15(Mo9). This method is based on the analysis of the genomic region encoding the adhesion module, more specifically the receptor binding protein (RBP) domain of the baseplate (21). The recognition between the RBP and cell wall polysaccharides enables phage-host interaction in P335 and 936 phages (66). The RBP/baseplate region is responsible for the alteration or extension of a given host range, through modular rearrangement and gene acquisition (21). The amplification products of 268 bp and ~600 bp generated from 15(Mo9) corresponded to Baseplate/receptor binding protein (Lactococcus phage 98102) and Baseplate protein from the Lactococcus phage 49801(Minor structural protein 4 from Tuc2009). All of these phages belong to the P335 subgroup II, the so-called, classical P335 species (21). However, neither of the genes encoding these proteins appeared in the final draft genome annotation. It remains unknown if this could possibly be the consequence of the underlying setup of the bioinformatics tools for gene calling that could not reveal the full phage genome complexity (67).

The principle of nanopore sequencing is based on single molecule characterization by sequencing of native nucleic acid strands (68). Nanopore sequencing of 15(Mo9) genome was undertaken following Illumina MiSeq sequencing with the anticipation that due to its specific chemistry it could enable a resolution regarding the issue of one vs two individual phage species. Nanopore sequencing revealed a single Lactococcus virus c2-like molecule, but also noted the presence of a cluster of P335 sequences, which appeared as a separate molecule. They could be of prophage origin from a heterologous host. Given the sensitive nature of PCR, these sequences were probably sufficient to create a P335-type product alongside the main c2-like phage. The nanopore sequencing data excluded a possibility of a previous recombination event between the c2- and P335 phage types and supported a potential origin of these sequences from a separate molecule. Even though this P335-like genetic element was negligibly present and was evidently not integrated into the c2-like chromosome, it was still able to steadily replicate as its genetic signature was detected in various amplifications over the course of this study. Further experiments would be required to

104

resolve this issue. Such experiments should start with additional single plaque purification including at different temperatures to ensure phage purity. If a P335 signal is still obtained after further purification, investigation of the host strain Mo9 is recommended. This could include isolation of total DNA (cDNA and extrachromosomal DNA) or colony hybridisation to assist in determining if a P335 type phage exists in a carrier state or as an extrachromosomal element.

5. CONCLUSION

This study described an investigation of the lactococcal phage 15(Mo9), which presented as a c2+P335 type. Our results revealed that this phage is a novel c2-like phage (unclassified Ceduovirus) and that within the phage DNA preparation a very low level of P335 sequences was detected via nanopore sequencing. Given the sensitive nature of PCR these sequences would have been sufficient to give the initially observed P335-type product.

This work established that there is no evidence of recombination between a c2- and a P335-like phage nor that the host strain Mo9 carried P335 as a prophage, either of which would have explained this dual signal. Furthermore, sequential single plaque purification and further repurification following the initial PCR results would be expected to exclude the possibility of P335 phage contamination.

Hence the origin/source of these P335 sequences remains unresolved. Some possible explanations include P335 existing in a phage carrier state or a low level contamination with a P335-type phage. An additional three phages have since been identified that show this dual species signature and may assist in resolving this issue.

Phage 15 (Mo9) differs to other c2-like phages by having an extended host range that includes both subspecies cremoris and lactis, including biovar diacetylactis. Its linear genome was estimated as 21,006 bp in size and its nucleotide sequence has been deposited to GenBank with the accession number MN337887.

SUPPLEMENTARY FILES in APPENDICES

Supplementary 1. Electron microscopy measurements

105

Supplementary 2. Sequencing of the PCR products generated from 15(Mo9) in PCR reactions using primers for the classification of P335 phage species

Supplementary 3. Illumina MiSeq sequencing of 15(Mo9)

Supplementary 4. The annotated products of 15(Mo9) and their similarity with known putative proteins

ACKNOWLEDEGMENTS

The author would like to thank Dr Emma Kettle, who performed the electron microscopy imaging. Electron Microscopy was performed at the Westmead Scientific Platforms, which are supported by the Westmead Research Hub, the Cancer Institute New South Wales, the National Health and Medical Research Council and the Ian Potter Foundation. The Illumina phage sequencing was performed at the Ramaciotti Centre for Genomics, UNSW, Sydney. The input of Dr Richard Edwards and Tim Amos that included assembly and annotation of Illumina generated phage sequences was much appreciated. The contribution of Cedric Lood from the Laboratory of Gene Technology, KU Leuven, Belgium, who performed nanopore sequencing of the phage genome and the bioinformatic analysis is greatly acknowledged. Mass spectrometric analysis for this work was carried out by Dr Ling Zhong at the Bioanalytical Mass Spectrometry Facility, UNSW and was supported in part by infrastructure funding from the New South Wales Government as part of its co-investment in the National Collaborative Research Infrastructure Strategy. The contribution of Dr Zhong is greatly acknowledged. The author would also like to sincerely thank Dr Harald Brussow, who suggested nanopore sequencing as a strategy for resolving the issue presented in this work and for helpful discussions. The author is very grateful to Dr Melissa Harvey for her critical contributions to this chapter.

REFERENCES

1. Deveau H, Labrie SJ, Chopin MC, Moineau S. Biodiversity and classification of lactococcal phages. Appl Environ Microbiol. 2006;72(6):4338-46.

2. Labrie SJ, Moineau S. Abortive infection mechanisms and prophage sequences significantly influence the genetic makeup of emerging lytic lactococcal phages. J Bacteriol. 2007;189(4):1482-7.

106

3. Mahony J, van Sinderen D. Current taxonomy of phages infecting lactic acid bacteria. Front Microbiol. 2014;5:7.

4. Moineau S, Borkaev M, Holler BJ, Walker SA, Kondo JK, Vedamuthu ER, et al. Isolation and characterization of lactococcal bacteriophages from cultured buttermilk plants in the . J Dairy Sci. 1996;79(12):2104-11.

5. Bissonnette F, Labrie S, Deveau H, Lamoureux M, Moineau S. Characterization of mesophilic mixed starter cultures used for the manufacture of aged . J Dairy Sci. 2000;83(4):620-7.

6. Szczepanska AK, Hejnowicz MS, Kolakowski P, Bardowski J. Biodiversity of Lactococcus lactis bacteriophages in Polish dairy environment. Acta Biochim Pol. 2007;54(1):151-8.

7. Castro-Nallar E, Chen H, Gladman S, Moore SC, Seemann T, Powell IB, et al. Population genomics and phylogeography of an Australian dairy factory derived lytic bacteriophage. Genome Biol Evol. 2012;4(3):382-93.

8. Casjens S HG, Hendrix R. . Evolution of dsDNA tailed-bacteriophage genomes. Semin Virol 1992;3:383–97.

9. Rakonjac J, O'Toole PW, Lubbers M. Isolation of lactococcal prolate phage- phage recombinants by an enrichment strategy reveals two novel host range determinants. J Bacteriol. 2005;187(9):3110-21.

10. Rakonjac J, Ward LJ, Schiemann AH, Gardner PP, Lubbers MW, O'Toole PW. Sequence diversity and functional conservation of the origin of replication in lactococcal prolate phages. Appl Environ Microbiol. 2003;69(9):5104-14.

11. Lubbers MW, Waterfield NR, Beresford TP, Le Page RW, Jarvis AW. Sequencing and analysis of the prolate-headed lactococcal bacteriophage c2 genome and identification of the structural genes. Appl Environ Microbiol. 1995;61(12):4348-56.

12. Schouler C, Ehrlich SD, Chopin MC. Sequence and organization of the lactococcal prolate-headed bIL67 phage genome. Microbiology. 1994;140 ( Pt 11):3061-9.

13. Millen AM, Romero DA. Genetic determinants of lactococcal C2viruses for host infection and their role in phage evolution. J Gen Virol. 2016;97(8):1998-2007.

107

14. Valyasevi R, Sandine WE, Geller BL. A membrane protein is required for bacteriophage c2 infection of Lactococcus lactis subsp. lactis C2. J Bacteriol. 1991;173(19):6095-100.

15. Geller BL, Ivey RG, Trempy JE, Hettinger-Smith B. Cloning of a chromosomal gene required for phage infection of Lactococcus lactis subsp. lactis C2. J Bacteriol. 1993;175(17):5510-9.

16. Stuer-Lauridsen B, Janzen T, Schnabl J, Johansen E. Identification of the host determinant of two prolate-headed phages infecting Lactococcus lactis. Virology. 2003;309(1):10-7.

17. Derkx PM, Janzen T, Sorensen KI, Christensen JE, Stuer-Lauridsen B, Johansen E. The art of strain improvement of industrial lactic acid bacteria without the use of recombinant DNA technology. Microb Cell Fact. 2014;13 Suppl 1:S5.

18. Desiere F, Mahanivong C, Hillier AJ, Chandry PS, Davidson BE, Brussow H. Comparative genomics of lactococcal phages: insight from the complete genome sequence of Lactococcus lactis phage BK5-T. Virology. 2001;283(2):240-52.

19. Ackermann HW, Kropinski AM. Curated list of prokaryote viruses with fully sequenced genomes. Res Microbiol. 2007;158(7):555-66.

20. Murphy J, Bottacini F, Mahony J, Kelleher P, Neve H, Zomer A, et al. Comparative genomics and functional analysis of the 936 group of lactococcal Siphoviridae phages. Scientific Reports. 2016;6:21345.

21. Mahony J, Oliveira J, Collins B, Hanemaaijer L, Lugli GA, Neve H, et al. Genetic and functional characterisation of the lactococcal P335 phage-host interactions. BMC Genomics. 2017;18(1):146.

22. Wandro S, Oliver A, Gallagher T, Weihe C, England W, Martiny JBH, Whiteson K. Predictable molecular adaptation of coevolving Enterococcus faecium and lytic phage EfV12-phi1. Frontiers in Microbiology. 2019;9(3192).

23. Oliveira J, Mahony J, Hanemaaijer L, Kouwen T, van Sinderen D. Biodiversity of bacteriophages infecting Lactococcus lactis starter cultures. J Dairy Sci. 2018;101(1):96-105.

24. Rihtman B, Meaden S, Clokie MR, Koskella B, Millard AD. Assessing Illumina technology for the high-throughput sequencing of bacteriophage genomes. PeerJ. 2016;4:e2055.

108

25. Labrie S, Moineau S. Multiplex PCR for detection and identification of lactococcal bacteriophages. Appl Environ Microb. 2000;66(3):987-94.

26. Damnjanovic D, Harvey M, Bridge WJ. Application of colony BOXA2R-PCR for the differentiation and identification of lactic acid cocci. Food Microb. 2019;82:277-86.

27. Terzaghi BE, Sandine WE. Improved medium for lactic streptococci and their bacteriophages. Appl Microbiol. 1975;29(6):807-13.

28. Kropinski AM, Prangishvili D, Lavigne, R. Position paper: the creation of a rational scheme for the nomenclature of viruses of Bacteria and Archaea. Appl Environ Microbiol. 2009;11:2775–7.

29. del Rio B, Binetti AG, Martin MC, Fernandez M, Magadán A, Alvarez MA. Multiplex PCR for the detection and identification of dairy bacteriophages in milk. Food Microbiol. 2007;24:75-81.

30. Kropinski AM. Practical advice on the one-step growth curve. Methods Mol Biol. 2018;1681:41-7.

31. Sanders ME, Klaenhammer TR. Restriction and modification in group N streptococci: effect of heat on development of modified lytic bacteriophage. Appl Environ Microbiol. 1980;40(3):500-6.

32. Mahony J, Kot W, Murphy J, Ainsworth S, Neve H, Hansen LH, et al. Investigation of the relationship between lactococcal host cell wall polysaccharide genotype and 936 phage receptor binding protein phylogeny. Appl Environ Microbiol. 2013;79(14):4385-92.

33. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990;215(3):403-10.

34. The UniProt C. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45(D1):D158-D69.

35. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114-20.

36. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19(6):1117-23.

37. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455-77. 109

38. Zerbino DR. Using the Velvet de novo assembler for short-read sequencing technologies. Curr Protoc Bioinformatics. 2010;Chapter 11:Unit 11 5.

39. Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764-70.

40. Hozza M, Vinař, T., & Brejová, B. How big is that genome? Estimating genome size and coverage from k-mer abundance spectra. In: String Processing and Information Retrieval, Springer International Publishing. 2015:199-209.

41. Edwards RJ, Palopoli N. Computational prediction of short linear motifs from protein sequences. Methods Mol Biol. 2015;1268:89-141.

42. Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30(14):2068-9.

43. Li D, Liu CM, Luo R, Sadakane K, Lam TW. MEGAHIT: an ultra-fast single- node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674-6.

44. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. Bmc Bioinformatics. 2010;11:119.

45. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389-402.

46. Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059-66.

47. Andrews S. FastQC: a quality control tool for high throughput sequence data. . Available online at: http://wwwbioinformaticsbabrahamacuk/projects/fastqc. 2010.

48. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722-36.

49. De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. NanoPack: visualizing and processing long-read sequencing data. Bioinformatics. 2018;34(15):2666-9.

110

50. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 2014;9(11):e112963.

51. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754-60.

52. Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC genomics. 2008;9:75-.

53. Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome comparison visualizer. Bioinformatics (Oxford, England). 2011;27(7):1009-10.

54. Boulanger P. Purification of bacteriophages and SDS-PAGE analysis of phage structural proteins from ghost particles. Methods Mol Biol. 2009;502:227-38.

55. Djordjevic GM, Klaenhammer TR. Bacteriophage-triggered defense systems: phage adaptation and design improvements. Appl Environ Microbiol. 1997;63(11):4370-6.

56. Ackermann HW. Tailed bacteriophages: the order caudovirales. Adv Virus Res. 1998;51:135-201.

57. Brussow H, Canchaya C, Hardt WD. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol Mol Biol Rev. 2004;68(3):560-602, table of contents.

58. Chopin A, Bolotin A, Sorokin A, Ehrlich SD, Chopin M. Analysis of six prophages in Lactococcus lactis IL1403: different genetic structure of temperate and virulent phage populations. Nucleic Acids Res. 2001;29(3):644-51.

59. Jarvis A, Meyer J. Electron microscopic heteroduplex study and restriction endonuclease cleavage analysis of the DNA genomes of three lactic streptococcal bacteriophages. Appl Environ Microbiol. 1986;51(3):566-71.

60. Jarvis A. Differentiation of lactic streptococcal phages into phage species by DNA-DNA homology. Appl Environ Microbiol. 1984;47(2):343-9.

61. Moineau S, Fortier J, Ackermann HW, Pandian S. Characterization of lactococcal bacteriophages from Quebec cheese plants. Can J Microb. 1992;38:875- 82.

111

62. Raiski A, Belyasova N. Biodiversity of Lactococcus lactis bacteriophages in the Republic of . Int J Food Microbiol. 2009;130(1):1-5.

63. Murphy J RB, Mahony J, Hoyles L, Heller K, Neve H, Bonestroo M, Nauta A, van Sinderen D. Biodiversity of lactococcal bacteriophages isolated from 3 Gouda-type cheese-producing plants. J Dairy Sci. 2013;96(8):4945-57.

64. Canchaya C, Proux C, Fournous G, Bruttin A, Brussow H. Prophage genomics. Microbiol Mol Biol Rev. 2003;67(2):238-76.

65. Siringan P, Connerton PL, Cummings NJ, Connerton IF. Alternative bacteriophage life cycles: the carrier state of Campylobacter jejuni. Open Biol. 2014;4:130200.

66. Spinelli S, Campanacci V, Blangy S, Moineau S, Tegoni M, Cambillau C. Modular structure of the receptor binding proteins of Lactococcus lactis phages: The RBP structure of the temperate phageTP901-1. Journal of Biological Chemistry. 2006;281(20):14256-62.

67. McNair K, Zhou C, Souza B, Edwards RA. THEA: A novel approach to gene identification in phage genomes. bioRxiv. 2018:265983.

68. Kasianowicz JJ, Brandin E, Branton D, Deamer DW. Characterization of individual polynucleotide molecules using a membrane channel. Proceedings of the National Academy of Sciences. 1996;93(24):13770-3.

112

CHAPTER 5

CONCLUSION AND FUTURE REMARKS

Lactic acid bacteria (LAB) encompasses a diverse group of bacteria that have a long history of safe use in the production of fermented foods, such as dairy (225), meat (226), vegetables (227), cereals (228), sourdough (229), fish (230); and beverages (231). Various methods have been used for LAB characterization. While use of morphological, physiological and biochemical assays is vital for functional application aspects of starter cultures, the reliable identification of LAB requires genotyping methods.

The first two projects of this PhD thesis resulted in the generation and validation of a practical, useful and easily implementable genotyping method applicable to both LAB and their phages. The third project described the characterisation of an atypical lytic lactococcal phage.

In the first chapter, a repetitive-PCR method using a single repetitive primer BOXA2R was developed. BOXA2R-PCR was shown to enable speciation, subspeciation and strain differentiation of Lc. lactis in a single reaction. The usefulness of this method was further demonstrated with the most common species of lactic acid cocci encountered in fermented dairy products: Lc. lactis, Str. thermophilus, Leuc. mesenteroides and Enterococcus sp. These species are interrelated and can have similar phenotypic characteristics, which can affect their correct taxonomic identification. BOXA2R-PCR could be used to initially identify dairy cocci isolates of Lc. lactis, Str. thermophilus or Leuc. mesenteroides at both strain and genus/species level with one quick inexpensive test. Its application to Enterococcus species, however, was limited to strain differentiation as the tested isolates failed to generate any genus- or species-specific bands. This is in agreement with the description of enterococci as being a heterogenous group of Gram-positive cocci with a large phenotypic and genotypic strain variability (232). Repetitive PCR using BOXA2R primer offers clear technical advantages due to its its reproducibility using a single, pure intact colony as a PCR template. The methodology’s potential is the offer of a high resolution, rapid and low cost genotyping tool for use in industrial applications as well as in research studies that explore the biodiversity of microbiota of artisanal cheeses.

Although lactobacilli were not within the scope of the current investigation, it was observed that species of the Lb. casei group appeared to share PCR products of specific sizes. Future investigation could explore the potential for BOXA2R-PCR genotyping to be applied to lactobacilli as they form a substantial part of bacterial

114

communities of many fermented and probiotic products. Additionally, our findings indicated that the methodology could be evaluated in future work for its potential to recognize contaminant bacterial and yeast isolates, which can occasionally be detected in dairy products.

The second chapter of this dissertation described the successful application of the same boxA-based PCR method for phage genotyping. Repetitive-PCR using both BOXA1R and BOXA2R single primers generated distinctive and reproducible patterns when applied to Lc. lactis phage, including those with genomes that are difficult to digest with restriction enzymes. To adequately inform the development of effective biocontrol strategies to address the unavoidable phage contamination problem in the factory environment, it would be beneficial to have such a simple genotyping method to monitor the resident phage population and to identify the emergence and circulation of new phage strains and types. Although developed with a primary focus on lactococcal phages, the utility of BOXA-PCR for genotyping extended to phage of other bacterial species, as demonstrated by the fingerprinting of Staph. aureus, Ps. aeruginosa and E. coli phages isolated from different ecological niches. The most appropriate template for PCR reaction was purified phage DNA that was free of bacterial contamination. This constraint of the method was imposed by the observed co-amplification of bacterial sequences if present in phage preparation.

The final chapter of the thesis investigated the phenotypic and genomic characterisation of the lytic Lc. lactis phage vB_LacS_15(Mo9) which presented with both c2 and P335 phage type signatures. Since this combination of phage types was unlikely, it drew suspicion of a possible contamination of the stock lysate. Phage 15(Mo9) was sequenced using two sequencing technologies, Illumina MiSeq and Nanopore. The specific nanopore chemistry appeared clearly advantageous in this case as it was capable of detecting the P335-like sequences and resolving that 15(Mo9) was not a chimeric phage. It was concluded that 15(Mo9) was an Unclassified c2 virus, which contained low level of P335-like sequences in its genome, which appeared as a segregated molecule.

The obligatory lytic prolate c2-type phages are one of the three most common species that affect Lc. lactis fermentations. However, 15(Mo9) had some features uncharacteristic for a c2-type phage. These included: a double-zone plaque morphology; EM observations of spike like structures on the tail tip on some particles;

115

an extended host range that spanned both subspecies cremoris and lactis, including biovar diacetylactis, which was corroborated by the observed ability to infect lactococcal strains with different types of cell wall polysaccharides; altered capsid morphology upon propagation in alternative hosts; and the generation of PCR products corresponding to the receptor binding protein sequences characteristic for the classical P335 phage group. It could not be concluded as to whether these phenotypic effects were influenced by the foreign, P335-type genetic element in the 15(Mo9) genome.

Further work would be required investigate the mechanisms by which 15(Mo9) and other phages that present as two-species type appear, which could contribute new knowledge on phage diversity and evolution.

116

BIBLIOGRAPHY

1. Mahony J, Moscarelli A, Kelleher P, Lugli GA, Ventura M, Settanni L, van Sinderen D. 2017. Phage biodiversity in artisanal cheese wheys reflects the complexity of the fermentation process. Viruses 9.

2. Kelly WJ, Ward LJ, Leahy SC. 2010. Chromosomal diversity in Lactococcus lactis and the origin of dairy starter cultures. Genome Biol Evol 2:729-44.

3. Fortina MG, Ricci G, Acquati A, Zeppa G, Gandini A, Manachini PL. 2003. Genetic characterization of some lactic acid bacteria occurring in an artisanal protected denomination origin (PDO) Italian cheese, the Toma piemontese. Food Microbiology 20:397-404.

4. Terzic-Vidojevic A, Vukasinovic M, Veljovic K, Ostojic M, Topisirovic L. 2007. Characterization of microflora in homemade semi-hard white Zlatar cheese. Int J Food Microbiol 114:36-42.

5. Hebert EM, De Giori GS, Raya RR. 2001. Isolation and characterization of a slowly milk-coagulating variant of Lactobacillus helveticus deficient in purine biosynthesis. Appl Environ Microbiol 67:1846-50.

6. Marcó MB, Moineau S, Quiberoni A. 2012. Bacteriophages and dairy fermentations. Bacteriophage 2:149-158.

7. Powell IB, Davidson BE. 1985. Characterization of streptococcal bacteriophage c6A. J Gen Virol 66 ( Pt 12):2737-41.

8. Barrangou R, Yoon SS, Breidt F, Jr., Fleming HP, Klaenhammer TR. 2002. Characterization of six Leuconostoc fallax bacteriophages isolated from an industrial sauerkraut fermentation. Appl Environ Microbiol 68:5452-8.

9. Labrie S, Moineau S. 2000. Multiplex PCR for detection and identification of lactococcal bacteriophages. Applied and Environmental Microbiology 66:987- 994.

10. Deveau H, Labrie SJ, Chopin MC, Moineau S. 2006. Biodiversity and classification of lactococcal phages. Appl Environ Microbiol 72:4338-46.

117

11. Chopin A, Bolotin A, Sorokin A, Ehrlich SD, Chopin M. 2001. Analysis of six prophages in Lactococcus lactis IL1403: different genetic structure of temperate and virulent phage populations. Nucleic Acids Res 29:644-51.

12. Rakonjac J, O'Toole PW, Lubbers M. 2005. Isolation of lactococcal prolate phage-phage recombinants by an enrichment strategy reveals two novel host range determinants. J Bacteriol 187:3110-21.

13. Bouchard JD, Moineau S. 2000. Homologous recombination between a lactococcal bacteriophage and the chromosome of its host strain. Virology 270:65-75.

14. Durmaz E, Klaenhammer TR. 2000. Genetic analysis of chromosomal regions of Lactococcus lactis acquired by recombinant lytic phages. Appl Environ Microbiol 66:895-903.

15. Broome MC, Powell I, Limsowtin GKY. 2011. Starter cultures: Specific properties. Encyclopedia of Dairy Sciences 1:269-275.

16. Hill C, Ross P. 1998. Starter cultures for the dairy industry. Genetic modification in the food industry, p 174-192 doi:10.1007/978-1-4615-5815-6_9.

17. Capozzi V, Russo P, Duenas MT, Lopez P, Spano G. 2012. Lactic acid bacteria producing B-group vitamins: a great potential for functional cereals products. Applied Microbiology and Biotechnology 96:1383-1394.

18. Ogawa J, Kishino S, Ando A, Sugimoto S, Mihara K, Shimizu S. 2005. Production of conjugated fatty acids by lactic acid bacteria. Journal of Bioscience and Bioengineering 100:355-364.

19. Beltran-Barrientos LM, Hernandez-Mendoza A, Torres-Llanez MJ, Gonzalez- Cordova AF, Vallejo-Cordoba B. 2016. Invited review: Fermented milk as antihypertensive functional food. Journal of Dairy Science 99:4099-4110.

20. Martini MC, Lerebours EC, Lin WJ, Harlander SK, Berrada NM, Antoine JM, Savaiano DA. 1991. Strains and species of lactic acid bacteria in fermented milks () - effect on in vivo lactose digestion. American Journal of Clinical 54:1041-1046.

118

21. Albano C, Morandi S, Silvetti T, Casiraghi MC, Manini F, Brasca M. 2018. Lactic acid bacteria with cholesterol-lowering properties for dairy applications: In vitro and in situ activity. Journal of Dairy Science 101:10807-10818.

22. Daly C. 1983. The Use of mesophilic cultures in the dairy industry. Antonie Van Leeuwenhoek Journal of Microbiology 49:297-312.

23. Parente E, Cogan T. 2004. Starter cultures: general aspects. Cheese: Chemistry, Physics and Microbiology 1.

24. Bourdichon F, Casaregola S, Farrokh C, Frisvad JC, Gerds ML, Hammes WP, Harnett J, Huys G, Laulund S, Ouwehand A, Powell IB, Prajapati JB, Seto Y, Ter Schure E, Van Boven A, Vankerckhoven V, Zgoda A, Tuijtelaars S, Hansen EB. 2012. Food fermentations: microorganisms with technological beneficial use. International Journal of Food Microbiology 154:87-97.

25. Powell I, Broome MC, Limsowtin GKY. 2011. Starter Cultures: general aspects, p 552-558, Encyclopedia of Dairy Sciences doi:10.1016/B978-0-08-100596- 5.00689-2.

26. Frantzen CA, Kot W, Pedersen TB, Ardo YM, Broadbent JR, Neve H, Hansen LH, Dal Bello F, Ostlie HM, Kleppen HP, Vogensen FK, Holo H. 2017. Genomic characterization of dairy associated leuconostoc species and diversity of Leuconostocs in undefined mixed mesophilic starter cultures. Front Microbiol 8:132.

27. Montel MC, Buchin S, Mallet A, Delbes-Paus C, Vuitton DA, Desmasures N, Berthier F. 2014. Traditional cheeses: rich and diverse microbiota with associated benefits. Int J Food Microbiol 177:136-54.

28. FDA. 2018. Generally Recognized as Safe (GRAS). https://wwwfdagov/food/food-ingredients-packaging/generally-recognized-safe- gras.

29. Laulund S, Wind A, Derkx PMF, Zuliani V. 2017. Regulatory and safety requirements for food cultures. Microorganisms 5.

30. Thunell RK SW, Bodyfelt FW. 1981. Phage-insensitive, multiple-strain starter approach to cheddar cheese making. Journal of Dairy Science 64:2270-2277. 119

31. Law BA. 2004. Controlled and accelerated : the research base for new technology. Int Dairy J 11:383-398.

32. Leroy F, De Vuyst L. 2004. Lactic acid bacteria as functional starter cultures for the food fermentation industry. Trends in Food Science & Technology 15:67-78.

33. Moineau S, Tremblay D, Labrie S. 2002. Phages of lactic acid bacteria: from genomics to industrial applications. ASM News 68:388-393.

34. Kongo JM. 2013. Lactic acid bacteria as starter cultures for cheese processing: past, present and future developments. Chapter 1. http://creativecommonsorg/licenses/by/30.

35. Callon C, Millet L, Montel MC. 2004. Diversity of lactic acid bacteria isolated from AOC Salers cheese. J Dairy Res 71:231-44.

36. Martin-Platero AM, Valdivia E, Maqueda M, Martinez-Bueno M. 2009. Characterization and safety evaluation of enterococci isolated from Spanish ' milk cheeses. Int J Food Microbiol 132:24-32.

37. Edalatian MR, Najafi MBH, Mortazavi SA, Alegria A, Nassiri MR, Bassami MR, Mayo B. 2012. Microbial diversity of the traditional Iranian cheeses Lighvan and Koozeh, as revealed by polyphasic culturing and culture-independent approaches. Dairy Science & Technology 92:75-90.

38. Delgado S, Mayo B. 2004. Phenotypic and genetic diversity of Lactococcus lactis and Enterococcus spp. strains isolated from Northern Spain starter-free farmhouse cheeses. Int J Food Microbiol 90:309-19.

39. Ogier JC, Serror P. 2008. Safety assessment of dairy microorganisms: the Enterococcus genus. Int J Food Microbiol 126:291-301.

40. Randazzo CL, Caggia C, Neviani E. 2009. Application of molecular approaches to study lactic acid bacteria in artisanal cheeses. J Microbiol Methods 78:1-9.

41. Ercolini D. 2004. PCR-DGGE fingerprinting: novel strategies for detection of microbes in food. J Microbiol Methods 56:297-314.

120

42. Arteau M LS, Roy D. 2010. Terminal-restriction fragment length polymorphism and automated ribosomal intergenic spacer analysis profiling of fungal communities in Camembert cheese. International Dairy Journal 20:545-554.

43. Quigley L, O'Sullivan O, Beresford TP, Ross RP, Fitzgerald GF, Cotter PD. 2011. Molecular approaches to analysing the microbial composition of raw milk and raw milk cheese. Int J Food Microbiol 150:81-94.

44. Dolci P, Zenato S, Pramotton R, Barmaz A, Alessandria V, Rantsiou K, Cocolin L. 2013. Cheese surface microbiota complexity: RT-PCR-DGGE, a tool for a detailed picture. Int J Food Microbiol 162:8-12.

45. Ndoye B, Lessard MH, LaPointe G, Roy D. 2011. Exploring suppression subtractive hybridization (SSH) for discriminating Lactococcus lactis ssp. cremoris SK11 and ATCC 19257 in mixed culture based on the expression of strain-specific genes. J Appl Microbiol 110:499-512.

46. Ndoye B, Rasolofo EA, LaPointe G, Roy D. 2011. A review of the molecular approaches to investigate the diversity and activity of cheese microbiota. Dairy Science & Technology 91:495.

47. Pu ZY, Dobos M, Limsowtin GK, Powell IB. 2002. Integrated polymerase chain reaction-based procedures for the detection and identification of species and subspecies of the Gram-positive bacterial genus Lactococcus. J Appl Microbiol 93:353-61.

48. Cavanagh D, Casey A, Altermann E, Cotter PD, Fitzgerald GF, McAuliffe O. 2015. Evaluation of Lactococcus lactis isolates from nondairy sources with potential dairy applications reveals extensive phenotype-genotype disparity and implications for a revised species. Appl Environ Microbiol 81:3961-72.

49. Nomura M, Kobayashi M, Okamoto T. 2002. Rapid PCR-based method which can determine both phenotype and genotype of Lactococcus lactis subspecies. Applied and Environmental Microbiology 68:2209-2213.

50. Garde S, Babin M, Gaya P, Nunez M, Medina M. 1999. PCR amplification of the gene acmA differentiates Lactococcus lactis subsp. lactis and L. lactis subsp. cremoris. Appl Environ Microbiol 65:5151-3.

121

51. Basaran P, Basaran N, Cakir I. 2001. Molecular differentiation of Lactococcus lactis subspecies lactis and cremoris strains by ribotyping and site specific- PCR. Curr Microbiol 42:45-8.

52. Beimfohr C LW, Schleifer K-H. 1997. Rapid genotypic differentiation of Lactococcus lactis subspecies and biovar. Systematic and Applied Microbiology 20:216-221.

53. Fernandez E, Alegria A, Delgado S, Martin MC, Mayo B. 2011. Comparative phenotypic and molecular genetic profiling of wild Lactococcus lactis subsp. lactis strains of the L. lactis subsp. lactis and L. lactis subsp. cremoris genotypes, isolated from starter-free cheeses made of raw milk. Appl Environ Microbiol 77:5324-35.

54. Odamaki T, Yonezawa S, Sugahara H, Xiao JZ, Yaeshima T, Iwatsuki K. 2011. A one step genotypic identification of Lactococcus lactis subspecies at the species/strain levels. Syst Appl Microbiol 34:429-34.

55. Kutahya OE, Starrenburg MJ, Rademaker JL, Klaassen CH, van Hylckama Vlieg JE, Smid EJ, Kleerebezem M. 2011. High-resolution amplified fragment length polymorphism typing of Lactococcus lactis strains enables identification of genetic markers for subspecies-related phenotypes. Appl Environ Microbiol 77:5192-8.

56. Psoni L, Kotzamanidis C, Yiangou M, Tzanetakis N, Litopoulou-Tzanetaki E. 2007. Genotypic and phenotypic diversity of Lactococcus lactis isolates from Batzos, a Greek PDO raw goat milk cheese. Int J Food Microbiol 114:211-20.

57. Prodelalova J, Spanova A, Rittich B. 2005. Application of PCR, rep-PCR and RAPD techniques for typing of Lactococcus lactis strains. Folia Microbiol (Praha) 50:150-4.

58. Rademaker JL, Herbet H, Starrenburg MJ, Naser SM, Gevers D, Kelly WJ, Hugenholtz J, Swings J, van Hylckama Vlieg JE. 2007. Diversity analysis of dairy and nondairy Lactococcus lactis isolates, using a novel multilocus sequence analysis scheme and (GTG)5-PCR fingerprinting. Appl Environ Microbiol 73:7128-37.

122

59. Van Hoorde K, Vandamme P, Huys G. 2008. Molecular identification and typing of lactic acid bacteria associated with the production of two artisanal raw milk cheeses. Dairy Science & Technology 88:445-455.

60. Zamfir M, Vancanneyt M, Makras L, Vaningelgem F, Lefebvre K, Pot B, Swings J, De Vuyst L. 2006. Biodiversity of lactic acid bacteria in Romanian dairy products. Syst Appl Microbiol 29:487-95.

61. Kelleher P, Murphy J, Mahony J, van Sinderen D. 2015. Next-generation sequencing as an approach to dairy starter selection. Dairy Sci Technol 95:545- 568.

62. Duckworth DH. 1976. "Who discovered bacteriophage?". Bacteriol Rev 40:793- 802.

63. Summers WC. 2001. Bacteriophage therapy. Annu Rev Microbiol 55:437-51.

64. Sulakvelidze A. 2013. Using lytic bacteriophages to eliminate or significantly reduce contamination of food by foodborne bacterial pathogens. J Sci Food Agric 93:3137-46.

65. Thacker PD. 2003. Set a microbe to kill a microbe: drug resistance renews interest in phage therapy. JAMA 290:3183-5.

66. Breitbart M, Rohwer F. 2005. Here a virus, there a virus, everywhere the same virus? Trends Microbiol 13:278-84.

67. Clokie MR, Millard AD, Letarov AV, Heaphy S. 2011. Phages in nature. Bacteriophage 1:31-45.

68. Brussow H, Hendrix RW. 2002. Phage genomics: small is beautiful. Cell 108:13-6.

69. Hatfull GF. 2008. Bacteriophage genomics. Curr Opin Microbiol 11:447-53.

70. Ofir G, Sorek R. 2018. Contemporary Phage Biology: From Classic Models to New Insights. Cell 172:1260-1270.

71. Bergh O, Borsheim KY, Bratbak G, Heldal M. 1989. High abundance of viruses found in aquatic environments. Nature 340:467-8. 123

72. Brum JR, Hurwitz BL, Schofield O, Ducklow HW, Sullivan MB. 2016. Seasonal time bombs: dominant temperate viruses affect Southern Ocean microbial dynamics. ISME J 10:437-49.

73. Verreault D, Gendron L, Rousseau GM, Veillette M, Masse D, Lindsley WG, Moineau S, Duchaine C. 2011. Detection of airborne lactococcal bacteriophages in cheese manufacturing plants. Appl Env Microbiol 77:491-7.

74. Shkoporov AN, Hill C. 2019. Bacteriophages of the human gut: the "known unknown" of the microbiome. Cell Host Microbe 25:195-209.

75. Breitbart M, Miyake JH, Rohwer F. 2004. Global distribution of nearly identical phage-encoded DNA sequences. FEMS Microbiol Lett 236:249-56.

76. Casjens SR. 2005. Comparative genomics and evolution of the tailed- bacteriophages. Curr Opin Microbiol 8:451-8.

77. Roux S, Brum JR, Dutilh BE, Sunagawa S, Duhaime MB, Loy A, Poulos BT, Solonenko N, Lara E, Poulain J, Pesant S, Kandels-Lewis S, Dimier C, Picheral M, Searson S, Cruaud C, Alberti A, Duarte CM, Gasol JM, Vaque D, Tara Oceans C, Bork P, Acinas SG, Wincker P, Sullivan MB. 2016. Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses. Nature 537:689-693.

78. Weinbauer MG. 2004. Ecology of prokaryotic viruses. FEMS Microbiol Rev 28:127-81.

79. Hurwitz BL, U'Ren JM. 2016. Viral metabolic reprogramming in marine ecosystems. Curr Opin Microbiol 31:161-168.

80. Barr JJ, Auro R, Furlan M, Whiteson KL, Erb ML, Pogliano J, Stotland A, Wolkowicz R, Cutting AS, Doran KS, Salamon P, Youle M, Rohwer F. 2013. Bacteriophage adhering to mucus provide a non-host-derived immunity. Proc Natl Acad Sci U S A 110:10771-6.

81. Ly M, Abeles SR, Boehm TK, Robles-Sikisaka R, Naidu M, Santiago-Rodriguez T, Pride DT. 2014. Altered oral viral ecology in association with periodontal disease. MBio 5:e01133-14.

124

82. Norman JM, Handley SA, Baldridge MT, Droit L, Liu CY, Keller BC, Kambal A, Monaco CL, Zhao G, Fleshner P, Stappenbeck TS, McGovern DP, Keshavarzian A, Mutlu EA, Sauk J, Gevers D, Xavier RJ, Wang D, Parkes M, Virgin HW. 2015. Disease-specific alterations in the enteric virome in inflammatory bowel disease. Cell 160:447-60.

83. Kauffman KM, Hussain FA, Yang J, Arevalo P, Brown JM, Chang WK, VanInsberghe D, Elsherbini J, Sharma RS, Cutler MB, Kelly L, Polz MF. 2018. A major lineage of non-tailed dsDNA viruses as unrecognized killers of marine bacteria. Nature 554:118-122.

84. Dutilh BE, Cassman N, McNair K, Sanchez SE, Silva GGZ, Boling L, Barr JJ, Speth DR, Seguritan V, Aziz RK, Felts B, Dinsdale EA, Mokili JL, Edwards RA. 2014. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nature Communications 5:4498.

85. Schooley RT, Biswas B, Gill JJ, Hernandez-Morales A, Lancaster J, Lessor L, Barr JJ, Reed SL, Rohwer F, Benler S, Segall AM, Taplitz R, Smith DM, Kerr K, Kumaraswamy M, Nizet V, Lin L, McCauley MD, Strathdee SA, Benson CA, Pope RK, Leroux BM, Picel AC, Mateczun AJ, Cilwa KE, Regeimbal JM, Estrella LA, Wolfe DM, Henry MS, Quinones J, Salka S, Bishop-Lilly KA, Young R, Hamilton T. 2017. Development and use of personalized bacteriophage- based therapeutic cocktails to treat a patient with a disseminated resistant Acinetobacter baumannii infection. Antimicrob Agents Chemother 61.

86. Yosef I, Manor M, Kiro R, Qimron U. 2015. Temperate and lytic bacteriophages programmed to sensitize and kill antibiotic-resistant bacteria. Proc Natl Acad Sci U S A 112:7267-72.

87. Harada LK, Silva EC, Campos WF, Del Fiol FS, Vila M, Dabrowska K, Krylov VN, Balcao VM. 2018. Biotechnological applications of bacteriophages: state of the art. Microbiol Res 212-213:38-58.

88. Howard-Varona C, Hargreaves KR, Abedon ST, Sullivan MB. 2017. Lysogeny in nature: mechanisms, impact and ecology of temperate phages. ISME J 11:1511-1520.

125

89. Erez Z, Steinberger-Levy I, Shamir M, Doron S, Stokar-Avihail A, Peleg Y, Melamed S, Leavitt A, Savidor A, Albeck S, Amitai G, Sorek R. 2017. Communication between viruses guides lysis-lysogeny decisions. Nature 541:488-493.

90. Lin M, Rippe RA, Niemela O, Brittenham G, Tsukamoto H. 1997. Role of iron in NF-kappa B activation and cytokine gene expression by rat hepatic macrophages. Am J Physiol 272:G1355-64.

91. Cenens W, Makumi A, Mebrhatu MT, Lavigne R, Aertsen A. 2013. Phage-host interactions during pseudolysogeny: lessons from the Pid/dgo interaction. Bacteriophage 3:e25029.

92. Los M, Wegrzyn G. 2012. Pseudolysogeny. Adv Virus Res 82:339-49.

93. Abedon ST. 2008. Disambiguating bacteriophage pseudolysogeny: an historical analysis of lysogeny, pseudolysogeny, and the phage carrier state. In: Contemporary Trends in Bacteriophage Research. Editor: Horace T Adams Chapter 11:2850307.

94. Siringan P, Connerton PL, Cummings NJ, Connerton IF. 2014. Alternative bacteriophage life cycles: the carrier state of Campylobacter jejuni. Open Biol 4:130200.

95. Brussow H, Desiere F. 2001. Comparative phage genomics and the evolution of Siphoviridae: insights from dairy phages. Mol Microbiol 39:213-22.

96. Hendrix RW, Smith MC, Burns RN, Ford ME, Hatfull GF. 1999. Evolutionary relationships among diverse bacteriophages and prophages: all the world's a phage. Proc Natl Acad Sci U S A 96:2192-7.

97. Botstein D. 1980. A theory of modular evolution for bacteriophages. Ann N Y Acad Sci 354:484-90.

98. Filee J, Bapteste E, Susko E, Krisch HM. 2006. A selective barrier to horizontal gene transfer in the T4-type bacteriophages that has preserved a core genome with the viral replication and structural genes. Mol Biol Evol 23:1688-96.

126

99. Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW. 2005. Three Prochlorococcus cyanophage genomes: signature features and ecological interpretations. PLoS Biol 3:e144.

100. Chibani-Chennoufi S, Canchaya C, Bruttin A, Brussow H. 2004. Comparative genomics of the T4-Like Escherichia coli phage JS98: implications for the evolution of T4 phages. J Bacteriol 186:8276-86.

101. Neve H, Zenz KI, Desiere F, Koch A, Heller KJ, Brussow H. 1998. Comparison of the lysogeny modules from the temperate Streptococcus thermophilus bacteriophages TP-J34 and Sfi21: implications for the modular theory of phage evolution. Virology 241:61-72.

102. Brussow H, Canchaya C, Hardt WD. 2004. Phages and the evolution of bacterial pathogens: from genomic rearrangements to lysogenic conversion. Microbiol Mol Biol Rev 68:560-602, table of contents.

103. Hendrix RW. 2002. Bacteriophages: evolution of the majority. Theor Popul Biol 61:471-80.

104. Juhala RJ, Ford ME, Duda RL, Youlton A, Hatfull GF, Hendrix RW. 2000. Genomic sequences of bacteriophages HK97 and HK022: pervasive genetic mosaicism in the lambdoid bacteriophages. J Mol Biol 299:27-51.

105. Veesler D, Cambillau C. 2011. A common evolutionary origin for tailed- bacteriophage functional modules and bacterial machineries. Microbiol Mol Biol Rev 75:423-33, first page of table of contents.

106. Lawrence JG, Hatfull GF, Hendrix RW. 2002. Imbroglios of viral taxonomy: genetic exchange and failings of phenetic approaches. J Bacteriol 184:4891- 905.

107. Proux C, van Sinderen D, Suarez J, Garcia P, Ladero V, Fitzgerald GF, Desiere F, Brussow H. 2002. The dilemma of phage taxonomy illustrated by comparative genomics of Sfi21-like Siphoviridae in lactic acid bacteria. J Bacteriol 184:6026-36.

108. Rohwer F, Edwards R. The phage proteomic tree: a genome-based taxonomy for phage. J Bacteriol. 2002;184(16):4529-35. 127

109. Moineau S, Lévesque C. 2004. Control of bacteriophages in industrial fermentations. Bacteriophages: Biology and Applications doi:10.1201/9780203491751.ch10:285-296.

110. Brussow H, Fremont M, Bruttin A, Sidoti J, Constable A, Fryder V. 1994. Detection and classification of Streptococcus thermophilus bacteriophages isolated from industrial milk fermentation. Appl Environ Microbiol 60:4537-43.

111. Coffey A, Ross RP. 2002. Bacteriophage-resistance systems in dairy starter strains: molecular analysis to application. Antonie Van Leeuwenhoek 82.

112. Jarvis AW. 1989. Bacteriophages of lactic acid bacteria. Journal of Dairy Science 72:3406-3428.

113. Josephsen J, Neve H. Bacteriophages and lactic acid bacteria. In Lactic acid bacteria: Microbiology and functional aspects. Ed. S. Salminen and A. von Wright pp 385– 436 New York: Marcel Dekker Inc. 1998.

114. Szczepankowska AK, Górecki RK, Kolakowski P, Bardowsk JK. 2013. Chapter: Lactic acid bacteria resistance to bacteriophage and prevention techniques to lower phage contamination in dairy fermentation. In book: Biochemistry, genetics and molecular biology , "Lactic Acid Bacteria - R & D for Food, Health and Livestock Purposes", Editors: Marcelino Kongo https://doi.org/10.5772/51541

115. Brüssow H KE. 2005. Phage ecology, p. 129-164. In: E. Kutter and A. Sulakvelidze (eds.), Bacteriophages: biology and application. CRC Press, Boca Raton, Florida.

116. Madera C, Monjardin C, Suarez JE. 2004. Milk contamination and resistance to processing conditions determine the fate of Lactococcus lactis bacteriophages in . Appl Environ Microbiol 70:7365-71.

117. Relano P, Mata M, Bonneau M, Ritzenthaler P. 1987. Molecular characterization and comparison of 38 virulent and temperate bacteriophages of Streptococcus lactis. J Gen Microbiol 133:3053-63.

128

118. Heap HA, Limsowtin GKY, Lawrence RC. 1978. Contribution of Streptococcus lactis strains in raw milk to phage infection in commercial cheese factories. N Z J Dairy Sci Technol 13:16-22.

119. Kleppen HP, Bang T, Nes IF, Holo H. 2011. Bacteriophages in milk fermentations: diversity fluctuations of normal and failed fermentations. International Dairy Journal 21:592-600.

120. Atamer Z, Dietrich J, Müller-Merbach M, Neve H, Heller K, Hinrichs J. 2009. Screening for and characterization of Lactococcus lactis bacteriophages with high thermal resistance. International Dairy Journal 19:228-235.

121. Brown RJ, Ernstrom CA. 1982. Incorporation of ultrafiltration concentrated whey solids into Cheddar cheese for increased yield. . J Dairy Sci 65:2391-2395.

122. Lawrence RC, Gilles J. 1987. Texture development during cheese ripening. Journal of Dairy Science 70:1748-1760.

123. Atamer Z, Neve H, Heller K, Hinrichs J. 2012. Thermal resistance of bacteriophages in the dairy industry, In book: Bacteriophages in dairy processing. Editors: A. Quiberoni, J. Reinheimer, p 195-214.

124. Bruttin A, Desiere F, d'Amico N, Guerin JP, Sidoti J, Huni B, Lucchini S, Brussow H. 1997. Molecular ecology of Streptococcus thermophilus bacteriophage infections in a cheese factory. Appl Env Microbiol 63:3144-50.

125. Garneau JE, Moineau S. 2011. Bacteriophages of lactic acid bacteria and their impact on milk fermentations. Microb Cell Fact 10 Suppl 1:S20.

126. Canchaya C, Proux C, Fournous G, Bruttin A, Brussow H. 2003. Prophage genomics. Microbiol Mol Biol Rev 67:238-76.

127. Alexeeva S, Guerra Martinez JA, Spus M, Smid EJ. 2018. Spontaneously induced prophages are abundant in a naturally evolved bacterial starter culture and deliver competitive advantage to the host. BMC Microbiol 18:120.

128. Lunde M, Aastveit AH, Blatny JM, Nes IF. 2005. Effects of diverse environmental conditions on phiLC3 prophage stability in Lactococcus lactis. Appl Environ Microbiol 71:721-7.

129

129. Samson JE, Belanger M, Moineau S. 2013. Effect of the abortive infection mechanism and type III toxin/antitoxin system AbiQ on the lytic cycle of Lactococcus lactis phages. J Bacteriol 195:3947-56.

130. Sing WD, Klaenhammer TR. 1993. A strategy for rotation of different bacteriophage defenses in a lactococcal single-strain starter culture system. Appl Environ Microbiol 59:365-72.

131. Heap HA, Lawrence R. 1977. The contribution of starter strains to the level of phage infection in a commercial cheese factory. NZJ Dairy Sci Technol 12.

132. Daly C, Fitzgerald GF, Davis R. 1996. Biotechnology of lactic acid bacteria with special reference to bacteriophage resistance. Antonie Van Leeuwenhoek 70:99-110.

133. Moineau S. 1999. Applications of phage resistance in lactic acid bacteria. Antonie Van Leeuwenhoek 76.

134. Durmaz E, Klaenhammer TR. 1995. A starter culture rotation strategy incorporating paired restriction/ modification and abortive infection bacteriophage defenses in a single Lactococcus lactis strain. Appl Environ Microbiol 61:1266-73.

135. Mahony J, Tremblay DM, Labrie SJ, Moineau S, van Sinderen D. 2015. Investigating the requirement for calcium during lactococcal phage infection. Int J Food Microbiol 201:47-51.

136. Gulstrom TJ, Pearce LE, Sandine WE, Elliker PR. 1979. Evaluation of commercial phage inhibitory media. Journal of Dairy Science 62:208-221.

137. Jarvis AW, Fitzgerald GF, Mata M, Mercenier A, Neve H, Powell IB, Ronda C, Saxelin M, Teuber M. 1991. Species and type phages of lactococcal bacteriophages. Intervirology 32:2-9.

138. Fortier LC, Bransi A, Moineau S. 2006. Genome sequence and global gene expression of Q54, a new phage species linking the 936 and c2 phage species of Lactococcus lactis. J Bacteriol 188:6101-14.

130

139. Villion M, Chopin MC, Deveau H, Ehrlich SD, Moineau S, Chopin A. 2009. P087, a lactococcal phage with a morphogenesis module similar to an Enterococcus faecalis prophage. Virology 388:49-56.

140. Samson JE, Moineau S. 2010. Characterization of Lactococcus lactis phage 949 and comparison with other lactococcal phages. Appl Environ Microbiol 76:6843-52.

141. Dupuis ME, Moineau S. 2010. Genome organization and characterization of the virulent lactococcal phage 1358 and its similarities to Listeria phages. Appl Environ Microbiol 76:1623-32.

142. Garneau JE, Tremblay DM, Moineau S. 2008. Characterization of 1706, a virulent phage from Lactococcus lactis with similarities to prophages from other Firmicutes. Virology 373:298-309.

143. Chopin A, Deveau H, Ehrlich SD, Moineau S, Chopin MC. 2007. KSY1, a lactococcal phage with a T7-like transcription. Virology 365:1-9.

144. Kotsonis SE, Powell IB, Pillidge CJ, Limsowtin GK, Hillier AJ, Davidson BE. 2008. Characterization and genomic analysis of phage asccphi28, a phage of the family Podoviridae infecting Lactococcus lactis. Appl Environ Microbiol 74:3453-60.

145. Tikhonenko AS. 1970. Ultrastructure of bacterial viruses. Plenum Press, New York, NY.

146. Keogh BP, Shimmin PD. 1974. Morphology of the bacteriophages of lactic streptococci. Appl Microbiol 27:411-5.

147. Bissonnette F, Labrie S, Deveau H, Lamoureux M, Moineau S. 2000. Characterization of mesophilic mixed starter cultures used for the manufacture of aged cheddar cheese. J Dairy Sci 83:620-7.

148. Szczepanska AK, Hejnowicz MS, Kolakowski P, Bardowski J. 2007. Biodiversity of Lactococcus lactis bacteriophages in Polish dairy environment. Acta Biochim Pol 54:151-8.

131

149. Castro-Nallar E, Chen H, Gladman S, Moore SC, Seemann T, Powell IB, Hillier A, Crandall KA, Chandry PS. 2012. Population genomics and phylogeography of an Australian dairy factory derived lytic bacteriophage. Genome Biol Evol 4:382-93.

150. Moineau S, Borkaev M, Holler BJ, Walker SA, Kondo JK, Vedamuthu ER, Vandenbergh PA. 1996. Isolation and characterization of lactococcal bacteriophages from cultured buttermilk plants in the United States. Journal of Dairy Science 79:2104-2111.

151. Prevots F, Mata M, Ritzenthaler P. 1990. Taxonomic differentiation of 101 lactococcal bacteriophages and characterization of bacteriophages with unusually large genomes. Appl Environ Microbiol 56:2180-5.

152. Raiski A, Belyasova N. 2009. Biodiversity of Lactococcus lactis bacteriophages in the Republic of Belarus. Int J Food Microbiol 130:1-5.

153. Murphy J, Mahony J, van Sinderen D. 2013. Complete genome sequence of the 936-type lactococcal bacteriophage Caseus JM1. Genome Announc 1:e0005913.

154. Mahony J, Kot W, Murphy J, Ainsworth S, Neve H, Hansen LH, Heller KJ, Sorensen SJ, Hammer K, Cambillau C, Vogensen FK, van Sinderen D. 2013. Investigation of the relationship between lactococcal host cell wall polysaccharide genotype and 936 phage receptor binding protein phylogeny. Appl Environ Microbiol 79:4385-92.

155. Labrie SJ, Moineau S. 2007. Abortive infection mechanisms and prophage sequences significantly influence the genetic makeup of emerging lytic lactococcal phages. J Bacteriol 189:1482-7.

156. Jervis AW. 1984a. Differentiation of lactic streptococcal phages into phage species by DNA-DNA homology. Appl Env Microbiol 47:343-349.

157. Coveney JA, Fitzgerald GF, Daly C. 1987. Detailed characterization and comparison of four lactic streptococcal bacteriophages based on morphology, restriction mapping, DNA homology, and structural protein analysis. Appl Environ Microbiol 53:1439-47.

132

158. Oliveira J, Mahony J, Hanemaaijer L, Kouwen T, van Sinderen D. 2018. Biodiversity of bacteriophages infecting Lactococcus lactis starter cultures. J Dairy Sci 101:96-105.

159. Crutz-Le Coq AM, Cesselin B, Commissaire J, Anba J. 2002. Sequence analysis of the lactococcal bacteriophage bIL170: insights into structural proteins and HNH endonucleases in dairy phages. Microbiology 1489(Pt 4):985-1001.

160. Moisan M, Moineau S. 2012. Multilocus sequence typing scheme for the characterization of 936-like phages infecting Lactococcus lactis. Appl Environ Microbiol 78:4646-4653.

161. Labrie SJ, Josephsen J, Neve H, Vogensen FK, Moineau S. 2008. Morphology, genome sequence, and structural proteome of type phage P335 from Lactococcus lactis. Appl Environ Microbiol 74:4636-44.

162. Moineau S, Pandian S, Klaenhammer TR. 1994. Evolution of a lytic bacteriophage via DNA acquisition from the Lactococcus lactis chromosome. Appl Environ Microbiol 60:1832-41.

163. Reyrolle J, Chopin MC, Letellier F, Novel G. 1982. Lysogenic strains of lactic acid streptococci and lytic spectra of their temperate bacteriophages. Appl Environ Microbiol 43:349-56.

164. Davidson BE, Powell IB, Hillier AJ. 1990. Temperate bacteriophages and lysogeny in lactic acid bacteria. FEMS Microbiol Rev 7:79-90.

165. Ackermann HW. 2009. Basic phage electron microscopy. Methods Mol Biol 501:113-26.

166. Goldsmith CS, Miller SE. 2009. Modern uses of electron microscopy for detection of viruses. Clin Microbiol Rev 22:552-63.

167. Bradley DE. 1967. Ultrastructure of bacteriophage and bacteriocins. Bacteriol Rev 31:230-314.

168. Ackermann HW. 2007. 5500 Phages examined in the electron microscope. Arch Virol 152:227-43.

133

169. Ackermann HW. 1998. Tailed bacteriophages: the order caudovirales. Adv Virus Res 51:135-201.

170. Lwoff A, Horne R, Tournier P. 1962. A system of viruses. Cold Spring Harb Symp Quant Biol 27:51-5.

171. Adriaenssens EM BJ. 2017. How to name and classify your phage: an informal guide. Viruses 9.

172. Lavigne R, Darius P, Summer EJ, Seto D, Mahadevan P, Nilsson AS, Ackermann HW, Kropinski AM. 2009. Classification of Myoviridae bacteriophages using protein sequence similarity. BMC Microbiol 9:224.

173. Mahony J, van Sinderen D. 2014. Current taxonomy of phages infecting lactic acid bacteria. Front Microbiol 5:7.

174. Farley MM, Tu J, Kearns DB, Molineux IJ, Liu J. 2017. Ultrastructural analysis of bacteriophage Phi29 during infection of Bacillus subtilis. J Struct Biol 197:163-171.

175. Taylor NM, Prokhorov NS, Guerrero-Ferreira RC, Shneider MM, Browning C, Goldie KN, Stahlberg H, Leiman PG. 2016. Structure of the T4 baseplate and its function in triggering sheath contraction. Nature 533:346-52.

176. Chaikeeratisak V, Nguyen K, Egan ME, Erb ML, Vavilina A, Pogliano J. 2017. The phage nucleus and tubulin spindle are conserved among large Pseudomonas phages. Cell Rep 20:1563-1571.

177. Kraemer JA, Erb ML, Waddling CA, Montabana EA, Zehr EA, Wang H, Nguyen K, Pham DS, Agard DA, Pogliano J. 2012. A phage tubulin assembles dynamic filaments by an atypical mechanism to center viral DNA within the host cell. Cell 149:1488-99.

178. Romero-Brey I, Bartenschlager R. 2015. Viral infection at high magnification: 3D electron microscopy methods to analyze the architecture of infected cells. Viruses 7:6316-45.

134

179. Lin NT, Chiou PY, Chang KC, Chen LK, Lai MJ. 2010. Isolation and characterization of phi AB2: a novel bacteriophage of Acinetobacter baumannii. Res Microbiol 161:308-14.

180. Huang LH, Farnet CM, Ehrlich KC, Ehrlich M. 1982. Digestion of highly modified bacteriophage DNA by restriction endonucleases. Nucl Acids Res 10:1579-91.

181. Rusinov IS, Ershova AS, Karyagina AS, Spirin SA, Alexeevski AV. 2018. Avoidance of recognition sites of restriction-modification systems is a widespread but not universal anti-restriction strategy of prokaryotic viruses. BMC Genomics 19:885.

182. Mata M, Trautwetter A, Luthaud G, Ritzenthaler P. 1986. Thirteen virulent and temperate bacteriophages of Lactobacillus bulgaricus and Lactobacillus lactis belong to a single DNA homology group. Appl Environ Microbiol 52:812-8.

183. Merabishvili M, Verhelst R, Glonti T, Chanishvili N, Krylov V, Cuvelier C, Tediashvili M, Vaneechoutte M. 2007. Digitized fluorescent RFLP analysis (fRFLP) as a universal method for comparing genomes of culturable dsDNA viruses: application to bacteriophages. Res Microbiol 158:572-581.

184. Comeau AM, Short S, Suttle CA. 2004. The use of degenerate-primed random amplification of polymorphic DNA (DP-RAPD) for strain-typing and inferring the genetic similarity among closely related viruses. J Virol Methods 118:95-100.

185. Dini C, de Urraza PJ. 2010. Isolation and selection of coliphages as potential biocontrol agents of enterohemorrhagic and Shiga toxin-producing E. coli (EHEC and STEC) in cattle. J Appl Microbiol 109:873-887.

186. Gutiérrez D, Martin-Platero AM, Rodríguez A, Martínez-Bueno M, García P, Martínez B. 2011. Typing of bacteriophages by randomly amplifed polymorphic DNA (RAPD)-PCR to assess genetic diversity. FEMS Microb Letters 322:90-97.

187. Doria F, Napoli C, Costantini A, Berta G, Saiz JC, García-Moruno E. 2013. Development of a new method for detection and identification of Oenococcus oeni bacteriophages based on endolysin gene sequence and randomly amplified polymorphic DNA. Appl Environ Microbiol 79:4799-4805.

135

188. Winget DM, Wommack KE. 2008. Randomly amplified polymorphic DNA PCR as a tool for assessment of marine viral richness. Appl Environ Microbiol 74:2612-2618.

189. Winter C, Weinbauer MG. 2010. Randomly amplified polymorphic DNA reveals tight links between viruses and microbes in the bathypelagic zone of the Northwestern Mediterranean Sea. Appl Environ Microbiol 76:6724-32.

190. Srinivasiah S, Lovett J, Polson S, Bhavsar J, Ghosh D, Roy K, Fuhrmann JJ, Radosevich M, Wommack KE. 2013. Direct assessment of viral diversity in soils by random PCR amplification of polymorphic DNA. Appl Environ Microbiol 79:5450-7.

191. Cortes P, Spricigo DA, Bardina C, Llagostera M. 2015. Remarkable diversity of Salmonella bacteriophages in swine and poultry. FEMS Microbiol Lett 362:1-7.

192. Li L, Yang H, Lin S, Jia S. 2010. Classification of 17 newly isolated virulent bacteriophages of Pseudomonas aeruginosa. Can J Microbiol 56:925-33.

193. del Rio B, Binetti AG, Martin MC, Fernandez M, Magadán A, Alvarez MA. 2007. Multiplex PCR for the detection and identification of dairy bacteriophages in milk. Food Microbiology 24:75-81.

194. Philipson CW, Voegtly LJ, Lueder MR, Long KA, Rice GK, Frey KG, Biswas B, Cer RZ, Hamilton T, Bishop-Lilly KA. 2018. Characterizing phage genomes for therapeutic applications. Viruses 10.

195. Aziz RK, Ackermann HW, Petty NK, Kropinski AM. 2018. Essential steps in characterizing bacteriophages: biology, taxonomy, and genome analysis. Methods Mol Biol 1681:197-215.

196. Ekblom R, Wolf JB. 2014. A field guide to whole-genome sequencing, assembly and annotation. Evol Appl 7:1026-42.

197. Shendure JA, Porreca GJ, Church GM. 2008. Overview of DNA sequencing strategies. Curr Protoc Mol Biol Chapter 7:Unit 7 1.

198. Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, Adam MP. 2012. Hybrid error correction

136

and de novo assembly of single-molecule sequencing reads. Nat Biotechnol 30:693-700.

199. Agah S ZM, Pasquali M, Kolomeisky AB. 2016. DNA sequencing by nanopores: advances and challenges. Journal of Physics D: Applied Physics 49 413001.

200. Ku CS, Roukos DH. 2013. From next-generation sequencing to nanopore sequencing technology: paving the way to personalized genomic medicine. Expert Rev Med Devices 10:1-6.

201. Tyler AD, Mataseje L, Urfano CJ, Schmidt L, Antonation KS, Mulvey MR, Corbett CR. 2018. Evaluation of Oxford Nanopore's MinION sequencing device for microbial whole genome sequencing applications. Sci Rep 8:10931.

202. Jain M, Fiddes IT, Miga KH, Olsen HE, Paten B, Akeson M. 2015. Improved data analysis for the MinION nanopore sequencer. Nat Methods 12:351-6.

203. Bates M, Polepole P, Kapata N, Loose M, O'Grady J. 2016. Application of highly portable MinION nanopore sequencing technology for the monitoring of nosocomial tuberculosis infection. Int J Mycobacteriol 5 Suppl 1:S24.

204. Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Gu Y. 2012. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13:341.

205. Del Angel VD HE, Sterck L, Capella-Gutierrez S, Notredame C, Vinnere Pettersson O, Amselem J, Bouri L, Bocs S, Klopp C, Gibrat J-F, Vlasova A, Leskosek BL, Soler L, Binzer-Panchal M, Lantz H. 2018. Ten steps to get started in genome assembly and annotation. F1000Res 7: ELIXIR-148.

206. Bernardes JS, Vieira FRJ, Costa LMM, Zaverucha G. 2015. Evaluation and improvements of clustering algorithms for detecting remote homologous protein families. BMC Bioinformatics 16:34.

207. McNair K, Zhou C, Souza B, Edwards RA. 2018. THEA: A novel approach to gene identification in phage genomes. bioRxiv doi:10.1101/265983:265983.

137

208. Russell DA. 2018. Sequencing, assembling, and finishing complete bacteriophage genomes. Methods Mol Biol 1681:109-125.

209. Reyes A, Semenkovich NP, Whiteson K, Rohwer F, Gordon JI. 2012. Going viral: next-generation sequencing applied to phage populations in the human gut. Nat Rev Microbiol 10:607-17.

210. Rihtman B, Meaden S, Clokie MR, Koskella B, Millard AD. 2016. Assessing Illumina technology for the high-throughput sequencing of bacteriophage genomes. PeerJ 4:e2055.

211. Puxty RJ, Perez-Sepulveda B, Rihtman B, Evans DJ, Millard AD, Scanlan DJ. 2015. Spontaneous deletion of an "ORFanage" region facilitates host adaptation in a "photosynthetic" cyanophage. PLoS One 10:e0132642.

212. Versalovic J, Koeuth T, Lupski JR. 1991. Distribution of repetitive DNA sequences in eubacteria and application to fingerprinting of bacterial genomes. Nucleic Acids Res 19:6823-31.

213. Higgins CF, Ames GF, Barnes WM, Clement JM, Hofnung M. 1982. A novel intercistronic regulatory element of prokaryotic operons. Nature 298:760-762.

214. Stern MJ, Ames GF, Smith NH, Robinson EC, Higgins CF. 1984. Repetitive extragenic palindromic sequences: a major component of the bacterial genome. Cell 37:1015-26.

215. Hulton CS, Higgins CF, Sharp PM. 1991. ERIC sequences: a novel family of repetitive elements in the genomes of Escherichia coli, Salmonella typhimurium and other enterobacteria. Mol Microbiol 5:825-34.

216. Martin B, Humbert O, Camara M, Guenzi E, Walker J, Mitchell T, Andrew P, Prudhomme M, Alloing G, Hakenbeck R, et al. 1992. A highly conserved repeated DNA element located in the chromosome of Streptococcus pneumoniae. Nucleic Acids Res 20:3479-83.

217. Doll L, Moshitch S, Frankel G. 1993. Poly(GTG)5-associated profiles of Salmonella and Shigella genomic DNA. Res Microbiol 144:17-24.

138

218. Barrangou R, Fremaux C, Deveau H, Richards M, Boyaval P, Moineau S, Romero DA, Horvath P. 2007. CRISPR provides acquired resistance against viruses in prokaryotes. Science 315:1709-1712.

219. Koeuth T, Versalovic J, Lupski JR. 1995. Differential subsequence conservation of interspersed repetitive Streptococcus pneumoniae BOX elements in diverse bacteria. Genome Res 5:408-18.

220. Versalovic J, Sschneider M, de Bruijn FJ, Lupski JR. 1994. Genomic fingerprinting of bacteria using repetitive sequence-based polymerase chain reaction. Methods in Molecular and Cellular Biology 5:25-40.

221. van Belkum A, Hermans P. 2001. BOX PCR Fingerprinting for molecular typing of Streptococcus pneumoniae. Methods Mol Med 48:159-68.

222. Wolska K, Kot B, Jakubczak A, Rymuza K. 2011. BOX-PCR is an adequate tool for typing of clinical Pseudomonas aeruginosa isolates. Folia Histochem Cytobiol 49:734-8.

223. Alegria A, Fernandez ME, Delgado S, Mayo B. 2010. Microbial characterisation and stability of a farmhouse natural fermented milk from Spain. Int J Dairy Technol 63:423-430.

224. Koc M, Cokmus C, Cihan AC. 2015. The genotypic diversity and lipase production of some thermophilic bacilli from different genera. Braz J Microbiol 46:1065-1076.

225. Kok CR, Hutkins R. 2018. and other fermented foods as sources of health-promoting bacteria. Nutr Rev 76:4-15.

226. Bartkiene E, Bartkevics V, Mozuriene E, Lele V, Zadeike D, Juodeikiene G. 2019. The safety, technological, nutritional, and sensory challenges associated with lacto-fermentation of meat and meat products by using pure lactic acid bacteria strains and plant-lactic acid bacteria bioproducts. Front Microbiol 10:1036.

227. Zabat MA, Sano WH, Wurster JI, Cabral DJ, Belenky P. 2018. Microbial community analysis of sauerkraut fermentation reveals a stable and rapidly established community. Foods 7. 139

228. Mokoena MP, Mutanda T, Olaniran AO. 2016. Perspectives on the probiotic potential of lactic acid bacteria from African traditional fermented foods and beverages. Food & Nutrition Research 60:29630.

229. De Vuyst L, Van Kerrebroeck D, Leroy F. 2017. Microbial ecology and process technology of sourdough fermentation. Advances in Applied Microbiology 100:49-160.

230. Dai Z, Li Y, Wu J, Zhao Q. 2013. Diversity of lactic acid bacteria during fermentation of a traditional Chinese fish product, Chouguiyu (stinky mandarinfish). J Food Sci 78:M1778-83.

231. Faria-Oliveira F, Diniz R, Godoy Santos F, Mezadri H, Castro I, Lopes Brandão R. 2015. The Role of yeast and lactic acid bacteria in the production of fermented beverages in South America, p 107-135 doi:10.5772/60877.

232. Giraffa G. 2002. Enterococci from foods. FEMS Microbiol Rev 26:163-71.

140

APPENDICES

CHAPTER 2

APPENDIX A. Statistical analysis of the BOXA2R-PCR for bacterial fingerprinting

Descriptives Type Statistic Std. Error Values Colony Mean 97.4486 .45087 95% Confidence Interval for Lower Bound 96.5491 Mean Upper Bound 98.3480 5% Trimmed Mean 97.7635 Median 100.0000 Variance 14.230 Std. Deviation 3.77228 Minimum 89.10 Maximum 100.00 Range 10.90 Interquartile Range 5.00 Skewness -1.169 .287 Kurtosis -.160 .566 DNA Mean 98.4052 .26284 95% Confidence Interval for Lower Bound 97.8845 Mean Upper Bound 98.9259 5% Trimmed Mean 98.7278 Median 100.0000 Variance 7.945 Std. Deviation 2.81860 Minimum 87.70 Maximum 100.00 Range 12.30 Interquartile Range 2.90 Skewness -1.644 .226 Kurtosis 1.764 .447

141

Mann-Whitney Test

Ranks Type N Mean Rank Sum of Ranks Values Colony 70 86.43 6050.00 DNA 115 97.00 11155.00 Total 185

Test Statisticsa Values Mann-Whitney U 3565.000 Wilcoxon W 6050.000 Z -1.566 Asymp. Sig. (2-tailed) .117 Exact Sig. (2-tailed) .118 Exact Sig. (1-tailed) .060 Point Probability .000 a. Grouping Variable: Source of DNA

142

CHAPTER 3

Supplementary 1. The phages used in the reproducibility study and their source Table S1. List of phages used in the reproducibility study

Phage Propagated on the strain Conditions name

Øc2a Lc.lactis ssp cremoris MG1363 DNA1and DNA 2 at 30C

Øc2 Lc.lactis ssp cremoris MG1363 DNA1 at 37C

Øc2 Lc.lactis ssp cremoris C2 DNA1 and DNA2 at 30C

Øc2 Lc.lactis ssp cremoris C2 DNA1 and DNA2 at 37C

Øsk1 Lc.lactis ssp cremoris MG1363 DNA1 and DNA2 at 30C

Øsk1 Lc.lactis ssp cremoris MG1363 DNA2 at 37C

Øsk1b Lc.lactis ssp cremoris LMO230 DNA1 and DNA2 at 30C

ØP087 Lc.lactis ssp lactis ML8 DNA2 at 30C

ØP087c Lc.lactis ssp lactis C10 DNA2 at 30C

Øc6Aa Lc.lactis ssp lactis C6 DNA1 and DNA2 at 30C

Øc6A Lc.lactis ssp lactis C10 DNA1 and DNA2 at 30C

Ø712 Lc.lactis ssp cremoris C2 DNA1 at 30C

Ø712 Lc.lactis ssp cremoris C2 DNA1 at 37C

Ø712b Lc.lactis ssp cremoris NCDO712 DNA2 at 30C

Ø712 Lc.lactis ssp cremoris NCDO712 DNA2 at 37C

ØbIL67a Lc.lactis ssp lactis IL1407 DNA1 and DNA2 at 30C

ØT4d - lysate

ØX174e - DNA

ØLambdae - DNA

Ø301f Lc.lactis ssp lactis Ni301 DNA1 and DNA2 at 30C DNA1 refers to the phage DNA isolated by the phenol/chloroform protocol DNA 2 refers to the phage DNA isolated using the QIAamp DNA Blood Mini Kit (Qiagen) a – Øc2 (host MG1363), ØbIL67(host IL1407), Øc6A(host C6) were provided by Dr Jasna Rakonjac, Massey University, New Zealand. 143

b- Øsk1 (host LMO230) and Ø712(host UK712) were sourced from our internal UNSW culture collection. c- ØP087 was purchased from the Félix d'Hérelle Reference Center for bacterial viruses of the Université Laval, Canada. d- E. coli T4 phage lysate was obtained from Dr Nicola Petty, The iThree institute, University of Technology, Sydney. e- PhiX174 RF1 DNA (0.5 µg/µl) and Lambda phage DNA (0.3 µg/µl) were purchased from Thermo Fisher Scientific, Australia. f- Ø301(host Ni301) was isolated in the course of this study and was sequenced in the Ramaciotti Centre for Genomics, UNSW, Sydney.

144

Supplementary 2. Statistical analysis of the BOXA-PCR for phage fingerprinting BOXA1R vs BOXA2R primer

Table S2. SPSS output of the reproducibility analysis showing that there was no statistically significant difference between the replicates generated using BOXA1R vs BOXA2R primer (p=0.279)

Descriptive Statistics N Mean Std. Deviation Minimum Maximum Values 117 98.2128 2.93565 89.40 100.00 Tests 117 1.5470 .49993 1.00 2.00

Mann-Whitney Test

Ranks Tests N Mean Rank Sum of Ranks Values BOXA1R 53 62.11 3292.00 BOXA2R 64 56.42 3611.00 Total 117

Test Statistics

Values Mann-Whitney U 1531.000 Wilcoxon W 3611.000 Z -1.086 Asymp. Sig. (2-tailed) .277 Exact Sig. (2-tailed) .279 Exact Sig. (1-tailed) .140 Point Probability .001

Figure S1. The plot illustrates 95% Confidence Intervals (CI) of dispersion estimates when using BOXA1R and BOXA2R primer

145

DNA1 30°C vs DNA1 37°C

Table S3. SPSS output of the reproducibility analysis of phage replicates where phage was amplified at 30°C and 37°C showing no significant difference between replicates (p = 0.527)

Descriptive Statistics N Mean Std. Deviation Minimum Maximum Data 30 97.9233 2.87326 89.40 100.00 Vars 30 1.5000 .50855 1.00 2.00

Mann-Whitney Test

Ranks Vars N Mean Rank Sum of Ranks Data 30C 15 14.53 218.00 37C 15 16.47 247.00 Total 30

Test Statistics

Values Mann-Whitney U 98.000 Wilcoxon W 218.000 Z -.681 Asymp. Sig. (2-tailed) .496 Exact Sig. (2-tailed) .567a Exact Sig. (1-tailed) .527 Point Probability .264 a. Not corrected for ties.

Figure S2. The plot illustrates 95% Confidence Intervals (CI) of dispersion estimates when using phage amplified at 30°C and 37°C as templates

146

DNA1 vs DNA2

Table S4. SPSS output of the reproducibility analysis using phage DNA isolated following two different protocols showing no significant difference between them (p = 0.080)

Descriptive Statistics N Mean Std. Deviation Minimum Maximum Results 117 98.2128 2.93565 89.40 100.00 Vars 117 1.3932 .49055 1.00 2.00

Mann-Whitney Test

Ranks Vars N Mean Rank Sum of Ranks Results DNA1 71 55.32 3927.50 DNA2 46 64.68 2975.50 Total 117

Test Statistics

Values Mann-Whitney U 1371.500 Wilcoxon W 3927.500 Z -1.755 Asymp. Sig. (2-tailed) .079 Exact Sig. (2-tailed) .080 Exact Sig. (1-tailed) .039 Point Probability .000

Figure S3. The plot illustrates 95% Confidence Intervals (CI) of dispersion estimates when using phage DNA amplified following two diffrent protocols

147

Descriptives Vars Statistic Std. Error Results DNA1 Mean 97.7648 .39211 95% Confidence Interval for Lower Bound 96.9828 Mean Upper Bound 98.5468 5% Trimmed Mean 98.1013 Median 100.0000 Variance 10.916 Std. Deviation 3.30394 Minimum 89.40 Maximum 100.00 Range 10.60 Interquartile Range 4.50 Skewness -1.293 .285 Kurtosis .571 .563 DNA2 Mean 98.9043 .31031 95% Confidence Interval for Lower Bound 98.2794 Mean Upper Bound 99.5293 5% Trimmed Mean 99.1548 Median 100.0000 Variance 4.429 Std. Deviation 2.10459 Minimum 93.30 Maximum 100.00 Range 6.70 Interquartile Range .70 Skewness -1.708 .350 Kurtosis 1.616 .688

148

Supplementary 3. Sequenced phage fragments

Table S5. List of sequenced phage fragments and related information FRAGMENT Band size- Band size- UniProt KB annotation Info Amplification NAME gel sequencing (a – Reviewed protein) and assessment (bp) sequencing (bp) primer ØX174-1 650 637 P03631 Replication-associated protein A E-value: 1.9e-91 BOXA1R (Enterobacteria phage phiX174)a Score: 745 Ident.: 89.1% ØX174-2 300 288 P03631-2 Isoform A* of Replication- E-value: 1.3e-3 associated protein A (Enterobacteria phage Score: 108 BOXA1R phiX174)a Ident.: 55.3% ØLambda-1 450 480 P03689 Replication protein P (Escherichia E-value: 1e-15 BOXA2R phage lambda)a Score: 166 Ident.: 43.2% ØT4-2 700 1089 A0A376YLU8 Putative baseplate structural E-value: 1.7e-13 BOXA2R protein (Escherichia coli) Score: 188 Ident.: 36.7% ØDL4HV-1 800 881 U1P0V3 Phage integrase family (Halorubrum sp. E-value: 1.6e-4 BOXA1R J07HR59) Score: 124 Ident.: 54.3% ØBU-1 400 354 A0A126HBL7 HNH endonuclease (Lactococcus E-value: 8.5e-23 BOXA2R phage 936 group) Score: 201 Ident.: 69.0% Ø301-1 900 912 A0A1P8BM18 Baseplate/receptor binding E-value: 8.5e-28 BOXA2R protein (Lactococcus phage 98104) Score: 198 Ident.: 61.8% Øc2-1 1200 1189 Q38305 Probable tape measure E-value: 5.8e-17 BOXA1R protein (Lactococcus phage c2) Score: 145 Ident.: 66.7% 149

Table S6. Primers used for the amplification of selected phage genes and results from the second round of PCR amplifications and BLAST analyses that used gene-specific primers PCR UniProt KB Expected product BLAST Fragment annotation (Top Genome CDS Primers for CDS amplification size (bp) size results BLAST result) (bp) P03631 Replication- E-value: 0.0 associated protein A FWD: ATGGTTCGTTCTTATTACC ØX174-1 J02482 AAA32570.1 1542 ~ 1500 Score: 1819 (Enterobacteria REV: TCATTTTCCGCCAGCA Ident.: 93.30% phage phiX174)a E-value: 3.6e- P03689 Replication 156 protein FWD: ATGAAAAACATCGCCGC ØLambda-1 J02459 AAA96585.1 702 ~ 700 Score: 1133 P (Escherichia REV: TCATACACTTGCTCCTTTC Ident.: phage lambda)a 100.00% A0A376YLU8 E-value: 3.9e- Putative baseplate FWD: ATGAGTAAAACAACACCGAC 133 ØT4-2 structural UGCD01000004 STK06109.1 648 ~ 650 REV: TTAGCCCATCGCCGAAT Score: 1004 protein (Escherichia Ident.: 98.50% coli) Q38305 Probable FWD: E-value: 0.0 tape measure Øc2-1 L48605 AAA92189.1 2121 ATGGCTAAAGAAAAATATGTC ~ 2000 Score: 1529 protein (Lactococcus REV: TCAACGCTTATTCAATTTAAC Ident.: 94.70% phage c2)

150

bp

bda 702 bp 702 bda

M ØLam bp ØT4 648 bp 1542 ØX174 2121 Øc2

Figure S4. PCR amplifications of the selected phage DNA using the forward and reverse primer pair corresponding to the annotated genes in each phage. The correct product size was generated matching the targeted genes in: ØLambda (Replication protein P); ØT4 (Putative baseplate structural protein); ØX174 (Replication-associated protein A) and Øc2 (Probable tape measure protein)

151

Supplementary 4. Sanger sequencing of the BOXA-PCR fragments Figure S5. Alignment of the sequencing of the PCR product from ØX174 DNA to the sequence of the replication-associated protein A (Enterobacteria phage phiX174) (grey boxes indicate a match)

152

153

154

Figure S6. Alignment of the sequencing of the PCR product from ØLambda DNA to the sequence of the Replication protein P (Escherichia phage lambda) (grey boxes indicate a match)

155

156

Figure S7. Alignment of the sequencing of the PCR product from ØT4 DNA to the sequence of the putative baseplate structural protein (Escherichia coli) (grey boxes indicate a match)

157

158

Figure S8. Alignment of the sequencing of the PCR product from ØC2 to the sequence of the probable tape measure protein (Lactococcus phage c2) (grey boxes indicate a match)

159

160

161

CHAPTER FOUR

Supplementary 1. Electron microscopy measurements

Table 1. Phage 15(Mo9) dimensions based on electron microscopy measurements (n=11) Head Head Tail length width length Tail width (nm) (nm) (nm) (nm) 1 43.72 36.88 86.24 5.98 2 44.52 33.64 86.31 4.48 3 44.50 38.50 84.38 4.98 4 48.88 39.06 87.74 8.89 5 49.94 38.66 87.26 6.38 6 43.06 36.60 99.09 6.59 7 52.78 29.84 95.93 6.38 8 38.90 35.17 61.87 7.75 9 38.88 32.59 67.42 6.90 10 40.15 29.32 62.00 5.49 11 31.07 41.94 64.32 5.80 Mean 42.01 35.45 76.73 6.36

162

Supplementary 2. Sequencing of the PCR products generated from 15(Mo9) in PCR reactions using primers for the classification of P335 phage species.

Example 1.

1. Product: 268 bp 2. Sequencing primer: 98101Forward

BEST MATCH: Baseplate/Receptor binding protein (Lactococcus phage 98101 - 4)

>15(Mo9)-98101F-Added 243 45 196 0.05 Quality 203 GCCNAGCGNATNGNTATTCGGCATNTGATAAGCTAAGANTGCGGATGTCAATAGT CAAGCGATTGTTGCTCAAATTGTAGAGAACGGTCAGCCCAAGAGTTTTGAGGGAC TGCAACCGTTCTTTTGTTTAATGGCACAAGAAACCACAGGGCAAGGGGTATCAGA AGAAAGTGTTGTCTCCTTTGATGCTAAAAATGGCACATTGAAATATATTGCCAGTG ATAATGCGATTACAGTTTGAAA

Figure 1. BLASTn sequences homology search in the NCBI database presented in a former NCBI format

Protein homology search in the UniProt KB databases: 163

Match hit Entry Protein names Identity 200400600800

A0A1P8BLR2 Baseplate/receptor binding 95.5% protein (Lactococcus phage 98102)

A0A1P8BM18 Baseplate/receptor binding 95.5% protein (Lactococcus phage 98104)

A0A1P8BLK4 Baseplate/receptor binding 95.5% protein (Lactococcus phage 98101)

Selected alignment(s) from match A0A1P8BLR2

A0A1P8BLR2 A0A1P8BLR2_9CAUD - Baseplate/receptor E-value: 2.1e-32 binding protein Lactococcus phage 98102 Score: 319 Ident.: 95.5% Positives : 98.5% Query Length: 243 Match Length: 672

Example 2.

3. Product: ~600 bp 4. Sequencing primer: Tuc2009Forward

BEST MATCH: Baseplate protein (Lactococcus phage 49801) Minor structural protein 4 (Lactococcus phage Tuc2009)

>15(Mo9)-Tuc2009Forward-Added 594 14 554 0.05

NNNNNNNNANNACGGTCAGCCCAAGAGTTTTGAGGGACTGCAACCGTTCTTTTGT TTAATGGCACAAGAAACCACAGGTCAAGGATTATCAGAAGAAAGTGTTGTCTCCTT TGATGCCAAAAATGGAACATTGAATTATATTGCCAGTGACAATGCGCTTCAAATGG TTGGACGAAATGAAGCTTATTTTAGCTTTAGAAAACAAGAAGGAAGTCAATGGATT GAGCAATTCTCCACTCGGACTTTTCACTATATTGTTGAGAAATCCATTTATTCGCAA CCCTTCAAAGACTCAAACTACTGGTGGACTTTCAAAGAGCTTTATCGAATCTTTAAT AAGTATATTGAAGATGGTAAAAATAGCTGGGTACAGTTTGTGGAAGCAAACCGTGA AATCCTTGAATCAATTGATCCAGGAGGACGGTTACTTTCTGAGGTTTTAGACCTCA ATAAAATTATTTATCGTAAAGTTCCAAGCGGATTTAATGTAGTAATTGAGCACGATT CAGAGTATCAACCGGATGTGAAAGTAACTTATTACAAAAATTCAATTGGAACCGAA GCCGAAAATGGGATTTTTGGATTAACCAAANN

164

NCBI search results: 100% identity over 93% query cover (total score 998, E-value 0.0) o Lactococcus phage 49801 (KX160205.1) and 94.21% ID (93% query cover, total score 854, E-value 0.0) to Bacteriophage Tuc2009 (AF109874.2)

Protein homology search in the UniProt KB databases:

Alignment overview Info

A0A1P8BKH4_9CAUD - Baseplate protein - • E-value: 7.7e-131 Lactococcus phag... - View alignment • Score: 973 • Ident.: 100.0%

Q9AYV5_BPTU2 - Minor structural protein 4 - • E-value: 1.6e-125 Lactococcus phag... - View alignment • Score: 938 • Ident.: 95.1%

Supplementary 3. Illumina MiSeq sequencing of 15(Mo9)

Assembly. Three different assembly programs with various different settings were tried and compared using a number of statistical methods. The assembly with ABySS k=95 had equal least pieces (3 versus 20 for SPAdes defined and 21 for SPAdes default), the highest maximum length (14,630 versus 11,073 for SPAdes defined and 12,109 for SPAdes default) and the highest N50 (14,627 versus 5,121 for SPAdes defined and 3,618 for SPAdes default). The phage genome sizes estimated with SPAdes had a 165

considerably longer total length than that estimated with AbySS(66,103 for defined and 66,415 for default versus 34,436 for ABySS k95). Comparison details of the assemblies generated with ABySS, SPAdes and Velvet are presented in Figures 1 -2.

Figure 1. Phage 15(Mo9) assembly statistics

166

Figure 2. Contig lengths of different assemblies Annotation. The annotation of 15(Mo9) genome assembled with ABySS produced 48 ORFs of which 16 were duplicates. Genome sequences that were assembled with SPAdes (both defined and default) generated 109/110 gene products when annotated, with just 19 having known function (see Table 1).

Table 1. The SPAdes default assembly had 110 genes annotations with 19 of these having the recognized functional proteins (listed below) locus_tag ftype length_bp gene EC_number COG product GOMGFBDM_00017 CDS 360 SSB protein GOMGFBDM_00018 CDS 216 Gene product 38 GOMGFBDM_00027 CDS 309 SaV protein GOMGFBDM_00029 CDS 189 SaV protein GOMGFBDM_00037 CDS 339 putative head completion protein 2

167

GOMGFBDM_00038 CDS 360 putative tail terminator protein GOMGFBDM_00039 CDS 699 Tail tube protein GOMGFBDM_00040 CDS 276 Chaperone protein gp12 GOMGFBDM_00042 CDS 2751 putative tape measure protein GOMGFBDM_00043 CDS 897 Distal tail protein GOMGFBDM_00044 CDS 1128 Baseplate protein gp16 GOMGFBDM_00046 CDS 804 Receptor binding protein GOMGFBDM_00051 CDS 627 3.2.1.17 Endolysin GOMGFBDM_00059 CDS 369 SaV protein GOMGFBDM_00064 CDS 372 SSB protein GOMGFBDM_00086 CDS 324 Gene product 19 GOMGFBDM_00090 CDS 315 putative head completion protein 1 GOMGFBDM_00092 CDS 1050 Capsid protein GOMGFBDM_00097 CDS 927 putative portal protein

Further inspection of these genes revealed that, in addition to proteins matching those from the reference genome of Lactococcus phage c2, a large number of both functional and hypothetical proteins were associated with Lactococcus phage 936 genome. This was considered erroneous and 15(Mo9) genome was subsequently re-assembled with ABBySS and the scaffolds were mapped onto the reference Lactococcus phage c2 (see Figure 3).

168

Supplementary 4. The annotated products of 15(Mo9) and their similarity with known putative proteins

Table 2. General features of the putative CDSs of the annotated 15(Mo9) genome and summary of homology searches

CDS Coordinates (start-stop codon) Size G+C The closest match functional similarity/ Homology (aa) (%) Lactococcus phage c2 homolog (%)

1 253(ATG) – 627(TAA) 124 33.33 Uncharacterized protein (Lactococcus E-value: 8.3e-82 phage 62606) A0A2Z2RV90 Score: 623 Ident.: 93.5% E21 protein (Lactococcus phage E-value: 1.5e-77 c2)Q38276 Score: 595 Ident.: 89.5% 2 649(ATG) – 951(TAA) 100 33 Uncharacterized protein (Lactococcus E-value: 7.3e-67 phage 62606)A0A2Z2RUN5 Score: 520 Ident.: 99.0% E20 protein (Lactococcus phage E-value: 2.4e-63 c2)Q38277 Score: 497 Ident.: 96.0% 3 944(ATG) – 1267(TAA) 107 33.64 Replication protein (Lactococcus phage E-value: 2.7e-70 50504) A0A2Z2RVF8 Score: 544 Ident.: 98.1% Gene product 19 R (Lactococcus phage E-value: 8.6e-67 c2)Q38278 Score: 521 Ident.: 93.5% 4 1269(ATG) - 1769(TAA) 166 30.54 Uncharacterized protein (Lactococcus E-value: 2.7e-105 phage 50504) A0A2Z2RUV6 Score: 786 Ident.: 91.6% E18 protein (Lactococcus phage c2) E-value: 3.7e-87 Q38279 Score: 667 169

Ident.: 77.7% 5 1766(ATG) – 1930(TGA) 54 33.33 Uncharacterised protein (Lactococcus E-value: 4.9e-37 phage 62606) A0A2Z2RWL6 Score: 294 Ident.: 100.0% E17 protein (Lactococcus phage c2) E-value: 1.6e-34 Q38280 Score: 279 Ident.: 94.4% 6 1943(ATG) – 2314(TAA) 123 36.02 Single stranded DNA binding protein E-value: 2.9e-86 (SSB) Score: 652 (Lactococcus phage 50504) Ident.: 98.4% A0A2Z2RXP0 E16 protein (Lactococcus phage c2) E-value: 2.1e-66 Q38281 Score: 521 Ident.: 83.7% 7 2380(ATG) – 2919(TAA) 179 36.3 Recombination protein (Lactococcus E-value: 1.6e-126 phage 62606) A0A2Z2RSI9 Score: 928 Ident.: 100.0% Recombination protein (Lactococcus E-value: 2.6e-125 phage c2) Q38282 Score: 920 Ident.: 99.4% 8 2919(ATG) – 3071(TAA) 50 29.41 E14 protein (Lactococcus phage c2) E-value: 9.7e-35 Q38283 Score: 279 Ident.: 100.0% 9 3065(ATG) – 3316(TGA) 83 31.75 E13 protein (Lactococcus phage c2) E-value: 6.5e-68 Q38284 Score: 488 Ident.: 100.0% 10 3288(ATG) – 3470(TAA) 60 34.43 Transcription regulator (Lactococcus E-value: 9.4e-46 phage c2) Q38285 Score: 348 Ident.: 100.0% 11 3448(ATG) – 3816(TAA) 122 36.31 E11 protein (Lactococcus phage c2) E-value: 9.4e-78 Q38286 Score: 596 170

Ident.: 90.2% 12 3943(ATG) – 4332(TGA) 129 35.13 E8 protein (Lactococcus phage c2) E-value: 1.1e-93 Q38289 Score: 702 Ident.: 97.7% 13 4332(ATG) – 4712(TAA) 126 31.5 DNA polymerase subunit (Lactococcus E-value: 3.8e-79 phage 50504) A0A2Z2S094 Score: 606 Ident.: 97.6% DNA polymerase subunit E-value: 4.2e-76 (Lactococcus phage c2) Q38290 Score: 586 Ident.: 94.4% 14 4786(ATG) – 5685(TAA) 299 37 DNA primase/helicase (Lactococcus E-value: 0.0 phage 50504) A0A2Z2RXP2 Score: 1,509 Ident.: 97.0% E5 protein (Lactococcus phage c2) E-value: 0.0 Q38292 Score: 1,488 Ident.: 96.0% 15 5687(ATG) – 5887(TAA) 66 25.87 Uncharacterized protein ( Lactococcus E-value: 6.4e-43 phage bIL67) Q38245 Score: 333 Ident.: 90.8% E4 protein (Lactococcus phage c2) E-value: 4.5e-38 Q38293 Score: 304 Ident.: 80.3% 16 5992(ATG) – 6132(TGA) 46 32.62 E1 protein (Lactococcus phage c2) E-value: 5.1e-41 Q38295 Score: 332 Ident.: 100.0% 17 6749(ATG) – 6898(TAA) 49 34 L1 protein (Lactococcus phage c2) E-value: 4.1e-35 Q38296 Score: 297 Ident.: 91.8% 18 6909(ATG) – 7394(TAG) 161 35.6 Holin (Lactococcus phage 50504) E-value: 7.8e-116 A0A2Z2RXP7 Score: 854

171

Ident.: 100.0% Holin (Lactococcus phage c2) Q38297 E-value: 1.2e-110 Score: 820 Ident.: 95.7% 19 7492(TTG) – 8118(TAA) 208 38.76 Lysin (Lactococcus phage 62606) E-value: 2.2e-152 A0A2Z2RUP4 Score: 1,105 Ident.: 98.6% EndolysinR (Lactococcus phage c2) E-value: 1.5e-131 P62692 Score: 969 Ident.: 86.1% 20 8129(ATG) – 8968(TGA) 279 38.69 Capsid and scaffold (Lactococcus E-value: 0.0 phage M6165) A0A192YAY1 Score: 1,405 Ident.: 97.5% Minor structural protein (lactococcus E-value: 0.0 phage c2) Q38299 Score: 1,386 Ident.: 96.1% 21 8919(TTG) – 10343(TAA) 474 40.63 Uncharacterized protein (Lactococcus E-value: 0.0 phage bIL67) Q38239 Score: 2,355 Ident.: 97.7% Major capsid proteinR (Lactococcus E-value: 0.0 phage c2) Q38300 Score: 2,328 Ident.: 96.2% 22 10391(ATG) – 11095(TAG) 234 37.3 Uncharacterized protein (Lactococcus E-value: 1.1e-158 phage M6165) A0A192YD51 Score: 1,151 Ident.: 94.0% L6 protein (Lactococcus phage c2) E-value: 1.6e-147 Q38301 Score: 1,078 Ident.: 88.0% 23 11113(ATG) – 11730(TAA) 205 42.23 Major tail shaft protein (Lactococcus E-value: 2.2e-123 phage c2) Q38302 Score: 913 Ident.: 84.9% 172

24 11801(ATG) – 12055(TAA) 84 35.29 Uncharacterized protein (Lactococcus E-value: 3.5e-64 phage bIL67) Q38236 Score: 466 Ident.: 100.0% L8 protein (Lactococcus phage c2) E-value: 3.6e-63 Q38303 Score: 460 Ident.: 97.6% 25 12097(ATG)-12270(TAG) 57 40.23 Structural protein 3 (Lactococcus phage E-value: 1.6e-39 50102) A0A2Z2S0B1 Score: 310 Ident.: 96.6% 26 12243(ATG)-12644(TAA) 133 35.32 Structural protein 4 (Lactococcus phage E-value: 1.8e-85 62403) A0A2Z2S062 Score: 649 Ident.: 94.0% L9 protein (Lactococcus phage c2) E-value: 1.5e-75 Q38304 Score: 584 Ident.: 82.8% 27 12637(ATG) – 14757(TGA) 706 40.74 Uncharacterized protein Lactococcus E-value: 0.0 lactis A0A2X0SS51 Score: 3,344 Ident.: 93.3% Probable tape measure proteinR E-value: 0.0 (Lactococcus phage c2) Q38305 Score: 3,145 Ident.: 87.3% 28 14754(TTG) – 15041(TAA) 95 35.07 HNH endonuclease (Lactococcus phage E-value: 5.9e-66 62402) A0A2Z2RXI3 Score: 513 Ident.: 95.8% L11 protein (Lactococcus phage c2) E-value: 6.9e-65 Q38306 Score: 506 Ident.: 94.7% 29 15054(ATG) – 16610(TAA) 518 36.42 Terminase subunit (Lactococcus phage E-value: 0.0 bIl67) Q38233 Score: 2,676 Ident.: 98.3%

173

Terminase (Lactococcus phage c2) E-value: 0.0 Q38307 Score: 2,652 Ident.: 96.9% 30 16610(ATG) – 16909(TGA) 99 36 Uncharacterized protein (Lactococcus E-value: 1.5e-61 phage bIL67) Q38232 Score: 485 Ident.: 100.0% L13 protein (Lactococcus phage c2) E-value: 8.6e-61 Q38308 Score: 480 Ident.: 99.0% 31 16906(ATG) – 18822(TGA) 638 37.61 Uncharacterized protein (Lactococcus E-value: 0.0 phage M5938) A0A192YC83 Score: 3,385 Ident.: 97.0% L14 protein (Lactococcus phage c2) E-value: 0.0 Q38309 Score: 3,235 Ident.: 92.8% 32 18819(ATG) – 19328(TAG) 169 34.31 Adhesion protein B (Lactococcus phage E-value: 1.6e-112 50102) A0A2Z2RWF0 Score: 834 Ident.: 93.5% Minor structural protein (Lactococcus E-value: 1.6e-42 phage c2) Q38310 Score: 390 Ident.: 70.2% 33 19321(ATG) – 20577(TAA) 418 35.24 Carbohydrate binding domain protein E-value: 0.0 (Lc. phage 62606) A0A2Z2RY15 Score: 2,266 Ident.: 97.4% Minor structural protein (Lc. phage c2) E-value: 0.0 Q38311 Score: 1,839 Ident.: 79.3% 34 20600 (ATG) – 20890 (TAG) 96 32.30 Holin (Lactococcus phage bIL67) E-value: 6.7e-55 Q38228 Score: 441 Ident.: 93.8% 174

Holin (Lactococcus phage c2) E-value: 6.3e-53 Q38312 Score: 428 Ident.: 90.6%

175