<<

THE ROLES OF MORON GENES IN THE ENTEROBACTERIA PHAGE PHI-80

Yury V. Ivanov

A Dissertation

Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

December 2012

Committee:

Ray A. Larsen, Advisor

Craig L. Zirbel Graduate Faculty Representative

Vipa Phuntumart

Scott O. Rogers

George S. Bullerjahn

© 2012

Yury Ivanov

All Rights Reserved iii

ABSTRACT

Ray Larsen, Advisor

The TonB system couples cytoplasmic membrane-derived proton motive force

energy to drive ferric siderophore transport across the outer membrane of Gram-negative

bacteria. While much effort has focused on this process, how energy is harnessed to

provide for transport of ligands remains unknown. Several bacterial (“phage”) are

known to require the TonB system to irreversibly adsorb (i.e., establish infection) in the Escherichia coli. One such phage is φ80, a “cousin” of the model temperate phage λ. Determining how φ80 is using the TonB system for infection should provide novel insights to the mechanisms of TonB-dependent processes. It had long been known that recombination between λ and φ80 results in a λ-like phage for whom TonB is now required; and this recombination involved the λ J gene, which encodes the tail-spike

protein required for irreversible adsorption of λ to E. coli. Thus, we suspected that a φ80

homologue of the λ J gene product was responsible for the TonB dependence of

φ80. While φ80 has long served as a tool for assaying TonB activity, it has not received

the scrutiny afforded λ. Indeed, only a small portion of its (not including a J gene

homolog) was known. To facilitate the use of φ80 as a molecular tool, we determined

and annotated the full genome of the φ80vir strain commonly used for TonB studies. The

46,285 bp φ80vir genome contained 71 predicted open reading frames, in which the

structural genes showed strong synteny with other known lambdoid viruses, with an

overall degree of mosaicism indicative of multiple genetic exchanges with other viruses. iv

There was also evidence of unique gene acquisition in the form of morons, one of which,

termed cor, when cloned and expressed in E. coli, blocked TonB-dependent transport of ferric siderophore and irreversible adsorption of φ80vir to the TonB-dependent outer membrane protein FhuA. These and other findings reported in this dissertation provide a foundation for the use of φ80vir as a tool to dissect the mechanisms of TonB-dependent energy . v

Dedicated to my grandfather, Yury Ivanov, who wanted me to become a student, right

after I was born. Unfortunately, I cannot tell you that I decided to go even beyond vi

ACKNOWLEDGMENTS

My career as a scientist would have been not possible without many individuals who played very important roles. But before giving them my regards, I would like to acknowledge Bowling Green State University for opening me new horizons.

Next, I would like to acknowledge members of my committee: Drs. Ray Larsen,

Scott Rogers, George Bullerjahn, Vipa Phuntumart, and Craig Zirbel. Dr. Larsen: I admire all your input, patience, and guidance that you gave me; I was honored to be your

Ph.D. student and I will not let you down; even now, I am still learning from you.

Dr. Rogers: you are the reason why I enjoy doing bioinformatics, genomics, phylogeny, and evolution; I will never forget our discussions during the fermentation seminar.

Dr. Bullerjahn: you was in my admission committee and I thank you for giving me this opportunity to become a Doctor of Philosophy. Dr. Phuntumart: you combine both a great scientist and a charming person; thank you for all your advice. Dr. Zirbel: I wish I could know you better; I am inspired by the research that you, as a mathematician, are doing in collaboration with Dr. Leontis.

Last but not least, I would like to thank my loving family for all their support and patience. To my parents, Victor Ivanov and Alla Ivanova: I owe you everything in my life; I am fortunate to be your son. Maria Ivanova: I cannot wait to see you again, grandmother; thank you for everything you have done for me. Special thanks to my brother, Evgeniy Ivanov: we are so different and yet are very good friends. My aunt,

Valentina Ivanova: I always feel your support I would like to acknowledge my support team of friends and colleagues: Andrew Fedorov, Maria Kozelkova, Alex Orlov,

Maribeth Spangler, Vadim Solovyov, Alex Goryaynov, Adam Boulton, Tim South, Pavel

Borisov, Emil Khisamutdinov, Kate Butler, Matthew Moreau, and many others. Thanks vii to Yury Shtarkman and Alex Bludin who contributed directly to my work and are my very best friends. Finally, special thanks to BGSU faculty: Drs Tami Steveson, Paul Morris,

Carmen Fioravanti –you all helped me in various ways and I do remember it.

“Never say never” – I refused to believe that I will ever become a Ph.D when I was in high school; you all proved me wrong. viii

TABLE OF CONTENTS

Page

CHAPTER I. A REVIEW OF THE LITERATURE ...... 1

Tailed ...... 1

Membrane Energetics and Iron Transport Systems in Bacteria...... 3

Iron and Iron Transport in Gram-Negative Bacteria ...... 5

Proton Motive Force and TonB-dependent Transporters ...... 7

CHAPTER II. ENTEROBACTERIOPHAGE PHI-80 ...... 13

Introduction … ...... 13

Materials and Methods ...... 16

Phage Phi-80 Growth ...... 16

DNA Extraction ...... 17

Restriction Digestion Map ...... 18

Genome Cloning and DNA Library Construction ...... 19

Annotation of Genes ...... 21

Comparative Genomics ...... 21

Electron Microscopy ...... 21

Atomic Force Microscopy ...... 22

Results and Discussion ……… ...... 22

Overall Features of φ80vir Chromosome ...... 22

The Left Arm ...... 23

Head Genes ...... 26

Tail Genes ...... 26 ix

Conserved -1 Translational Frameshift ...... 27

The Central Region ...... 28

Phage Attachment Site ...... 28

The Right Arm ...... 29

Immunity Region ...... 29

Phage Morphology ...... 31

CHAPTER III. GENES WITH MORON FEATURES ...... 40

Introduction … ...... 40

Materials and Methods ...... 41

Identification of Signals ...... 41

G+C Skew ...... 42

Results and Discussion ……… ...... 42

Genes with Moron Features, or Morons ...... 42

CHAPTER IV. CONVERSION RESISTANT LIPOPROTEIN COR ...... 51

Introduction … ...... 51

Bacterial Lipoproteins ...... 51

Materials and Methods ...... 54

Bacterial Strains and Construction of ...... 54

Colicin and Phage Spot-Titer Assays ...... 54

[55Fe3+]ferrichrome Transport ...... 55

Multiple Sequence Alignment and Phylogenetic Analysis of Cor Protein .... 55

Results and Discussion ……… ...... 57

Lytic and Lysogenic Conversion: Llp and Cor Lipoproteins ...... 58 x

CHAPTER V. CONCLUSIONS ...... 71

REFERENCES ...... 73 xi

LIST OF FIGURES/TABLES

Figure/Table Page

1 Figure 1. TonB-dependent outer membrane receptor, FhuA, and components of

the TonB energy transduction system ...... 12

2 Figure 2. Genetic map of φ80vir chromosome ...... 25

3 Table 1. Annotation of the φ80vir chromosome ...... 33

4 Figure 3. Annotation of the φ80vir chromosome ...... 36

5 Table 2. Percent identity of φ80 “head” proteins to those in λ and N15 phage ...... 38

6 Figure 4. Morphology of φ80vir ...... 39

7 Figure 5. Comparison of intergenic regions between genes encoding for a head

decoration protein D and major protein E in φ80vir, N15, and λ ...... 44

8 Figure 6. The guanine + cytosine content analysis of the left arm of φ80vir phage 48

9 Table 3. Predicted σ70 promoters of the enterobacterial phage φ80vir ...... 49

10 Table 4. Predicted ρ-independent (stem-loop) transcription terminators of the

enterobacterial phage φ80vir ...... 50

11 Figure 7. Predicted σ70 promoters of the enterobacterial phage φ80vir ...... 53

12 Figure 8. Polymerase Chain Reaction (PCR) cloning...... 56

13 Figure 9. Multiple sequence alignment of cor and llp xenologous genes

translated into protein sequences ...... 61

14 Figure 10. Multiple nucleotide sequence alignment corresponding to Cor and Llp mature

lipoproteins ...... 62

15 Figure 11. Maximum Parsimony (MP) analysis of mature Cor and Llp lipoproteins

aligned with MUSCLE ...... 64 xii

16 Figure 12. Alternative multiple sequence alignment of mature proteins cor and llp 66

17 Figure 13. Maximum Parsimony (MP) analysis of mature Cor and Llp lipoproteins

aligned with MAFFT ...... 67

18 Figure 14. Summary of computational analyses of protein Cor ...... 69

18 Figure 15. [55Fe3+]ferrichrome transport ...... 70

18 Table 5. Phage and colicin spot-titer assays ...... 71 1 CHAPTER I. A REVIEW OF THE LITERATURE

Tailed Bacteriophages

Bacteriophages, or “phages” for short, were discovered in the beginning of 20th century.

The initial observation occurred in 1896 when the British bacteriologist Ernest Hanbury Hankin noticed that the sewage from and Jumna rivers in contained an unknown agent that had the ability to kill Vibrio cholerae bacteria responsible for . He also found that this agent was heat labile and able to pass through a fine porcelain filter (Hankin E.H., 1896). A similar phenomenon was observed two years later in Bacillus subtilis by Russian bacteriologist

Gamaleya (Samsygina and Boni, 1984). About 20 years after Hankin’s observation, another

British pathologist—Frederick William Twort—proposed an antimicrobial filterable agent based on the apparent glassy transformation of Micrococcus colonies. Independently, in 1917, the

French-Canadian microbiologist Félix Hubert d’Hérelle, working at the Institut Pasteur in Paris, discovered an “ultra ” that infected Shigella bacteria and multiplied within it (d’Hérelle,

1918). This “ultra virus” occurred in filtered bacteria-free stool samples of French troops that were recovering from dysentery. d’Hérelle coined the term “” (literally “bacteria eater”) to describe this filterable agent. His hypothesis, that a phage is a bacterial virus, was not widely accepted by the scientific community until the 1940s.

Four of the six virus groups in the Baltimore classification scheme include agents that infect bacteria; double-stranded DNA viruses, single-stranded DNA viruses, double-stranded

RNA viruses, and positive-sense single-stranded RNA viruses (Ackermann, 2006; Canchaya et al., 2007). With most, if not all, eubacteria and archeae serving as hosts for multiple bacteriophage, it has become apparent that these agents are the most abundant known biological entities in biosphere (Hendrix et al., 2003). Direct quantification of bacteriophage virions in 2 coastal seawaters where their bacterial and archaeal hosts reside yields values of ~108

bacteriophage particles per cm3 (Wommack and Colwell, 2000). On this basis, the global viral

population was estimated to be of >1031 individuals (Hendrix, 2003). The majority of research emphasis has focused on the double-stranded DNA group Caudovirales – the tailed viruses. Of the over 750 unique sequenced phage , two thirds are from the Caudovirales (Hatfull and Hendrix, 2011). The emphasis on this group of viruses in part reflects their relative abundance in bacteria that have commensal or pathogenic relationships with humans and their domestic animals, and their development as model systems for fundamental biological processes, including: the nature of genetic information and mutation, mechanisms of gene regulation, protein folding and self-assembly, population dynamics, and questions regarding the origins of all viruses.

Early studies regarding the nature of bacteriophages and their establishment as model genetic systems focused on a set of bacteriophages that used Escherichia coli as a host. These

“coliphage” were chosen by a consortium led by Alfred Hershey, Max Delbrück and Salvador

Luria and termed “T-phages”. Specific coliphage were chosen for study because their E. coli

hosts did not develop resistance to infection, thus “growth” studies were straightforward. The T-

phages played a central role in the development of bacteriophage (for a historical

review, see Judson, 1996). Other phage could induce a state of “immunity” in their E. coli hosts

– these immune hosts could induce of other, non-immune E. coli hosts; thus, such strains

were termed “lysogens”. Studies of one such lysogenic bacteriophage, lambda (λ)—discovered

by in early 1950s—led to the recognition that immunity involved the ability of

the phage to establish a cryptic, non-lytic infection. The lysis of non-immune E. coli hosts

resulting from co-incubation with lysogenic E. coli strains represented the occasional production 3 of new phage from the “activation” of the cryptic phage genome. For the phage λ, lysogeny involved the integration of the bacteriophage genome into E. coli chromosome. This ability to integrate into (and subsequently excise from) the host genome provided for the transfer of genes adjacent to the phage integration sites between E. coli strains. This process of specialized

transduction drove the development of λ-phage as a model system (Judson, 1996; Ptashne,

2004).

While the bacteriophage λ was only isolated once from the environment (Gottesman and

Weisberg, 2004), numerous similar “lambdoid” phages are known (Hendrix and Casjens, 2006).

Probably as a result of biased sampling, the majority of phages isolated from fresh mammal feces

and natural habitats tend to be lambdoid phages that infect Escherichia coli and other Gram-

negative enterics (Casjens, 2008; Dhillon et al., 1976; Dhillon et al., 1980; Kameyama et al.,

1999). One such coliphage, called phi-80 (φ80)—isolated by Matsushiro in the early 1960’s

(Matsushiro, 1961)—is the focus of the present work. While originally used for specialized transduction in the same fashion as λ phage (Matsushiro, 1963), phage φ80 was found to exploit the TonB-dependent siderophore transport system of E. coli to gain entry to the host cell

(Hancock and Braun, 1976). This host range specificity has made phage φ80 a valuable tool for exploring the energy-dependent mechanisms of siderophore transport in E. coli (Larsen et al.,

2003).

Membrane Energetics and Transport Systems in Bacteria

The Gram-negative bacterial cell envelope is a dual membrane system, consisting of a protein-rich, energized cytoplasmic membrane (CM) and an unenergized outer membrane (OM), separated by aqueous compartment termed the periplasmic space, wherein a thin layer of peptidoglycan confers shape and rigidity to the cell. Whereas the CM is a standard phospholipid 4 bilayer, the OM has an unusual architecture; with an external surface of lipopolysaccharide

(LPS), an internal phospholipid face, and a variety of integral and lipid-anchored proteins. The major function of the OM is to serve as a protective barrier against many deleterious agents, including hydrophobic , detergents, and components of host defense systems. This function reflects the ability of LPS to form a permeability barrier to hydrophobic compounds

(such as lipophilic toxins) and large aqueous molecules (such as lysozyme), while certain integral membrane proteins termed “porins” provide aqueous channels that allow for the diffusion of smaller (<600 Da) hydrophilic nutrients (Nikaido, 2003; Crosa et al., 2004). Thus, the OM functions as a diffusion barrier, allowing passive diffusion of small hydrophilic molecules, while excluding both larger hydrophilic molecules as well as many hydrophobic molecules with high relative solubility in conventional phospholipid bilayers. Conversely, the

CM is a standard phospholipid bilayer that provides a permeability barrier that limits the passive diffusion of ions, maintaining electrochemical proton and sodium ion gradients that play major roles in driving ATP synthesis, solute transport, and other membrane associated activities

(White, 2007). The transport of some non-lipid soluble molecules across this barrier occurs by facilitated diffusion through gated channels (uniporters). However; the majority of transport is an energy-dependent process mediated by dedicated integral outer membrane proteins generally termed “transporters”. These transporters facilitate the energy-dependent flux of solutes across the membrane.

Bacterial CM active transport systems can be divided into primary transport and secondary transport, depending on how they are coupled to the energy source. Primary transport is directly coupled to some energy-generating reactions (e.g., ATP or PEP hydrolysis), whereas secondary transport is associated with an ion-electrochemical gradient. Active transport is a 5 subclass of primary transport in which the solute stays unmodified (e.g., the ATP-dependent

uptake of histidine). Secondary transport systems couple electrochemical ion gradients to the

vectoral transport of various solutes including sugars, amino acids (aa), and ions across the

membrane (White, 2007). For these systems “symport” refers to solute uptake in which two or

more solutes are transported on the carrier in the same direction; whereas antiport refers to the

coupled translocation of two or more solutes in opposite directions.

Iron and Iron Transport in Gram-Negative Bacteria

Iron plays an essential role in bacterial homeostasis. In oxidizing environments iron

occurs primarily in the form of insoluble hydroxides, Fe(OH)3, and oxide hydrate complexes

3+ -18 (Fe2O3 × nH2O). Under such conditions and at a neutral pH, [Fe ] is limited to 10 M in an aqueous environment, and up to 10-24 M in mammal serum (Raymond et al., 2003). These values

are below the cell requirement limit. When concentrations of iron are limited, many bacteria

(and fungi) produce Fe(III)-chelators, termed “siderophores”, that specifically bind, solubilize,

and deliver ferric ions to the cell (Crosa et al., 2004). Siderophores have exquisitely high

(subnanomolar) binding affinities for ferric ion (Braun and Hantke, 2011); for example, the E. coli siderophore enterochelin (also called “enterobactin”) forms an Fe(III)-enterochelin complex with a formal stability constant estimated at 1049 (Loomis and Raymond, 1991). Fe(III)-

siderophore complexes are readily taken across the cytoplasmic membrane by dedicated ABC

transporter systems, a primary transport process driven by the hydrolysis of ATP. For Gram-

negative bacteria, however, the process is complicated by the diffusion barrier nature of the OM,

as most Fe(III)-siderophore complexes exceed the 600 Da limit for a passive diffusion through

the aqueous channels provided by porins. To capture these important complexes and transport

them across the OM barrier, Gram-negative bacteria have evolved a variety of OM proteins— 6 collectively termed “TonB-dependent OM transporters” (TBDT) (Schauer et al., 2008)—that

provide for the active transport of Fe(III)-siderophores and other molecules into the periplasmic

space. The ability to recover ferric iron seems to provide a competitive advantage for many

bacteria (Hibbing et al., 2010), with much investment placed in its recovery. For example: while

Escherichia coli makes only one siderophore (enterochelin), wild E. coli cells express up to eight

different siderophore receptors, one for enterochelin, and seven others for Fe(III) in complex

with a variety of other siderophores made by other bacterial and fungal species (Noinaj et al.,

2010). Similarly, the genome of the soil bacterium Pseudomonas protegens encodes 45 different

TBDTs predicted to provide for the uptake of a wide range of ferric-coupled siderophore

complexes (Hartney et al., 2011).

Active transport requires energy, yet unlike the CM, the OM is spatially removed from

energy sources. Early studies found two requirements for active transport across the OM; the

presence of a cytoplasmic membrane protein—called TonB—and an intact electrochemical gradient across the CM (Hancock and Braun, 1976; Reynolds et al., 1980). Subsequent studies

strongly suggest that it is the electrochemical potential gradient of the CM proton gradient

~ ∆µ + (proton motive force: H ) that provides the energy (Bradbeer, 1983). This energy appears to

be harnessed by multimeric complexes of ExbB and ExbD proteins and coupled by TonB protein

(Fig. 1), the per-cell ratio of which is 7:2:1, respectively; whereas the exact TonB-ExbB-ExbD

complex composition is unknown (Higgs et al., 2002; Gresock et al., 2011). Despite the fact that

both inactive and active forms of TonB protein can associate with TBDTs, only the active form

facilitates transport of Fe(III)-siderophore complexes across the OM and its dissociation from the

transporter into the periplasm. Notably, some specific TBDTs mediate the transport of other

ligands: in E. coli these include cobalamin (Bassford et al., 1976) and heme (Torres et al., 2001); 7 in other species, TBDTs have been identified for the carbohydrate maltose (Lohmiller et al.,

2008), nickel, and cobalt (Schauer et al., 2007; Stoof et al., 2010). Therefore, it may turn out that

TBDTs mediate many other OM transport functions (Schauer et al., 2008).

Proton Motive Force and TonB-Dependent Transporters

Being a monovalent charged particle, the proton produces an electric potential (∆Ψ)

across a membrane as well as proton activity gradient when it travels across the membrane. This

~ ∆µ + sum is called electrochemical potential gradient, or simply electrochemical energy, H .

Using the Nernst equation for one mole of an ion across a membrane yields an equation for proton motive force (pmf) for the CM:

~ RT in out Vp )( µ + / F +∆Ψ=∆=∆ + aa + )/ln( H zF H H (1)

where z is the number of electrons transferred, F is Faraday constant, R is universal gas constant, a is chemical activity, and ∆p is pmf.

Using the normal temperature (298 K) and conversion to log10 by multiplying by 2.303,

the equation (1) can be expressed in more concise way:

p mV 16.59)( ∆−∆Ψ=∆ pH (2)

In equation (1), a single electron is being transported at the same time as the proton. In reality,

bacteria utilize both electrogenic electron flow and electrogenic proton flow to produce the

Donnan potential of a membrane (White, 2007).

The E. coli CM-anchored 239-aa energy transducing protein TonB (~26 kDa) (Postle and

Good, 1983; Gresock et al., 2011) physically interacts with TBDTs (e.g., Ferric hydroxamate uptake protein A: FhuA), transferring energy derived from the pmf of the CM via uncharacterized conformational changes to drive the transport of siderophores and other ligands 8 into the periplasm (Postle and Kadner, 2003; Braun and Endriss, 2007; Noinaj et al., 2010). The finding that missense mutations in the tonB gene could suppress mutations in a motif conserved

among TBDTs (termed the “TonB-box”) provided genetic evidence for this physical interaction

(Heller et al, 1988; Schöffler and Braun, 1989; Bell et al., 1990). Further evidence was

suggested by the observations that the over-expression of a TBDT stabilized a normally

proteolytically-labile over-expressed TonB (Günter and Braun, 1990) and the finding that the

introduction of a consensus TonB-box pentapeptide into the periplasmic space inhibited the

transport function of TBDTs (Tuckman and Osburne, 1992). These suggestions were confirmed

by in vivo chemical cross-linking studies that provided direct evidence for interaction between

TonB and TBDTs (Skare et al, 1993; Moeck et al., 1997). Surprisingly, the TBDT FhuA was

found to retain TonB-dependent functions in absence of its amino-terminal region that includes

the TonB-box and “plug”—which seals the channel of FhuA from periplasmic space in absence

of active form of TonB (Braun et al., 1999). Further studies suggested that this sensitivity

involved complementation of the deletion by the cork domain of another TBDT (Vakharia and

Postle, 2002). This suggests that it is the conserved features of TBDTs (including the TonB-box)

and not the more divergent features of cork domains that are essential for energization of

transport.

Analyses of crystal structures for the TBDT protein FhuA both alone and in complex

with TonB revealed the dimeric structure of the FhuA and physical evidence for specific

interactions between the amino-terminal TonB-box of FhuA and the carboxy-terminal region of

TonB (Pawelek et al., 2006). Specifically, this study found that the FhuA TonB-box, disordered in the absence of TonB, interacted with a β-sheet region (β3) of the TonB carboxy-terminal

domain by assuming a β-sheet structure stabilized by hydrogen bond linkages with TonB. FhuA 9 TonB box residues Ile-9, Thr-10, Val-11, and Ala-13 strongly reassemble with TonB residues

Val-225, Val-226, Leu-229, and Lys-231, respectively. Group B colicins—-encoded protein toxins that exploit TonB to facilitate transport by TBDTs—also have amino-terminal

TonB-boxes (Schramm et al., 1987; Mende and Braun, 1990). Furthermore, when the corresponding crystal structure of colicin M in complex with FhuA was solved, it revealed a similar interaction between the FhuA TonB-box and the carboxy-terminal region of TonB (Zeth et al., 2008).

The fact that the group B colicins would contain a motif common to TBDTs essential for their TonB-dependent entry into E. coli suggested to us that TonB dependent bacteriophage might require a similar motif for irreversible adsorption to E. coli. Two of the phage that utilize the TBDT FhuA, φ80 and T1, clearly require TonB and pmf for irreversible adsorption (Hancock and Braun, 1976). We examined the genome of phage T1 (Roberts et al, 2004), but found no apparent TonB box candidate in any of its tail genes. A similar search was not possible for φ80, as sequence data were not available. With the initial naïve idea that comparison of φ80 sequences against T1 and T5 (Wang et al., 2005), a TonB-independent phage that also used

FhuA as a receptor (Braun et al., 1973) would reveal common features required for adsorption to

FhuA and unique features involved in TonB-dependence, we set out to sequence the tail genes of

φ80. This undertaking evolved into the determination and annotation of the full genome of the

φ80vir mutant commonly used in TonB phenotype studies (Larsen et al, 2003).

As sequence data became available it also became evident that a canonical TonB box motif was not a feature of φ80, and that more sophisticated analyses involving a larger data set

(i.e. more phage genomes – a project that the Larsen lab is now proceeding with, independent of this study) will likely be required to predict potential motifs that define receptor and TonB 10 specificities. However, with the assembly and annotation of the φ80vir genome one particular

feature relevant to the question at hand stood out – the presence of a gene called “cor”.

A variety of lambdoid phages—including N15, HK022, mEp phages, and our φ80—as well as some non-lambdoid phages—for example UC-1, T1, and T5—require the TBDT protein

FhuA of E. coli as a receptor for adsorption (Braun, 2009; Braun et al., 1973; Kameyama et al.,

1999; Wayne and Neilands, 1975; Luria and Delbrück, 1943; Garen and Puck, 1951; Vostrov et al., 1996). Certain of these bacteriophages carry the cor gene, the product of which appears to be expressed during lysogeny and interacts with the TBDT FhuA to block further infection of the cells by FhuA-dependent phage. It was speculated that the product of this “gene of conversion”,

Cor protein, would also display a TonB-box consensus (Vostrov et al., 1996) based on the similar in vivo inhibition of TonB-dependent processes by the synthetic TonB-box consensus pentapeptide (Tuckman and Osburne, 1992), however this has not proven to be the case.

Uc-Mass et al. (2004) found that a set of lambdoid mEp phages shares the same phage exclusion mechanism with some λ-like phages: HK022, φ80, and N15. Cells expressing mEp167 cor (Cor+) excluded 13 out of 20 phages from different infection immunity groups in

both tonB- and tonB+ cells. These lambdoid mEp phages were unable to infect fhuA- cells, confirming their receptor for adsorption. In addition, the FhuA-mediated uptake of Fe(III)- ferrichrome was inhibited in Cor+ cells (Uc-Mass et al. 2004). It has been estimated that there

are ~85% of FhuA-dependent phages among heterogeneous lambdoid phages from different

immunity groups, with about half of phages the phages examined containing a version of the cor

gene (Hernández-Sánchez et al., 2008).

As initially annotated, the φ80 cor gene was incorrect, as the inverse complement was

considered as the open reading frame. In contrast, llp gene of T5 phage was characterized in 11 greater detail (Braun et al., 1994; Decker et al., 1994; Robichon et al., 2003). Of course, at that

time, the llp gene did not share any homology with incorrectly annotated locus of gene cor.

However, the later observations made for Cor protein of lambdoid mEp167 phage (Hernández-

Sánchez et al., 2008) were similar to those made for Llp lipoprotein. My hypothesis, based on a review of the available literature and sequences, was that cor and llp genes are paralogous, and that the cor gene should have its own cis-regulatory elements to be expressed during the lysogenic stage. As detailed in chapters three and four, the cor gene has the hallmarks of a moron, with predicted features that include an independent and terminator, and signature sequences indicative of a lipoprotein that traffics to the outer membrane. Expression of the cor gene product from a plasmid-borne cor gene in the proper orientation is demonstrated to block the ability of the FhuA receptor to support the irreversible adsorption of both φ80 and the

TonB-independent phage T5. Likewise the transport of both colicin M and the ferric siderophore

Fe(III)-ferrichrome were block, but not the TonB dependent processes of other TBDT. Together, these observations support a model where the Cor protein interacts specifically with FhuA independent of any TonB dependent interactions made by FhuA, colicin M, or φ80. 12

Figure 1. TonB-dependent outer membrane receptor, FhuA, and components of the TonB energy

transduction system. The FhuA receptor occurs as a beta-barrel (shown in green), which can be

opened and closed by the N-terminal plug domain, or “cork” (shown in dark blue) to control the

transport of substrates. The TonB-ExbB-ExbD-complex couples the pmf of the CM to FhuA receptor to facilitate transport. Cylinders, within the CM, indicate trans-membrane domains.

Capital letters “C” and “N” correspond to the C- and N-termini, respectively. Ribbons represent the solved crystal structure of FhuA with TonB (Pawelek et al., 2006) and liquid-crystal NMR

structure of C-terminal portion of ExbD protein (Garcia-Herrero et al., 2007), visualized by

Executable build of The PyMOL Molecular Graphics System, Version 1.3, Schrödinger, LLC. 13 CHAPTER II. ENTEROBACTERIOPHAGE PHI-80

Introduction

Enterobacteria phage phi-80 (φ80) was first isolated from the bacterium Shigella

dysenteriae by Matsushiro (1961). This temperate coliphage belongs to lambdoid group of the

Siphoviridiae family and exhibits features common to this group: a head with cubic

and long non-contractile tail (Shinagawa, 1966), the ability to form functional recombinants with

the enterobacteria phage lambda (λ) during coinfections (Singer, 1964), 12-nt long cohesive ends

on a dsDNA chromosome (Yamagishi et al., 1965), and maturation induced by UV light in

E. coli K-12(φ80) lysogens (Matsushiro, 1961).

Historically, φ80 and λ were key players in early studies of lysogeny. Following the

initial observation λ phage mediation of gal-transduction between E. coli K12 strains (Morse et

al., 1956), a similar specialized transduction of trp+ genes from tryptophan positive to tryptophan

negative cells of E. coli by φ80 was observed (Matsushiro, 1963). Shortly thereafter, the lac-

transducing system between E. coli K-12 strains using φ80 was developed (Signer and Beckwith,

1966). Later, an integration site of φ80 into the E. coli chromosome was located between supF, galU, and tdk genes on one side, and tonB and trp genes on other side (27-min region) (Igarashi et al., 1967; Bachmann and Low, 1980)—genes that can also be packaged and transferred by φ80 heads during specialized transduction. Fortune seemed to favor bacteriophage

λ, its isolated DNA was found able to transform E. coli (Kaiser and Hodgness, 1960), the sticky ends of which proved to be the first successfully sequenced DNA (Wu and Taylor, 1971) and ultimately it yielded the first fully sequenced double-stranded DNA genome (Sanger et al.,

1980). Early availability of a sequence led to the harnessing of λ as a major cloning vehicle

(reviewed in Chauthaiwale et al. 1992); and supported the development of λ lysogeny as a 14 dominant model for studies (Ptashne, 2004). In contrast, bacteriophage φ80 languished in the shadow of its more widely studied cousin.

Although φ80 belongs to the same family of phages, it differs from the family prototype

λ in immunity specificity, host range, plaque morphology, prophage location, conversion, thermal stability, buoyant density, and other features (Rybchin, 1984). Furthermore, in contrast to λ phage, φ80 has a relatively narrow host range, adsorbing only onto the B, C and K-12 strains of E. coli and onto the SH strain of Shigella dysenteriae from which it was originally isolated

(Rybchin, 1984). During infection, φ80 specifically adsorbs to FhuA, the OM ferric hydroxamate uptake protein of E. coli (formerly designated TonA and recently reviewed by

Braun, 2009). Adsorption by wild-type λ phage is a two-step process, beginning with low affinity interactions with the OM porin OmpC using side tail fibers (Hendrix and Duda, 1992), followed by high affinity adsorption onto the maltose transporter LamB (Randall-Hazelbauer and

Shwartz, 1973; Shaw et al., 1977) via the tail tip fiber, gpJ. The LamB specific region of the λ gpJ protein maps to the carboxyl-terminal region (Wang et al., 2000). Interestingly, recombination events in E. coli K-12 strains co-infected with both φ80 and λ phages are common and can result in both recombinants with immunity to λ but recognizing FhuA as a receptor

(λh80: Taylor and Yanofsky, 1964) and recombinants with immunity to φ80 but recognizing

LamB as a receptor (φ80h: Franklin et al., 1965). These host-range recombination events map to the gpJ region of λ and a corresponding region of φ80 (Youderian, 1978). Whereas the docking of the λ phage to LamB is sufficient for irreversible absorption and the initiation of infection, the docking of φ80 phage to FhuA becomes irreversible and infection ensues only if an ion electrochemical potential exists at the CM and if TonB protein is coupled to that potential

(Hancock and Braun, 1976). 15 Long before sequencing data were available, genetic studies resulted in genetic maps of a

λ derivative (λimm80, a circular permutation for the prophage (Franklin et al., 1965), and for φ80

and its integrated prophage in E. coli (Rybchin and Singer, 1967). Based on these physical

maps, φ80 and λ were assumed to be very similar phage (Fiandt et al., 1971). Nevertheless, until

present, sequences for only a few regions of the φ80 genome have been published, and then primarily in the context of comparison with its famous cousin λ. Thus, GenBank has housed a

modest collection of φ80 sequence data: for the φ80 immunity region, including cI and gene N

(GenBank ID: M11919; Tanaka and Matsushiro, 1985), an annotated immunity gene cluster N- oL-4OB-4OA-cI-oR-cro-cII for wild-type φ80 (GenBank ID: X13065; Ogawa et al., 1988), for

the “-excisionase region” responsible for recombination in host range mutant φ80h

phage and containing the att-int-xis genes (GenBank ID: X04051; Leong et al., 1986), and for

the structural gene encoding the major coat protein E (GenBank ID: X06751; Kitao and Nakano,

1988).

Our initial goal was to isolate and characterize the φ80 J gene, with the objective of

identifying regions of the gene that contribute to the TonB dependence of this agent. For this

purpose we used the non-lysogenic φ80vir strain commonly used in our and other labs to assay

TonB function. Our assumption that we would identify a “TonB box” similar to those present in

TBDTs (Weiner, 2005) and colicins that exploit the TonB system (Braun et al., 2002) proved

naïve as our first sequence data were analyzed and no such motif was evident. However, further

examination of conversion resistant protein cor gene and its role in lysogenic conversion

(Kozyrev et al., 1982) compelled us to continue sequencing. Upon finding evidence suggesting

the presence of moron genes in our data that were distinct from those described in other phage

(Hendrix et al., 2000) and evidence for genetic mosaicism similar to that noted elsewhere 16 (Casjens, 2008), we decided to sequence the entire late of the φ80 phage, comprising the

“left arm” of the genome and encoding the major structural genes of the phage. As described below, the sequence for the majority of this roughly 24 kb region was solved by a conventional strategy, using Sanger sequencing initially on cloned fragments of the φ80 genome, and subsequently by primer walking on the purified genome itself. Having gotten this far, it seemed only natural to finish the genome, which was then accomplished using the next generation approach of Ion Torrent sequencing. Analysis of the resultant data confirmed our initial sequencing and resulted in a fully annotated genome for φ80vir that, surprisingly, is clearly distinct from some of other sequences published for φ80.

Materials and Methods

Phage φ80 Growth

The enterobacteria phage “φ80vir” (from here on simply φ80vir) was acquired from the laboratory of Dr. Kathleen Postle (Pennsylvania State University), having first come into her possession while a graduate student in the 1970’s in the laboratory of William Reznikoff

(University of Wisconsin). The history of the phage prior to this remains unclear. Phage φ80vir was grown similarly to phage λ using the protocol of Sambrook and Russell (2001), as adapted by Cramer (MS Thesis, BGSU 2008). An initial culture of the laboratory wild-type E. coli K-12 strain W3110 (Hill and Harnish, 1981) was grown overnight in 5 ml of the Miller formulation

(Miller, 1972) of lysogeny broth (LB: 450 µl of 10 N NaOH, 10 g of tryptone, 10 g of NaCl, and

5 g of yeast extract per 1 liter of ddH20) at 37°C incubator with shaking. The next day, 1.25 ml aliquots of this culture were inoculated into each of three 1-liter Erlenmeyer flasks containing

250 ml of LB medium supplemented with 2.5 ml of 0.5 M CaCl2 and 1.25 ml of 1 M MgSO4. 17 These were incubated with shaking at 37°C for 30 min, then inoculated with 45 µl of a φ80vir

2+ lysate in λ Ca buffer (10 mM Tris-HCl [pH 7.9 @ 25ºC], 20 mM MgSO4, 5 mM CaCl2),

containing either 1.8 × 105 or 8.0 × 103 Plaque Forming Units (PFU), or with λ Ca2+ buffer

alone, the latter to serve as a negative control. Culture densities were periodically monitored by

measuring absorbance at 550 nm in a Spectronic 20 spectrophotometer with a 1.5-cm path

length. After 7 hours of incubation, lysis was evidenced by a dramatic reduction in absorbance

in both of the phage-inoculated cultures (but not the negative control). At this point, 2 ml of

chloroform was added to the flask with initially inoculated with 1.8 × 105 PFU and mixed with a

stir bar. The cell debris were removed by centrifugation for 15 min at 10,000×g, with 2 ml of

chloroform then added to the recovered supernatant.

To determine the PFU for this supernatant, serial 10-fold dilutions of samples were made

in λ Ca2+ buffer, mixed with 100 µl aliquots of a fresh overnight W3110 culture grown in LB, suspended in 3 ml of molten (60˚C) T-top agar (10 g tryptone, 8 g NaCl, 7.5 g agar per 1 liter of ddH2O; Miller, 1972), plated on T-plates (10 g tryptone, 8 g NaCl, 15 g agar per 1 liter of ddH2O; Miller, 1972) and incubated overnight at 37˚C. Counts of the resultant plaques indicated

that the bacteriophage lysate contained ~1011 PFU ml-1 of φ80vir phage, for a total of ~2.5 ×

1013 PFU. This resultant solution was stored at 4°C.

DNA Extraction

The bacteriophage lysate was treated with DNase I and RNase A, each at 0.057 µg ml-1,

for 1 hr at 25°C. DNase I was heat-inactivated for 20 min at 60°C. To concentrate the phage for

DNA extraction, the lysate was adjusted to 1 M NaCl and stirred at 4°C for 120 min, then

centrifuged for 15 min at 10,000 ×g. Following centrifugation the supernatant was treated

overnight with 10% (wt/vol) of polyethylene glycol (PEG 8000) and stirred at 4°C. The lysate 18 was then centrifuged and the precipitate was suspended in SM buffer (100 mM NaCl, 8 mM

MgSO4•H2O, 50 mM Tris-Cl [pH 7.9]), and extracted against an equal volume of chloroform.

To purify the genome, 1 ml of the recovered phage concentrate was supplemented with 40 µl of

500 mM EDTA, 50 µl of 1 mg/ml proteinase K and 50 µl of 10% (wt/vol) SDS and incubated at

60°C for 60 min to release the viral DNA from the nucleocapsid. Following this incubation, the

φ80vir DNA was sequentially extracted against 1 ml of Tris-Cl equilibrated phenol (pH 8), 1 ml

of phenol:chloroform:isoamyl alcohol (25:24:1), and 1 ml of chloroform:isoamyl alcohol (24:1).

The extracted DNA was dialyzed overnight at 4°C against 1 L of TE buffer (10 ml of 1 M Tris-

Cl [pH 7.9], 2 ml of 0.5 M EDTA in 1 liter of ddH2O). Samples of the purified genomic φ80vir

DNA were electrophoresed along with various concentrations of phage λ genome in a 0.7%

(wt/vol) agarose gel to verify the presence and integrity of the DNA (data is not shown). The

absorbance of φ80vir DNA solution was measured on a Beckman Coulter DU50 spectrophotometer (Beckman Instruments, Inc.; Fullerton, CA) at 260 and 280 nm, with the resultant concentration of φ80vir DNA calculated at 460 µg ml-1, with a 260 nm :: 280 nm ratio

of 1.78.

Restriction Digestion Map

The φ80vir genome was subjected to restriction analysis using 40 different restriction

endonucleases (New England Biolabs Inc.; Ipswich, MA) as described by Cramer (MS Thesis,

BGSU 2008). Restriction enzymes were chosen based on their availability and specificity to

palindromic sequences. Each reaction was performed in a total volume of 20 µl, containing

7.5 µl of the 460-µg ml-1 φ80vir genomic DNA, 1 µl of an endonuclease, 2 µl of the

corresponding 10x reaction buffer, and 0.2 µl of a 100x stock of bovine serum albumin (when

necessary). Reaction mixes were incubated at the appropriate temperature for 90 min, after 19 which 5 µl of gel loading buffer was added. The entire digestion was then resolved by

electrophoresis in a 1% agarose gel along with appropriate size standards (2-log ladder: New

England Biolabs Inc.; Ipswich, MA) and visualized with ethidium bromide staining under

ultraviolet illumination.

Genome Cloning and DNA Library Construction

Extracted φ80vir genomic DNA was processed into smaller fragments by limited

restriction with the restriction endonuclease Sau3AI (cuts at 5’-↓GATC-3’). Resulted fragments were then ligated into a BamHI-restricted, dephosphorylated pUC19 plasmid to produce random clones as described by Henry (MS Thesis BGSU 2008). The resultant constructs were recovered by transformation into into NEB 5-α competent E. coli cells and plated on LB agar supplemented with 100-µg/ml ampicillin. Screening by α-complementation was then performed by plating the plasmid-bearing isolates by on X-gal/IPTG plates (250 ml of molten LB agar equilibrated at 60˚C, to which 100 µl of 2% (wt/vol) 5-bromo-4-chloro-3-indolyl-β-D- galactopyranoside (X-gal), 100 µl of 20% (wt/vol) isopropyl-β-D-1-thiogalactopyranoside

(IPTG), and 250 µl of 100 mg ml-1 ampicillin was added, then poured in 20 ml aliquots into

standard Petri dishes). Selected white colonies containing putative plasmids bearing φ80vir

DNA inserts were cultured in LB supplemented with 100-µg/ml ampicillin at 37C for 16-18 hrs,

after which plasmids were recovered by alkaline lysis as described by Sambrook and Russell,

(2001). To identify fragments isolated plasmids were restricted with EcoRI and HindIII

endonucleases, recognizing palindromic sequences that flank the BamHI cloning site in the

polylinker of the pUC19. Digests were resolved on 1% agarose gels, with visualization as

described above (Henry, MS Thesis, BGSU 2008). A total of 590 plasmids were thus screened 20 for inserts. A subset of plasmids bearing inserts greater than ~500 bp in length were chosen for

DNA sequence analysis.

Selected plasmids were sequenced by automated standard Sanger technology using

M13/pUC -20 and -24 reverse primers at The University of Iowa, DNA Core Facility.

Overlapping sequences were used to assemble several contiguous regions (“contigs”) containing sequences with homologies to known structural proteins from other lambdoid bacteriophage.

Gaps between these contigs were resolved by primer walking, using the purified φ80vir genome as a template. From these data a region of 23,204 bp was assembled, with at least two-fold coverage of all bases on both strands.

For next generation sequencing, a sample of purified φ80vir genomic DNA was submitted to the Genewiz corporation (South Plainfield, NJ), from which a library was generated using emulsion PCR, with ion semiconductor sequencing then performed on an Ion Torrent platform. This run generated 638547 reads with an average length of 216.8 bp. These data were processed using CLC Genomic Workbench software (CLC Bio, Aarhus, Denmark) on a

MacIntosh laptop with a 2.4 GHz Intel Core i5 running Mac OS X (version 10.6.8). Reads were

“trimmed” at the 5’ and 3’ ends such that each read included only bases 15-50 or bases 51-100 of the initial read. These were then pooled and assembled, producing two major contigs. The smaller of the two contigs was 4540-bp long (circular) with an average coverage of 53.2 reads.

This contig was assembled from reads not used in the larger contig. A BLAST search with this sequence identified it as the cryptic E. coli plasmid Sflu5 (AM43197). The larger contig was

46,380-bp long, and had assembled with overlap at a region corresponding to the sticky ends of a linear genome. These redundant extensions were resolved to yield unique 46,285 base pairs representing the φ80vir genome. Average coverage on this contig was 282.8 reads, with no 21 region having less than 50 reads. Comparison of this complete contig with the conventionally derived 23,204-bp assembly identified 46 points of conflict, 45 of which were single bases, with

1 involving the removal 45 additional bases resulting from an error in the editing of the original data. For each of these, the original Sanger sequence data was re-examined, and in each case was the initial sequencing was found to be either ambiguous or in error.

Annotation of Genes

We initially predicted open reading frames (ORF) and probable start codons using the

GeneMark program (http://opal.biology.gatech.edu/GeneMark/) (Besemer and Borodovsky,

2005). Subsequent predictions were verified using the DNA Master (version 5.22.1) software, freely available through the J.G. Lawrence laboratory at the University of Pittsburgh

(http://cobamide2.bio.pitt.edu/). Every ORF was thoroughly examined, and adjusted putative start codons were assigned on the basis of their homology with other known orthologous genes of lambdoid and lambda-like viruses.

Comparative Genomics

The comparative alignment of the φ80vir genome was constructed utilizing the progressiveMauve algorithm (Darling et al., 2010), a free publicly available genome alignment program (at http://asap.ahabs.wisc.edu/mauve/). Aligned phages were selected according to the results of sequence similarity searches, as briefly summarized in Table 1.

Electron Microscopy

Phage morphology was elucidated using a high-titer phage (1010 PFU ml-1) sample. For the transmission electron microscopy, virions were adsorbed to the surface of carbon-coated

(Parlodion) copper grids for 1 min and then washed with sterile ddH2O. Grids were negatively stained with a 2% (wt/vol) aqueous solution of uranyl acetate or with a 1% (wt/vol) aqueous 22 solution of phosphotungstic acid (PTA) for 1 min and then air-dried. Samples were examined

and micrographs obtained using a Zeiss EM-10 transmission electron microscope (Carl Zeiss,

Oberkochen, Germany; housed in the Electron Microscopy facility in the Biological Sciences

Department at Bowling Green State University, OH) operated at 80 kV accelerating voltage

(corresponds to primary magnification of ×54,400). The magnification calibration was

performed with Grating Replica Parallel Lines (Electron Microscopy Sciences, Ft. Washington,

PA) prior taking the images.

Atomic Force Microscopy

For the atomic force microscopy (AFM), the high-titer phage sample was diluted 100- fold with distilled water. Freshly cleaved mica was immersed in the diluted phage sample for

20-30 min. After the adsorption of virions to the surface of the mica, supports were washed with

sterile distilled water and dried on air. The AFM image was taken using NanoScope IIIa (Digital

Instruments, Santa Barbara, CA; housed in the Center for Materials and Sensor Characterization

at The University of Toledo, OH) with the scanning rate of 0.6 Hz.

Results and Discussion

Overall Features of φ80vir Chromosome

The genome sequence of the enterobacteria phage φ80vir was assembled as described in

Materials and Methods, with ion torrent data validating, refining, and extending those data

initially collected by conventional means. The chromosome of lambdoid phage φ80vir, has an

architecture that, as in λ, can be partitioned into three functionally distinct regions historically

termed as: (i) the “left arm”, bearing primarily late-expressed genes involved in the assembly of

new virions; (ii) the central region, comprised of genes involved in the establishment of 23 lysogeny and; (iii) the “right arm”, containing regulatory components for the establishment of

lysogeny and immunity, and genes encoding products that will trigger lysis of the host and

release of new viral progeny (Fig. 2). The chromosome of φ80vir contains in total 46,285 base

pairs of a double-stranded DNA with its overall G+C content of 50.4%, comparable to the 50.8%

for the E. coli K-12 genome (Blattner et al., 1997). There were 71 open reading frames predicted

for this phage with features suggesting they were in fact genes, the majority of which matched

previously characterized phage genes described from either isolated phage or from Gram-

negative lysogens (Table 1).

The preliminary partial 5’contig containing “late genes” of the enterobacterial phage

φ80vir had been deposited into the EMBL nucleotide sequence database (GenBank ID:

FN582354). Corrections to this sequence were made following comparisons with the ion torrent data. We have not yet released the final annotated version of the entire φ80vir genome sequence because of the unexpected organization of its “early region” (see below). It will be released once the laboratory has annotated a complete genome for an ostensibly wild-type φ80 that we have recovered from a lysogen provided by Volkmar Braun (Max Planck Institute for Developmental

Biology: Tübingen, Germany) and compared it to the φ80vir genome.

The Left Arm

The left arm of φ80vir chromosome is comprised of genes controlling phage particle maturation and structural genes components. We report that the 23,204 left arm of

φ80vir consists of 26 predicted genes: (i) an operon of 23 structural genes that are orthologous to the structural genes of lambda-like phages with a similar synteny, interrupted by (ii) a novel predicted moron gene morF encoding a putative lipoprotein, (iii) a putative moron gene cor

encoding for the FhuA-blocking lipoprotein (unpublished), and ending with (iv) a hypothetical 24 tail gene (stf). Similar to other lambda-like viruses, about 98% of this operon encodes for

proteins; gene density reaches ~1.1 genes per kilo base pair of DNA (~860 [nt] per

gene). The first 7130 nucleotides of the φ80vir phage chromosome (Table 1) are identical to the

E. coli K12 strain DH10B(φ80dlacZ∆M15) prophage annotated by Durfee et al. (2008), except

for one base (T instead of C) at the position 7,114. Another part of left arm, from position

39,870 to the end, there are 18 nt in total are different. To compare φ80vir structural gene cluster with similar gene clusters from other related and unrelated phages, we performed a comparative genomics study using a sequence of the left arm φ80vir and found that it was highly conserved with the N15 left arm and almost resembled its synteny (Fig. 3, Table 1). In our study, however,

Locally Colinear Blocks (LCB, or regions of synteny) within the minor tails region of φ80vir did not show significant similarity with the minor tails of λ, but did show significant homology to corresponding regions in N15 and HK022. Some genes were similar to those found in T1

(Fig. 3). Figure 3 represents comparative genomic study using LCBs constructed with progressiveMauve algorithm (Darling et al., 2010) (at http://asap.ahabs.wisc.edu/mauve/). Each

LCB above genetic map corresponds to that particular contig on DNA chromosome. Based on

the genomes compared, the LCB suggests that the corresponding DNA contig is free of

Horizontal Gene Transfer (HGT) events.

25

Figure 2. Genetic map of φ80vir chromosome. Genes are shown using the kilo base pair scale. Rectangles above and below the scale represent genes that are transcribed rightward and leftward, respectively; vertical positions were selected for visual clarity. Gene names are given inside the rectangles (when possible); whereas gene product names (if known) are indicated immediately above or below the rectangles. Predicted σ70 promoters are illustrated by small bended arrows, and ρ-independent (stem-loop) transcription terminators (T) are also presented. The slanted arrow between genes 14 and 15 shows a programmed -1 translational frameshift

(ribosome slippage), predicted based on strong sequence homology to bacteriophage λ. The “?” assumes a pseudo gene, as a result of frameshift mutation in int gene.

26 Head Genes

The φ80vir genome has 10 predicted genes that are highly conserved with genes that are

required for capsid formation in phage λ (Hendrix and Casjens, 2006). The order of the genes in the head gene cluster is strongly conserved among φ80vir, N15, and λ; therefore, it can be clustered in the same Locally Collinear Blocks (LCBs) (Fig. 3). This genetic synteny is typical for lambda-like phages; it begins with the genes encoding for the small and large terminase subunits and proceeds through the portal protein, protease, scaffolding protein, major capsid protein (gpE), and the head completion proteins. In the present study, the predicted φ80vir major capsid protein E (UniProt ID: E4WL25) matched 98% identity with previously annotated wildtype φ80 gpE analog (UniProt ID: P05481) and 95% with the N15 phage gp8 gene (UniProt

ID: O64322) (Table 1 and Table 2). However, the φ80vir gpE did not score as 100% identical to the gpE in the defective prophage φ80dlacZ∆M15, perhaps only because the corresponding gene

E in this prophage was disrupted (Durfee et al., 2008). Our computational analysis suggests that the major capsid protein gene E is flanked by intrinsic regulatory elements: a σ70 promoter and a

stem-loop ρ-independent transcription terminator, a very unusual phenomenon for a gene

encoding a structural protein (and predictive of a moron, as discussed in Chapter III).

Tail Genes

Sequence similarity searches revealed that some of the predicted tail proteins from the

phage φ80vir, belonging to the Siphoviridae family, shared a significant similarity with the tail

proteins of N15, HK022, HK097, and other unrelated phages (Table 1). Interestingly enough, an

unrelated lytic phage T1—which also belongs to Siphoviridae family of viruses but to a different

genus—shares some significant similarity with φ80vir, N15, and HK022 in the region of the

minor tails (Fig. 3). It should be noted that all of these phages listed above, except for HK097, 27 infect E. coli cells through adsorption to the FhuA receptor, and the region of similarity is known to be involved in host-range specificity (part of J gene and phage minor tails). Therefore, this finding might reflect horizontal gene transfer events between both related and unrelated bacteriophages, and as a result, exhibits phage diversity and coevolution in the same host. We strongly believe that the carboxyl-terminal domain of gpJ and the minor tail proteins to have necessarily been conserved between the various FhuA-dependent phages in order to have a proper conformation to recognize FhuA receptor for irreversible adsorption to occur. Although the predicted moron genes morF and cor are not real tail genes, they are encoded within the tail gene cluster. Both of those genes and their corresponding products are described in detail in

Chapter III and Chapter IV.

Conserved -1 Translational Frameshift

A programmed translational frameshift in phage λ has been known for almost two decades (Levin et al., 1993). Similar to the retroviral translational frameshifts of gag-pol genes, this -1 nt frameshifting in λ and certain other phages (Xu et al., 2004) serves to control the expression of two overlapping coding sequences, encoding for tail assembly proteins, gpG and gpGT, both of which are required for phage maturation. Notably, neither of these gene products are part of the mature phage tail. Closer to the 3’-end, the gene G of the λ phage contains a

“slippery sequence” in the mRNA, 5’-GGGAAAG-3’ (with the consensus sequence of: 5’-

XXXYYYZ-3’), which experimentally results in ~3.5% of the ribosomes making a 1-nt step back during translation and continue to read in a new ORF to produce the fusion protein gpGT

(Xu et al., 2004). 28 In this study, the conserved “slippery sequence” was found to be present in phage φ80vir

based on the strong homology to the 5’-GGGAAAG-3’sequence in the corresponding region of the phage λ genome. The predicted gene products, gpG (143 aa) and gpGT (251 aa), were calculated to be 15.6 kDa and 27.7 kDa, respectively (Table 1).

The Central Region

Phage attachment site

The primary attachment site for λ phage on the E. coli strain K-12 chromosome, known

as attB (or BOB’), is located between genes gal and bio (~17.4-min region; Weisberg and

Landy, 1983); whereas BOB’ for φ80 is different (Leong et al, 1985) and is located, as was

mentioned before, in the E. coli strain K-12 chromosome between tdk and trp genes (27-min

region; Igarashi et al., 1967). To our surprise, φ80vir had an identical attachment sequence to

that of λ. Moreover, not only did φ80vir have a λatt, but also the entire recombination att-int-xis

gene cluster (see Table 1), composed of the phage attachment site and the genes predicted to

encode the recombination proteins, integrase and excisionase. The int gene of φ80vir, however,

was mutated (single nucleotide deletion) and probably expresses the truncated integrase protein.

It is known that integrase is required for integrative (attP × attB) and excisive (attL × attR)

recombinations; however, a classes of int mutants, that were defective in integrative but not

proficient in excisive recombination in , was characterized (Weisberg and Landy,

1983). We do not have any experimental evidence of whether this frameshifted integrase is

functional or not, but we speculate that the virulent phenotype of the phage studied might be

associated with non-functional integrase, and not with mutated CI , as it was believed a

priori. The reason of why φ80vir retains att-int-xis region without any other mutations in it 29 might be explained by the probable functionality of truncated integrase in excisive

recombination. The near identity between this region and the corresponding region of λ strongly

suggests the occurrence of a recent recombination event between φ80 and λ, quite probably in a

laboratory setting.

The Right Arm

Most of the predicted genes in the right arm of φ80vir were found to be highly homologous to the open reading frames found to occur in putative identified in the genomes of various strains of E. coli, Shigella and Enterobacter (Table 1). Several predicted genes were found to have >95% homology to bacteriophages HK620 and H19B. Homology searches of φ80vir genes located after the rightward promoter (cro-cII-gene 49-O) did not reveal a strong homology to the corresponding DNA region for the wild-type φ80 (gene 30-cII-M3RNA- gene 15-gene14) reported earlier (Ogawa et al., 1988). However, the general synteny of this region in φ80vir is conserved and consists of cII-like—oop—“orf”—O-like—P-like gene motif within lambdoid phage (Hayes et al., 2012).

Immunity Region

In the lambda family of phages, cellular immunity is triggered by the prophage that causes lysogenic conversion of E. coli cells. The CI repressor protein encoded in the immunity region binds to the operator sites (oL and oR), therefore blocking transcription of the phage genes

controlled by the major leftward and rightward promoters; pR and pL, respectively. Here, we

report that the immunity gene cluster of φ80vir does not correspond to wild-type immφ80 region described by Ogawa et al. (1988). We annotated the immunity gene cluster of φ80vir as having pL-(oL)-gene 45-cI-oR-pR-cro synteny; whereas the wild-type immφ80 was reported to be N-oL-

4OB-4OA-cI-oR-cro. Of course, one would expect a virulent mutant of a temperate phage to 30 have this region mutated or somewhat changed, so that the virulent mutant could efficiently form

plaques on cells lysogenised by its wild-type form of phage. But unlike λvir that harbors point

mutations v2 in oL, v1 in oR2, and v3 in oR3 (Daniels et al., 1983), little to nothing is known

concerning the nature of the mutations that resulted in the virulent phenotype of the φ80vir used

in this study. In the literature, a deletion termed “AB43” for the cI gene of φ80vir was mentioned in the abstract of an article in Russian Journal of Genetics (Vasinova et al., 1987) and discussed further by Rybchin (1984). Rybchin described that the ΑΒ43 deletion (4.8% λ units, a value that corresponds to ~2300 missing nucleotides) overlapped with the φ80 cI gene and reported that a similar deletion was found in a hybrid φ80hy43. The ORF for cI in φ80vir phage used in this study (Table 1) consists of 708 bp (corresponding to 235-aa protein); whereas ORF for cI in wild-type φ80 isolated by Matsushiro contains 712 bp (corresponding to 236-aa protein).

Therefore, it is evident that our φ80vir does not carry the AB43 deletion, and it is highly unlikely that our φ80vir and the φ80vir described by Vasinova et al. are the same strain. It should be noted that mutations conferring the virulent phenotype are fairly common, even a spontaneous single-step mutation may result in appearance of φ80 virulent mutant in a stock of wild-type φ80 phage (Youderian and King, 1981).

To reconstruct events that took place to give rise to φ80vir, we propose a scenario in which wild-type φ80 prophage DNA recombined with λ DNA in E. coli in a laboratory setting, just like in an int-promoted recombination between λatt24 and λatt+ produced heterozygote-

phage particles (Shulman et al., 1973). Because λ and φ80 were commonly used together in

laboratory, this scenario might be possible. Also, there must have been another recombination

event in which the immunity region from another unknown lambdoid phage was recombined

with φ80h80attλ. Finally, the resultant phage, whose phenotype was probably similar to the 31 wild-type φ80, acquired a single-nucleotide mutation (deletion) in the int gene to give raise of a virulent mutant. The overall genotype of φ80vir phage therefore can be written as φ80h80attλint-

imm?lys80 (φ80 hybrid phage with host-range of φ80, recombination of lambda with deficient integrase, immunity of unknown lambdoid phage, and lysis of φ80). Once the virulent phenotype is established, it is likely that mutations might accumulate in the immunity region by genetic drift, as this is no longer under selection. Obviously the converse is also a possibility, that the frameshift in the int gene occurred after an initial mutation in the immunity region

established the virulent phenotype in our φ80vir strain. These two possibilities may be resolved

with completion of the wild-type φ80 genome.

Phage Morphology

Transmission electron micrographs of wild-type φ80 were first obtained about 5 years after the isolation of the phage (Shinagawa et al., 1966). We wanted to see if our φ80vir had the same morphology as its wild-type sibling. The TEM micrographs revealed that the general structure of φ80vir was similar to that observed with wild-type φ80 and to other lambdoid and lambda-like phages (Fig. 4). As evident in Figure 4, φ80vir has a capsid of about 70 nm in diameter, a long and flexible non-contractile tail, but lacks apparent “side tail fibers” that are also absent in λPaPa, but present in true wild-type Ur-λ phage, possibly because of the similar frameshifting event in tail fiber gene, as was described for λPaPa (Hendrix R.W. and Duda R.L.,

1992). It should be noted that the side tail fibers are fairly delicate and thus their absence in these preparations is not conclusive. Although AFM image of φ80vir (Fig. 4C) does not provide an accurate structure dimension of the virion, it is still a useful and efficient technique that can be used to visualize phages. Because the sample had to be dried prior to scanning, the resulting picture of the phage looks flat. In retrospect, a pre-staining step using uranyl acetate would have 32 made a better preservation of the phage particle. Nevertheless, TEM and AFM images support

an overall morphological similarity between φ80vir and the wild-type φ80 and φ80pt1 strains

(Shinagawa et al., 1966). 33 Table 1. Annotation of the φ80vir chromosome Coordinates BLASTp (Swiss-Prot)b ORF Strand ______Length, Functional annotationa Best homolog Accession # ______Start Stop (AAs) Bit score ID% E value Overlap

Nu1 + 191 736 181 Terminase small subunit / DNA packaging φ80dlacZ∆M15 B1XBI2 370 100 4e-129 181/181 A + 711 2633 640 Terminase large subunit / DNA packaging φ80dlacZ∆M15 B1XBI3 1337 100 0.0 640/640 W + 2633 2839 68 Adaptor protein / head-tail joining φ80dlacZ∆M15 B1XBI4 138 100 4e-41 68/68 B + 2836 4428 530 Head portal protein / head to tail protein φ80dlacZ∆M15 B1XBI5 1103 100 0.0 530/530 C + 4409 5752 447 Minor capsid protein / capsid component φ80dlacZ∆M15 B1XBI6 905 100 0.0 447/447 Nu3 + 5339 5752 137 Head scaffolding protein / capsid assembly φ80dlacZ∆M15 B1XBI6 261 100 3e-83 137/137 D + 5762 6094 110 Head decoration protein φ80dlacZ∆M15 B1XBI7 225 100 7e-74 110/110 E + 6162 7187 341 Major head protein / capsid component Phage φ80 gpE P05481 685 98 0.0 334/341 FI + 7233 7649 138 DNA-packaging protein Phage N15 gp9 O64323 182 76 5e-56 138/144 FII + 7661 8014 117 Tail attachment protein Phage N15 gp10 O64324 191 74 1e-60 117/117 Z + 8024 8608 194 Minor tail protein / tail component Fels-1 gpZ Q8ZQG7 283 75 3e-94 194/194 U + 8605 9003 132 Tail shaft capping protein / tail component Phage N15 gp12 O64326 209 75 4e-67 132/132 V + 9011 9754 247 Major tail protein / tail component Phage N15 gp13 O64327 352 73 7e-120 245/245 G + 9765 10196 143 Tail assembly chaperone / tail component Phage N15 gp14 O64328 94.7 39 5e-22 143/140 G-T + 9765 10519 251 Tail assembly / G-T programmed frameshift N15 gp14+gp15 O64328-9 333.2c 53.7 N/Ac 249/247 H + 10503 13640 1045 Tail length tape measure / tail component Fels-1 gpH Q8ZQG3 837 48 0.0 1002/1016 M + 13637 13975 112 Minor tail protein / tail component Phage N15 gp17 O64331 85.9 40 2e-19 109/111 L + 14032 14769 245 Minor tail protein / tail component Phage N15 gp18 O64332 261 51 9e-84 242/248 K + 14771 15490 239 Minor tail protein / tail component HK022 gp19 Q9MCU3 236 47 3e-74 235/234 I + 15483 16100 205 Tail tip assembly / tail component ENT39118 tail F1C572 94.7 46 3e-48 203/195 morF + 16186 16638 150 Putative lipoprotein precursor / unknown None 16234 16638 134 Mature lipoprotein / cleaved by SPase II J + 16730 20209 1159 Tail tip fiber / host range HK022 gp24 Q9MCU0 1390 58 0.0 1153/1180 22 + 20211 20513 100 Unknown HK022p24 NP_597886.1 200 100 2e-64 100/100 23 + 20513 21187 224 Unknown HK022p25 NP_597887.1 360 82 9e-124 224/212 cor + 21299 21532 77 Lysogenic conversion lipoprotein precursor HK022 Cor Q9MCT9 138 79 9e-41 77/77 21347 21529 61 mature FhuA-blocking lipoprotein mEp167 Cor Q6IWY0 117 82 6e-33 61/61 stf + 21593 22804 403 hypothetical tail fiber protein HK75 gp24 G8C7K5 332 46 2e-106 403/433 25 + 22906 23076 56 hypothetical moron gene product None int - 23930 23280 216 Integrase; pseudo HK97 Int Q9MCR4 442 99 4e-154 216/216 - 24349 23924 141 Integrase; partial Phage λ Int C5W1L5 292 100 4e-97 140/140 34

xis - 24545 24327 72 Excisionase / DNA recombination HK022 Xis P68927 148 99 7e-45 72/72 28 - 24752 24585 55 putative conserved uncharacterized protein Phage λ gp35 C6ZCU7 110 93 1e-30 55/55 29 + 24932 25324 130 hypothetical protein Stx2-86 gp43 Q08J62 120 90 1e-31 63/63 30 - 25956 25189 255 conserved putative uncharacterized protein φ80dlacZ∆M15 B1XBG9 248 69 7e-79 196/195 31 - 26391 25960 143 conserved putative protein E. coli 541-15 I4SCN5 296 99 4e-101 143/143 32 - 26948 26391 185 Ea22-like protein / unknown E. coli 541-15 I4SCN6 229 68 2e-73 185/182 33 - 27106 26945 53 hypothetical protein None 34 - 27831 27103 242 EaE-like protein Phage SE1 gp11 B8K1D5 283 64 3e-93 239/210 35 - 27992 27828 54 uncharacterized protein HK620 hkaI Q9AZ32 108 98 8e-30 54/54 36 - 28323 28009 104 conserved hypothetical protein E. coli LF82 E2QJJ7 206 100 1e-66 104/104 37 - 28817 28335 160 resistance factor (Siphovirus gp157) E. coli UMNF18 G0FEU8 310 98 9e-106 160/160 recT - 29712 28810 300 Recombinase / DNA-binding E. coli 042 D3GYC3 621 99 0.0 300/300 38 - 30017 29709 102 hypothetical protein E. coli LF82 E4UEB9 209 100 4e-68 102/103 39 - 30105 29998 35 hypothetical protein E. coli HS A7ZWL9 71.6 100 9e-16 35/35 ral - 30377 30102 91 Ral protein S. flexneri X D2A7V6 178 99 5e-56 91/91 41 - 30646 30458 62 conserved hypothetical protein E. coli W E0IXS6 130 100 3e-38 62/62 42 - 30933 30646 95 conserved putative uncharacterized protein E. coli MS 115-1 D7Y7S7 167 97 7e-52 93/93 N - 31323 31081 80 Antitermination protein N / early gene regulation HK620 gpN Q9AZ23 159 99 1e-48 80/80 44 - 32304 32008 98 conserved putative uncharacterized protein E. coli MS 115-1 D7Y7T2 186 99 4e-59 98/98 cI - 33052 32345 235 Repressor protein CI / transcriptional regulator E. coli MS 115-1 D7Y7T3 483 100 8e-172 235/235 cro + 33164 33388 74 Antirepressor protein Cro / DNA-binding E. coli O26:H11 C8TV29 152 100 2e-46 74/74 cII + 33411 33776 121 Regulator protein CII / lytic-lysogeny decision E. coli MS 115-1 D7Y7T5 246 99 9e-82 121/121 48 + 33811 34215 134 putative conserved uncharacterized protein E. coli MS 115-1 D7Y7T6 283 100 6e-96 134/134 O + 34354 35214 286 phage DNA replication protein O E. coli DEC13B H5L941 600 100 0.0 286/286 P + 35322 37202 626 DNA Primase/Helicase E. coli MS 115-1 D7Y7T9 1299 99 0.0 626/626 51 + 37288 37500 70 Unknown HK97 gp60 Q9MCP5 125 97 1e-35 70/70 ninB + 37481 37897 138 Recombinase E. coli MS 115-1 D7Y7U0 283 99 5e-96 138/138 ninF + 37915 38091 58 NinF H19B NinF O48425 113 97 1e-31 58/58 roi + 38084 38806 240 Roi / DNA-binding antirepressor HK620 Roi Q9AZ12 487 99 4e-173 240/240 ybcO + 38806 39096 96 Recombinase E. coli TA124 H1FFH3 197 97 1e-63 96/96 RUS + 39093 39488 131 Endodeoxyribonuclease RUS E. coli 541-15 I4SLU8 271 100 1e-91 131/131

35

ninH + 39485 39691 68 NinH / DNA-binding Phage λ NinH P03771 142 100 8e-43 68/68 ninI + 39669 40337 222 Serine/Threonine-protein phosphatase φ80dlacZ∆M15 B1XBH1 421 91 1e-147 222/222 ninG + 40330 40971 213 NinG / Recombination φ80dlacZ∆M15 B1XBH2 440 100 2e-155 213/213 lar + 40968 41174 68 Lar / Restriction alleviation protein None 61 + 41171 41287 38 hypothetical protein E. aerogenes G0E0P6 60.5 71 2e-11 38/38 Q + 41287 42096 269 Antitermination protein Q φ80dlacZ∆M15 B1XBH3 557 100 0.0 269/269 S + 42575 42798 74 Lysis protein S (frameshifted) φ80dlacZ∆M15 B1XBH4 152 100 2e-46 74/74 R + 42776 43270 164 R / Lysozyme φ80dlacZ∆M15 B1XBH5 338 100 1e-116 164/164 Rz + 43267 43725 152 Rz lysis protein φ80dlacZ∆M15 B1XBH6 309 100 1e-105 152/152 Rz1 + 43487 43672 61 Lipoprotein Rz1 precursor / lysis protein φ80dlacZ∆M15 B1XBH7 123 100 3e-35 61/61 43544 43669 42 Mature lipoprotein / cleaved by SPase II φ80dlacZ∆M15 B1XBH7 275 100 3e-32 42/42 Rha + 43888 44442 184 Rha protein / regulatory φ80dlacZ∆M15 B1XBH8 278 100 2e-92 184/184 Phage φ80 Rha Q38450 278 100 2e-92 184/184 68 - 45127 44517 204 Conserved protein (nonsense mutant - opal) φ80dlacZ∆M15 B1XBH9 424 99 2e-149 204/204 44993 44514 159 ORF159 / truncated product with new start codon φ80dlacZ∆M15 B1XBH9 312 100 5e-106 151/151 69 + 45108 45338 76 gp70 / Fur-regulated protein φ80dlacZ∆M15 B1XBI0 160 100 3e-49 76/76 70 + 45500 46228 242 gp71 / Fur-regulated protein φ80dlacZ∆M15 B1XBI1 504 100 1e-179 242/242 a Functional annotation was assigned based on homology and/or on computational prediction (e.g., Pfam or CDD search). b ID%, percent amino acid sequence identity of corresponding gene products relative to their best homolog in BLASTp alignment; overlap, length of aligned region of query and subject sequences. c Because N15 gpGT sequence is annotated separately as gpG and gpT in GenBank database, manually reconstructed gpGT sequence was used to perform pairwise alignment.

36

37 Figure 3. Mauve progressive alignment of the φ80vir left arm. Each horizontal “panel” (gray or white) represents a separate phage genome with its taxonomic name, in the lower left corner, and a scale bar showing the sequence coordinates of that chromosome. Colored block outlines, above the black center line, indicate Locally Collinear Blocks (LCBs or regions of synteny), conserved segments that appear to be internally free from genome rearrangements. Matches are colored differently based on their Multiplicity (the number of matching genomes). The lines connecting LCBs pinpoint those parts in each chromosome that are homologous. The regions outside of these blocks lack significant homology between the genomes aligned. Annotated features, genes and ORFs, are shown as rectangles and presented below the center line according to their genome coordinates. Black rectangles suggest the positions of moron genes. Gene product names, if known, are presented right next to rectangles. The slanted arrow between genes indicates a programmed -1 translational frameshift (ribosome slippage). A similarity profile for a genome is color coded and represented within each block; the height of the similarity profile is proportional to the level of sequence identity in that region. Regions in white color were not aligned and, potentially, contain segments specific to a particular chromosome.

The alignment was constructed using progressiveMauve algorithm (Darling et al., 2010) (at http://asap.ahabs.wisc.edu/mauve/) 38 Table 2. Percent identity of φ80 “head” proteins to those in λ and N15 phage.

φ80 Protein Name %IDa to λ phage %ID to N15 Nu1 86 54 A 90 89 W 81 96 B 88 99 C 74 96 Nu3 76 95 D 97 96 E 89 97 FI 42 76 FII 63 74

a ID%, percent amino acid sequence identity of BLASTp (Swiss-Prot) hits.

39

A B

C

Figure 4. Morphology of φ80vir. (A, B) Transmission electron micrographs of the enterobacteria phage φ80vir. Virions were purified, as described in Materials and Methods, and negatively stained with 1% PTA (wt/vol) (A) and 2% uranyl acetate (wt/vol) (B) aqueous solutions. White scale bars represent 100 nm. (C) Atomic force micrograph of φ80vir. The AFM image was captured at a scanning rate of 0.6 Hz on NanoScope IIIa at the University of Toledo, image was taken by Alexey Bludin. 40 CHAPTER III. GENES WITH MORON FEATURES

Introduction

The term “moron gene” (Juhala et al., 2000) was first introduced after recent comparisons of complete genome sequences of some lambdoid and λ-like phages. The genetic linkages and characteristic features of “phage mosaics” are seen in pair-wise comparisons of two or more phage genomes. Generally, a gene with moron features possesses its own transcription signals, i.e., usually flanked by an upstream σ70 promoter and a downstream ρ-independent (stem-loop) transcription terminator. Such genes are mostly found within the structural gene cluster (Hendrix et al., 2000) of the phage genome and the corresponding transcripts, where studied, tend to increase host-cell fitness. As has been previously shown (Juhala et al., 2000), the G+C base composition of moron genes in HK022 and HK97 phages is significantly different relative to that of adjacent structural genes. Furthermore, these genes tend to travel among various related and unrelated phages (Hendrix et al., 2000), thus these genes have been suggested to be derived from some unidentified “external” organism and transmitted by the means of homologous and illegitimate recombination. The analysis of G+C content can therefore be used to tentatively identify relative chromosomal positions of moron genes. Interestingly, moron genes provide potential insights regarding the persistent question on the origins of viruses (Hendrix et al., 2003) because these genes do not to date seem to have any corresponding homologs amongst the currently sequenced organisms. It has become apparent that these genes are the result of an evolutionarily recent addition to the genome because the same gene can be absent in the corresponding position in a comparison genome with an otherwise high degree of synteny and, in some cases, the moron gene and its associated transcriptional signals can be inserted backwards relative to the surrounding genes. Such arrangement of transcription signals allows these genes 41 to be transcribed autonomously, of particular importance in otherwise the repressed prophage;

therefore, the expressed gene product can perform its function at any time after infection. In

most cases, identified morons lack information about their biological function, but all of the

morons described to date appear to provide an apparently advantageous phenotype to the host

and/or some selective benefit for the phage genome. For example, in λ phage the lom gene appears to be a moron gene from which a lipoprotein is expressed from the prophage during lysogeny. This lipoprotein is localized to the outside of the cell, where it appears to function as an adhesin, enhancing the ability of E. coli to adhere to human buccal epithelial cells (Vica

Pachecho et al., 1997); thus, enhancing host-cell fitness.

In the present study, three open reading frames with features predictive of morons were identified; interestingly one of which appears to be the major capsid protein E.

Materials and Methods

Identification of Transcription Signals

Using the assembled left arm of the φ80vir genome, we predicted prokaryotic σ70-

dependent promoters and ρ-independent stem-loop terminators using BPROM and FindTerm

programs (SoftBerry, Inc., http://www.softberry.com). The predicted transcription start sites

(TSS) were identified using a prokaryotic version of the promoter prediction program based on

Sequence Alignment Kernel (http://mendel.cs.rhul.ac.uk/) (Gordon et al., 2003) (Table 3 and

Table 4). For greater sensitivity in promoter identification, 150-nucleotide regions [TSS –

100…TSS + 50], were selected as an input for the prediction. Similarly, 90-nt downstream

regions—relative to the last coding nucleotide of a gene—were analyzed for the presence of

putative ρ-independent stem-loop terminators. The predicted transcription terminators were 42

further examined and the Gibbs free energy of folding, ∆Gfold, was estimated using the

“Quickfold” program at the mfold web server (http://mfold.rna.albany.edu/?q=mfold) (Zuker,

2003). The RNA folding conditions (ver. 3.0) were fixed at the temperature of 37°C and ionic

conditions at physiological levels, [Na+] = 1 M and [Mg2+] = 0 M.

G+C Skew

The guanine + cytosine content (in percent) was calculated using the formula:

()+ CG CG (%) =+ × %100 ( +++ CGTA )

The G+C content was determined for a 300-bp sliding window across the left arm of the φ80vir

phage genome with a step of 50 nt and plotted against the corresponding genome position.

Results and Discussion

Genes with Moron Features, or Morons

Based on a computational analysis and the length of the intergenic regions between genes

7 (D) and 8 (E), 8 and 9 (FI), the major capsid gene E has features strongly suggestive of a

moron. This would be the first example of a gene with moron features encoding for a structural

protein (major capsid protein E). Following the proposed “moron accretion hypothesis”

(Hendrix et al., 2003), gene E could have been introduced into the genome of a lambda-like

ancestor at some evolutionary time point, providing an advantage for the phage in packaging its

genome and, probably, in adjusting the size (DNA packaging capacity) of its protein capsid.

Although the ρ-independent stem-loop terminators around the gene 8 (E) were previously

predicted in both λ and N15 phages (Sanger et al., 1982; Ravin et al., 2000), a σ70-promoter for gene E has yet to be reported in the literature. To elucidate the origin of this putative promoter and verify our hypothesis of gene E being a gene with moron features, we inferred our 43 observation on phages listed above. Upon comparison, a similarity in length was noted for the intergenic regions between genes D and E with a conservation of prokaryotic σ70 promoters and factor-independent (stem-loop) terminators (Fig. 5). 44

mRNA

 D Intergenic region E  φ80vir TAATCCGCATTTCTACAACCATCATCATTCATAAAAGCCGCTTGCGCGGCTTTTTTTACGGGAAAAATCTATG N15 TAATCCGCGTTTTTACAACCACCATCATTCATAAAAGCCGCCTGCGCGGCTTTTTTTATGGGAAAAATCTATG λ TAACTTTACCCTTCATCACTAAAGGCCGCCTGTGCGGCTTTTTTTACGGGATTTTTTTATG

Figure 5. Comparison of intergenic regions between genes encoding for a head decoration protein D and major capsid protein E in

φ80vir, N15, and λ. Nucleotide sequences for σ70 promoters were predicted using BPROM (Softberry, Inc.) program, and sequences

for ρ-independent terminators – by FindTerm (Softberry, Inc.) program. 5’-TAA sequence (in grey) correspond to the for

upstream gene D, whereas 3’-ATG sequence represents the start codon for downstream gene E. Sequences for the predicted -35 element, (-10 region), and Transcription Start Site (TSS, or +1 region) are represented in each sequence from left-to-

right, respectively, and bolded for visual clarity. An RNA terminator represents a hairpin structure with a stem (underlined), loop

(between underlined regions), and oligo(U) tail (in blue). The predicted structure of a putative factor-independent RNA terminator for

φ80vir phage is shown above the brace, which corresponds to location of terminator in DNA sequence. 45 Using the promoter prediction software BPROM and SAK, as described in Materials and

Methods, we identified putative σ70 promoters within those intergenic regions, upstream of gene

67 E. The 67-nt long intergenic region between φ80vir D and E (we denoted it as φ80vir - IRD/E) is

67 very similar to the Ν15- IRD/E and has almost the same transcription signals. Nonetheless, λ-

55 IRD/E is only partially similar to the others (Fig. 6) and, more importantly, does not share a

strong homology to consensus sequences of bacterial -35 and -10 promoter regions; therefore, it

could be a false-positive prediction. From computational prediction alone, it is still unclear

whether these putative promoters are functional, non-functional, or pseudo-promoters. To our

surprise, we found in literature a microarray gene expression study on the E. coli K-12

W3350(λ) lysogen (Chen Y. et al., 2005). In their article, Chen et al. observed that gene E was

expressed 3-fold in the lysogen compared to non-lysogen stage. Even though they mentioned in

text the existence of the ρ-independent transcription terminators flanking gene E and proposed

that such organization of regulation signals aids prophage in silencing gene expression by

terminating initiated polycistronic mRNA, it does not however explain the increased expression

level of the gene E. In order for a gene to get expressed from a repressed prophage, it should

possess an intrinsic promoter. Our prediction for a σ70 promoter of E can explain how this gene

is being expressed out of CI-repressed prophage. Because it was beyond the temporal scope of

my project, we were unable to experimentally confirm the presence of σ70 promoter located

adjacent upstream of gene E. Using BPROM (Softberry, Inc.) program, we predicted a

transcription factor binding site (TFBS) for the transcription repressor protein LexA—which

55 controls DNA damage-inducible SOS response—in all three intergenic regions (λ- IRD/E,

67 67 φ80vir - IRD/E, and Ν15- IRD/E ) matching a sequence [5’-TTTTTTTA-3’], exactly 13 nt upstream of E gene. However, the SOS box (TFBS for LexA) was reported to have a consensus 46

sequence of 16 nt [CTG-N10-CAG] in E. coli (Erill et al., 2003). Upon further examination, we

were unable to find such a consensus sequence upstream of gene E. The absence of SOS box,

however, does not prove that this region is not involved in the SOS response. This finding

requires additional experimental evidence to elucidate its function. To conclude, we speculate

that gene E to be a moron gene, the first known to encode a phage structural protein; contrary to

the current consensus that genes encoding for structural proteins do not have “moron features”.

As in certain lambdoid and lambda-like phage (HK022, N15, and mEp167), φ80vir

putatively encodes the lipoprotein Cor. This protein blocks the activity of the TonB-dependent

outer membrane transporter protein FhuA (Chapter IV and Uc-Mass et al., 2004). Kozyrev et al.

(1982) described a phenomenon of lysogenic conversion, and a gene responsible for preventing

the superinfection of φ80 lysogens that they termed cor (conversion resistant). Although a nucleotide sequence of the ORF encoding for the φ80 cor gene was first determined by

Matsumoto et al. (1985), the coding sequence was annotated incorrectly. Vostrov et al. (1996) found that the Matsumoto’s version of cor was longer and located on the lagging strand of φ80.

Our version of φ80vir cor (UniProt ID: E4WL42) is consistent with the Vostrov’s cor with some minor nucleotide differences that may reflect the potentially different histories of these φ80

mutants. The function of the cor gene product will be discussed further in Chapter IV.

Another putative lipoprotein was identified between genes 20 (I) and 21 (J) (Fig. 6). A priori, we annotated this gene considering the longest possible ORF; however, upon examining the G+C skew, we suspected that the real in-frame coding sequence was shorter. Based upon the high conservation of lipoprotein signal peptides in length, we were able to find a putative GUG- start codon—the most frequent bacterial non-AUG start codon (Schneider et al., 1986)—for morF with predictions of a corresponding upstream promoter (Table 3), a putative 5’-AAGGAG- 47 3’ Shine-Dalgarno sequence seven nucleotides upstream of morF, and a ρ-independent stem-

loop terminator (Table 4). Based on lipoprotein trafficking signal studies (Seydel et al., 1999;

Terada et al., 2001), this novel lipoprotein was predicted to have a sorting signal, which targets the premature lipoprotein product to be delivered into the periplasm of E. coli. The purpose and function of this lipoprotein is unknown, but one possibility is that it might increase the host cell fitness, just like other phage moron gene products do (Hendrix et al., 2000). We named this gene morF because: (i) it has moron gene features, and (ii) capital “F” is derived from the greek letter

φ (“phi”) pinpointing the phage it was first discovered in. 48

Figure 6. The guanine + cytosine content analysis of the left arm of φ80vir phage. The G+C content is expressed in percents in a sliding by 50 nt 300-bp window and is plotted versus genome position. Black rectangles suggest moron gene positions and sizes, whereas white rectangles represent structural genes of φ80vir. The red solid line across the plot indicates the average G+C content (in percents) of the φ80vir left arm.

49 Table 3. Predicted σ70 promoters of the enterobacterial phage φ80vir. Sequenceb Gene Coordinatea Strand ______Spacer, nt (-35 region) (spacer) (-10 region)

E 6,101 + CCGCA TTTCTA CAACCATCATCATTCA TAAAAG CCGCT 16 morF 16,126 + TGAAC GTTACA TTGCAATCACCCTTGGT TATCAT GTCCA 17 cor 21,236 + ACCAA TTATAC CCACCTCTTTCATGTTGG TATTGT CTAAG 18

a Position of the first nucleotide of -35 sequence for promoters located on the leading (+) DNA strand or the last nucleotide of -10 sequence for promoters located on the lagging (-) DNA strand. b Nucleotide sequences for σ70 promoters were predicted using BPROM (Softberry, Inc.) program. Sequences for the Pribnow box (-10 region) and the -35 element are underlined. Spaces between nucleotides are intended for visual clarity. 50

Table 4. Predicted ρ-independent (stem-loop) transcription terminators of the enterobacterial phage φ80vir. b c Sequence ∆Gfold Gene Coordinatea Strand ______Length, nt (stem) (loop) (stem) (tail) kcal/mol kJ/mol

D 6,126 + AAGCCGC UUGC GCGGCUU UUUUUACGG -10.90 -45.60 27 E 7,197 + GUGGCCC UGUC GGGCCAC CUUUCU -14.40 -60.25 24 M 13,991 + AAAGCUGCC UCCG GGCGGCUUU UUUUUAUGG -13.30 -55.65 31 morF 16,693 + AACCCAGC UCAG GCUGGGUU UUUUAAUGG -13.10 -54.81 29 cor 21,550 + AAACCUCGC UCCG GCGGGGUUU UUUUAUUGC -12.50 -52.30 31

a Position of the first nucleotide of the stem sequence for terminators located on the coding (+) DNA strand or the last nucleotide of the tail sequence

for terminators located on the lagging (-) DNA strand.

b Nucleotide sequences for ρ-independent terminators were predicted by FindTerm (Softberry, Inc.) program. An RNA terminator represents a hairpin structure with a stem, loop, and oligo(U) tail. Ribonucleotides responsible for forming the stem are underlined. Spaces between ribonucleotides are intended for visual clarity.

c Gibbs free energy of folding, ∆Gfold, was estimated using “Quickfold” program at the mfold web server (see Materials and Methods). The table presents both energy values given by Quickfold (kcal mol-1) and converted values to SI units (kJ mol-1). RNA folding conditions (ver. 3.0) were fixed

at 37°C and ionic conditions at physiological levels, [Na+] = 1 M and [Mg2+] = 0 M. 51 CHAPTER IV. CONVERSION RESISTANT LIPOPROTEIN COR

Introduction

Bacterial Lipoproteins

Secretory proteins possess a transiently attached amino-terminal signal peptide for trafficking purposes across membrane(s) (von Heijne G., 1989). Signal peptides contain three characteristic regions: (i) a positively charged amino-terminal region (n-region), (ii) a hydrophobic central region (h-region), and (iii) a carboxy-terminal polar region (c-region), which specifies peptidase cleavage site (von Heijne G., 1985). Lipoproteins can be distinguished from non-lipoproteins by the signal peptidase (SPase) that their precursors utilize (Kosic et al., 1993).

Bacterial lipoproteins belong to exported proteins with a cleavable signal peptide, which is processed by the lipoprotein-specific signal peptidase (LspA, or SPase II; Innis et al., 1984), an enzyme that can be inhibited by globomycin (Inukai et al., 1978). Within the cell envelope, lipoprotein is covalently acylated by the amino-terminal cysteinyl residue to three fatty acids

(Wu H.C., 1996). During cleavage, a glyceryl moiety is attached to the cysteinyl residue, the very first amino acid residue after the SPase II-processing site (+1 amino acid site of the mature protein). The sequence of amino acids around this cysteinyl residue, called a lipoprotein box— or simply lipobox—L-(A/S/I)-(G/A)-↓-C (the arrow indicates the cleavage site), is very highly- conserved and distinguishes bacterial lipoproteins from the precursors of other bacterial secretory proteins (von Heijne G., 1989; Wu H.C., 1996).

The major outer membrane of E. coli is Braun’s lipoprotein, or more simply termed Lpp

(Braun and Rehn, 1969). Lpp has an amino-terminal glycerylcysteine with two ester-linked fatty acids and one amide-linked fatty acid. This major lipoprotein has served as a prototype for the study of lipid-modified proteins in bacteria. For example, studies with Lpp (Watanabe et al., 52 1988) helped to elucidate that lipoprotein precursors are translocated across the CM by the

general export protein machinery, i.e., Sec system (Snyder and Champness, 2007). Notably,

processing of lipid-modified lipoprotein requires functional Sec proteins and energy delivered in

the form of pmf (Kosic et al., 1993).

For E. coli and other Gram-negative bacteria, lipoproteins can occur on the periplasmic

faces of the CM and OM or exposed on the external surface of the OM (Seydel et al., 1999).

Whether a lipoprotein remains attached to the CM, gets delivered to the OM, or is secreted

outside of the bacterial cell is determined by the amino acid composition—known as lipoprotein

sorting signal (Yamaguchi et al., 1988; Seydel et al., 1999; Terada et al., 2001)—following the

lipid-modified cysteinyl residue. Lipoproteins targeted to the OM are transported by the general

lipoprotein Lol system (Yokota et al., 1999). An ABC transporter, LolCDE, performs a LolA-

dependent release of OM lipoproteins from the CM into the periplasm (Yakushi et al., 2000).

The mechanism of lipoprotein’s translocation is summarized in Figure 7.

As alluded to in Chapter III, the predicted φ80vir proteins Cor and MorF have features indicative of lipoproteins. This chapter will focus on the characterization of the φ80vir cor gene, its cloning, and the phenotype associated with its expression in E. coli K-12 strains. 53

Figure 7. Biosynthesis and translocation of outer membrane lipoproteins. In the cytoplasm, the prolipoprotein is translocated across the CM by Sec translocon. The processing of signal peptides involves several steps: (i) formation of thioether bond between cysteine in the mature lipoprotein and diacylglycerol by phosphatidylglycerol:prolipoprotein dyacylglyceryl transferase

(Lgt), (ii) cleavage of signal peptide by lipoprotein signal peptidase (LspA), and (iii) amino- acylation by apolipoprotein N-acyltransferase (Lnt). Outer membrane targeted lipoproteins are recognized by LolCDE complex, an ABC transporter. The release of lipoprotein from the cytoplasmic membrane is mediated by LolA periplasmic chaperone. LolA-lipoprotein complex interacts with outer-membrane-anchored lipoprotein LolB followed by incorporation of the mature lipoprotein into the outer membrane. This figure is adopted with the permission of the author (Narita, 2011) 54 Materials and Methods

Bacterial Strains and Construction of Plasmids

The Escherichia coli K-12 strains W3110 (Hill and Harnish, 1981), RA1051 (W3110

∆exbBD ∆tolQR; Brinkman and Larsen, 2008), and KP1344 (W3110 ∆tonB::blaM; Larsen et al.,

1999) were used in this study. Plasmid pRA033 was constructed from pBAD24 (Guzman et al.,

1995) as summarized in Fig. 8. We introduced restriction sites at both sides of the predicted

φ80vir cor gene using amplification by the polymerase chain reaction (PCR) with primers containing EcoRI sites. The resultant amplimers were restricted and cloned into the similarly restricted vector pBAD24 under control of the arabinose promoter (PBAD) to provide for the regulated expression of the cor gene product in E. coli K-12 strains. The plasmid pRA038 was engineered to express a carboxyl-terminal (His)6-tagged version of φ80vir Cor protein and was derived from pRA033. The bacteriophage T5 was obtained from the American Type Culture

Collection. Colicins were prepared from strains and plasmids kindly provided by Volkmar

Braun (Max Planck Institute, Tübengin, Germany) and Anthony Pugsley (Institut Pasteur, Paris) as previously described (Larsen et al 1999).

Colicin and Phage Spot-Titer Assays

Spot-titer assays for determining the relative activities of colicins and bacteriophage were performed as previously described (Larsen et al., 2003). Briefly, serial 5-fold dilutions of colicin preparations and 10-fold serial dilutions of bacteriophage were applied as 5 µl aliquots to lawns of test cells that had been grown to mid-exponential phase in T-broth, then plated on T-plates as

100 µl aliquots suspended in 3 ml of T-top agar supplemented with arabinose and ampicillin as indicated. Following the application of colicin or bacteriophage dilutions, the plates were incubated at 37˚C for 16 hr and then visually scored for clearing of the lawn. 55 [55Fe3+]ferrichrome Transport

A radiolabeled iron transport assay was adapted from Köster and Braun (1990) and performed similarly to as previously described (Larsen and Postle, 2001). Cultures of 10 ml mid-exponential phase E. coli cells (corresponding to ~2 × 108 CFU per ml in T-broth) were

grown in supplemented T-broth with ~60 µM of L-arabinose (10-3% [wt/vol]), as indicated.

Samples were incubated at 30ºC with shaking. Transport was initiated by the addition of

55 55 4.5 pmol of [Fe]-ferrichrome (by mixing desferriferrichrome with [Fe]Cl3 at a 5.45:1 molar

ratio in a presence of 10 mM HCl at 37ºC for 15 min) to each of the 10 ml cell cultures. The 0.5-

ml aliquots, containing 108 CFU each, were periodically removed at specified time intervals in

triplicate and filtrated onto glass filters in a manifold connected to the vacuum system. Samples

were then washed 3 times with 8 ml of 0.1-M LiCl. Filters were dried and placed in scintillation

tubes with 2.5 ml of Cytoscint and consumed 55[Fe3+] was determined by liquid scintillation counting.

Multiple Sequence Alignment and Phylogenetic Analysis of Cor Protein

Phylogenetic analysis was performed according to B. Hall’s manual (2011). To avoid misplaced gaps in protein sequences and to increase the reliability of the alignment, multiple sequence alignments of Cor and Llp proteins were performed using their corresponding DNA coding nucleotide sequences aligned by codons (a sequence of three nucleotides encoding for an amino acid). Multiple sequence alignments were performed in MEGA5 (Tamura et al., 2011) using

MUSCLE (Edgar et al., 2004) or MAFFT (Katoh et al., 2002) algorithms. To check reliability of the alignments, average percent amino acid identity, or p-distance, was identified in pairwise comparisons. Phylogenetic protein trees were estimated and constructed in MEGA5. 56

Figure 8. Polymerase Chain Reaction (PCR) cloning. This diagram shows the cloning rationale for the bacteriophage φ80vir cor gene with the corresponding 0.8% agarose gel electrophoregram. The gene was amplified using the two-primer technique. As shown on the figure, primers were designed based on the determined sequence for the 5’ and 3’ regions of cor, with each primer bearing an EcoRI-restriction site sequence to introduce restriction sites right before the start codon and after the stop codon, respectively. The corresponding cor fragment

(~270 bp) is represented on the gel. 57 Results and Discussion

The lipoprotein nature of φ80vir Cor was predicted based on protein sequence analyses using SignalP 3.0, ProtParam, TMHHMM 2.0, TatP 1.0, CELLO v.2.5, LipoP 1.0,

Secretome 2.0, and Phobius programs (Fig. 14). Same analysis revealed the similar graph and characterization for φ80vir novel lipoprotein MorF (data is not shown). These data strongly suggested that both protein precursors had cleavable signal peptides with conserved lipoboxes, and the resultant mature peptides were predicted to be localized in the periplasmic space.

The lysogenic conversion resistant gene cor of φ80vir was successfully cloned into commercially available multicopy pBAD24 plasmid to create pRA033, which was then transformed into the E. coli K-12 strain W3110 cells. When induced with arabinose, the lipoprotein Cor inhibited colicin uptake, iron transport, and phage infections through FhuA receptor protein (Table 5; Fig. 15), by interacting with FhuA directly, but not via the TonB protein. To show that Cor interact with FhuA and not with FepA or TonB, several colicins and phages have been used in this study: Col M (TonB- and FhuA-dependent), Col E1 (TolC- and

BtuB-dependent), Col B (TonB- and FepA-dependent), and T1 phage (FhuA- and TonB- dependent), and T5 phage (FhuA-dependent but TonB-independent). Col M, Col E1, and Col B exploit different TBDTs to get an access into the host, but only W3110/pRA033 cells were tolerant to Col B; therefore, suggesting a specific inhibition of the FhuA receptor. These cells were also tolerant to the TonB-independent lytic phage T5. The colicin killing assay did not provide a very sensitive assay; however, the ferrichrome transport assay provided a more accurate and sensitive characterization of FhuA activity. Interestingly, cells lacking exbBD tolQR, gene products of which are involved in formation of inner membrane pmf-harvesting complexes, exhibited higher background activity than W3110/pRA033 cells (Fig. 15). This 58 might be explained by a potentially different conformation of FhuA in the presence of Cor

lipoproteins. We speculate that Cor might alter the conformation of the FhuA TBDT in a

manner that precludes the binding of ferricrome complexes to the receptor from the environment.

In contrast, cells with ∆exbBD ∆tolQR background (RA1051) can bind ferrichrome but cannot transport it inside of the cell because of inactive, uncomplexed TonB protein. However, we have not tested this hypothesis by expressing cor from pRA033 plasmid in ∆exbBD ∆tolQR background and measuring iron transport activity.

All these observations suggest the chemical interaction between Cor and FhuA proteins that can provide a means to further examine the protein-protein interactions that mediate the functions of TBDT proteins. To facilitate the purification of Cor, we introduced a hexahistidine tag at the carboxyl terminus of in order to purify it on a Ni-NTA column. Our data indicated that the resultant bacterial strain W3110/pRA038 expressing φ80vir Cor-(His)6 from pBAD24

plasmid became more resistant to FhuA-dependent phage and colicins (even without adding

arabinose; see Table 5) as well as more inhibitory to ferrichrome transport (data is not shown)

than the W3110/pRA033 strain. Notably, when induced with arabinose, Cor-(His)6 retarded

bacterial growth and at the same time enhanced the blockage of FhuA receptor. It is still unclear

why carboxyl-terminal (His)6-tag increased the efficiency of FhuA blockage and toxicity at the

same time.

Lytic and Lysogenic Conversion: Llp and Cor Lipoproteins

It was previously assumed that φ80 cor and T5 llp genes were not related simply because these genes did not share a significant similarity when compared pair-wise. However, based on position-specific ψ-BLAST search and the experimental evidence obtained in this study, we hypothesized that the lysogenic conversion gene cor and lytic conversion gene llp are xenologs 59 (i.e. orthologous sequences found in different species as a result of horizontal gene transfer). To

test this hypothesis, we initially aligned ψ-BLAST hits (with more than 10 iterations) for φ80vir

Cor and T5 Llp together (Fig. 9), as described in Materials and Methods. Only homologs and some phage uncharacterized lipoproteins that shared similarity were selected for the MSA.

Because lipoprotein signal sequences have a significantly high conservation and are cleaved following protein translocation, they would most likely affect the results of phylogenetic estimation; therefore, signal peptides were manually removed from the MSA and resultant sequences were realigned using MUSCLE (Fig. 10) and MAFFT (Fig. 12) algorithms. Because those proteins do not share strong identity, there are two possible alignments shown on Fig 11 and Fig. 13. These orthologs most likely share structural similarity, rather than amino acid identity. For instance, there are three conserved cysteine residues apparent in Cor, T5-like Llp

and T1 Llp (known as Cor in literature), proteins of phage infecting E. coli cells through TBDTs.

However, these lipoproteins share only two conserved cysteinyl residues with T1-like lipoprotein

of Rtp phage—receptor of which was not identified (Wietzorrek et al., 2006)—and with

lipoproteins from Podoviridae, Siphoviridae, and unclassified phage infecting other bacterial

species. We believe that this phenomenon is most likely associated with host-range. As shown

on Fig. 11 and Fig. 13, these lipoproteins from “other” phage are clustered separately from Cor

and Llp proteins that block TBDTs. The corresponding receptors are indicated (in parenthesis);

however, not all of these lipoproteins have been experimentally confirmed to inhibit host

receptor activities. As was mentioned before, the +1 cysteine residue after the lipobox is

essential for incorporation of mature lipoprotein into the OM from the periplasm. As for the

other two cysteinyl residues, they were proposed to be involved in formation of disulfide bridges

and subsequent dimerization of lipoproteins (Pedruzzi et al., 1998); this intramolecular bond is 60 required for Llp function (Robichon et al., 2003). The presence of the last cysteinyl residue (at

the carboxyl terminus) correlates with the receptor that the phage is using to infect the bacteria.

This cysteinyl residue is present in lipoproteins found in FhuA- and BtuB-dependent phage and

absent in other “unrelated” phage. We propose that this region to be potentially involved in

recognition of the receptor. Although Cor lipoproteins have similar amino acid composition, T5

Llp has different amino acid content (except for cysteinyl residues at similar positions).

Nonetheless, function of both Llp and Cor is the same. Based on experimental evidence and

phylogenetic analyses, these data strongly suggest that φ80 cor and T5 llp are xenologs. This means that over the course of evolution, both genes (llp and cor) were separated from a common

ancestor by a speciation event and horizontally transferred from one phage to another unrelated

phage. Therefore, knowing the function of Llp and Cor, our finding might be useful in

identifying potential receptors for the bacteriophages clustered together with Rtp phage.

61

Figure 9. Multiple nucleotide sequence alignment of cor and llp xenologous genes translated into protein sequences. Nucleotide sequences were aligned in MEGA5 (Tamura et al., 2011) by codons using MUSCLE algorithm (Edgar R.C., 2004) and the resulted alignment was visualized with the JalView browser. The reliability of the alignment is acceptable, with the overall p-distance (1 -

amino acid identity) of 0.69. 62

Figure 10. Multiple nucleotide sequence alignment corresponding to Cor and Llp mature lipoproteins. These protein sequences

represent only the mature gene products, i.e., processed by SPase II. Nucleotide sequences were aligned in MEGA5 software (Tamura

et al., 2011) by codons (i.e., by three coding nucleotides) using MUSCLE algorithm (Edgar R.C., 2004), adjusted manually (if needed), and visualized with the JalView browser. The reliability of the resulted alignment is acceptable, with the overall p-distance of

0.71. 63

64 Figure 11. Maximum Parsimony (MP) analysis of mature Cor and Llp lipoproteins aligned with MUSCLE. The evolutionary history was inferred using the Maximum Parsimony method. The bootstrap consensus tree was constructed from 2000 replicates (Felsenstein,

1985) and represents the evolutionary history of the taxa analyzed. Only bootstrap percentages of more than 50% are shown next to the corresponding nodes on the tree. The most parsimonious rooted tree is shown. The taxa labeled “FhuA-dependent” was used as outgroup for rooting the tree. The MP tree was obtained using the Close-Neighbor-Interchange algorithm [pg. 128 in (Nei & Kumar,

2000)] with search level 1 in which the initial trees were obtained with the random addition of sequences (10 replicates). The tree is drawn to scale; with branch lengths calculated using the average pathway method [see pg. 132 in (Nei & Kumar, 2000)] and are in the units of the number of changes over the whole sequence. The analysis involved 19 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions with less than 85% site coverage were eliminated. That is, fewer than 15% alignment gaps, missing data, and ambiguous bases were allowed at any position. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Asterix indicates FhuA receptor of Salmonella that is recognized by ES18 phage.

65

Figure 12. Alternative multiple sequence alignment of mature proteins cor and llp. Nucleotide sequences were aligned using MAFFT algorithm, adjusted manually, and the resulted alignment was visualized with the JalView browser. The reliability of the alignment is acceptable, with the overall p-distance of 0.73. 66

67 Figure 13. Maximum Parsimony (MP) analysis of mature Cor and Llp lipoproteins aligned with MAFFT. The evolutionary history was estimated using the Maximum Parsimony method. The bootstrap consensus tree was constructed from 2000 replicates

(Felsenstein, 1985) and represents the evolutionary history of the taxa analyzed. Only bootstrap percentages of more than 50% are shown next to the corresponding nodes on the tree. The most parsimonious rooted tree is shown. The taxa labeled “FhuA-dependent” was used as outgroup for rooting the tree. The MP tree was obtained using the Close-Neighbor-Interchange algorithm [pg. 128 in (Nei

& Kumar, 2000)] with search level 1 in which the initial trees were obtained with the random addition of sequences (10 replicates).

The tree is drawn to scale, with branch lengths calculated using the average pathway method [see pg. 132 in (Nei & Kumar, 2000)] and are in the units of the number of changes over the whole sequence. The analysis involved 19 nucleotide sequences. Codon positions included were 1st+2nd+3rd+Noncoding. All positions with less than 85% site coverage were eliminated. That is, fewer than

15% alignment gaps, missing data, and ambiguous bases were allowed at any position. Evolutionary analyses were conducted in

MEGA5 (Tamura et al., 2011). Asterix indicates FhuA receptor of Salmonella that is recognized by ES18 phage.

68

Figure 14. Summary of computational analyses of protein Cor. The picture summarizes predictions of several programs indicated above. This protein appears to be a periplasmic lipoprotein with a cleavable signal peptide (SP). Signal peptidase II cleavage site for φ80vir

Cor (LTG↓C) strongly reassembled SPase II consensus LA(G,A)↓C (von Heijne G., 1989) with 100% conserved cysteine residue at +1 position after the cleavage site. At the top of the plot (between 1 and 1.2) the best prediction for Gram-negative bacterial cell is shown.

69

Figure 15. [55Fe3+]ferrichrome transport. Data is presented as mean values of triplicates with corresponding standard deviations. When needed, cells were supplemented with arabinose

(+Ara) to a final concentration of ~60 mM (corresponds to 10-3% [wt/vol]). The

W3110[pBAD24] is wild type E. coli K-12 strain W3110 bacteria transformed with arabinose-

inducible pBAD24 plasmid. RA1051 cells represent W3110 with ∆exbBD ∆tolQR deletions. The

pRA033 plasmid encodes φ80 cor inserted into EcoRI site of pBAD24. 70 Table 5. Phage and colicin spot-titer assays.

a b c c c d Strain Ara (%) Col M Col E1 Col B φ80vir T5

0 4, 5, 5 6, 6, 6 8, 8, 8 7, 7, 7 7, 7, 7 W3110[pBAD24] 0.001 4, 5, 4 6, 6, 6 8, 8, 8 7, 7, 7 7, 7, 7

0 R, R, R R, R, R R, R, R R, R, R - RA1051 0.001 R, R, R R, R, R R, R, R R, R, R -

0 R, R, R 6, 6, 6 R, R, R R, R, R 7, 7, 8 KP1344 0.001 R, R, R 6, 6, 6 R, R, R R, R, R 7, 8, 7

0 4, 3, 4 6, 6, 6 8, 7, 8 2, 1, 2 2, 1, 2 W3110[pRA033] 0.001 R, R, R 6, 6, 6 8, 7, 7 R, R, R R, R, R

0 R, R, R 6, 6, 6 7, 8, 8 R, R, R R, R, R W3110[pRA038] 0.001 R, R, R 6, 6, 6 8, 7, 8 R, R, R R, R, R a All bacterial strains used in this experiment are described in Materials and Methods section. b When needed, growth medium and T-top were supplemented with arabinose to the final concentration of 10-3% (wt/vol) to test the effect of cor gene induction on bacterial sensitivity to phage and colicins. c Col M (TonB- and FhuA-dependent); Col E1 (TolC- and BtuB-dependent); Col B (TonB- and FepA-dependent). d 5-ml 10-fold dilutions of 1010-pfu/ml φ80vir phage was spotted in triplicate onto a bacterial lawn in T-top agar supplemented with 100 mg of ampicillin per ml, and incubated overnight at 37ºC. 71 CHAPTER V. CONCLUSIONS

This dissertation provided some new insights and revealed unexpected twists in phage

genomics and biochemistry of lipoprotein Cor of enterobacteria phage φ80vir. We successfully

sequenced and annotated 46,285 nucleotide base pairs of φ80vir chromosome. Unlike wild-type

φ80, the virulent mutant used in this study was characterized as φ80h80attλint-imm?lys80 (see

Chapter I). More than half of the genes annotated matched those in φ80dlacZ∆M15; therefore, indicating the wild-type origin of φ80vir. As the pivotal role in comparative genetics and evolution is the number of sequenced and annotated genomes, we are going to extend our research on various φ80 mutants commonly used in labs. Therefore, we hope that newly sequenced and annotated genomes of φ80 mutants will provide a better understanding of phage evolution in general and help elucidate adaptation of φ80 in laboratory environment.

In Chapter III, we provide some evidence regarding moronic nature of major capsid protein gene E. Although we do not have an experimental evidence for φ80 phage to confirm our hypothesis, based on our data, we speculate that the moron nature of this gene could be associated with an uncharacterized secondary function of the corresponding major capsid protein component during lysogeny. The presence of moron features around gene E also suggest a recombination mechanism perhaps involved in “moving” from one phage to another; therefore, providing for a flexibility in packaging phage genomic DNA of a different size. A further study of moronic nature of the major capsid protein gene E are needed to elucidate its secondary function (if any) to better understand the relationship of virus-host interaction. It will be interesting to compare expression profiles of major capsid components in various phage to see if this phenomenon is conserved among them. The φ80 phage would certainly be a good candidate for such study. 72 In Chapters III and IV we described the xenologous nature of lysogenic and lytic conversion genes, cor and llp, respectively. The similarity of these lipoproteins in terms of function is evident. Further studies will focus on purification of Cor and its biochemical analysis.

It will be very informative to crystallize lipoprotein Cor together with the outer membrane transporter protein FhuA. Meanwhile, an in vivo formaldehyde cross-linking study may provide the necessary information of potential sites of interaction between Cor and FhuA; this study, however, will require antibodies against both proteins. Hopefully, the ongoing projects will provide some new insights on FhuA-dependent transport of iron siderophores, colicins, and phage infection. Because Cor is expressed even under repressed prophage, it will ultimately raise a concern about applying sideromycins ( attached to siderophore molecule) as a potential antibiotic for diseases caused by Gram-negative bacteria. 73 REFERENCES

Ackermann, H-W. 2006. Classification of bacteriophages, p. 8-16 In: Calendar, R (ed).

“The Bacteriophages”. Oxford University Press, NY.

Bachmann, B., and K. Low. 1980. Linkage map of Escherichia coli K-12, Edition-6.

Microbiol. Rev. 44:1-56.

Bassford Jr, P. J., C. Bradbeer, R. J. Kadner and C. A. Schnaitman. 1976. Transport of

vitamin B12 in tonB mutants of Escherichia coli. J. Bacteriol. 128:242-247.

Bell, P. E., C. D. Mau, J. T. Brown, J. Kopisky, and R. J. Kadner. 1990. Genetic

suppression demonstrates direct interaction of TonB protein with outer membrane transport

proteins in Escherichia coli. J. Bacteriol. 172:3826-3829.

Besemer, J., and M. Borodovsky. 2005. GeneMark: web software for gene finding in

prokaryotes, eukaryotes and viruses. Nucleic Acids Res. 33:W451-454.

Blattner, F. R., G. Plunkett III, C. A. Bloch, N. T. Perna, V. Burland, M. Riley, J.

Collado-Vides, J. D. Glasner, C. K. Rode, G. F. Mayhew, J. Gregor, N. W. Davis, H. A.

Kirkpatrick, M. A. Goeden, D. J. Rose, B. Mau, and Y. Shao. 1997. The complete

genome sequence of Escherichia coli K-12. Science 277:1453-1462.

Braun, M., H. Killmann, and V. Braun. 1999. The beta-barrel domain of FhuADelta5-

160 is sufficient for TonB-dependent FhuA activities of Escherichia coli. Mol. Microbiol.

33:1037-1049. 74 Braun, V. 2009. FhuA (TonA), the career of a protein. J. Bacteriol. 191:3431-3436.

Braun, V., and F. Endriss. 2007. Energy-coupled outer membrane transport proteins and

regulatory proteins. Biometals. 20:219-231.

Braun, V., A. Pramanik, T. Gwinner, M. Koberle, and E. Bohn. 2009. Sideromycins: tools and antibiotics. Biometals. 22:3-13.

Braun, V., and K. Rehn. 1969. Chemical characterization, spatial distribution and function of a lipoprotein (murein-lipoprotein) of the E. coli cell wall. The specific effect of trypsin on the membrane structure. Eur. J. Biochem. 10:426-438.

Braun, V., K. Schaller, and H. Wolff. 1973. A common receptor protein for phage T5 and colicin M in the outer membrane of Escherichia coli B. Biochim. Biophys. Acta. 323:87-97.

Braun, V., S. I. Patzer, and K. Hantke. 2002. Ton-dependent colicins and microcins:

modular design and evolution. Biochimie. 84:365-380.

Braun, V., and K. Hantke. 2011. Recent insights into iron import by bacteria. Curr. Opin.

Chem. Biol. 15:328-334.

Brinkman, K. K., and R. A. Larsen. 2008. Interactions of the energy transducer TonB

with noncognate energy-harvesting complexes. J. Bacteriol. 190:421-427.

Canchaya, C.A., M. Ventura, and D. van Sinderen. 2007. Bacteriophage Bioinformatics

and Genomics. p. 43-60. In S. McGrath and D. van Sinderen (Eds): “Bacteriophage:

Genetics and Molecular Biology”. Caister Academic Press, Norfolk, UK. Pp 43-60. 75 Casjens, S.R. 2008. Diversity among the tailed bacteriophage that infect the

Enterobacteriaceae. Res. Microbiol. 159:340-348.

Chauthaiwale, V. M., A. Therwath, and V. V. Deshpande. 1992. Bacteriophage lambda as a . Microbiol. Rev. 56:577-591.

Chen, Y., I. Golding, S. Sawai, L. Guo, and E. Cox. 2005. Population fitness and the regulation of Escherichia coli genes by bacterial viruses. Plos Biology. 3:1276-1282. doi:

10.1371/journal.pbio.0030229.

Cramer, T. J. 2008. Genetic Mosaicism Between The Bacteriophage φ80 And

Bacteriophage λ. MS Degree, Biology. Bowling Green State University.

Crosa, J. H., A. R. Mey, and S. M. Payne. 2004. Iron transport in bacteria. ASM Press,

Washington, D.C.

Daniels, D.L., J. L. Schroeder, W. Szybalski, F. Sanger, A.R. Coulson, G. Hong, D. Hill,

G. Petersen, and F. Blattner. 1983. Complete annotated lambda sequence. In: Hendrix

RW, Roberts JW, Stahl FW, Weisberg RA, eds. Lambda II. Cold Spring Harbor, New York:

Cold Spring Harbor Laboratory. pp 519–676.

Darling, A. E., B. Mau, and N. T. Perna. 2010. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 5:e11147. doi:

12.1371/journal.pone.0011147. d'Hérelle, F. 1918. Technique de la recherche du microbe filtrant bactériophage

(Bacteriophagum intestinale). C. R. Soc. Biol. 81:1160-1162. 76 Dhillon, E. K., T. S. Dhillon, A. N. Lai, and S. Linn. 1980. Host range, immunity and

antigenic properties of lambdoid coliphage HK97. J. Gen. Virol. 50:217-220.

Dhillon, T. S., E. K. Dhillon, H. C. Chau, W. K. Li, and A. H. Tsang. 1976. Studies on

bacteriophage distribution: virulent and temperate bacteriophage content of mammalian

feces. Appl. Environ. Microbiol. 32:68-74.

Durfee, T., R. Nelson, S. Baldwin, G. 3rd Plunkett, V. Burland, B. Mau, J. F. Petrosino,

X. Qin, D. M. Muzny, M. Ayele, R. A. Gibbs, B. Csorgo, G. Posfai, G. M. Weinstock,

and F. R. Blattner. 2008. The complete genome sequence of Escherichia coli DH10B:

insights into the biology of a laboratory workhorse. J. Bacteriol. 190:2597-2606

Edgar, R. C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res. 32:1792-1797.

Erill, I., M. Escribano, S. Campoy, and J. Barbe. 2003. In silico analysis reveals

substantial variability in the gene contents of the gamma proteobacteria LexA-regulon.

Bioinformatics. 19:2225-2236.

Felsenstein, J. 1985. Confidence-Limits on Phylogenies - an Approach using the Bootstrap.

Evolution. 39:783-791.

Fiandt, M., Z. Hradecna, H. A. Lozeron, and W. Szybalski. 1971. Electron Micrographic

Mapping of Deletions, Insertions, Inversions, and Homologies in the of Coliphages

Lambda and Phi 80, p. 329-354. In A. D. Hershey (ed.), The Bacteriophage Lambda vol. 2.

CHPL Press, Cold Spring Harbor, N.Y., U.S.A. 77 Franklin, N. C., W. E. Dove, and C. Yanofsky. 1965. The linear insertion of a prophage

into the chromosome of E. coli shown by deletion mapping. Biochem. Biophys. Res.

Commun. 18:910-923.

Garcia-Herrero, A., R. S. Peacock, S. P. Howard, and H. J. Vogel. 2007. The solution structure of the periplasmic domain of the TonB system ExbD protein reveals an unexpected structural homology with siderophore-binding proteins. Mol. Microbiol. 66:872-889

Garen, A., and T. T. Puck. 1951. The first two steps of the invasion of host cells by bacterial viruses. II. J. Exp. Med. 94:177-189.

Gordon, L., A. Y. Chervonenkis, A. J. Gammerman, I. A. Shahmuradov, and V. V.

Solovyev. 2003. Sequence alignment kernel for recognition of promoter regions.

Bioinformatics. 19:1964-1971.

Gottesman, M. E. and R. A. Weisberg. 2004. Little Lambda, who made thee? Microbiol.

Mol. Biol. Rev. 68:796-813

Gresock, M. G., M. I. Savenkova, R. A. Larsen, A. A. Ollis, and K. Postle. 2011. Death of the TonB Shuttle Hypothesis. Front. Microbiol. 2:1-8 206. doi:

10.3389/fmicb.2011.00206.

Günter, K., and V. Braun. 1990. In vivo evidence for FhuA outer membrane receptor interaction with the TonB inner membrane protein from Escherichia coli. FEBS Lett.

274:409-415. 78 Guzman, L. M., D. Belin, M. J. Carson, and J. Beckwith. 1995. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter.

J. Bacteriol. 177:4121-4130.

Hall, B. G. 2011. Phylogenetic trees made easy: a how-to manual. Sinauer Associates,

Sunderland, Mass.

Hankin, E. H. 1896. L'action bactericide des eaux de la Jumna et du Gange sur le vibrion du cholera. Annales De l'Institut Pasteur. 511-523.

Hancock, R. W and V. Braun. 1976. Nature of the energy requirement for the irreversible adsorption of bacteriophages T1 and phi80 to Escherichia coli. J. Bacteriol. 125:409-415.

Hartney, S. L., Mazurier, S., Kidarsa, T. A., Quecine, M. C., Lemanceau, P., and J. E.

Loper. 2011. TonB-dependent outer-membrane proteins and siderophore utilization in

Pseudomonas fluorescens Pf-5. Biometals 24: 193-213.

Hatfull, G. F. and R. W. Hendrix. 2011. Bacteriophages and their genomes. Curr. Opin.

Virol. 1:298-303.

Hayes, S., M. A. Horbay, and C. Hayes. 2012. A cI-independent form of replicative inhibition: turn off of early replication of bacteriophage lambda. PLoS One. 7:e36498. doi:

10.1371/journal.pone.0036498.

Heller, K. J., Kadner, R. J., and K. Günter. 1988. Suppression of the btuB451 mutation by mutations in the tonB gene suggests a direct interaction between TonB and TonB- dependent receptor proteins in the outer membrane of Escherichia coli. Gene 64:147-153. 79 Hendrix, R.W. 2003. Bacteriophage genomics. Curr. Opin. Microbiol. 6:506-511.

Hendrix, R. W. and R. L. Duda. 1992. Bacteriophage lambda PaPa: not the mother of all lambda phages. Science. 258:1145-1148.

Hendrix, R. W., J. G. Lawrence, G. F. Hatfull, and S. Casjens. 2000. The origins and ongoing evolution of viruses. Trends Microbiol. 8:504-508.

Hendrix, R. W., and S. Casjens. 2006. Bacteriophage λ and its genetic neighborhood, p.

409-447. In: Calendar, R. (ed). “The Bacteriophages”. Oxford University Press, NY.

Henry, M. S. 2008. Characterization of a Lambdoid Phage Gene Encoding a Host Cell

Attachment Spike. MS Thesis, Biology. Bowling Green State University.

Henthorn, K. S., and D. I. Friedman. 1995. Identification of related genes in phages phi

80 and P22 whose products are inhibitory for phage growth in Escherichia coli IHF mutants.

J. Bacteriol. 177:3185-3190.

Hernandez-Sanchez, J., A. Bautista-Santos, L. Fernandez, R. M. Bermudez-Cruz, A.

Uc-Mass, E. Martinez-Penafiel, M. A. Martinez, J. Garcia-Mena, G. Guarneros, and L.

Kameyama. 2008. Analysis of some phenotypic traits of feces-borne temperate lambdoid bacteriophages from different immunity groups: a high incidence of cor+, FhuA-dependent phages. Arch. Virol. 153:1271-1280

Hibbing, M. E., Fuqua, C., Parsek, M. R., and S. B. Peterson. 2010. Bacterial competition: surviving and thriving in the microbial jungle. Nat. Rev. Microbiol. 8:15-25. 80 Higgs, P. I., R. A. Larsen, and K. Postle. 2002. Quantification of known components of the Escherichia coli TonB energy transduction system: TonB, ExbB, ExbD and FepA. Mol.

Microbiol. 44:271-281.

Hill, C. W., and B. W. Harnish. 1981. Inversions between ribosomal-RNA genes of

Escherichia coli. Proceedings of the National Academy of Sciences of the United States of

America-Biological Sciences. 78:7069-7072.

Igarashi, K., S. Hiraga, and T. Yura. 1967. A Deoxythymidine Kinase Deficient Mutant of Escherichia coli .II. Mapping and Transduction Studies with Phage Phi80. Genetics.

57:643.

Innis, M. A., M. Tokunaga, M. E. Williams, J. M. Loranger, S. Y. Chang, S. Chang, and

H. C. Wu. 1984. Nucleotide sequence of the Escherichia coli prolipoprotein signal peptidase (lsp) gene. Proc. Natl. Acad. Sci. U. S. A. 81:3708-3712.

Inukai, M., M. Takeuchi, K. Shimizu, and M. Arai. 1978. Mechanism of action of globomycin. J. Antibiot. (Tokyo). 31:1203-1205.

Judson, H. F. 1996. The Eighth Day of Creation. Cold Spring Harbor Laboratory Press,

Cold Spring Harbor, NY. 715 pp

Juhala, R. J., M. E. Ford, R. L. Duda, A. Youlton, G. F. Hatfull, and R. W. Hendrix.

2000. Genomic sequences of bacteriophages HK97 and HK022: pervasive genetic mosaicism in the lambdoid bacteriophages. J. Mol. Biol. 299:27-51. 81 Kaiser, A. D. and D. S. Hogness. 1960. The transformation of Escherichia coli with deoxyribonucleic acid isolated from bacteriophage lambda-dg. J. Mol. Biol. 2:392-415.

Kameyama, L, L. Fernandez, J. Calderon, A. Ortiz-Rojas, and T.A. Patterson. 1999.

Characterization of wild lambdoid bacteriophages: detection of a wide distribution of phage immunity groups and identification of a nus-dependent, nonlambdoid phage group. Virology.

263:100-111.

Katoh, K, Misawa, K, Kuma, K, Miyata, T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:3059-

3066.

Kitao, S., Nakano, E. 1988. Nucleotide sequence from bacteriophage phi 80 with high homology to the major coat protein gene of lambda. Nucleic Acids Res. 16:764.

Kosic, N., M. Sugai, C. K. Fan, and H. C. Wu. 1993. Processing of lipid-modified prolipoprotein requires energy and sec gene products in vivo. J. Bacteriol. 175:6113-6117.

Koster, W., Braun, V. 1990. Iron(III) hydroxamate transport of Escherichia coli: restoration of iron supply by coexpression of the N- and C-terminal halves of the cytoplasmic membrane protein FhuB cloned on separate plasmids. Mol. Gen. Genet. 223:379-384.

Kozyrev, D. P., M. G. Djus, and V. N., Rybchin. 1983. Lysogenic conversion caused by phage 80. II. Mapping of locus cor. Genetika. 19:940. 82 Kozyrev, D. P., and V. N. Rybchin. 1987. Lysogenic conversion caused by phage phi 80.

III. The mapping of the conversion gene and additional characterization of the phenomenon.

Genetika. 23:793.

Kozyrev, D. P., A. N. Svarchevskii, E. N. Zaitsev, and V. N. Rybchin. 1982. Lysogenic

conversion induced by phages phi 80. I. A description of the phenomenon and the cloning of

the conversion gene. Genetika. 18:555-560.

Larsen, R. A. and K. Postle. 2001. Conserved residues Ser(16) and His(20) and their

relative positioning are essential for TonB activity, cross-linking of TonB with ExbB, and the

ability of TonB to respond to proton motive force. J. Biol. Chem. 276:8111-8117.

Larsen, R. A., M. G. Thomas, and K. Postle. 1999. Protonmotive force, ExbB and ligand-

bound FepA drive conformational changes in TonB. Mol. Microbiol. 31:1809-1824.

Larsen, R. A., G. J. Chen, and K. Postle. 2003. Performance of standard phenotypic

assays for TonB activity, as evaluated by varying the relative levels of functional, wild-type

TonB. J. Bacteriol. 185:4699-4706.

Larsen, R. A., T. E. Letain and K. Postle, 2003. In vivo evidence of TonB shuttling

between the cytoplasmic and outer membrane in Escherichia coli. Mol. Microbiol. 49:211-

218.

Leong, J. M., S. E. Nunes-Duby, A. B. Oser, C. F. Lesser, P. Youderian, M. M. Susskind

and A. Landy. 1986. Structural and regulatory divergence among site-specific

recombination genes of lambdoid phage. J. Mol. Biol. 189:603-616. 83 Levin, M. E., R. W. Hendrix, and S. R. Casjens. 1993. A programmed translational

frameshift is required for the synthesis of a bacteriophage lambda tail assembly protein. J.

Mol. Biol. 234:124-139.

Lohmiller, S., K. Hantke, S. I. Patzer, and V. Braun. 2008. TonB-dependent maltose

trasport by Caulobacter cresentus. Microbiology. 154:1748-1754.

Loomis, L.D., and K. N. Raymond. 1991. Solution equilibria of enterobactin and metal- enterobactin complexes. Inorg. Chem. 30:906-911.

Luria, S. E. and M. Delbruck. 1943. Mutations of Bacteria from Virus Sensitivity to Virus

Resistance. Genetics. 28:491-511.

Matsumoto, M., N. Ichikawa, S. Tanaka, T. Morita, and A. Matsushiro. 1985.

Molecular cloning of phi-80 adsorption-inhibiting cor gene. Jpn. J. Genet. 60:475-483.

Matsushiro, A. 1961. Isolation of UV-inducible temperate phage phi80. Biken J. 4:133-

135.

Matsushiro, A. 1963. Specialized transduction of tryptophan markers in Escherichia coli

K12 by bacteriophage phi-80. Virology. 19:475-482.

Mende, J. and V. Braun. 1990. Import-defective colicin B derivatives mutated in the TonB box. Mol. Microbiol. 4:1523.

Miller, J.H. 1972. Experiments in . Cold Spring Harbor Laboratory

Press, Cold Spring Harbor, NY. 84 Moeck, G.S., J.W. Coulton, and K. Postle. 1997. Cell envelope signaling in Escherichia

coli: Ligand binding to the ferrichrome-iron receptor FhuA promotes interaction with the

energy-transducing protein TonB. J. Biol. Chem. 272:28391-28397.

Morse, M. L., E. M. Lederberg, and J. Lederberg. 1956. Transduction in Escherichia

coli K-12,. Genetics 41:142-156.

Narita, S. 2011. ABC Transporters Involved in the Biogenesis of the Outer Membrane in

Gram-Negative Bacteria. Biosci. Biotechnol. Biochem. 75:1044-1054.

Nikaido, H. 2003. Molecular basis of bacterial outer membrane permeability revisited.

Microbiol. Mol. Biol. Rev. 67:593-656.

Noinaj, N., M. Guillier, T. J. Barnard, S. K Buchanan. 2010. TonB-dependent transporters: regulation, structure, and function. Annu. Rev. Microbiol. 64:43-60..

Ogawa, T., H. Masukata, and J. Tomizawa. 1988. Transcriptional Regulation of Early

Functions of Bacteriophage-Phi-80. J. Mol. Biol. 202:551-563.

Ogawa, T., H. Masukata, and J. Tomizawa. 1988. Organization of the early region of bacteriophage phi-80. Genes and proteins. J. Mol. Biol. 202:537-550.

Pawelek, P. D., N. Croteau, C. Ng-Thow-Hing, C. M. Khursigara, N. Moiseeva, M.

Allaire, and J. W. Coulton. 2006. Structure of TonB in complex with FhuA, E. coli outer

membrane receptor. Science. 312:1399-1402. 85 Pedruzzi, I., J. P. Rosenbusch, and K. P. Locher, 1998. Inactivation in vitro of the

Escherichia coli outer membrane protein FhuA by a phage T5-encoded lipoprotein. FEMS

Microbiol. Lett. 168:119-125.

Postle, K., and R. F. Good. 1983. DNA sequence of the Escherichia coli tonB gene. Proc.

Natl. Acad. Sci. U. S. A. 80:5235-5239.

Postle, K, and R. J. Kadner. 2003. Touch and go: tying TonB to transport. Mol.

Microbiol. 49:869-882.

Ptashne, M. 2004. “A genetic switch, Third Edition: Phage Lambda Revisited”. Cold

Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 164 pp.

Randall-Hazelbauer, L., and M. Schwartz. 1973. Isolation of the bacteriophage lambda receptor from Escherichia coli. J. Bacteriol. 116:1436-1446.

Raymond, K. N., E. A. Dertz, and S. S Kim. 2003. Enterobactin: an archetype for microbial iron transport. Proc. Natl. Acad. Sci. U. S. A. 100:3584-3588.

Roberts, M.D., N.L. Martin, and A.M. Kropinski. 2004. The genome and proteome of coliphage T1. Virology. 318:245-266.

Robichon, C., M. Bonhivers, and A. P. Pugsley. 2003. An intramolecular disulphide bond reduces the efficacy of a lipoprotein plasma membrane sorting signal. Mol. Microbiol.

49:1145-1154.

Rybchin, V. N. 1984. Genetics of bacteriophage phi 80--a review. Gene. 27:3-11. 86 Sambrook, J. and D. W. Russell. 2001. : 3rd edition. Cold Spring

Harbor Laboratory Press, Cold Spring Harbor, NY.

Samsygina, G. A., and E. G. Boni. 1984. Bacteriophages and phage therapy in pediatric practice. Pediatrica. 4:67-70.

Sanger, F., A. R. Coulson, G. F. Hong, D. F. Hill, and G. B. Petersen. 1982. Nucleotide sequence of bacteriophage lambda DNA. J. Mol. Biol. 162:729-773.

Schauer, K., B. Gouget, M. Carriére, A. Labigne, and H. de Reuse. 2007. Novel nickel transport mechanism across the bacterial outer membrane energized by the

TonB/ExbB/ExbD machinery. Mol. Microbiol. 63:1054-1068.

Schauer, K., D. A. Rodionov, and H. de Reuse. 2008. New substrates for TonB-dependent transport: do we only see the 'tip of the iceberg'? Trends Biochem. Sci. 33:330-338.

Schneider, T. D., G. D. Stormo, L. Gold, and A. Ehrenfeucht, A. 1986. Information

Content of Binding Sites on Nucleotide Sequences. J. Mol. Biol. 188:415-431.

Schöffler, H., and V. Braun. 1989. Transport across the outer membrane of Escherichia coli via the FhuA receptor is regulated by the TonB protein of the cytoplasmic membrane.

Mol. Gen. Genet. 217:378-383.

Schramm, E., J. Mende, V. Braun, and R. M. Kamp. 1987. Nucleotide sequence of the colicin B activity gene cba: consensus pentapeptide among TonB-dependent colicins and receptors. J. Bacteriol. 169:3350. 87 Seydel, A., P. Gounon, and A. P. Pugsley. 1999. T esting the '+2 rule' for lipoprotein

sorting in the Escherichia coli cell envelope with a new genetic selection. Mol. Microbiol.

34:810-821.

Shaw, J. E., H. Bingham, C. R. Fuerst, and M. L. Pearson. 1977. The multisite character

of host-range mutations in bacteriophage λ. Virology. 83:180-194.

Shinagawa, H., Y. Hosaka, H. Yamagishi, and Y. Nishi. 1966. Electron microscopic

studies on phi80 and phi80pt1 phage virions and their DNA. Biken J. 9:135-148.

Shulman, M. J., K. Mizuuchi, and M. M. Gottesman. 1976. New att mutants of phage

lambda. Virology. 72:13-22.

Singer, E. R. 1964. Recombination between coliphages lambda and phi80. Virology.

22:650-651.

Signer, E. R., and J. R. Beckwith. 1966. Transposition of lac region of Escherichia coli.

III. Mechanism of attachment of bacteriophage phi 80 to the bacterial chromosome. J. Mol.

Biol. 22:33-51.

Signer, E. R., and V. N. Rybchin. 1967. Recombination between prophages of Escherichia

coli. Genetika. 3:114-121.

Skare, J. T., R. M. M. Ahmer, C. L. Seachord, R. P. Darveau, and K. Postle. 1993.

Energy transduction between membranes – TonB, a cytoplasmic membrane protein, can be

chemically cross-linked in vivo to the outer membrane receptor FepA. J. Biol. Chem.

268:16302-16308. 88 Snyder, L., and Champness, W. 2007. Molecular Genetics of Bacteria, 3rd Edition.ASM

Press, Washington, D.C. 735 pp.

Stoof, J., E. J. Kuipers, and A. H. van Vliet. 2010. An ABC transporter and a TonB ortholog contribute to Helicobacter mustelae nickel and cobalt transport. Infect. Immun.

78:4261-4267.

Tamura, K., D. Peterson, N. Peterson, G. Stecher, M. Nei, and S. Kumar. 2011.

MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28:2731-2739.

Tanaka, S., and A. Matsushiro. 1985. Characterization and sequencing of the region containing gene N, the nutL site and tL1 terminator of bacteriophage phi 80. Gene. 38:119-

129.

Taylor, W. and C. Yanofsky. 1964. Transformation of bacterial markers and transfer of phage markers with DNA isolated from a λ−φ80 hybrid phage carrying the tryptophan genes of E. coli. Biochem. Biophys. Res. Comm. 17:798-804.

Terada, M., T. Kuroda, S. I. Matsuyama, and H. Tokuda. 2001. Lipoprotein sorting signals evaluated as the LolA-dependent release of lipoproteins from the cytoplasmic membrane of Escherichia coli. J. Biol. Chem. 276:47690-47694.

Torres, A. G., P. Redford, R. A. Welch, and S. M. Payne. 2001. TonB-dependent systems of uropathogenic Escherichia coli: aerobactin and heme transport and TonB are required for virulence in the mouse. Infect. Immun. 69:179-185. 89 Tuckman, M., and M. S. Osburne. 1992. In vivo inhibition of TonB-dependent processes by a TonB box consensus pentapeptide. J. Bacteriol. 174:320.

Uc-Mass, A., E. J. Loeza, M. de la Garza, G. Guarneros, J. Hernandez-Sanchez, and L.

Kameyama. 2004. An orthologue of the cor gene is involved in the exclusion of temperate lambdoid phages. Evidence that Cor inactivates FhuA receptor functions. Virology. 329:425-

433.

Vakharia, H. L., and K. Postle. 2002. FepA with globular domain deletions lacks activity.

J. Bacteriol 184:5508-5512.

Vasinova, N. A., D. P. Kozyrev, and V. N. Rybchin. 1987. Physical mapping of the immunity region of phage phi 80. Genetika. 23:389-396.

Vica Pacheco, S., O. Garcia Gonzalez, and G. L. Paniagua Contreras. 1997. The lom gene of bacteriophage lambda is involved in Escherichia coli K12 adhesion to human buccal epithelial cells. FEMS Microbiol. Lett. 156:129-132. von Heijne, G. 1985. Signal sequences. The limits of variation. J. Mol. Biol. 184:99-105. von Heijne, G. 1989. The structure of signal peptides from bacterial lipoproteins. Protein

Eng. 2:531-534.

Vostrov, A. A., O. A. Vostrukhina, A. N. Svarchevsky, and V. N. Rybchin. 1996.

Proteins Responsible for Lysogenic Conversion Caused by Coliphages N15 and phi80 Are

Highly Homologous. J. Bacteriol. 178:1484-1486. 90 Wang, J., M. Hofnung, and A. Charbit. 2000. The C-terminal portion of the tail fiber

protein of bacteriophage Lambda is responsible for binding to LamB, its receptor at the

surface of Escherichia coli K-12. J. Bacteriol. 182:508-512.

Wang, J., Y. Jiang, M. Vincent, Y. Sun, H. Yu, Q. Bao, H.Kong and S. Hu. 2005.

Complete genome sequence of bacteriophage T5. Virology 332:45-65.

Watanabe, T., S. Hayashi, and H. C. Wu. 1988. Synthesis and export of the outer membrane lipoprotein in Escherichia coli mutants defective in generalized protein export. J.

Bacteriol. 170:4001-4007.

Wayne, R., and J. B. Neilands. 1975. Evidence for common binding sites for ferrichrome compounds and bacteriophage phi 80 in the cell envelope of Escherichia coli. J. Bacteriol.

121:497-503.

Weiner, M. C. 2005. TonB-dependent outer membrane transport: going for Baroque?

Curr. Opin. Struct. Biol. 15:394-400.

Weisberg, R., and A. Landy. 1983. Site-specific recombination in phage lambda. In R.

Hendrix J. Roberts, F. Stahl, and R. Weisberg (eds.), Lambda II, pp. 211-250. Cold Spring

Harbor Press, NY.

White, D. 2007. The physiology and biochemistry of prokaryotes. Oxford University Press,

New York. 628 pp. 91 Wietzorrek, A., H. Schwarz, C. Herrmann, and V. Braun. 2006. The genome of the

novel phage Rtp, with a rosette-like tail tip, is homologous to the genome of phage T1. J.

Bacteriol. 188:1419-1436.

Wommack, K. E., and R. R. Colwell. 2000. Virioplankton: viruses in aquatic ecosystems.

Microbiol. Mol. Biol. Rev. 64:69-114.

Wu, H. C. 1996. Biosynthesis of lipoproteins, p. 1005-1014. In F. C. Neidhardt (ed.),

Escherichia coli and Salmonella: Cellular and Molecular Biology vol. 1. American Society

for Microbiology Press, Washington D.C.

Wu, R., and E. Taylor. 1971. Nucleotide sequence analysis of DNA. II. Complete

nucleotide sequence of the cohesive ends of bacteriophage lambda DNA. J. Mol. Biol.

57:491-511.

Xu, J., R. W. Hendrix, and R. L. Duda. 2004. Conserved translational frameshift in

dsDNA bacteriophage tail assembly genes. Mol. Cell. 16:11-21.

Yakushi, T., K. Masuda, S. Narita, S. Matsuyama, and H. Tokuda. 2000. A new ABC transporter mediating the detachment of lipid-modified proteins from membranes. Nat. Cell

Biol. 2:212-218.

Yamagishi, H., K. Nakamura, and H. Ozeki. 1965. Cohesion occurring between DNA molecules of temperate phages phi 80 and lambda or phi 81. Biochem. Biophys. Res.

Commun. 20:727-732. 92 Yamaguchi, K., F. Yu, and M. Inouye. 1988. A single amino acid determinant of the membrane localization of lipoproteins in E. coli. Cell. 53:423-432.

Yokota, N., T. Kuroda, S. Matsuyama, and H. Tokuda. 1999. Characterization of the

LolA-LolB system as the general lipoprotein localization mechanism of Escherichia coli. J.

Biol. Chem. 274:30995-30999.

Youderian, P., and J. King. 1981. New genes in the left arm of the bacteriophage phi 80 chromosome. J. Virol. 37:976-986.

Youderian, P. 1978. Genetic control of the length of lambdoid phage tails. Ph.D.

Dissertation. Massachusetts Institute of Technology, Cambridge.

Zeth, K., C. Romer, S. I. Patzer, and V. Braun. 2008. Crystal structure of colicin M, a novel phosphatase specifically imported by Escherichia coli. J. Biol. Chem. 283:25324-

25331.

Zuker, M. 2003. Mfold web server for folding and hybridization prediction.

Nucleic Acids Res. 31:3406-3415.