Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2018 Comparative genomics and phylogenetic assignment of Extraintestinal Pathogenic Escherichia coli Daniel Ward Nielsen Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Bioinformatics Commons, and the Microbiology Commons

Recommended Citation Nielsen, Daniel Ward, "Comparative genomics and phylogenetic assignment of Extraintestinal Pathogenic Escherichia coli" (2018). Graduate Theses and Dissertations. 17278. https://lib.dr.iastate.edu/etd/17278

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Comparative genomics and phylogenetic assignment of Extraintestinal Pathogenic Escherichia coli

by

Daniel Ward Nielsen

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Microbiology

Program of Study Committee: Catherine M. Logue, Co-major Professor Qijing Zhang, Co-major Professor Heather K. Allen Adina Howe F. Chris Minion

The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred.

Iowa State University

Ames, Iowa

2018

Copyright © Daniel Ward Nielsen, 2018. All rights reserved. ii

TABLE OF CONTENTS

Page

LIST OF FIGURES ...... iii

LIST OF TABLES ...... v

NOMENCLATURE ...... vi

ACKNOWLEDGMENTS ...... viii

ABSTRACT ...... ix

CHAPTER 1. INTRODUCTION ...... 1

CHAPTER 2. THE IMPACT OF MEDIA, PHYLOGENETIC CLASSIFICATION, AND E. COLI PATHOTYPES ON BIOFILM FORMATION IN EXTRAINTESTINAL AND COMMENSAL E. COLI FROM HUMANS AND ANIMALS ...... 66

CHAPTER 3. ASSOCIATION OF POLYMORPHISMS IN OUTER MEMBRANE A WITH EXTRAINTESTINAL PATHOGENIC ESCHERICHIA COLI’S PHYLOGENETIC GROUPS AND SEQUENCE TYPES ...... 102

CHAPTER 4. COMPLETE GENOME SEQUENCE OF AVIAN PATHOGENIC ESCHERICHIA COLI STRAIN APEC O2-211 ...... 138

CHAPTER 5. COMPLETE GENOME SEQUENCE OF THE MULTI-DRUG RESISTANT NEONATAL MENINGITIS ESCHERICHIA COLI SEROTYPE O75:H5:K1 STRAIN MCJCHV-1 (NMEC-O75)...... 143

CHAPTER 6. GENOMIC CHARACTERIZTION OF THE NEONATAL MENINGITIS ESCHERICHIA COLI (NMEC) SUBPATHOTYPE USING WHOLE GENOME SEQUENCING ...... 149

CHAPTER 7. CONCLUSION ...... 193 iii

LIST OF FIGURES

Page

Figure 1.1: Phylogenetic groupings between the different ExPEC subpathotypes...... 40

Figure 2.1: k-means Clustering of E. coli by Media ...... 75

Figure 2.2: Prevalence of Biofilm Formation in ExPEC subpathotypes and E. coli Commensals...... 76

Figure 2.3: Optical Density of ExPEC and Commensal E. coli Strains...... 77

Figure 2.4: Optical Density of E. coli isolates as Designated Using Clermont’s Original Phylogenetic Groups...... 81

Figure 2.5: Optical Density of E. coli isolates as Designated Using Clermont’s Revised Phylogenetic Groups...... 84

Figure 2.6: Prevalence of Biofilm Formation by E. coli classified using Clermont’s Revised Phylogenetic Groups in Each Medium...... 85

Figure 3.1: Polymorphisms present in OmpA (3.1A) and their location within the OmpA protein (3.1B)...... 110

Figure 3.2: OmpA amino acid polymorphism patterns (x-axis) as associated with Clermont’s revised phylogenetic groups for the E. coli strains analyzed in the present study that were accessed from GenBank (n=493)...... 115

Figure 3.3: OmpA of E. coli strains from GenBank (n=493) can differ by their phylogenetic group (x-axis) when examined between the (3A) linker/dimerization domain and (3B) N-terminal domain...... 116

Figure 3.4: Association of polymorphism patterns with Warwick sequence types (x- axis) for publicly-available E. coli (n=493) (25)...... 119

Figure 3.5: Polymorphism patterns (x-axis) and prevalence (y-axis) of each pattern for APEC (n=171), NMEC (n=80), and UPEC (n=148)...... 120

Figure 3.6: ExPEC subpathotypes polymorphisms differ across the phylogenetic groups (facetted plots) by their (6A) linker/dimerization and (6B) N- terminal domains ...... 122

Figure 3.7: Polymorphism differences occur between the ExPEC subpathotypes (x- axis) when separated by their serogroups (facetted plots)...... 125 iv

Figure 6.1: Genomes of the NMEC subpathotype show diversity of virulence- associated genes...... 156

Figure 6.2: Maximum-Likelihood phylogenetic tree of NMEC core genome alignment...... 158

Figure 6.3: Presence of the T2SS genes among NMEC and K-12 isolates (W3110 and MG1655...... 159

Figure 6.4: NMEC clusters into four clades by the presence of conserved genes...... 162

Figure 6.5: Complete or nearly-complete sequenced NMEC chromosomes relative to NMEC strain RS218...... 165

Figure 6.6: Plasmid replicons within the NMEC genomes...... 166

Figure 6.7: NMEC ColV plasmids are similar to each other and other APEC ColV plasmids, including the prototypical ColV plasmid pAPEC-O2-ColV (dark blue) (65)...... 169 v

LIST OF TABLES

Page

Table 2.1: Isolates used in this study as classified by subpathotype or commensal category and Clermont’s phylogenetic typing schemes...... 72

Table 2.2: Chi-square Discrete Categorization of E. coli by Clermont’s Original Phylogenetic Groups*...... 80

Table 3.1: Phylogenetic groups of the ExPEC and publicly available E. coli chromosomes examined in this study...... 108

Table 3.2: Historic OmpA variant types ...... 111

Table 3.3: Liao et al. (20) polymorphism patterns identified by the total collection in this study and in Liao et al...... 112

Table 3.4: Polymorphism patterns for the different domains and their identifying names...... 113

Table 6.1: Conserved genes unique to each clade with adhesins, toxins, iron regulation, and lipoproteins shown...... 163

Table 6.2: Complete and draft chromosomes for 13 NMEC strains and reference chromosomes...... 164

Table 6.3: Large plasmids carried by NMEC isolates...... 167

vi

NOMENCLATURE

AIEC Adherent Invasive E. coli

APEC Avian Pathogenic E. coli

BBB Blood-Brain Barrier

BHI Brain Heart Infusion

BMEC Brain Microvascular Endothelial Cells

CDEC Cell Detaching E. coli

CSF Cerebrospinal Fluid

CNS Central Nervous System

DAEC Diffusely Adherent E. coli

E. coli Escherichia coli

EHEC Enterhemorrhagic E. coli

EIEC Enteroinvasive E. coli

EPEC Enteropathogenic E. coli

ETEC Enterotoxigenic E. coli

ExPEC Extraintestinal Pathogenic E. coli

HBMEC Human Brain Microvascular Endothelial Cells

HUS Hemolytic Uremic Syndrome

InPEC Intestinal Pathogenic E. coli

LB Luria Broth

MLST Multilocus Sequence Typing

NBM Neonatal bacterial meningitis

NMEC Neonatal Meningitis E. coli vii

OmpA Outer Membrane Protein A

PAI Pathogenicity Island

ST Sequence Type

TSB Tryptic Soy Broth

UPEC Uropathogenic E. coli

UTI Urinary Tract Infection

VEC Vaginal E. coli

WGS Whole Genome Sequencing

viii

ACKNOWLEDGMENTS

Although I have had substantial support from many individuals throughout graduate school, I would specifically like to thank Drs. Catherine M. Logue, Lisa K.

Nolan, and Heather K. Allen, for providing opportunities to explore science, teaching me laboratory techniques, and exposing me to different laboratory experiences. I would also like to thank Nicole Ricker, Yvonne Wannemuehler, and Nicolle L. Barbieri for their patience and support as they all had the great misfortune of sharing office space with me.

I would also like to acknowledge Dr. Qijing Zhang for serving as my co-major professor and Drs. Adina Howe and Chris Minion for serving on my committee; all of these individuals have committed their time, expertise, and knowledge. Additionally, I would like to recognize James S. Klimavicz who assisted me in troubleshooting programming problems and encouraged me to be a better programmer. Finally, I would like to thank my parents, family, and friends for their support. ix

ABSTRACT

Extraintestinal pathogenic Escherichia coli (ExPEC) are a group of E. coli associated with disease outside the intestinal tract of the host. These organisms are known to cause a diversity of disease states in humans and animals. ExPEC are often divided into subpathotypes including Neonatal Meningitis E. coli (NMEC), associated with neonatal bacterial meningitis (NBM) in humans; uropathogenic E. coli (UPEC), associated with urinary tract infections in both humans and animals; and Avian

Pathogenic E. coli (APEC), which causes colibacillosis in poultry. However, in vivo studies have demonstrated that the subpathotypes are more fluid than their names recognize. This work compares ExPEC subpathotypes to assess phenotypic and genomic differences amongst them. Specifically, differences between the subpathotypes are assessed via a phenotypic biofilm assay and polymorphisms in a known virulence factor, outer membrane protein A (OmpA). The results indicate that statistically significant similarities and differences were found between subpathotypes in their polymorphisms of

OmpA and ability to form biofilms in certain media. Statistically significant differences were observed when the chromosomal differences of these E. coli are assessed via

Clermont’s phylogenetic typing scheme. Both OmpA polymorphisms and biofilm formation results suggest that the phylogenetic groups are important when attempting to make generalizations about ExPEC subpathotypes. To examine an ExPEC subpathotype in greater resolution than traditional typing schemes allow, NMEC was chosen for a large sequencing study as NBM is a devastating disease that affects newborns resulting in sepsis and meningitis and mortality rate ranging from 15-40%. An examination of 58

NMEC genomic sequences found that NMEC had no conserved chromosomal type, as x indicated via their serogroups, sequence types, phylogenetic groups, or identification of four NMEC clades based on the presence of conserved genes for each clade. Nineteen completed or nearly completed NMEC chromosomes concurred with this conclusion. An examination of NMEC plasmids revealed significant diversity among the Inc replicon types including the presence of Col replicons, yet IncFIB replicons were prevalent in most NMEC genomes. The NMEC sequencing data suggests that the NMEC subpathotype has considerable variation and that no strain should be considered prototypical.

1

CHAPTER 1. INTRODUCTION

Escherichia coli: The Bacteria of Pathological Variants

Escherichia coli is an important component of the gut microbiota as it is one of the initial colonizers of the gut after birth (1). As an initial colonizer, E. coli acts as a pioneer species by consuming the initial oxygen in the gut, so other obligate anaerobes can colonize (1-4). Later in life, E. coli may contribute to the health of the gut by enhancing the mucosal integrity of the gut and can be consumed as a probiotic (5, 6). E. coli is also used in a wide variety of industrial and laboratory applications including drug and production (7, 8). Despite these beneficial attributes, E. coli is also associated with a variety of disease states in human and animal hosts. E. coli strains responsible for specific disease states using common virulence factors have been grouped into pathotypes (9), which can be divided broadly into Intestinal Pathogenic E. coli

(InPEC) and Extraintestinal Pathogenic E. coli (ExPEC) based on their primary sites of infection. Pathotypes or subpathotypes of E. coli may also be referred to as pathovars, a grouping of strains with similar distinctive pathogenicity characteristics.

Intestinal Pathogenic E. coli: A Grouping of Queasiness

Intestinal pathogenic E. coli (InPEC) cause or are associated with disease in the gastrointestinal tract. This pathotype consists of six long-standing subpathotypes including enteropathogenic E. coli (EPEC), enterohemorrhagic E. coli (EHEC), enterotoxigenic E. coli (ETEC), enteroaggregative E. coli (EAEC), enteroinvasive E. coli

(EIEC), and diffusely adherent E. coli (DAEC) (9). More InPEC subpathotypes have also been proposed including adherent invasive E. coli (AIEC), necrotoxic E. coli (NTEC), 2 and cell-detaching E. coli (CDEC) (9, 10). Though all InPEC have a primary focus on the intestinal tract and may share similar signs of disease, the virulence factors and mechanisms by which they cause disease differ.

Enteropathogenic E. coli (EPEC)

Enteropathogenic E. coli (EPEC) may be one of the most well-studied pathovars of E. coli (9). EPEC is the causative agent of acute to persistent diarrhea in children worldwide (11). EPEC-caused disease occurs primarily in developing countries and has a mortality rate of 30% for infants (12). In adults, the infectious dose of EPEC is high, between 108 to 1010 CFU (13). The pathogenesis of EPEC-caused diarrhea involves ingestion of EPEC in contaminated water or food, bacterial adhesion to the small intestine enterocytes, protein translocation of a receptor from bacteria to host via a type

III secretion system (T3SS), and pedestal formation through actin filament rearrangement

(9). The formation of the pedestal destroys the intestinal microvilli, which induces the attaching and effacing lesion and lessens the surface area for fluid absorption, which contributes to diarrhea (9).

EPEC’s key virulence factors include the locus of enterocyte effacement pathogenicity island (LEE PAI)(14). The LEE PAI encodes a T3SS that encodes Intimin and Tir. Tir is translocated from the bacteria into the host cell to be used as a receptor. On the surface of the bacteria, Intimin mediates the attachment of EPEC to Tir (15, 16). Tir has also been shown to bind to nucleolin, a cell-growth protein that shuttles between the nucleus and cytoplasm of the host cell (17). Another virulence factor of EPEC is a 70-100 kb plasmid known as pEAF for plasmid of EPEC Adherence Factor (9). pEAF encodes the bundle forming pilus, which adheres to the host cell, and the plasmid enhances the 3 efficiency of LEE PAI genes (18, 19). Some EPEC, known as atypical EPEC (aEPEC), lack the EAF plasmid and do not make the bundle forming pilus, as compared to more typical EPEC (aka tEPEC). Globally, atypical EPEC are more often isolated from cases of diarrhea in the developed world while tEPEC cases are more prevalent in developing countries (18, 20). Genomic analyses indicate that the separation of atypical and typical

EPEC based on the presence or absence of the bundle forming pilus is likely an oversimplification (21).

Enterohemorraghic E. coli (EHEC)

Like EPEC, enterohemorrhagic E. coli (EHEC) also form a pedestal through effectors of the LEE PAI, including Tir and Intimin. Tir is translocated into the host cell by the bacteria via a T3SS, and intimin serves as the receptor on the bacterial cell (9).

EPEC and EHEC both form the pedestal complex, but the Tir proteins are not functionally identical due to differences in signaling (22). Another difference is location,

EPEC’s attaching and effacing (AE) lesions occur in the small intestine while EHEC induces the AE lesions in the large intestine (9). Nevertheless, the greatest difference between EPEC and EHEC is that EHEC commonly produce a shiga toxin (Stx). Shiga toxin E. coli (STEC) can be differentiated from EHEC as STEC strains produce the Stx, but not all STEC are EHEC as demonstrated by the 2011 German Outbreak of STEC

O104:H4 (23). Greater than 380 serotypes of E. coli can produce Stx, but most are not associated with disease in humans without also containing the LEE PAI (24). Therefore,

EHEC may be defined as E. coli that produce the Stx and contain the LEE PAI (9).

EHEC infections occur via a fecal-oral route, and cattle are a known reservoir with 75% of outbreaks attributable to bovine-derived products (24). The low infectious 4 dose of EHEC, 10-100 CFUs, assists in the transmission of the human pathogen (24).

EHEC causes disease in the large intestine, and disease can manifest as bloody diarrhea, non-bloody diarrhea, or hemolytic uremic syndrome (HUS). Bloody diarrhea (or hemorrhagic colitis) is a result of local damage in the colon from Stx, which results in the termination of protein synthesis by cleaving ribosomal RNA (9). EHEC that cause HUS can result in renal failure or death (25). For HUS to occur, Stx production occurs in the large intestine before Stx binds to the endothelial cells, is absorbed into the bloodstream, and is disseminated (24).

Upon entering the kidney, Stx damages renal endothelial cells and blocks the microvessels through direct toxicity and host immune response related to cytokine and chemokine production leading to inflammation (9). Damage within the kidney can lead to

HUS, which can result in fatal acute renal failure. Treatment of EHEC infection is complicated by the production of Stx as antibiotics promote the replication and expression of stx genes since the genes are contained on a lambda prophage, which has been integrated into the chromosome (26, 27). The Stx also promotes intestinal colonization (28).

In addition to the Stx, EHEC isolates are also known to contain plasmids, which encode other virulence factors and toxins (29, 30). Although the most well-known EHEC belong to the serogroup O157, EHEC isolates of other O-antigens types are also recognized including O26, O103, O111, and O145 (31). Overall, EHEC isolates have larger than average genomes for E. coli and their genomes contain large numbers of prophages and integrative elements (32).

5

Enterotoxigenic E. coli (ETEC)

Enterotoxigenic E. coli (ETEC) is responsible for approximately 15-20% of diarrheal cases in children under five in developing countries and 60% of traveler’s diarrhea (33). ETEC harbor numerous virulence factors, including enterotoxins and colonization factors. ETEC contain at least one of two enterotoxins, a heat-labile enterotoxin (LT) or a heat-stable one (ST). The LT toxin is nearly identical to the cholera toxin of Vibrio cholerae, while the STa toxin is small with only 18-19 amino acids and forms three disulphide bridges via six cysteine residues (34). The LT results in chloride secretion by secretory crypt cells, leading to watery diarrhea (9). The STa activates kinases, which ultimately increase secretion (9).

In addition to the enterotoxins, ETEC can harbor colonization factors, and more than 22 different colonization factors have been described among human ETEC (35). The majority of these factors enable ETEC to colonize the small intestine, which brings the enterotoxins produced by ETEC in close proximity to the intestinal epithelium. Seventy five percent of human ETEC cases involve colonization factors CFA/I, CFA/II, or

CFA/IV (36). ETEC can also cause disease in suckling animals, but the virulence factors differ slightly from its human analog. In animals, the enterotoxins used are the STaP,

STb, or the LT-II enterotoxin variants (37). Common animal colonization factors are K88 or K99 antigens (9).

Enteroaggregative E. coli (EAEC)

Enteroaggregative E. coli (EAEC), previously known as enteroadherent- aggregative E. coli, produce a thick biofilm and secrete toxins leading to watery diarrhea

(9, 38). EAEC biofilms are distinct from the biofilms of non-pathogenic E. coli as they 6 are produced independently of biofilm-promoting factors like curli, flagella, and antigen

43 (39). For these organisms to cause disease, they colonize the mucosa of the small and large intestine. To do so, aggregative adherence is critical and is facilitated by aggregative adherence fimbriae (AAFs) (39). These adhesins are important in classification as their designation as a pathotype is due to their ability to form a “stacked- brick” formation adherence to HEp-2 cells (40). The designation from the stacked- bricked phenotype and the lack of LT or ST is the primary classification mechanism of the pathotype (9).

Research has shown EAEC to be genetically heterogeneous, and there is controversy regarding conserved virulence factors (41). EAEC is sometimes described as diarrheagenic E. coli that produce aggregative adherence in cultured epithelial cells without the virulence factors which define EPEC, ETEC, EHEC, or EIEC (42). Although no EAEC virulence factor is present in all isolates, several virulence factors have been described. Two of its toxins include ShET1 and EAST-1. The ShET1 toxin is also found in Shigella flexneri 2a, and EAST-1 is a homolog of the heat-stable enterotoxin of ETEC

(43, 44). Some EAEC also contain two serine protease autotransporters (SPATEs): Pic

(protease involved in intestinal colonization) and Pet (plasmid-encoded autotransporter)

(42, 45). Pic cleaves surface glycoproteins and mediates hemagglutinin and mucus cleavage while Pet is a cytotoxin, which leads to cell rounding and detachment (46, 47).

Like some other diarrheagenic pathotypes, it has been proposed that EAEC be separated into typical or atypical (aEAEC) groups. The presence of aggR, a gene encoding a global regulator, distinguishes between EAEC and aEAEC. In general, the 7 majority of virulence factors, including AAFs, EAST-1, aggR, and Pet, are encoded by

100kb pAA plasmids (42, 48).

Enteroinvasive E. coli (EIEC)

Enteroinvasive E. coli is a major etiologic agent of dysentery in developing countries, and the mechanisms of EIEC pathogenesis are similar to those of Shigella.

Diarrhea occurs following bacterial invasion of enterocytes via endocytosis, leading to host cell death (9). A 220 kb virulence plasmid, pInv, is critical to EIEC virulence, as it carries the invasion plasmid antigen H gene (ipaH), invasion-associated locus (ial), and invasion transcriptional regulation genes (virF and virB) (49). Other potential virulence factors include the ShET1 and ShET2 enterotoxins, Pic, and autotransporters SepA and

SigA (9). Recent research involving 169 genomic sequences has shown a lack of monophyletic groups and no distinct evolutionary groups between EIEC and Shigella, suggesting Shigella should be classified as enteroinvasive E. coli (50). Nevertheless,

EIEC isolates appear to cause less severe infections than Shigella species (51).

Diffusely Adherent E. coli (DAEC)

Diffusely adherent E. coli (DAEC) have been implicated in diarrhea in children older than 12 months (52). Like EAEC, DAEC have been defined by their adherence pattern to cultured cells as DAEC have a diffuse adherence pattern on HeLa and Hep-2 cells (39). Additionally, DAEC (like EAEC) are classified into other pathovar groups if they contain the virulence factors indicative of a different pathotype. E. coli with the

DAEC phenotype have been categorized into other E. coli groups including uropathogenic E. coli (UPEC), ETEC, or EPEC (53). The Afa/Dr adhesins are important 8 to the DAEC virulence as they assist in eliciting the enterocytes to develop long cellular projections, which wrap around the bacteria (9). Nevertheless, without additional virulence factors, the role of DAEC in diarrhea is questionable as volunteers that ingested

1010 CFU of prototypic DAEC strains did not experience diarrhea (53, 54).

Adherent Invasive E. coli (AIEC)

In 2002, Darfeuille-Michaud described a new pathotype of InPEC, adherent- invasive E. coli (AIEC) (10). This pathotype colonizes the intestinal mucosa and adheres to intestinal epithelial cells. Like EIEC, AIEC are invasive pathogens as AIEC survive and replicate intracellularly (55). Clinically, AIEC may be involved in the pathogenesis of Crohn’s Disease. AIEC have been enriched from the ileal lesions of Crohn’s Disease patients, and AIEC is associated with the ileal mucosa (56). In the ileum, AIEC adheres to CEACAM6, which is overexpressed in Crohn’s disease patients (57). Virulence factors of AIEC strains include type I fimbriae, long polar fimbriae, pyelonephritis-associated pili, S fimbriae, Dr antigen-specific fimbriae, K1 capsule, iron acquisition proteins, the vacuolating autotransporter toxin, alpha-hemolysin, cytotoxic necrotizing factor, and the

IbeA invasin (58). Interestingly, these virulence factors are found in many extraintestinal pathogenic E. coli isolates, which is discussed in a later section (59, 60).

Necrotoxic E. coli (NTEC)

Necrotoxic E. coli (NTEC) are defined based on their production of cytotoxic necrotizing factor 1 (CNF1) or 2 (CNF2). With the toxin, NTEC strains were able to cause a variety of disease states including diarrhea, systemic infections, septicemia, and urinary tract infections (61). Since CNF1 is often found in other pathotypes including 9

AIEC and ExPEC, the NTEC pathotype has become a less common classification for E. coli isolates. It may be more relevant to include E. coli with a cytotoxic necrotizing factor gene in a pathotype or subpathotype grouping that accounts for their clinical condition.

As such, UPEC are occasionally subdivided by the presence of cnf1 or cnf2 to distinguish between those with the toxin, an NTEC UTI, and those without, an UPEC UTI (62).

However, this distinction is poorly recognized among the UPEC field.

Cell-Detaching E. coli (CDEC)

Cell-detaching E. coli (CDEC) are associated with diarrhea in children (9).

Members of the CDEC pathovar cause cultured epithelial cells to detach from glass or plastic. This trait is associated with cell-bound alpha hemolysin (HlyA) (63). HlyA is found in other pathotypes of E. coli, which makes the CDEC pathotype similar to the

NTEC pathotype as other E. coli groupings contain their defining virulence factors. (9,

64).

Extraintestinal Pathogenic E. coli: The Mix and Match Pathogens

In 2000, Russo et al. proposed a super grouping of the E. coli causing disease outside the intestine and named this pathotype Extraintestinal Pathogenic E. coli (ExPEC)

(65). This pathotype has several subpathotypes that are named by the host species or disease from which they are isolated, including: E. coli that cause colibacillosis in birds

(Avian Pathogenic E. coli (APEC)), urinary tract infections (Uropathogenic E. coli

(UPEC)), sepsis (Sepsis Producing E. coli (SPEC)), and bacterial meningitis in newborns

(Neonatal Meningitis Causing E. coli (NMEC)). The ExPEC terminology acknowledges 10 that ExPEC strains exist on a continuum of virulence due to a wide-variety of virulence factors, the inoculum size, and potential pre-existing conditions (65).

Avian Pathogenic E. coli (APEC)

Avian pathogenic E. coli (APEC) infections result in severe economic losses worldwide to the poultry production industry annually (66). Avian pathogenic E. coli is responsible for colibacillosis, a localized or systemic infection that is the most common infectious bacterial disease of poultry (60). Colibacillosis can be separated further into descriptors of the localized or systemic infection such as colispeticemia, hemorrhagic septicemia, air sac disease, swollen-head syndrome, peritonitis, coligranuloma, or enteritis.

Although colibacillosis is most often reported in commercial chicken, turkey, and duck flocks, most avian species are susceptible (60). Even though colibacillosis can affect birds of any age, young birds are the most frequently affected and disease severity is often greater. Factors that can increase APEC susceptibility include co-infection with viruses, bacteria, or parasites or poor environmental conditions such as overcrowding, inadequate ventilation, or contaminated water (60). Unfortunately, the pathogenesis of

APEC is not well characterized, but APEC share a significant number of virulence factors with the ExPEC of mammals (67). As such, it has been proposed that APEC or the E. coli from poultry may be a reservoir for ExPEC that cause human disease (68, 69).

Uropathogenic E. coli (UPEC)

Urinary tract infections (UTIs) affect 150 million people annually worldwide and are the most common pathogen encountered in ambulatory or emergency settings in the 11

United States (70, 71). Indeed, 40% of women and 12% of men will experience a symptomatic UTI in their lifetime (72). Clinically, UTIs are categorized as uncomplicated or complicated, but the causative agent for 65-75% for both categories is

UPEC (73, 74). Complicated UTIs result due to the host defense or urinary tract becoming compromised via urinary obstruction, urinary retention, immunosuppression, renal failure, pregnancy, or indwelling catheters (73). In the United States, approximately

75% of complicated UTIs are a result of indwelling catheters, also known as catheter- associated UTIs (CAUTIs) (75). CAUTIs are the most common nosocomial infection

(74, 76). Risk factors for CAUTI include diabetes, advanced age, or prolonged catherization (73, 75).

Contrary to complicated UTIs, uncomplicated UTIs affect individuals that are healthy and without structural abnormalities to the urinary tract (71, 73). Uncomplicated

UTIs can be further separated into cystitis and pyelonephritis (71, 73). Cystitis, the infection of the lower urinary tract, is a bladder infection that can result in pain, hematuria, dysuria, and changes in urination pattern or frequency (73). Risk factors for cystitis include a prior UTI, sexual activity, vaginal infection, diabetes, obesity, genetic susceptibility, or being of the female sex (77). Pyelonephritis, the infection of the upper urinary tract, is a kidney infection that can result in fever, nausea, and vomiting, as well as the signs and symptoms of cystitis (73). Significant morbidity and mortality can occur due to UTIs, and untreated pyelonephritis can result in sepsis or renal damage or failure

(72).

Uncomplicated UTIs occur when UPEC colonizes the urethra before migrating to the bladder (73). Pili and adhesins mediate the colonization of the bladder epithelium, 12 which is composed of umbrella cells, intermediate cells, and basal cells (73, 78). Next,

UPEC binds uroplakins, a major component of the umbrella cell apical membrane.

Intracellular bacterial community formation occurs in these umbrella cells, which promotes survival of UPEC by protecting UPEC from host defenses and antibiotics in a biofilm-like environment (79, 80).

Like intracellular bacterial communities, biofilms are important to UPEC pathogenesis as they can result in complicated UTIs when a UPEC biofilm forms on a catheter. The production of a biofilm by bacteria result in a different bacterial lifestyle as biofilms can prevent treatment by antibiotics, subvert innate defenses, and promote the exchange of extracellular DNA, such as plasmids (81). Biofilms form through the initial attachment, which leads to irreversible attachment, then maturation and dispersion (81).

Complicated UTIs occur when UPEC binds to a catheter, kidney or bladder stone or is retained by an obstruction within the urinary tract (73). Toxins produced by UPEC further result in tissue damage and release nutrients needed for survival. In the kidney,

UPEC binds to glycolipids on the surface of renal epithelial cells (82). If UPEC crosses the tubular epithelial barrier in the kidneys, urosepsis or bacteremia can occur (73, 83).

The vat, fyuA, chuA, and yfcV genes are UPEC-predictor genes correlated with the uropathogenic potential of E. coli isolates (84). A number of publications further elaborate the pathogenesis and virulence factors of UPEC (72-74, 77, 84, 85). Virulence factors of ExPEC are discussed in the virulence factors section.

Neonatal Meningitis E. coli (NMEC)

Neonatal Meningitis E. coli (NMEC) is responsible for 28-30% of neonatal bacterial meningitis (NBM) cases in developed countries and incidence is between 2-4 13 newborns per 10,000 births (86, 87). However, among preterm and very preterm infants

NMEC is the leading cause of meningitis (86). Without treatment, NBM is often fatal, but NBM also often progresses more quickly than culture results can be obtained for initial diagnosis. Therefore, infants that are discharged from the hospital and later present with fever, fast or difficult breathing, cough, lethargy, poor feeding, or convulsion should be treated with antibiotics (88). With such a mandate, the ratio of non-infected to blood- culture-positive neonates ranged between 15:1-28:1 (88, 89). Although the use of antibiotics on so many false negatives is alarming, it should be noted that false-negative blood cultures can occur for NMEC; approximately 37% of NBM cases are initially not diagnosed when a single blood culture is performed (86, 90). Therefore, the gold standard remains isolation of the bacteria from the cerebrospinal fluid. Eighty-nine percent of

NBM cases were detected via lumbar puncture in a six-year epidemiological study involving 2,951 cases of bacterial meningitis (86).

Clinically, NBM is currently treated by antibiotics including ampicillin and gentamicin with 3rd generation cephalosporins (91). Nevertheless, antibiotic resistant

NMEC present a unique challenge to treat clinically. The Iqbal et al. clinical case study demonstrates this challenge as additional antibiotics such as amikacin did not prevent mortality (91). Indeed, treating bacterial infection in newborns is difficult as randomized controlled trials for antibiotic therapy resulted in treatment failure and mortality in numerous studies of newborns with sepsis (88, 92-94). Further treatment for NBM beyond antibiotics may include craniotomy to debride the infected tissue (91).

Unfortunately, even with survival, serious sequalae may be the result of NBM as antibiotics may be correlated with decreased mortality but not morbidity (87). Thirty-one 14 percent of survivors experience neurological sequelae in the form of cerebral palsy, epilepsy, hearing impairment, and cognitive impairment (87). In a survey of NBM cases

(n=200) that occurred from 1985-1987 in England and Wales, 6.5% of the initially- surviving children had died before age five. At the age of five, 2.9% of NBM-surviving children experienced sensorineural hearing loss, 7% suffered from epilepsy, 8% experienced a learning problem, and 8.7% had cerebral palsy (n=172) (87).

The pathogenesis of NMEC is thought to occur by exposure of NMEC to the digestive tract perinatally. In the digestive tract, NMEC moves to the intestinal tract and attaches to enterocytes and is transcytosed into the bloodstream (39). Furthering the importance of the intestinal tract pathogenesis in NMEC infection, experimental evidence in a neonatal rat model demonstrates that there is incomplete development of the mucus barrier during early neonatal life (95). Once in the bloodstream, NMEC must evade the innate immune system. To do this, NMEC mimics the host immune system by producing capsular glycoproteins (K1) similar to those of humans (96). Additionally, NMEC invade and replicate in macrophages; thus, evading the immune system and further proliferating in the bloodstream (97). Such proliferation is responsible for acute bacteremia.

Bacteremia is associated with meningitis in newborns as experimental evidence has found that greater than 103 CFU/mL of blood is more likely to result in NBM and death (9, 98). With a high quantity of CFUs in the blood, it is unsurprising that high numbers of bacteria are exposed to the blood-brain barrier, a barrier of microvascular endothelial cells that separate the central nervous system from the bloodstream (99).

Virulence factors including outer membrane protein A, invasion of the brain endothelium

A, and cytotoxic necrotizing factor 1 are thought to be utilized by some NMEC to invade 15 and transverse the blood-brain barrier (39). Alternatively, NMEC may cross the BBB and into the CNS by “hiding” in macrophages in a Trojan horse mechanism (100). To cause disease, NMEC must survive and inhabit a wide range of niches within the body. To do so, it must use a wide range of virulence factors that are described in the virulence factor section.

Vaginal E. coli (VEC)

Although little is known about Vaginal E. coli, it has been proposed to be an

ExPEC subpathotype as vaginal E. coli may be the reservoir for the fecal-vaginal- urinary/neonatal transmission among ExPEC (101). VEC is presumed to be a member of the ExPEC pathotype due to its location of infection and shared virulence factors with other ExPEC (101). Clinically, E. coli colonization of the vagina is associated with pelvic inflammatory disease, urinary tract infections, and early-onset neonatal septicemia and meningitis (102).

E. coli infection can result in both male and female infertility. Pelvic inflammatory disease results in reproductive disability due to damage to the Fallopian tubes and the resulting inflammation of the female upper reproductive tract from VEC

(103). In males, E. coli can also impair fertility by irreversibly adhering to spermatozoa

(104-106). Additionally, E. coli is the most-common non-sexually transmitted cause of pain, swelling, or inflammation of the epididymis or testicles (107). The Gram-negative bacteria is also involved in as many as 80% of prostatitis cases (107).

E. coli can also lead to the rupture of fetal membranes (108, 109). The premature rupture of membranes can be managed after 36 weeks, but if earlier, it may result in up to

20% of perinatal deaths (110). The premature rupture of membranes is associated with 16 neonatal brain injury and chronic lung injury of the neonate (111). Risk factors for VEC include the use of antibiotics within two weeks prior to examination, carriage of Group B streptococcus, prior preterm delivery or prior UTI, employment as a sex worker, or the presence of cats or dogs in the home (102, 112, 113). Although this dissertation will focus on the three well-recognized subpathotypes of ExPEC (APEC, NMEC, and UPEC), information about other ExPEC subpathotypes, such as VEC, may be included when readily available.

Septicemia E. coli (SPEC)

Septicemia E. coli results in sepsis for the host. This ExPEC subpathotype is often represented within the other subpathotypes as APEC, NMEC, and UPEC and can result in sepsis in certain situations. Therefore, the boundaries between these subpathotypes and

SPEC remain unclear. Some ExPEC literature, including that by Ron (2007), defines all

ExPEC subpathotypes as septicemic (114). There may be a correlation between the clonal lineage or O-antigen of E. coli and septicemia (114).

ExPEC Models & Infections

Although the ExPEC subpathotypes have been presented here independently, there is growing evidence that the subpathotypes may be able to cause the same disease in different hosts. In 2010, Tivendale et al. demonstrated that APEC isolates could result in NBM in a neonatal rat model, and NMEC isolates could cause disease in chicks as well as the deaths of chick embryos (115). Likewise, certain APEC isolates have shown the ability to grow in urine similar to UPEC (116). In a chicken model, APEC and UPEC isolates also demonstrated a similar lethal dose (LD50) (117). Additionally, an in vivo 17 murine model showed APEC and UPEC isolates express genes in a similar tendency

(117). Thus, there is mounting evidence of a zoonotic potential for APEC for humans

(115, 118-120). Although these characteristics blur the line between the subpathotypes, the clinical definition of the isolates is the rationale for studying each individual subpathotype, and the subpathotypes are known to contain certain virulence factors with different prevalences (121). Therefore, caution is warranted when attempting to identify subpathotypes solely by a phenotypic characteristic or virulence factor. Throughout this dissertation APEC, NMEC, and UPEC will be used to identify the initial host and apparent disease in keeping with previous works (68, 115, 122-124).

Virulence Factors of ExPEC Subpathotypes

Virulence factors allow bacteria to replicate and disseminate within hosts, which results in disease. The virulence factors associated with ExPEC, and the genes that encode them, can be broadly divided into different functional categories including adhesion, invasion, iron acquisition, toxins, and protectins; key virulence factors are described within their respective category (125).

Adhesins

Adhesion is critical for the pathogenesis of many bacteria, including the E. coli

ExPEC pathotype. Key adhesins of ExPEC include the type I fimbriae (Fim), the Pap or

P fimbriae (Pap/Prf), S/FIC fimbriae (Sfa/Foc), N-acetyl D-glucasamine-specific fimbriae

(Gaf), M-agglutinin (Bma), afimbrial adhesin (Afa), and temperature sensitive hemagglutinin (Tsh) (64, 125). 18

Type I fimbriae (fim)

The type I fimbriae (Fim) binds to D-mannose-containing structures through the

FimH protein (126). The fimH gene contains regions of conservation and variability in

APEC, NMEC, and UPEC (127-130). Point mutations in fimH have also been shown to lead to increased urovirulence (129, 131). Interestingly, the polymorphisms in fimH also correspond to the chromosomal history of the E. coli isolate as assigned by phylogenetic group, and fimH polymorphisms have been suggested as a way to type strains (130, 132).

These point mutations may help explain the differing role of FimH in pathogenesis as fimH is ubiquitous in Escherichia coli.

Although it has a slightly different role in the pathogenesis of APEC, NMEC, and

UPEC, FimH acts as an adhesin for all three subpathotypes. In UPEC strains, FimH is important for adhesion and invasion of the intestinal epithelium by ExPEC strain CP9

(133). When a fimH- mutant cystitis isolate was used to colonize the gut, there was reduced persistence of the strain (134). Additionally, FimH is needed by UPEC for binding the host mannosylated uroplakins that coat the umbrella cells of the bladder

(135). The type I fimbriae is also required for the production of intracellular bacterial communities, a biofilm-like environment which confers protection to the bacteria (80). In

NMEC, type I fimbriae interact with human brain microvascular endothelial cells

(HBMEC) (136). FimH is not only an adhesin for HBMEC, but the protein triggers signaling that allows NMEC to invade via the host’s CD48 receptor (137). In APEC strains, the type I fimbriae is expressed during the initial colonization of the tracheal epithelial cells (60). Additionally, FimH is necessary for adhesion to cultured chicken tracheal cells in vitro (138). 19

A variety of ExPEC vaccines have been developed and tested with respect to

FimH (139-142). Nevertheless, immunization with the binding domain of fimH did not provide protection to chickens against APEC (142).

Pap fimbriae or P pili

While the type I fimbriae is critical for UPEC binding in lower UTIs, the P pili are necessary for pyelonephritis to occur as the P pili adhere to renal tissue (82). Of 531

UPEC surveyed, the P pili (as indicated by papC, the usher in P pilus assembly) was present in approximately 60% of isolates (143). In APEC, P pili are commonly expressed among isolates that colonize the blood, air sacs, kidney, blood, and pericardial fluid, but the P pili are not present in isolates that colonize the trachea (144). It has been suggested that in some pathogenic APEC, the P pili can be exchanged for other adhesins (145). P pili genes have been shown to vary among different APEC collections (22.7%, 30%, and

40.4%) (64, 146, 147). Interestingly, this variation may be due to some selection by the poultry host as Rodriguez-Siek et al. observed that APEC from turkeys were far more likely to contain papC than APEC from chickens (64). papC was present in 35.6% of

NMEC isolates (143) and 45% of vaginal E. coli (VEC) isolates (101). Even though papC is not present within ExPEC subpathotypes at the same prevalence, it has been proposed to be a molecular marker of the pathotype along with tsh, iucD, and cnf (148).

FIC fimbriae (foc)

The FIC fimbriae are encoded by the foc gene cluster, which is related to the S fimbriae genetic cluster (149). The FIC and S fimbriae are part of the same super family, but they bind different receptors (149). UPEC use FIC fimbriae to bind kidney tubular cells (150). The importance of the FIC fimbriae remains understudied in APEC and

NMEC. 20

S fimbriae (sfa)

S-fimbriae bind to luminal surfaces of the vascular endothelium of neonatal rat brain tissues by binding to sialic galactosides (151). S-fimbriae also adhere to the enterocytes and the bladder and kidney epithelium (151). As the S fimbriae and FIC fimbriae are part of the same superfamily, they are often surveyed together. A majority of

NMEC strains (51.1%) contain sfa or foc unlike UPEC (26.4%) or APEC (4.4%) (143).

Dr adhesins

Dr adhesins use the Dr blood group antigen or decay accelerating factor as receptors (9). UPEC use the Dr adhesins to adhere to enterocytes and cell lines derived from the kidneys (151). An isogenic UPEC mutant was less virulent in a mouse model of pyelonephritis (152).

M-agglutinin (bma)

The M-agglutinin adhesin is involved in the binding of UPEC to uroepithelial cells, but the adhesin is rarely found in UPEC that cause pyelonephritis (153, 154).

Interestingly, bmaE is statistically more likely to be present in UPEC strains that were resistant to both nalidixic acid and ampicillin (155). Overall, the gene may be associated with the ExPEC pathotype, but it remains underexplored (156).

Afimbrial adhesin (afa)

Afimbrial adhesins are adhesins that form fimbriae that are difficult to view, and it is strongly suggested that afa plays an important role in UPEC pathogenesis (152, 157).

When a large collection of ExPEC isolates was examined, none of the ExPEC subpathotypes had a majority of isolates containing afa, but NMEC had the greatest percent of isolates containing afa (25.6%) (143). 21

Hek outer membrane protein (hek)

In NMEC strain RS218, the Hek outer membrane protein can cause agglutination of red blood cells and can mediate autoaggregation (158). When expressed by E. coli, the protein can facilitate adherence and invasion of epithelial cells (158). The process of adherence and invasion requires heparin-containing proteoglycans on the host cell surface

(159). The prevalence of the Hek outer membrane protein within the ExPEC community remains underexplored.

Temperature sensitive hemagglutinin (tsh)

Tsh is both a protease and agglutinin, and tsh confers a hemagglutinin-positive phenotype at 26 ˚C (160). Tsh uses hemoglobin, fibronectin, or collagen as a receptor and has been shown to adhere to chicken erythrocytes (151). Approximately 53, 31, and 3% of APEC, NMEC, and UPEC contain tsh, respectively (143, 147). Of the tsh-positive

APEC, a majority were ranked as the most virulent isolates, but tsh is not associated with plasmid virulence, unlike iss or iucA (161, 162). The tsh gene is identified as a defining part of the APEC pathotype (64). Historically, the vat autotransporter was also identified as tsh (84).

Invasins

Like adhesion to the host’s tissue, invasion of host tissue is often critical to the pathogenesis of a wide range of bacteria, including ExPEC subpathotypes. Adhesins and invasins are often specialized surface proteins on the bacteria and are necessary for pathogenesis, and a distinction between adhesins and invasins remains obscure (163).

Therefore, this section distinguishes invasins from adhesins as proteins historically termed invasins within the literature. 22

Invasion of brain endothelium (ibeA)

An important NMEC invasin is IbeA, which was discovered in the prototypic

NMEC strain RS218, and is encoded by the ibeA gene; it is found in approximately 59% of NMEC isolates (143, 164). ibeA is also known as ibe10, and the gene is encoded on a

20.3 kb pathogenicity island known as GimA, which is associated with phylogenetic group B2 (164-166). Experimentally, IbeA was shown to significantly increase invasion of cultured human brain microvascular endothelial cells, a lab model of the blood brain barrier, by binding to the host receptor vimentin (100, 164). Interestingly, IbeA has been experimentally shown to assist AIEC in colonizing the mouse intestine, which prompts us to ask if IbeA may form a similar function in NMEC pathogenesis (167). Although ibeA is found in AIEC and other E. coli strains, some researchers have suggested defining the

NMEC subpathotype by the presence of ibeA and the K1 capsular antigen (69).

In contrast to NMEC, fewer APEC and UPEC contain ibeA, 14.2 and 19.2% of isolates, respectively (143). In APEC isolates, experimental evidence has shown that

IbeA may assist in the regulation of type I fimbriae (FimH) and increase resistance to hydrogen peroxide (H2O2) killing (168). Regulation of FimH is critical as FimH is an important adhesin for the ExPEC subpathotypes. Additionally, IbeA enhanced the invasion of brain microvascular endothelial cells by an APEC isolate (169). The deletion of ibeA also lead to decreased biofilm formation by an APEC isolate (170).

Invasion of brain endothelium (ibeB)

Although ibeA and ibeB both contribute to the invasion of the brain microvascular endothelial cells, the two genes are not located within the same operon or pathogenicity island (165, 166). In APEC, an ibeB deletion resulted in reduced virulence and invasion of a chicken fibroblast cell line (171). For NMEC strain RS218∆ibeB, the strain was less 23 invasive towards BMEC in vitro, and there were significantly fewer animals with E. coli present in the CSF when newborn rats were infected (172). When the gene of ibeB was examined, it was determined to be identical to cusC, an outer membrane porin that is part of the copper/silver transport system (Nielsen et al., unpublished).

Invasion of the brain endothelium (ibeC)

The IbeC protein contributes to the invasion of BMEC (99). In a recent NMEC outbreak in Spain, all bacteria contained ibeC, unlike other invasins sometimes absent like ibeA (173). ibeC is not on the same genetic island as ibeA (165), and the IbeC protein sequence is identical to that of CptA, a phosphoethanolamine (Nielsen et al., unpublished data).

Genetic island associated with newborn meningitis (gimB)

In APEC strains that caused severe cellulitis in broiler chickens, gimB is associated with increased disease severity, and in NMEC, gimB is associated with increased invasion of host cells (174, 175). The genetic island containing gimB is not the same island as GimA. 8.8%-9.7% of APEC, 22.6% of UPEC, and 56.7% of NMEC contain gimB (143, 174).

Arylsulfatase (aslA)

aslA contributes to invasion of brain microvascular endothelial cells in vivo and in vitro as demonstrated by NMEC isolate RS218 (176). However, the aslA gene remains understudied within the ExPEC pathotype.

Iron regulation

For many hosts, the first line of defense against pathogens is nutritional immunity; arguably, the sequestration of iron is one of the most important factors in protecting the host (177). To compete as a successful pathogen in a host, ExPEC must 24 acquire iron. Therefore, it is unsurprising that many virulence genes function to acquire or regulate iron. Siderophores can assist bacteria in the capture of iron. Of all the proteins associated with ExPEC iron acquisition, the majority of them are regulated by Fur (178).

Aerobactin synthesis and transport (iucABCD)

The iucABCD operon encodes a siderophore and has been found in surveyed

NMEC plasmids (179, 180). During the growth of non-pathogenic E. coli Nissle 1917 in urine or a biofilm community, iucABCD was significantly upregulated (181). When iucD was knocked-out from both an APEC and UPEC strain, there were fewer lesions in the heart, liver, and lung of infected chickens (180). iuc has also been shown to experimentally contribute to the virulence of APEC strain c7122 (182).

Enterobactin receptor (iha)

In UPEC strains, Iha promotes adherence to uroepithelial cells and acts as a receptor (183). In a mouse model, iha was expressed in the kidney, and UPEC strains with iha outcompeted their deletion counterparts (183, 184). NMEC (26.7%) and UPEC

(39.2%) genomes were more likely to contain iha compared to APEC (3.5%) (143).

Eit operon (eitA)

The Eit operon encodes an ABC iron transporter and a periplasmic binding protein. Interestingly the Eit operon is more likely to be present in APEC isolates (37.2%) compared to NMEC or UPEC (4.3-5.6%) (143). The operon is found in APEC plasmid- linked pathogenicity islands (60).

Metal transporter (feoB)

FeoB is a metal transporter that is the primary mechanism for mediated ferrous

(Fe2+) iron uptake (60). In a combined ∆feoB-∆sit strain, APEC c7122 demonstrated 25 increased colonization of the liver, but there was a decreased persistence of the strain in the blood (185). feoB has a higher prevalence in asymptomatic bacteriuria in children than other UPEC types (186).

Iron responsive element (ireA)

ireA encodes an outer membrane protein and is found on a chromosomal pathogenicity island in strain APEC O1 (187, 188). Expression of ireA is greatly enhanced in human urine or blood (189). In a mouse urinary tract infection model, ireA significantly contributed to colonization of the bladder (189). ireA is more prevalent in

APEC isolates compared to NMEC or UPEC (143). The ireA gene may play a role in adhesion and stress resistance in APEC (188).

Ferric aerobactin receptor precursor (iutA)

The ferric aerobactin receptor gene, iutA, encodes an outer membrane receptor. iutA is a core gene of NMEC plasmids and is localized to APEC plasmid-linked pathogenicity islands (60, 179). APEC isolates responsible for colibacillosis were significantly more likely to contain the iutA gene (190). Indeed, 77.8% and 80.8% of

NMEC and APEC, respectively, carry iutA (143). The prevalence of iutA is lower for

UPEC (48.4%), but UPEC belonging to phylogenetic group B2 with iutA were more likely to be resistant to nalidixic acid and susceptible to gentamicin (143, 155).

Salmochelins synthesis and transport (iroBCDEN)

The Salmochelin operon contains the catecholate siderophore receptor gene, iroN

(60). iroN has been shown to experimentally contribute to the virulence of APEC strain c7122, and iroN was strongly upregulated by APEC in chick embryos (182, 191). In a neonatal rat meningitis model, NMEC lacking iroN was less virulent and resulted in a 26 lower level of bacteremia (192). When iroN was knocked-out from both an APEC and

UPEC strain and challenged in chickens, no lesions were present in the heart, liver, or lungs 24 hours post-infection (180). iroBCDEN are core plasmid genes of NMEC plasmids and are located on APEC plasmid-linked pathogenicity islands (60, 179).

During the growth of E. coli strain Nissle 1917 in urine or a biofilm community, iroBCDEN was significantly upregulated (181).

Iron transporter (sitABCD)

First discovered in Salmonella species, sitABCD is an operon that encodes divalent metal ion transporters (193). The importance of having sitA is demonstrated for the ExPEC subpathotypes as the gene was detected in 83.4-95.6% of each of the subpathotypes examined (143). In an APEC isolate c7122, the presence of the sitABCD genes resulted in increased transport of iron and manganese, and the operon also provides some resistance to H2O2 (194). In APEC, reduced colonization of the lungs, liver, and spleen was seen in a ∆sit strain (185). The operon has been found on both the chromosome and on APEC colicin-V-type virulence plasmids (143, 194). The sitABCD operon is found within NMEC plasmids and is found on APEC plasmid-linked pathogenicity islands (60, 179). During the growth of non-pathogenic E. coli Nissle 1917 in urine or a biofilm community, sitABCD was significantly upregulated (181).

Heme receptor (chuA)

ChuA is an outer membrane protein that is a heme receptor and is produced in response to iron deficient environments (195). Heme uptake via chuA is involved in bladder and kidney colonization by UPEC (196). Notably, chuA is also a part of the E. 27 coli phylogenetic typing schemes (197, 198). As it is used to sort E. coli by their phylogenetic groups, chuA is found on the chromosome.

Yersiniabactin receptor/synthesis (fyuA/irp2)

The Yersiniabactin operon was discovered as a mechanism by which Yersinia acquire iron in a mouse model of infection, resulting in virulence to the mice (199). In

1998, it was reported that the pathogenicity island was present in E. coli that caused in sepsis in their human hosts, and in 2001, the Yersiniabactin genes were found in APEC that resulted in death via colibacillosis (146, 199). The genes of the Yersiniabactin operon are localized to chromosomal pathogenicity islands described in APEC (60) with

58.2% of isolates positive for the fyuA gene; in contrast, 80.6% of UPEC isolates and

68.9% of NMEC harbored fyuA. (143).

Heme receptor (hma)

The Hma protein binds hemin. An hma mutant was outcompeted by the wild-type

UPEC strain CFT073 in the kidneys and spleens of a murine model, and a chuA-hma mutant was unable to successfully colonize the murine kidneys, unlike the wild-type

UPEC strain CFT073 (200). The prevalence of Hma in the ExPEC subpathotypes remains underexplored.

Toxins

Important virulence factors of ExPEC include endotoxins and exotoxins. In Gram negative bacteria, endotoxins are part of the LPS and are a complex of lipid and polysaccharide (201). The O-serogroup, a typing mechanism of E. coli, composes part of the LPS. Endotoxins can cause fever, inflammation, and lethal shock (201). Exotoxins are a group of toxins composed of proteins and secreted from pathogenic bacteria. 28

Cytotoxic necrotizing factor 1 (cnf1)

CNF1 is an exotoxin that activates GTPases and stimulates myosin rearrangement

(202). CNF1 contributes to the ability of NMEC strain RS218 to cross the blood brain barrier; the receptor for CNF1 is known as 67LR or RPSA (203, 204). Yet, only 4.4% of

NMEC harbored cnf1. 1.3% of tested APEC isolates carried cnf1 while 23.4% of UPEC harbored the toxin gene (143). For UPEC, expression of CNF1 results in bladder inflammation (205). The influence of CNF1 on APEC virulence remains underexplored.

Hemolysin (hlyA)

The alpha-hemolysin, HlyA, is a pore-forming toxin that triggers proteolysis of host cell proteins, which results in inflammation and a loss of cell adhesion (206). As a result of the HlyA toxin, UPEC can invade and disseminate through the superficial facet cells (39). HlyA also allows UPEC to disable macrophages (206). When the UPEC isolate CFT073 was grown in urine, the hlyA hemolysin gene was induced, but it was not in other non-ExPEC E. coli (181). While a majority of human ExPEC contain the hlyA gene, the gene was less likely to be found in APEC (67).

Hemolysin (hlyF)

A majority of APEC (75.4%) and NMEC (58.9%) isolates contained hlyF in contrast to UPEC (5.6%) when the subpathotypes were surveyed (143). Due to its presence in APEC, HlyF has been termed the avian E. coli hemolysin (60). The hemolysin is encoded on plasmids within a cluster of virulence genes in APEC strains

(207, 208). Further work on hlyF has demonstrated that the gene is directly involved with the enhanced production of outer membrane vesicles that are associated with the release of toxins (209). 29

Cytolethal distending toxin (cdtB)

The cytolethal distending toxin-B protein is suggested to have DNA-damaging properties and shares sequencing homology to mammalian DNase I (210). Although the cdtB gene is present in less than 8% of APEC and UPEC, the gene is present in approximately 38% of NMEC (121, 124). The cytolethal distending toxin may have been transferred to certain ExPEC by phage transduction as the toxins are flanked by lambdoid prophage genes (211).

Vaculating autotransporter toxin (vat)

vat encodes a serine protease autotransporter (SPATE). The toxin induces cytoxic effects similar to those of VacA by H. pylori by creating intracellular vacuoles (212). An

APEC vat- mutant demonstrated no virulence in broiler chickens (212). The vat gene was approximately present in half of APEC, NMEC, and UPEC isolates tested (67) and is located on chromosomal pathogenicity islands (60).

Secreted autotransporter toxin (sat)

Sat is significantly associated with clinical pyelonephritis (213). The toxin is a

SPATE, similar to Vat (213). The toxin was cytopathic to kidney and bladder cell lines and was able to elicit a strong antibody response after UPEC strain CFT073 was used to infect mice (213). Sat is more common in NMEC (34.6%) and UPEC (21.2%) compared to APEC (0.4%) (67).

Protectins

Protectins prevent complementation by the immune system and are commonly membrane-bound proteins. By preventing death via the immune system, ExPEC are able to further disseminate throughout their host, resulting in further disease. 30

Outer membrane protein A (ompA)

OmpA is an outer membrane protein that performs a wide-variety of functions in different E. coli types (214). In NMEC, OmpA promotes bloodstream survival and assists in crossing the blood brain barrier (172, 215, 216). Structurally, the loops of OmpA contribute to NMEC’s survival and entry into the human brain microvascular endothelial cells (HBMEC) by binding the Ecgp glycoprotein (217, 218). The protein also contributes to the binding and survival of NMEC inside macrophages (219). As a result, it has been suggested that OmpA might be a good vaccine target to prevent NMEC infection (220). Nevertheless, the effect OmpA has on invasion is not additive with other virulence factors, like IbeB (172). When UPEC were used to infect a murine UTI model, deletion of ompA prevented formation of the intracellular bacterial communities and reduced bacterial counts in the bladder (221). Although ompA is found in nearly all E. coli, OmpA is known to contain differences within its protein sequence (214, 222).

Outer membrane protein T (ompT)

OmpT is an outer membrane protease. The majority of APEC and UPEC have ompT chromosomally, and a majority of APEC and NMEC have ompT episomely (143). ompT is frequently associated with urinary tract infections and usp, a uropathogenic specific protein associated with pyelonephritis (223). OmpT may allow longer persistence of E. coli in the urinary tract by enabling resistance to the antimicrobial activity of urinary cationic peptides (224). The deletion of ompT from an APEC strain reduced the adherence and invasion to BMEC by 43.8 and 28.8%, respectively (225). ompT deletion also resulted in reduced colonization and invasion in the brains, lungs, and blood 1.7-2-fold in murine or duckling models (225). ompT inactivation also resulted in

8.7 or 15.2-fold reduced virulence in murine or duckling models, respectively (225). 31

Increased serum survival (iss)

Iss is an outer membrane protein associated with complement resistance (226).

Experimentally, iss increased the lethal dose (LD50) of APEC 100-fold in day-old chicks

(60, 227). iss was also strongly up-regulated when chick embryos were infected with

APEC (191). UPEC of phylogenetic group B2 containing iss were significantly more likely to be resistant to nalidixic acid (155). When initially surveyed the iss gene was found on the plasmids of 82.7%, 55.6%, and 26.6% APEC, NMEC, and UPEC examined, respectively (143). The gene is statistically more prevalent in APEC than avian fecal E. coli (64).

Studies on iss are complicated as iss closely resembles another gene, bor. The two genes have been shown to be 88.4-95.5% identical, and the iss gene is often misannotated as bor (228). Bor, encoded by bor, is a protein that commonly occurs in E. coli as an outer membrane protein and is derived from lambda (226, 228, 229). Additionally, the generation of Iss-specific antibodies has been hampered by their cross-reactivity to Bor

(226). Johnson et al. (2009) characterized the iss gene into three different types. Although iss was thought to occur exclusively on virulence plasmids, only iss type 1 is specifically plasmid borne (228). Among APEC and NMEC strains, iss type 1 was the most prevalent

(78% and 66%, respectively) (228). The prevalence of bor was lower for human ExPEC

(4%) than APEC (25%) (228). Although the Bor protein is found widely among avian E. coli pathogens and commensals, it may also play a role in serum resistance (226, 229).

K capsular antigen (kps)

In 1974, Robbins et al. discovered a relationship between the capsular (K1) polysaccharide and neonatal meningitis as 84% of NMEC had the K1 antigen (230). 32

Indeed, the importance of the capsular polysaccharide persists in the literature as NMEC may be alternatively described as Escherichia coli K1. The K1 antigen protects NMEC from a host immune response by enhancing serum resistance (231). NMEC with the K1 antigen are concealed from the immune system by the K1 capsule since the K1 antigen is composed of a sialic acid (232). Sialic acids are self-recognition molecules and are common on the surface of host cells (233). The K1 antigen also prevents lysosomal fusion in HBMEC when NMEC are localized within a membrane-bound vacuole (234).

Thus, live bacteria can transverse the BBB. NMEC isolates appear to differ from APEC and UPEC regarding the prevalence of the K1 capsule. A majority of NMEC (70-84%) contain the K1 capsular antigen in contrast to APEC (15.7%) and UPEC (29.2%) (143,

230).

TraT

TraT is a plasmid-encoded outer membrane protein. The protein enhances resistance to phagocytosis of the bacterium (235). traT is associated with the K1 capsule and ColV plasmids (236). TraT promotes surface exclusion by preventing DNA transfer during conjugation; therefore, it prevents the acquisition of similar plasmids (237). TraT may also interact with OmpA (238). A majority of APEC (78.1%), NMEC (85.6%), and

UPEC (67.8%) contain traT (143).

Although not every putative ExPEC virulence factor has been presented here, the virulence factors documented demonstrate the numerous possibilities in which an ExPEC strain may become virulent. Clearly, no single ExPEC isolate or subpathotype contains 33 all of these virulence factors, but the “mix-and-match” virulence factor approach makes it difficult to investigate, treat, and prevent infections in both avian and human hosts.

Virulence plasmids of ExPEC

Although virulence factors can be critical to the severity and disease outcome, the vectors by which these virulence factors move should not be overlooked. Transmissible plasmids that carry a significant number of virulence factors are common among the

ExPEC subpathotypes (60, 179). Indeed, these plasmids are occasionally considered virulence factors themselves as these plasmids have been associated with improved growth in urine and under acidic pH conditions, resistance to chlorine and disinfectants, and bacteriophage resistance (239, 240). Virulence plasmids of ExPEC are commonly classified by colicin or replicon type.

Colicin Typing

Colicins are produced by strains of E. coli carrying a plasmid encoding colicin genes and are toxic to other strains of E. coli not carrying this plasmid (241). Colicins kill susceptible strains by binding outer membrane proteins and are translocated in the periplasm before they form a voltage-dependent channel in the inner membrane or use their endonuclease activity on the host’s genetic material (241). Because colicins are carried on colicinogenic plasmids, some plasmids have been named after the colicins they carry (191, 207, 208, 227, 240). Some colicin-related genes include cvaC, cbi, and cma.

Some plasmids of APEC have been named after the colicin they carry, such as pAPECO1-ColBM, pAPEC-O2-ColV, and pAPEC-O103-ColBM (207, 208, 240). The

ColV plasmids have long been associated with the ExPEC pathotype, and they were named because a colibacillosis strain produced “principle V”, which lysed other cells 34

(239). Further work illustrated that these ColV plasmids carried numerous virulence factors including genes such as iss, traT, and tsh as well as iron acquisition operons (182,

242).

The prevalence of different colicins varies among the ExPEC subpathotypes.

When screened for the cvaC, the colicin V precursor gene, 67.5% of APEC and 54.4% of

NMEC contained ColV; only 5.6% of UPEC carried cvaC (143). When screened for cmi, the colicin-M immunity gene, approximately 25% of APEC carried the gene while less than 5% of NMEC or UPEC did (140). Colicin B is a channel-forming colicin encoded by cba. Approximately 4% of UPEC, 21% of NMEC, and 34.3% of APEC carry the cba gene (143). The higher prevalence of the different colicins for the APEC subpathotype compared to the NMEC and UPEC subpathotypes presents an interesting observation.

Yet, the presence of these genes does not imply a plasmid for each colicin type as hybrid plasmids exist, like hybrid ColBM plasmids (240).

Replicons

The failure to maintain stable plasmids of a similar type in bacteria is known as plasmid incompatibility. Plasmid incompatibility families are named beginning with

“Inc”. These Inc families have also been described by correlations to the biological functions of the things they encode. For example, the different IncF plasmids are capable of carrying MDR, transfer, and virulence function genes while IncT are plasmids capable of carrying transfer and DNA metabolism functions (239).

In 2005, Carattoli et al. revolutionized the high-throughput naming of plasmids by developing a PCR-based replicon typing method (243). When the PCR-based scheme was applied to a large collection of ExPEC, 86.9% of APEC, and 80.0% of NMEC, and

33.5% of UPEC contained the IncFIB replicon (143). Other replicon types occurring in 35 more than 10% of one ExPEC subpathotype include IncB/O, IncFIC, IncP, IncFIIA,

IncI1, and IncN (143). With genomic data, in silico analyses can be performed to detect the replicons present in a given strain via PlasmidFinder (244). With an ever-growing number of replicons being added to the database, it is evident that there is still much to discover in plasmid biology.

Classification Schemes Used with E. coli

In addition to grouping virulence plasmids by replicon or pathogenic E. coli into pathotypes, various methods are employed to parse E. coli into distinct cohorts according to certain traits. In the past, there were many typing schemes that attempted to infer the genetic relatedness among isolates. In addition to the use of serology, which classifies E. coli by antibody recognition of surface structures, such as O-antigen, pili and capsule, a number of typing methods rely on discerning genetic relatedness among isolates, such as

Sequence Typing, Phylogenetic Typing, fim typing, and other such schemes. One of the newest and most thorough tools to assess the genetic relatedness of E. coli is whole genome sequencing. Interestingly, the information gathered from the various E. coli typing schemes is still used to infer the E. coli types from sequencing data. Relevant E. coli typing schemes including the serotype, multi-locus sequence typing (MLST), and phylogenetic group have been applied to sequencing data. In this dissertation, wherever possible, the results of APEC, NMEC, and UPEC are included within each typing scheme when available.

36

Serotyping

In the 1940s, serotyping was developed by Kauffmann. To serotype, E. coli are assigned to different groups based on serological detection of certain O, H, and K antigens (245, 246). The serogroup is composed of the O-antigens, the exterior lipopolysaccharides of the LPS; the K antigen, the capsular antigen; and the H or flagellar antigen (247). The term serogroup refers to typing of E. coli by its O-antigen only; whereas, serotype refers to a more comprehensive type, including O-antigen and H and/or

K antigen typing (9). There are currently 187 recognized serogroups (248). Due to the effort required for serological typing, DNA- or computational-based methods are gaining use as a substitute method to assign E. coli to serotypes (249). These newer methods may also overcome the difficulty with cross reactivity of some O-group antisera (248). Recent research examining the serogroups and DNA sequences of O-antigens has indicated that several serogroups or serogroup subdivisions may be redundant (248).

Common APEC serogroups in the United States and Germany include O2, O8,

O18, and O78 (64, 147). Nevertheless, other serogroups are also present as represented by APEC O1, the first completely sequenced APEC that has an O1 serogroup (187). In

NMEC, the most common serogroup is O18, followed by O1, O7, and O83 (124). For

UPEC, the most common serogroup was O6, with serogroups O1, O2, O4, O18, and O75 common as well (121). Thus, there are shared serogroups between APEC, NMEC, and

UPEC isolates. However, serogroups cannot fully explain E. coli isolates as 25% of E. coli remain untypeable, and these isolates are categorized as negative, rough, autoagglutinating, or belonging to multiple groups (248). The importance of the K- antigen to virulence is discussed within the virulence factor section.

37

Multi-locus sequence typing

Multi-locus sequence typing (MLST) is a PCR and Sanger sequencing-based typing scheme. Three MLST schemes exist, and the schemes may be referenced by their location of MLST databases or their inventor. The Warwick or Achtman scheme may be one of the most used MLST schemes, and the scheme contains the 4,499 sequence types

(ST), more than any other MLST scheme (250, 251). Some Achtman STs may be associated with extended-spectrum B-lactamase-producing E. coli (252). The Pasteur

Institute scheme contains 771 STs (251, 253). The Achtman and Pasteur schemes have wide appeal as both can be completed computationally via the Center for Genomic

Epidemiology (254). The Michigan State MLST contains 1,081 STs, but this MLST may favor pathogenic STEC due to the sequencing data collected (251). The Michigan State

MLST is computationally available from shigatox.net. Several MLST reviews have been published (251, 255).

Although few ExPEC have published MLST data, some common NMEC

Achtman STs were ST95, ST390, ST3157, and 3194 (119, 256, 257). Common APEC

STs include ST95, ST1354, ST1827, and ST3191, and UPEC STs commonly included

ST95 and ST1349 (119). Therefore, the ExPEC commonly shared ST95 among the currently available data, yet a larger sample size would allow for improved analyses.

Phylogenetic groups

The original phylogenetic types were based on the profiles of multilocus enzyme electrophoresis (MLEE) results for E. coli pathotypes. In this scheme, 10 enzyme loci and the sequence of the mdh gene were used to study the genetic relationships of pathogenic

E. coli and Shigella. With MLEE, Whittam et al. discovered statistically distinct 38 groupings of E. coli (258) that became known as phylogenetic groups, and these groups were correlated with the presence of virulence factors (259).

In 2000, phylogenetic groups were built upon as Clermont et al. released a phylogenetic typing scheme, which was based on the presence or absence of three DNA targets—chuA, yjaA, and TspE4.C2. These three DNA targets separated E. coli isolates into four groups—A, B1, B2, and D (198). However, the accuracy of the method was limited with an 80-85% accuracy to MLST data, but the information gained from the typing may still be biologically relevant (122, 260). Therefore, several other phylogenetic typing schemes were proposed (261, 262). In 2013, Clermont et al. published a revised phylogenetic typing scheme to improve the original, which was based on six targets.

These six targets separate E. coli isolates into nine groups—A, B1, B2, C, D, E, F, Clade

I, and unknown (197). The results of Clermont’s phylogenetic typing schemes are correlated with MLST data (251). The proportions for each ExPEC subpathotype are shown in Figure 1.1.

Whole Genome Sequencing Comparisons & Significance

Although Caratolli et al. streamlined replicon typing via PCR, the comparison of plasmid replicons has further been transformed in an era of whole genome sequencing

(WGS) (243). With genomic assemblies, it is feasible to screen for replicons, antibiotic resistance genes, and insertion elements in a more time efficient manner through tools such as PlasmidFinder, ResFinder, ISEscan, or FimTyper (244, 263). However, the greatest benefit of such tools and WGS is the ability to determine potential patterns or novel genes, which may have gone undetected using more traditional tools, such as PCR. 39

WGS has even lead some individuals to ask if pathotypes are still relevant in a WGS era

(53). Although such a question is intriguing, WGS tools can certainly allow us to closely examine the pathotypes and subpathotypes of E. coli.

The majority of conclusions drawn about virulence factors and pathogenesis of the ExPEC subpathotypes have been the result of work on a few isolates. In NMEC, the majority of known virulence factors have been discovered in RS218 and its rifampicin- resistant derivate strain, E44. For example, the importance of genes such as ibeA, ibeB, ibeC, ompA, cnf1, and yjiP were discovered using NMEC strain RS218 (164, 172, 215,

264-266). Nevertheless, in the last decade, more NMEC are being sequenced to provide a more comprehensive view of ExPEC and its subpathotypes. For example, NMEC strains

CE10, IHE3034, NMEC O18, RS218, and S88 have completed chromosomes publicly available and genomic assemblies are currently available for SP-4, SP-5, SP-13, SP-46, and SP-45 (267-271). Nevertheless, no large-scale genomic comparison has been completed with these isolates. With an expanding number of genome sequences, we may be able to more clearly delineate NMEC chromosomal and plasmid-linked pathogenicity islands. Therefore, 48 additional NMEC were sequenced with an Illumina MiSeq instrument and 14 of these isolates were chosen for sequencing with Pacific Bioscience’s

RSII instrument.

40

40

Figure 1.1: Phylogenetic groupings between the different ExPEC subpathotypes.

The data presented here is published in Logue et al. 2017 (123) and shows significant changes occurred between the distributions of phylogenetic groups for the subpathotypes (x-axis) when the old and new typing schemes were applied. Phylogenetic group B2 was more common among human ExPEC while revised phylogenetic group C was more common among APEC isolates.

41

In addition to the NMEC isolates sequenced, this work also sequenced avian pathogenic isolate APEC O2-211. The sequencing of APEC O2-211 provides additional plasmid sequences to compare against other landmark APEC plasmids, such as pAPEC-

O1-ColBM and pAPEC-O2-ColV, for future work (207, 208). Comparisons of the APEC and NMEC virulence plasmids is critical as these plasmids carry a significant number of virulence factors. With an examination of the different genomic sequences, there is the distinct possibility of finding patterns between the clinical features of the isolates and their genomic constituents.

Aims

The aims of the dissertation presented here include an examination of the differences between the chromosomal history of extraintestinal pathogenic E. coli via assignment to a phylogenetic group and the phenotypic and genomic information associated with the phylogenetic groups and subpathotypes.

Organization of this Dissertation

The dissertation presented here is organized into seven chapters in a journal format. Chapter 1 is a literature review of Escherichia coli with an emphasis on the

ExPEC pathotype and E. coli classification and typing schemes. Chapter 2 examines the association between phylogenetic groups, ExPEC subpathotypes, and biofilm formation.

Chapter 3 is an investigation of the polymorphisms of the ExPEC virulence factor outer membrane protein A (OmpA) and the association between the polymorphism patterns, phylogenetic groups, sequence types, subpathotypes, and serology of E. coli. Chapter 4 introduces the complete genome of APEC O2-211, an APEC strain that was isolated from

42 the air sac of a chick. Chapter 5 announces the complete genome of NMEC-O75

(MCJCHV-1), an NMEC isolate that resulted in the death of an infant at day 45 of life.

Chapter 6 is a genomic analysis of the NMEC subpathotype. Chapter 7 is a conclusion of the work presented within this dissertation.

References

1. Secher T, Brehin C, Oswald E. 2016. Early settlers: which E. coli strains do you not want at birth? Am J Physiol Gastrointest Liver Physiol 311:G123-G129.

2. Ley RE, Peterson DA, Gordon JI. 2006. Ecological and evolutionary forces shaping microbial diversity in the human intestine. Cell 124:837-848.

3. Sakata H, Yoshioka H, Fujita K. 1985. Development of the intestinal flora in very low birth weight infants compared to normal full-term newborns. Eur J Pediatr 144:186-190.

4. Palmer C, Bik EM, DiGiulio DB, Relman DA, Brown PO. 2007. Development of the human infant intestinal microbiota. PLoS Biol 5:e177.

5. Ukena SN, Singh A, Dringenberg U, Engelhardt R, Seidler U, Hansen W, Bleich A, Bruder D, Franzke A, Rogler G, Suerbaum S, Buer J, Gunzer F, Westendorf AM. 2007. Probiotic Escherichia coli Nissle 1917 inhibits leaky gut by enhancing mucosal integrity. PLoS One 2:e1308.

6. Jacobi CA, Malfertheiner P. 2011. Escherichia coli Nissle 1917 (Mutaflor): new insights into an old probiotic bacterium. Dig Dis 29:600-607.

7. Ajikumar PK, Xiao WH, Tyo KE, Wang Y, Simeon F, Leonard E, Mucha O, Phon TH, Pfeifer B, Stephanopoulos G. 2010. Isoprenoid pathway optimization for Taxol precursor overproduction in Escherichia coli. Science 330:70-74.

8. Critchley RJ, Jezzard S, Radford KJ, Goussard S, Lemoine NR, Grillot-Courvalin C, Vassaux G. 2004. Potential therapeutic applications of recombinant, invasive E. coli. Gene Ther 11:1224-1233.

9. Kaper JB. 2004. Pathogenic Escherichia coli. Nat Rev Microbiol 2:123-140.

10. Darfeuille-Michaud A. 2002. Adherent-invasive Escherichia coli: a putative new E. coli pathotype associated with Crohn's disease. Int J Med Microbiol 292:185- 193.

43

11. Ochoa TJ, Contreras CA. 2011. Enteropathogenic Escherichia coli infection in children. Curr Opin Infect Dis 24:478-483.

12. Senerwa D, Olsvik O, Mutanda LN, Lindqvist KJ, Gathuma JM, Fossum K, Wachsmuth K. 1989. Enteropathogenic Escherichia coli serotype O111:HNT isolated from preterm neonates in Nairobi, Kenya. J Clin Microbiol 27:1307- 1311.

13. Bieber D, Ramer SW, Wu CY, Murray WJ, Tobe T, Fernandez R, Schoolnik GK. 1998. Type IV pili, transient bacterial aggregates, and virulence of enteropathogenic Escherichia coli. Science 280:2114-21188.

14. McDaniel TK, Jarvis KG, Donnenberg MS, Kaper JB. 1995. A genetic locus of enterocyte effacement conserved among diverse enterobacterial pathogens. P Natl Acad Sci USA 92:1664-1668.

15. Jerse AE, Yu J, Tall BD, Kaper JB. 1990. A genetic locus of enteropathogenic Escherichia coli necessary for the production of attaching and effacing lesions on tissue culture cells. P Natl Acad Sci USA 87:7839-7843.

16. Kenny B, DeVinney R, Stein M, Reinscheid DJ, Frey EA, Finlay BB. 1997. Enteropathogenic E. coli (EPEC) transfers its receptor for intimate adherence into mammalian cells. Cell 91:511-520.

17. Sinclair JF, O'Brien AD. 2002. Cell surface-localized nucleolin is a eukaryotic receptor for the adhesin intimin-gamma of enterohemorrhagic Escherichia coli O157:H7. J Biol Chem 277:2876-2885.

18. Trabulsi LR, Keller R, Tardelli Gomes TA. 2002. Typical and atypical enteropathogenic Escherichia coli. Emerg Infect Dis 8:508-513.

19. Giron JA, Ho AS, Schoolnik GK. 1991. An inducible bundle-forming pilus of enteropathogenic Escherichia coli. Science 254:710-713.

20. Scotland SM, Smith HR, Cheasty T, Said B, Willshaw GA, Stokes N, Rowe B. 1996. Use of gene probes and adhesion tests to characterise Escherichia coli belonging to enteropathogenic serogroups isolated in the United Kingdom. J Med Microbiol 44:438-443.

21. Kallonen T, Boinett CJ. 2016. EPEC: a cocktail of virulence. Nat Rev Microbiol 14:196.

22. DeVinney R, Stein M, Reinscheid D, Abe A, Ruschkowski S, Finlay BB. 1999. Enterohemorrhagic Escherichia coli O157:H7 produces Tir, which is translocated to the host cell membrane but is not tyrosine phosphorylated. Infect Immun 67:2389-2398.

44

23. Beutin L, Martin A. 2012. Outbreak of Shiga toxin-producing Escherichia coli (STEC) O104:H4 infection in Germany causes a paradigm shift with regard to human pathogenicity of STEC strains. J Food Prot 75:408-418.

24. Nguyen Y, Sperandio V. 2012. Enterohemorrhagic E. coli (EHEC) pathogenesis. Front Cell Infect Microbiol 2:90.

25. Sandvig K. 2001. Shiga toxins. Toxicon 39:1629-1635.

26. Wong CS, Jelacic S, Habeeb RL, Watkins SL, Tarr PI. 2000. The risk of the hemolytic-uremic syndrome after antibiotic treatment of Escherichia coli O157:H7 infections. N Engl J Med 342:1930-1936.

27. Karch H, Strockbine NA, O’Brien AD. 1986. Growth of Escherichia coli in the presence of trimethoprim-sulfamethoxazole facilitates detection of Shiga-like toxin producing strains by colony blot assay. FEMS Microbiology Lett 35.

28. Robinson CM, Sinclair JF, Smith MJ, O'Brien AD. 2006. Shiga toxin of enterohemorrhagic Escherichia coli type O157:H7 promotes intestinal colonization. P Natl Acad Sci USA 103:9667-9672.

29. Burland V, Shao Y, Perna NT, Plunkett G, Sofia HJ, Blattner FR. 1998. The complete DNA sequence and analysis of the large virulence plasmid of Escherichia coli O157:H7. Nucleic Acids Res 26:4196-4204.

30. Lorenz SC, Monday SR, Hoffmann M, Fischer M, Kase JA. 2016. Plasmids from Shiga toxin-producing Escherichia coli strains with rare enterohemolysin gene (ehxA) subtypes reveal pathogenicity potential and display a novel evolutionary path. Appl Environ Microbiol 82:6367-6377.

31. Eichhorn I, Heidemanns K, Semmler T, Kinnemann B, Mellmann A, Harmsen D, Anjum MF, Schmidt H, Fruth A, Valentin-Weigand P, Heesemann J, Suerbaum S, Karch H, Wieler LH. 2015. Highly virulent non-O157 enterohemorrhagic Escherichia coli (EHEC) serotypes reflect similar phylogenetic lineages, providing new insights into the evolution of EHEC. Appl Environ Microbiol 81:7041-7047.

32. Ogura Y, Ooka T, Iguchi A, Toh H, Asadulghani M, Oshima K, Kodama T, Abe H, Nakayama K, Kurokawa K, Tobe T, Hattori M, Hayashi T. 2009. Comparative genomics reveal the mechanism of the parallel evolution of O157 and non-O157 enterohemorrhagic Escherichia coli. P Natl Acad Sci USA 106:17939-17944.

33. Mirhoseini A, Amani J, Nazarian S. 2018. Review on pathogenicity mechanism of enterotoxigenic Escherichia coli and vaccines against it. Microb Pathog 117:162-169.

34. Sears CL, Kaper JB. 1996. Enteric bacterial toxins: mechanisms of action and linkage to intestinal secretion. Microbiol Rev 60:167-215.

45

35. Qadri F, Svennerholm AM, Faruque AS, Sack RB. 2005. Enterotoxigenic Escherichia coli in developing countries: epidemiology, microbiology, clinical features, treatment, and prevention. Clin Microbiol Rev 18:465-483.

36. Wolf MK. 1997. Occurrence, distribution, and associations of O and H serogroups, colonization factor antigens, and toxins of enterotoxigenic Escherichia coli. Clin Microbiol Rev 10:569-584.

37. Nagy B, Fekete PZ. 1999. Enterotoxigenic Escherichia coli (ETEC) in farm animals. Vet Res 30:259-284.

38. Nataro JP, Deng Y, Cookson S, Cravioto A, Savarino SJ, Guers LD, Levine MM, Tacket CO. 1995. Heterogeneity of enteroaggregative Escherichia coli virulence demonstrated in volunteers. J Infect Dis 171:465-468.

39. Croxen MA, Finlay BB. 2010. Molecular mechanisms of Escherichia coli pathogenicity. Nat Rev Micro 8:26-38.

40. Nataro JP, Kaper JB, Robins-Browne R, Prado V, Vial P, Levine MM. 1987. Patterns of adherence of diarrheagenic Escherichia coli to HEp-2 cells. Pediatr Infect Dis J 6:829-831.

41. Okeke IN, Wallace-Gadsden F, Simons HR, Matthews N, Labar AS, Hwang J, Wain J. 2010. Multi-locus sequence typing of enteroaggregative Escherichia coli isolates from Nigerian children uncovers multiple lineages. PLoS One 5:e14093.

42. Gomes TA, Elias WP, Scaletsky IC, Guth BE, Rodrigues JF, Piazza RM, Ferreira LC, Martinez MB. 2016. Diarrheagenic Escherichia coli. Braz J Microbiol 47 Suppl 1:3-30.

43. Savarino SJ, Fasano A, Watson J, Martin BM, Levine MM, Guandalini S, Guerry P. 1993. Enteroaggregative Escherichia coli heat-stable enterotoxin 1 represents another subfamily of E. coli heat-stable toxin. P Natl Acad Sci USA 90:3093- 3097.

44. Vila J, Vargas M, Henderson IR, Gascon J, Nataro JP. 2000. Enteroaggregative Escherichia coli virulence factors in traveler's diarrhea strains. J Infect Dis 182:1780-1783.

45. Dautin N. 2010. Serine protease autotransporters of enterobacteriaceae (SPATEs): biogenesis and function. Toxins (Basel) 2:1179-1206.

46. Eslava C, Navarro-Garcia F, Czeczulin JR, Henderson IR, Cravioto A, Nataro JP. 1998. Pet, an autotransporter enterotoxin from enteroaggregative Escherichia coli. Infect Immun 66:3155-3163.

46

47. Henderson IR, Czeczulin J, Eslava C, Noriega F, Nataro JP. 1999. Characterization of pic, a secreted protease of Shigella flexneri and enteroaggregative Escherichia coli. Infect Immun 67:5587-5596.

48. Czeczulin JR, Whittam TS, Henderson IR, Navarro-Garcia F, Nataro JP. 1999. Phylogenetic analysis of enteroaggregative and diffusely adherent Escherichia coli. Infect Immun 67:2692-2699.

49. Hale TL. 1991. Genetic basis of virulence in Shigella species. Microbiol Rev 55:206-224.

50. Pettengill EA, Pettengill JB, Binet R. 2015. Phylogenetic analyses of Shigella and enteroinvasive Escherichia coli for the identification of molecular epidemiological markers: whole-genome comparative analysis does not support distinct genera designation. Front Microbiol 6:1573.

51. Moreno AC, Ferreira LG, Martinez MB. 2009. Enteroinvasive Escherichia coli vs. Shigella flexneri: how different patterns of gene expression affect virulence. FEMS Microbiol Lett 301:156-163.

52. Scaletsky IC, Fabbricotti SH, Carvalho RL, Nunes CR, Maranhao HS, Morais MB, Fagundes-Neto U. 2002. Diffusely adherent Escherichia coli as a cause of acute diarrhea in young children in Northeast Brazil: a case-control study. J Clin Microbiol 40:645-648.

53. Robins-Browne RM, Holt KE, Ingle DJ, Hocking DM, Yang J, Tauschek M. 2016. Are Escherichia coli pathotypes still relevant in the era of whole-genome sequencing? Front Cell Infect Mi 6:141.

54. Tacket CO, Moseley SL, Kay B, Losonsky G, Levine MM. 1990. Challenge studies in volunteers using Escherichia coli strains with diffuse adherence to HEp-2 cells. J Infect Dis 162:550-552.

55. Boudeau J, Glasser AL, Masseret E, Joly B, Darfeuille-Michaud A. 1999. Invasive ability of an Escherichia coli strain isolated from the ileal mucosa of a patient with Crohn's disease. Infect Immun 67:4499-4509.

56. Darfeuille-Michaud A, Boudeau J, Bulois P, Neut C, Glasser AL, Barnich N, Bringer MA, Swidsinski A, Beaugerie L, Colombel JF. 2004. High prevalence of adherent-invasive Escherichia coli associated with ileal mucosa in Crohn's disease. Gastroenterology 127:412-421.

57. Barnich N, Carvalho FA, Glasser AL, Darcha C, Jantscheff P, Allez M, Peeters H, Bommelaer G, Desreumaux P, Colombel JF, Darfeuille-Michaud A. 2007. CEACAM6 acts as a receptor for adherent-invasive E. coli, supporting ileal mucosa colonization in Crohn disease. J Clin Invest 117:1566-1574.

47

58. Yang Y, Liao Y, Ma Y, Gong W, Zhu G. 2017. The role of major virulence factors of AIEC involved in inflammatory bowl disease-a mini-review. Appl Microbiol Biotechnol 101:7781-7787.

59. Martinez-Medina M, Mora A, Blanco M, Lopez C, Alonso MP, Bonacorsi S, Nicolas-Chanoine MH, Darfeuille-Michaud A, Garcia-Gil J, Blanco J. 2009. Similarity and divergence among adherent-invasive Escherichia coli and extraintestinal pathogenic E. coli strains. J Clin Microbiol 47:3968-3979.

60. Nolan LK, Barnes HJ, Vaillancourt JP, Abdul-Aziz T, Logue CM. 2013. Colibacillosis, p 751-805, Diseases of Poultry, 13th ed doi:10.1002/9781119421481. Ch18. John Wiley & Sons, Ltd.

61. Mainil JG, Jacquemin E, Pohl P, Fairbrother JM, Ansuini A, Le Bouguenec C, Ball HJ, De Rycke J, Oswald E. 1999. Comparison of necrotoxigenic Escherichia coli isolates from farm animals and from humans. Vet Microbiol 70:123-135.

62. Rahman H, Deka M. 2014. Detection & characterization of necrotoxin producing Escherichia coli (NTEC) from patients with urinary tract infection (UTI). Indian J Med Res 139:632-637.

63. Sobieszczanska BM, Osek J, Gryko R, Dobrowolska M. 2006. Association between cell-bound haemolysin and cell-detaching activity of Escherichia coli isolated from children. J Med Microbiol 55:325-330.

64. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Nolan LK. 2005. Characterizing the APEC pathotype. Vet Res 36:241-256.

65. Russo TA, Johnson JR. 2000. Proposal for a new inclusive designation for extraintestinal pathogenic isolates of Escherichia coli: ExPEC. J Infect Dis 181:1753-1754.

66. Dho-Moulin M, Fairbrother JM. 1999. Avian pathogenic Escherichia coli (APEC). Vet Res 30:299-316.

67. Ewers C, Li G, Wilking H, Kiessling S, Alt K, Antao EM, Laturnus C, Diehl I, Glodde S, Homeier T, Bohnke U, Steinruck H, Philipp HC, Wieler LH. 2007. Avian pathogenic, uropathogenic, and newborn meningitis-causing Escherichia coli: how closely related are they? Int J Med Microbiol 297:163-176.

68. Johnson TJ, Logue CM, Wannemuehler Y, Kariyawasam S, Doetkott C, DebRoy C, White DG, Nolan LK. 2009. Examination of the source and extended virulence genotypes of Escherichia coli contaminating retail poultry meat. Foodborne Pathog Dis 6:657-667.

69. Mitchell NM, Johnson JR, Johnston B, Curtiss R, 3rd, Mellata M. 2015. Zoonotic potential of Escherichia coli isolates from retail chicken meat products and eggs. Appl Environ Microbiol 81:1177-1187.

48

70. Stamm WE, Norrby SR. 2001. Urinary tract infections: disease panorama and challenges. J Infect Dis 183:S1-S4.

71. Hooton TM. 2012. Clinical practice. Uncomplicated urinary tract infection. N Engl J Med 366:1028-37.

72. Sivick KE, Mobley HL. 2010. Waging war against uropathogenic Escherichia coli: winning back the urinary tract. Infect Immun 78:568-585.

73. Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. 2015. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Micro 13:269-284.

74. Foxman B. 2010. The epidemiology of urinary tract infection. Nat Rev Urol 7:653-660.

75. Lo E, Nicolle LE, Coffin SE, Gould C, Maragakis LL, Meddings J, Pegues DA, Pettis AM, Saint S, Yokoe DS. 2014. Strategies to prevent catheter-associated urinary tract infections in acute care hospitals: 2014 update. Infect Control Hosp Epidemiol 35:464-479.

76. Stenzelius K, Laszlo L, Madeja M, Pessah-Rasmusson H, Grabe M. 2016. Catheter-associated urinary tract infections and other infections in patients hospitalized for acute stroke: a prospective cohort study of two different silicone catheters. Scand J Urol 50:483-488.

77. Foxman B. 2002. Epidemiology of urinary tract infections: incidence, morbidity, and economic costs. Am J Med 113 Suppl 1A:5S-13S.

78. Khandelwal P, Abraham SN, Apodaca G. 2009. Cell biology and physiology of the uroepithelium. Am J Physiol Renal Physiol 297:F1477-1501.

79. Hannan TJ, Totsika M, Mansfield KJ, Moore KH, Schembri MA, Hultgren SJ. 2012. Host–pathogen checkpoints and population bottlenecks in persistent and intracellular uropathogenic Escherichia coli bladder infection. FEMS Microbiol Rev 36:616-648.

80. Wright KJ, Seed PC, Hultgren SJ. 2007. Development of intracellular bacterial communities of uropathogenic Escherichia coli depends on type 1 pili. Cell Microbiol 9:2230-2241.

81. Kostakioti M, Hadjifrangiskou M, Hultgren SJ. 2013. Bacterial biofilms: development, dispersal, and therapeutic strategies in the dawn of the postantibiotic era. Cold Spring Harb Perspect Med 3.

82. Roberts JA, Marklund BI, Ilver D, Haslam D, Kaack MB, Baskin G, Louis M, Mollby R, Winberg J, Normark S. 1994. The Gal(alpha 1-4)Gal-specific tip

49

adhesin of Escherichia coli P-fimbriae is needed for pyelonephritis to occur in the normal urinary tract. P Natl Acad Sci USA 91:11889-11893.

83. Ben Mkaddem S, Chassin C, Vandewalle A. 2010. Contribution of renal tubule epithelial cells in the innate immune response during renal bacterial infections and ischemia-reperfusion injury. Chang Gung Med J 33:225-240.

84. Spurbeck RR, Dinh PC, Jr., Walk ST, Stapleton AE, Hooton TM, Nolan LK, Kim KS, Johnson JR, Mobley HL. 2012. Escherichia coli isolates that carry vat, fyuA, chuA, and yfcV efficiently colonize the urinary tract. Infect Immun 80:4115-4122.

85. Wright KJ, Hultgren SJ. 2006. Sticky fibers and uropathogenesis: bacterial adhesins in the urinary tract. Future Microbiol 1:75-87.

86. Gaschignard J, Levy C, Romain O, Cohen R, Bingen E, Aujard Y, Boileau P. 2011. Neonatal bacterial meningitis: 444 cases in 7 years. Pediatr Infect Dis J 30:212-217.

87. Harvey D, Holt DE, Bedford H. 1999. Bacterial meningitis in the newborn: a prospective study of mortality and morbidity. Semin Perinatol 23:218-225.

88. Sivanandan S, Soraisham AS, Swarnam K. 2011. Choice and duration of antimicrobial therapy for neonatal sepsis and meningitis. Int J Pediatr 2011.

89. Hammerschlag MR, Klein JO, Herschel M, Chen FC, Fermin R. 1977. Patterns of use of antibiotics in two newborn nurseries. N Engl J Med 296:1268-1269.

90. Wiswell TE, Baumgart S, Gannon CM, Spitzer AR. 1995. No lumbar puncture in the evaluation for early neonatal sepsis: will meningitis be missed? Pediatrics 95:803-806.

91. Iqbal J, Dufendach KR, Wellons JC, Kuba MG, Nickols HH, Gómez-Duarte OG, Wynn JL. 2016. Lethal neonatal meningoencephalitis caused by multi-drug resistant, highly virulent Escherichia coli. Infectious Diseases 48:461-466.

92. Snelling S, Hart CA, Cooke RW. 1983. Ceftazidime or gentamicin plus benzylpenicillin in neonates less than forty-eight hours old. J Antimicrob Chemother 12:353-356.

93. Miall-Allen VM, Whitelaw AG, Darrell JH. 1988. Ticarcillin plus clavulanic acid (Timentin) compared with standard antibiotic regimes in the treatment of early and late neonatal infections. Br J Clin Pract 42:273-279.

94. Metsvaht T, Ilmoja ML, Parm U, Maipuu L, Merila M, Lutsar I. 2010. Comparison of ampicillin plus gentamicin vs. penicillin plus gentamicin in empiric treatment of neonates at risk of early onset sepsis. Acta Paediatr 99:665- 672.

50

95. Birchenough GM, Johansson ME, Stabler RA, Dalgakiran F, Hansson GC, Wren BW, Luzio JP, Taylor PW. 2013. Altered innate defenses in the neonatal gastrointestinal tract in response to colonization by neuropathogenic Escherichia coli. Infect Immun 81:3264-3275.

96. Kim KS, Itabashi H, Gemski P, Sadoff J, Warren RL, Cross AS. 1992. The K1 capsule is the critical determinant in the development of Escherichia coli meningitis in the rat. J Clin Invest 90:897-905.

97. Selvaraj SK, Prasadarao NV. 2005. Escherichia coli K1 inhibits proinflammatory cytokine induction in monocytes by preventing NF-kappaB activation. J Leukoc Biol 78:544-554.

98. Dietzman DE, Fischer GW, Schoenknecht FD. 1974. Neonatal Escherichia coli septicemia--bacterial counts in blood. J Pediatr 85:128-130.

99. Kim KS. 2001. Escherichia coli translocation at the blood-brain barrier. Infect Immun 69:5217-5222.

100. Che X, Chi F, Wang L, Jong TD, Wu CH, Wang X, Huang SH. 2011. Involvement of IbeA in meningitic Escherichia coli K1-induced polymorphonuclear leukocyte transmigration across brain endothelial cells. Brain Pathol 21:389-404.

101. Obata-Yasuoka M, Ba-Thein W, Tsukamoto T, Yoshikawa H, Hayashi H. 2002. Vaginal Escherichia coli share common virulence factor profiles, serotypes and phylogeny with other extraintestinal E. coli. Microbiol 148:2745-2752.

102. Cools P. 2017. The role of Escherichia coli in reproductive health: state of the art. Res Microbiol 168:892-901.

103. Laufer N, Simon A, Schenker JG, Sekeles E, Cohen R. 1984. Fallopian tubal mucosal damage induced experimentally by Escherichia coli in the rabbit. A scanning electron microscopic study. Pathol Res Pract 178:605-610.

104. Monga M, Roberts JA. 1994. Spermagglutination by bacteria: receptor-specific interactions. J Androl 15:151-156.

105. Kaur K, Kaur S, Prabha V. 2015. Exploitation of sperm-Escherichia coli interaction at the receptor-ligand level for the development of anti-receptor antibodies as the vaginal contraceptive. J Androl 3:385-394.

106. Kaur K, Prabha V. 2014. Spermagglutinating Escherichia coli and its role in infertility: in vivo study. Microb Pathog 69-70:33-38.

107. Pellati D, Mylonakis I, Bertoloni G, Fiore C, Andrisani A, Ambrosini G, Armanini D. 2008. Genital tract infections and infertility. Eur J Obstet Gynecol Reprod Biol 140:3-11.

51

108. Karat C, Madhivanan P, Krupp K, Poornima S, Jayanthi NV, Suguna JS, Mathai E. 2006. The clinical and microbiological correlates of premature rupture of membranes. Indian J Med Microbiol 24:283-285.

109. Pitari GM, Zingman LV, Hodgson DM, Alekseev AE, Kazerounian S, Bienengraeber M, Hajnoczky G, Terzic A, Waldman SA. 2003. Bacterial enterotoxins are associated with resistance to colon cancer. P Natl Acad Sci USA 100:2695-2699.

110. Caughey AB, Robinson JN, Norwitz ER. 2008. Contemporary diagnosis and management of preterm premature rupture of membranes. Rev Obstet Gynecol 1:11-22.

111. Martin RJ. 2017. Pathophysiology of apnea of prematurity, p 1595-1604. In Polin RA, Abman, S.A, Rowitch, D.H., Benitz, W.E., Fox, W.W (ed), Fetal and Neonatal Physiology, 5th ed doi:10.1016/B978-0-323-35214-7.00157-8. Elsevier.

112. Stokholm J, Schjorring S, Pedersen L, Bischoff AL, Folsgaard N, Carson CG, Chawes B, Bonnelykke K, Molgaard A, Krogfelt KA, Bisgaard H. 2012. Living with cat and dog increases vaginal colonization with E. coli in pregnant women. PLoS One 7:e46226.

113. Cools P, Jespers V, Hardy L, Crucitti T, Delany-Moretlwe S, Mwaura M, Ndayisaba GF, van de Wijgert JH, Vaneechoutte M. 2016. A multi-country cross- sectional study of vaginal carriage of Group B Streptococci (GBS) and Escherichia coli in resource-poor settings: prevalences and risk factors. PLoS One 11:e0148052.

114. Ron EZ. 2006. Host specificity of septicemic Escherichia coli: human and avian pathogens. Curr Opin Microbiol 9:28-32.

115. Tivendale KA, Logue CM, Kariyawasam S, Jordan D, Hussein A, Li G, Wannemuehler Y, Nolan LK. 2010. Avian-pathogenic Escherichia coli strains are similar to neonatal meningitis E. coli strains and are able to cause meningitis in the rat model of human disease. Infect Immun 78:3412-3419.

116. Skyberg JA, Johnson TJ, Johnson JR, Clabots C, Logue CM, Nolan LK. 2006. Acquisition of avian pathogenic Escherichia coli plasmids by a commensal E. coli isolate enhances its abilities to kill chicken embryos, grow in human urine, and colonize the murine kidney. Infect Immun 74:6287-6292.

117. Zhao L, Gao S, Huan H, Xu X, Zhu X, Yang W, Gao Q, Liu X. 2009. Comparison of virulence factors and expression of specific genes between uropathogenic Escherichia coli and avian pathogenic E. coli in a murine urinary tract infection model and a chicken challenge model. Microbiol 155:1634-1644.

118. Maluta RP, Logue CM, Casas MR, Meng T, Guastalli EA, Rojas TC, Montelli AC, Sadatsune T, de Carvalho Ramos M, Nolan LK, da Silveira WD. 2014.

52

Overlapped sequence types (STs) and serogroups of avian pathogenic (APEC) and human extra-intestinal pathogenic (ExPEC) Escherichia coli isolated in Brazil. PLoS One 9:e105016.

119. Danzeisen JL, Wannemuehler Y, Nolan LK, Johnson TJ. 2013. Comparison of multilocus sequence analysis and virulence genotyping of Escherichia coli from live birds, retail poultry meat, and human extraintestinal infection. Avian Dis 57:104-108.

120. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of extraintestinal pathogenic Escherichia coli from human and avian sources reveals a mixed subset representing potential zoonotic pathogens, Appl Environ Microbiol.

121. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Fakhr MK, Nolan LK. 2005. Comparison of Escherichia coli isolates implicated in human urinary tract infection and avian colibacillosis. Microbiol 151:2097-2110.

122. Nielsen DW, Klimavicz JS, Cavender T, Wannemuehler Y, Barbieri NL, Nolan LK, Logue CM. 2018. The impact of media, phylogenetic classification, and E. coli pathotypes on biofilm formation in extraintestinal and commensal E. coli from humans and animals. Front Microbiol 9.

123. Logue CM, Wannemuehler Y, Nicholson BA, Doetkott C, Barbieri NL, Nolan LK. 2017. Comparative analysis of phylogenetic assignment of human and avian ExPEC and fecal commensal Escherichia coli using the (previous and revised) Clermont phylogenetic typing methods and its impact on avian pathogenic Escherichia coli (APEC) classification. Front Microbiol 8:283.

124. Logue CM, Doetkott C, Mangiamele P, Wannemuehler YM, Johnson TJ, Tivendale KA, Li G, Sherwood JS, Nolan LK. 2012. Genotypic and Phenotypic Traits That Distinguish Neonatal Meningitis-Associated Escherichia coli from Fecal E. coli Isolates of Healthy Human Hosts. Appl Environ Microb 78:5824- 5830.

125. Kohler CD, Dobrindt U. 2011. What defines extraintestinal pathogenic Escherichia coli? Int J Med Microbiol 301:642-647.

126. Krogfelt KA, Bergmans H, Klemm P. 1990. Direct evidence that the FimH protein is the mannose-specific adhesin of Escherichia coli type 1 fimbriae. Infect Immun 58:1995-1998.

127. Vandemaele F, Vandekerchove D, Vereecken M, Derijcke J, Dho-Moulin M, Goddeeris BM. 2003. Sequence analysis demonstrates the conservation of fimH and variability of fimA throughout avian pathogenic Escherichia coli (APEC). Vet Res 34:153-163.

53

128. Nicholson BA. 2015. Genomic epidemiology and characterization of neonatal meningitis Escherichia coli and their virulence plasmids. Iowa State University Digital Repository.

129. Rosen DA, Pinkner JS, Walker JN, Elam JS, Jones JM, Hultgren SJ. 2008. Molecular variations in Klebsiella pneumoniae and Escherichia coli FimH affect function and pathogenesis in the urinary tract. Infect Immun 76:3346-3356.

130. Hommais F, Gouriou S, Amorin C, Bui H, Rahimy MC, Picard B, Denamur E. 2003. The FimH A27V mutation is pathoadaptive for urovirulence in Escherichia coli B2 phylogenetic group isolates. Infect Immun 71:3619-3622.

131. Sokurenko EV, Chesnokova V, Dykhuizen DE, Ofek I, Wu XR, Krogfelt KA, Struve C, Schembri MA, Hasty DL. 1998. Pathogenic adaptation of Escherichia coli by natural variation of the FimH adhesin. P Natl Acad Sci USA 95:8922- 8926.

132. Dias RC, Moreira BM, Riley LW. 2010. Use of fimH single-nucleotide polymorphisms for strain typing of clinical isolates of Escherichia coli for epidemiologic investigation. J Clin Microbiol 48:483-488.

133. Poole NM, Green SI, Rajan A, Vela LE, Zeng XL, Estes MK, Maresso AW. 2017. Role for FimH in extraintestinal pathogenic Escherichia coli invasion and translocation through the intestinal epithelium. Infect Immun 85:IAI-00581.

134. Russell CW, Fleming BA, Jost CA, Tran A, Stenquist AT, Wambaugh MA, Bronner MP, Mulvey MA. 2018. Context-dependent requirements for FimH and other canonical virulence factors in gut colonization by extraintestinal pathogenic Escherichia coli. Infect Immun 86:IAI-00746.

135. Zhou G, Mo WJ, Sebbel P, Min G, Neubert TA, Glockshuber R, Wu XR, Sun TT, Kong XP. 2001. Uroplakin Ia is the urothelial receptor for uropathogenic Escherichia coli: evidence from in vitro FimH binding. J Cell Sci 114:4095-4103.

136. Teng CH, Cai M, Shin S, Xie Y, Kim KJ, Khan NA. 2005. Escherichia coli K1 RS218 interacts with human brain microvascular endothelial cells via type 1 fimbria bacteria in the fimbriated state. Infect Immun 73:2923-2931.

137. Khan NA, Kim Y, Shin S, Kim KS. 2007. FimH-mediated Escherichia coli K1 invasion of human brain microvascular endothelial cells. Cell Microbiol 9:169- 178.

138. Dozois CM, Chanteloup N, Dho-Moulin M, Bree A, Desautels C, Fairbrother JM. 1994. Bacterial colonization and in vivo expression of F1 (type 1) fimbrial antigens in chickens experimentally infected with pathogenic Escherichia coli. Avian Dis 38:231-239.

54

139. Langermann S, Palaszynski S, Barnhart M, Auguste G, Pinkner JS, Burlein J, Barren P, Koenig S, Leath S, Jones CH, Hultgren SJ. 1997. Prevention of mucosal Escherichia coli infection by FimH-adhesin-based systemic vaccination. Science 276:607-611.

140. Langermann S, Mollby R, Burlein JE, Palaszynski SR, Auguste CG, DeFusco A, Strouse R, Schenerman MA, Hultgren SJ, Pinkner JS, Winberg J, Guldevall L, Soderhall M, Ishikawa K, Normark S, Koenig S. 2000. Vaccination with FimH adhesin protects cynomolgus monkeys from colonization and infection by uropathogenic Escherichia coli. J Infect Dis 181:774-778.

141. Poggio TV, La Torre JL, Scodeller EA. 2006. Intranasal immunization with a recombinant truncated FimH adhesin adjuvanted with CpG oligodeoxynucleotides protects mice against uropathogenic Escherichia coli challenge. Can J Microbiol 52:1093-1102.

142. Vandemaele F, Ververken C, Bleyen N, Geys J, D'Hulst C, Addwebi T, van Empel P, Goddeeris BM. 2005. Immunization with the binding domain of FimH, the adhesin of type 1 fimbriae, does not protect chickens against avian pathogenic Escherichia coli. Avian Pathol 34:264-272.

143. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of extraintestinal pathogenic Escherichia coli strains from human and avian sources reveals a mixed subset representing potential zoonotic pathogens. Appl Environ Microb 74:7043-7050.

144. Pourbakhsh SA, Dho-Moulin M, Bree A, Desautels C, Martineau-Doize B, Fairbrother JM. 1997. Localization of the in vivo expression of P and F1 fimbriae in chickens experimentally inoculated with pathogenic Escherichia coli. Microb Pathog 22:331-341.

145. Dziva F, Stevens MP. 2008. Colibacillosis in poultry: unravelling the molecular basis of virulence of avian pathogenic Escherichia coli in their natural hosts. Avian Pathol 37:355-366.

146. Janben T, Schwarz C, Preikschat P, Voss M, Philipp HC, Wieler LH. 2001. Virulence-associated genes in avian pathogenic Escherichia coli (APEC) isolated from internal organs of poultry having died from colibacillosis. Int J Med Microbiol 291:371-378.

147. Ewers C, Janssen T, Kiessling S, Philipp HC, Wieler LH. 2004. Molecular epidemiology of avian pathogenic Escherichia coli (APEC) isolated from colisepticemia in poultry. Vet Microbiol 104:91-101.

148. Stromberg ZR, Johnson JR, Fairbrother JM, Kilbourne J, Van Goor A, Curtiss RR, Mellata M. 2017. Evaluation of Escherichia coli isolates from healthy chickens to determine their potential risk to poultry and human health. PLoS One 12:e0180599.

55

149. Sabitha B, Kumar KV, Geetha RK. 2016. Adhesins of uropathogenic Escherichia coli (UPEC). Int J Med Microb Trop Dis 2:10-18.

150. Khan AS, Kniep B, Oelschlaeger TA, Van Die I, Korhonen T, Hacker J. 2000. Receptor structure for F1C fimbriae of uropathogenic Escherichia coli. Infect Immun 68:3541-3547.

151. Antao EM, Wieler LH, Ewers C. 2009. Adhesive threads of extraintestinal pathogenic Escherichia coli. Gut Pathog 1:22.

152. Le Bouguenec C, Lalioui L, du Merle L, Jouve M, Courcoux P, Bouzari S, Selvarangan R, Nowicki BJ, Germani Y, Andremont A, Gounon P, Garcia MI. 2001. Characterization of AfaE adhesins produced by extraintestinal and intestinal human Escherichia coli isolates: PCR assays for detection of Afa adhesins that do or do not recognize Dr blood group antigens. J Clin Microbiol 39:1738-1745.

153. Archambaud M, Courcoux P, Ouin V, Chabanon G, Labigne-Roussel A. 1988. Phenotypic and genotypic assays for the detection and identification of adhesins from pyelonephritic Escherichia coli. Ann Inst Pasteur Microbiol 139:557-573.

154. Le Bouguenec C, Archambaud M, Labigne A. 1992. Rapid and specific detection of the pap, afa, and sfa adhesin-encoding operons in uropathogenic Escherichia coli strains by polymerase chain reaction. J Clin Microbiol 30:1189-1193.

155. Horcajada JP, Soto S, Gajewski A, Smithson A, Jimenez de Anta MT, Mensa J, Vila J, Johnson JR. 2005. Quinolone-resistant uropathogenic Escherichia coli strains from phylogenetic group B2 have fewer virulence factors than their susceptible counterparts. J Clin Microbiol 43:2962-2964.

156. Smith JL, Fratamico PM, Gunther NW. 2007. Extraintestinal pathogenic Escherichia coli. Foodborne Pathog Dis 4:134-163.

157. Keller R, Ordonez JG, de Oliveira RR, Trabulsi LR, Baldwin TJ, Knutton S. 2002. Afa, a diffuse adherence fibrillar adhesin associated with enteropathogenic Escherichia coli. Infect Immun 70:2681-2689.

158. Fagan RP, Smith SG. 2007. The Hek outer membrane protein of Escherichia coli is an auto-aggregating adhesin and invasin. FEMS Microbiol Lett 269:248-255.

159. Fagan RP, Lambert MA, Smith SG. 2008. The hek outer membrane protein of Escherichia coli strain RS218 binds to proteoglycan and utilizes a single extracellular loop for adherence, invasion, and autoaggregation. Infect Immun 76:1135-1142.

160. Provence DL, Curtiss R, 3rd. 1994. Isolation and characterization of a gene involved in hemagglutination by an avian pathogenic Escherichia coli strain. Infect Immun 62:1369-1380.

56

161. Dozois CM, Dho-Moulin M, Bree A, Fairbrother JM, Desautels C, Curtiss R, 3rd. 2000. Relationship between the Tsh autotransporter and pathogenicity of avian Escherichia coli and localization and analysis of the Tsh genetic region. Infect Immun 68:4145-4154.

162. Tivendale KA, Allen JL, Ginns CA, Crabb BS, Browning GF. 2004. Association of iss and iucA, but not tsh, with plasmid-mediated virulence of avian pathogenic Escherichia coli. Infect Immun 72:6554-6560.

163. Niemann HH, Schubert WD, Heinz DW. 2004. Adhesins and invasins of pathogenic bacteria: a structural view. Microbes Infect 6:101-112.

164. Huang SH, Wass C, Fu Q, Prasadarao NV, Stins M, Kim KS. 1995. Escherichia coli invasion of brain microvascular endothelial cells in vitro and in vivo: molecular cloning and characterization of invasion gene ibe10. Infect Immun 63:4470-4475.

165. Homeier T, Semmler T, Wieler LH, Ewers C. 2010. The GimA locus of extraintestinal pathogenic E. coli: does reductive evolution correlate with habitat and pathotype? PLoS One 5:e10877.

166. Huang SH, Chen YH, Kong G, Chen SH, Besemer J, Borodovsky M, Jong A. 2001. A novel genetic island of meningitic Escherichia coli K1 containing the ibeA invasion gene (GimA): functional annotation and carbon-source-regulated invasion of human brain microvascular endothelial cells. Funct Integr Genom 1:312-322.

167. Cieza RJ, Hu J, Ross BN, Sbrana E, Torres AG. 2015. The IbeA invasin of adherent-invasive Escherichia coli mediates interaction with intestinal epithelia and macrophages. Infect Immun 83:1904-1918.

168. Flechard M, Cortes MA, Reperant M, Germon P. 2012. New role for the ibeA gene in H2O2 stress resistance of Escherichia coli. J Bacteriol 194:4550-4560.

169. Germon P, Chen YH, He L, Blanco JE, Brée A, Schouler C. 2005. ibeA, a virulence factor of avian pathogenic Escherichia coli. J Microbiol 151:1179-1186.

170. Wang S, Niu C, Shi Z, Xia Y, Yaqoob M, Dai J, Lu C. 2011. Effects of ibeA deletion on virulence and biofilm formation of avian pathogenic Escherichia coli. Infect Immun 79:279-287.

171. Wang S, Shi Z, Xia Y, Li H, Kou Y, Bao Y, Dai J, Lu C. 2012. IbeB is involved in the invasion and pathogenicity of avian pathogenic Escherichia coli. Vet Microbiol 159:411-419.

172. Wang Y, Kim KS. 2002. Role of OmpA and IbeB in Escherichia coli K1 invasion of brain microvascular endothelial cells in vitro and in vivo. Ped Res 51:559-563.

57

173. Saez-Lopez E, Bosch J, Salvia MD, Fernandez-Orth D, Cepas V, Ferrer-Navarro M, Figueras-Aloy J, Vila JP, Soto SM. 2017. Outbreak caused by Escherichia coli O18: K1: H7 sequence type 95 in a neonatal intensive care unit in Barcelona, Spain. Pediatr Infect Dis J 36:1079-1086.

174. Barbieri NL, de Oliveira AL, Tejkowski TM, Pavanelo DB, Rocha DA, Matter LB, Callegari-Jacques SM, de Brito BG, Horn F. 2013. Genotypes and pathogenicity of cellulitis isolates reveal traits that modulate APEC virulence. PLoS One 8:e72322.

175. Matter LB, Spricigo DA, Tasca C, Vargas AC. 2015. Invasin gimB found in a bovine intestinal Escherichia coli with an adherent and invasive profile. Braz J Microbiol 46:875-878.

176. Hoffman JA, Badger JL, Zhang Y, Huang SH, Kim KS. 2000. Escherichia coli K1 aslA contributes to invasion of brain microvascular endothelial cells in vitro and in vivo. Infect Immun 68:5062-5067.

177. Skaar EP. 2010. The battle for iron between bacterial pathogens and their vertebrate hosts. PLoS Pathog 6:e1000949.

178. Porcheron G, Garenaux A, Proulx J, Sabri M, Dozois CM. 2013. Iron, copper, zinc, and manganese transport and regulation in pathogenic Enterobacteria: correlations between strains, site of infection and the relative importance of the different metal transport systems for virulence. Front Cell Infect Microbiol 3:90.

179. Nicholson BA, West AC, Mangiamele P, Barbieri NL, Wannemuehler Y, Nolan LK, Logue CM, Li G. 2016. Genetic characterization of ExPEC-like virulence plasmids among a subset of NMEC. PLOS ONE 11:e0147757.

180. Gao Q, Wang X, Xu H, Xu Y, Ling J, Zhang D, Gao S, Liu X. 2012. Roles of iron acquisition systems in virulence of extraintestinal pathogenic Escherichia coli: salmochelin and aerobactin contribute more to virulence than heme in a chicken infection model. BMC Microbiol 12:143.

181. Hancock V, Vejborg RM, Klemm P. 2010. Functional genomics of probiotic Escherichia coli Nissle 1917 and 83972, and UPEC strain CFT073: comparison of transcriptomes, growth and biofilm formation. Mol Genet Genomics 284:437- 454.

182. Dozois CM, Daigle F, Curtiss R, 3rd. 2003. Identification of pathogen-specific and conserved genes expressed in vivo by an avian pathogenic Escherichia coli strain. P Natl Acad Sci USA 100:247-252.

183. Johnson JR, Jelacic S, Schoening LM, Clabots C, Shaikh N, Mobley HL, Tarr PI. 2005. The IrgA homologue adhesin Iha is an Escherichia coli virulence factor in murine urinary tract infection. Infect Immun 73:965-971.

58

184. Leveille S, Caza M, Johnson JR, Clabots C, Sabri M, Dozois CM. 2006. Iha from an Escherichia coli urinary tract infection outbreak clonal group A strain is expressed in vivo in the mouse urinary tract and functions as a catecholate siderophore receptor. Infect Immun 74:3427-3436.

185. Sabri M, Caza M, Proulx J, Lymberopoulos MH, Bree A, Moulin-Schouleur M, Curtiss R, 3rd, Dozois CM. 2008. Contribution of the SitABCD, MntH, and FeoB metal transporters to the virulence of avian pathogenic Escherichia coli O78 strain chi7122. Infect Immun 76:601-611.

186. Yun KW, Kim HY, Park HK, Kim W, Lim IS. 2014. Virulence factors of uropathogenic Escherichia coli of urinary tract infections and asymptomatic bacteriuria in children. J Microbiol Immunol Infect 47:455-461.

187. Johnson TJ, Kariyawasam S, Wannemuehler Y, Mangiamele P, Johnson SJ, Doetkott C, Skyberg JA, Lynne AM, Johnson JR, Nolan LK. 2007. The Genome Sequence of Avian Pathogenic Escherichia coli Strain O1:K1:H7 Shares Strong Similarities with Human Extraintestinal Pathogenic E. coli Genomes. J Bacteriol 189:3228-3236.

188. Li Y, Dai J, Zhuge X, Wang H, Hu L, Ren J, Chen L, Li D, Tang F. 2016. Iron- regulated gene ireA in avian pathogenic Escherichia coli participates in adhesion and stress-resistance. BMC Vet Res 12:167.

189. Russo TA, Carlino UB, Johnson JR. 2001. Identification of a new iron-regulated virulence gene, ireA, in an extraintestinal pathogenic isolate of Escherichia coli. Infect Immun 69:6209-6216.

190. Delicato ER, de Brito BG, Gaziri LC, Vidotto MC. 2003. Virulence-associated genes in Escherichia coli isolates from poultry with colibacillosis. Vet Microbiol 94:97-103.

191. Skyberg JA, Johnson TJ, Nolan LK. 2008. Mutational and transcriptional analyses of an avian pathogenic Escherichia coli ColV plasmid. BMC Microbiol 8:24.

192. Negre VL, Bonacorsi S, Schubert S, Bidet P, Nassif X, Bingen E. 2004. The siderophore receptor IroN, but not the high-pathogenicity island or the hemin receptor ChuA, contributes to the bacteremic step of Escherichia coli neonatal meningitis. Infect Immun 72:1216-1220.

193. Zhou D, Hardt WD, Galan JE. 1999. Salmonella typhimurium encodes a putative iron transport system within the centisome 63 pathogenicity island. Infect Immun 67:1974-1981.

194. Sabri M, Leveille S, Dozois CM. 2006. A SitABCD homologue from an avian pathogenic Escherichia coli strain mediates transport of iron and manganese and resistance to hydrogen peroxide. Microbiol 152:745-758.

59

195. Torres AG, Payne SM. 1997. Haem iron-transport system in enterohaemorrhagic Escherichia coli O157:H7. Mol Microbiol 23:825-833.

196. Garcia EC, Brumbaugh AR, Mobley HL. 2011. Redundancy and specificity of Escherichia coli iron acquisition systems during urinary tract infection. Infect Immun 79:1225-1235.

197. Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Env Microbiol Rep 5:58-65.

198. Clermont O, Bonacorsi S, Bingen E. 2000. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl Environ Microb 66:4555-4558.

199. Schubert S, Rakin A, Karch H, Carniel E, Heesemann J. 1998. Prevalence of the "high-pathogenicity island" of Yersinia species among Escherichia coli strains that are pathogenic to humans. Infect Immun 66:480-485.

200. Hagan EC, Mobley HL. 2009. Haem acquisition is facilitated by a novel receptor Hma and required by uropathogenic Escherichia coli for kidney infection. Mol Microbiol 71:79-91.

201. Peterson JW. 1996. Bacterial pathogenesis. In Baron 5 (ed), Medical Microbiology, Galveston (TX).

202. Essler M, Linder S, Schell B, Hufner K, Wiedemann A, Randhahn K, Staddon JM, Aepfelbacher M. 2003. Cytotoxic necrotizing factor 1 of Escherichia coli stimulates Rho/Rho-kinase-dependent myosin light-chain phosphorylation without inactivating myosin light-chain phosphatase in endothelial cells. Infect Immun 71:5188-5193.

203. Kim KS. 2008. Mechanisms of microbial traversal of the blood-brain barrier. Nat Rev Microbiol 6:625-634.

204. Kim KJ, Chung JW, Kim KS. 2005. 67-kDa laminin receptor promotes internalization of cytotoxic necrotizing factor 1-expressing Escherichia coli K1 into human brain microvascular endothelial cells. J Biol Chem 280:1360-1368.

205. Smith YC, Rasmussen SB, Grande KK, Conran RM, O'Brien AD. 2008. Hemolysin of uropathogenic Escherichia coli evokes extensive shedding of the uroepithelium and hemorrhage in bladder tissue within the first 24 hours after intraurethral inoculation of mice. Infect Immun 76:2978-2990.

206. Dhakal BK, Mulvey MA. 2012. The UPEC pore-forming toxin alpha-hemolysin triggers proteolysis of host proteins to disrupt cell adhesion, inflammatory, and survival pathways. Cell Host Microbe 11:58-69.

60

207. Johnson TJ, Johnson SJ, Nolan LK. 2006. Complete DNA sequence of a ColBM plasmid from avian pathogenic Escherichia coli suggests that it evolved from closely related ColV virulence plasmids. J Bacteriol 188:5973-5983.

208. Johnson TJ, Siek KE, Johnson SJ, Nolan LK. 2006. DNA sequence of a ColV plasmid and prevalence of selected plasmid-encoded virulence genes among avian Escherichia coli strains. J Bacteriol 188:745-758.

209. Murase K, Martin P, Porcheron G, Houle S, Helloin E, Penary M, Nougayrede JP, Dozois CM, Hayashi T, Oswald E. 2016. HlyF produced by extraintestinal pathogenic Escherichia coli is a virulence factor that regulates outer membrane vesicle biogenesis. J Infect Dis 213:856-865.

210. De Rycke J, Oswald E. 2001. Cytolethal distending toxin (CDT): a bacterial weapon to control host cell proliferation? FEMS Microbiol Lett 203:141-148.

211. Toth I, Nougayrede JP, Dobrindt U, Ledger TN, Boury M, Morabito S, Fujiwara T, Sugai M, Hacker J, Oswald E. 2009. Cytolethal distending toxin type I and type IV genes are framed with lambdoid prophage genes in extraintestinal pathogenic Escherichia coli. Infect Immun 77:492-500.

212. Parreira VR, Gyles CL. 2003. A novel pathogenicity island integrated adjacent to the thrW tRNA gene of avian pathogenic Escherichia coli encodes a vacuolating autotransporter toxin. Infect Immun 71:5087-5096.

213. Guyer DM, Henderson IR, Nataro JP, Mobley HL. 2000. Identification of sat, an autotransporter toxin produced by uropathogenic Escherichia coli. Mol Microbiol 38:53-66.

214. Smith SG, Mahon V, Lambert MA, Fagan RP. 2007. A molecular Swiss army knife: OmpA structure, function and expression. FEMS Microbiol Lett 273:1-11.

215. Prasadarao NV, Wass CA, Weiser JN, Stins MF, Huang SH, Kim KS. 1996. Outer membrane protein A of Escherichia coli contributes to invasion of brain microvascular endothelial cells. Infect Immun 64:146-153.

216. Weiser JN, Gotschlich EC. 1991. Outer membrane protein A (OmpA) contributes to serum resistance and pathogenicity of Escherichia coli K-1. Infect Immun 59:2252-2258.

217. Maruvada R, Kim KS. 2011. Extracellular loops of the Eschericia coli outer membrane protein A contribute to the pathogenesis of meningitis. J Infect Dis 203:131-140.

218. Prasadarao NV. 2002. Identification of Escherichia coli outer membrane protein A receptor on human brain microvascular endothelial cells. Infect Immun 70:4556-4563.

61

219. Sukumaran SK, Shimada H, Prasadarao NV. 2003. Entry and intracellular replication of Escherichia coli K1 in macrophages require expression of outer membrane protein A. Infect Immun 71:5951-5961.

220. Gu H, Liao Y, Zhang J, Wang Y, Liu Z, Cheng P, Wang X, Zou Q, Gu J. 2018. Rational design and evaluation of an artificial Escherichia coli K1 protein vaccine candidate based on the structure of OmpA. Front Cell Infect Microbiol 8:172.

221. Nicholson TF, Watts KM, Hunstad DA. 2009. OmpA of uropathogenic Escherichia coli promotes postinvasion pathogenesis of cystitis. Infect Immun 77:5245-5251.

222. Power ML, Ferrari BC, Littlefield-Wyer J, Gordon DM, Slade MB, Veal DA. 2006. A naturally occurring novel allele of Escherichia coli outer membrane protein A reduces sensitivity to bacteriophage. Appl Environ Microb 72:7930- 7932.

223. Kanamaru S, Kurazono H, Ishitoya S, Terai A, Habuchi T, Nakano M, Ogawa O, Yamamoto S. 2003. Distribution and genetic association of putative uropathogenic virulence factors iroN, iha, kpsMT, ompT and usp in Escherichia coli isolated from urinary tract infections in Japan. J Urol 170:2490-2493.

224. Hui CY, Guo Y, He QS, Peng L, Wu SC, Cao H, Huang SH. 2010. Escherichia coli outer membrane protease OmpT confers resistance to urinary cationic peptides. Microbiol Immunol 54:452-459.

225. Hejair HMA, Ma J, Zhu Y, Sun M, Dong W, Zhang Y, Pan Z, Zhang W, Yao H. 2017. Role of outer membrane protein T in pathogenicity of avian pathogenic Escherichia coli. Res Vet Sci 115:109-116.

226. Lynne AM, Skyberg JA, Logue CM, Nolan LK. 2007. Detection of Iss and Bor on the surface of Escherichia coli. J Appl Microbiol 102:660-666.

227. Binns MM, Davies DL, Hardy KG. 1979. Cloned fragments of the plasmid ColV,I-K94 specifying virulence and serum resistance. Nature 279:778-781.

228. Johnson TJ, Wannemuehler YM, Nolan LK. 2008. Evolution of the iss gene in Escherichia coli. Appl Environ Microbiol 74:2360-2369.

229. Barondess JJ, Beckwith J. 1995. bor gene of phage lambda, involved in serum resistance, encodes a widely conserved outer membrane lipoprotein. J Bacteriol 177:1247-1253.

230. Robbins JB, McCracken GHJ, Gotschlich EC, Ørskov F, Ørskov I, Hanson LA. 1974. Escherichia coli K1 capsular polysaccharide associated with neonatal meningitis. New Engl J Med 290:1216-1220.

62

231. Allen PM, Roberts I, Boulnois GJ, Saunders JR, Hart CA. 1987. Contribution of capsular polysaccharide and surface properties to virulence of Escherichia coli K1. Infect Immun 55:2662-2668.

232. Silver RP, Aaronson W, Vann WF. 1988. The K1 capsular polysaccharide of Escherichia coli. Rev Infect Dis 10 S282-S286.

233. Varki A. 2008. Sialic acids in human health and disease. Trends Mol Med 14:351- 360.

234. Kim KJ, Elliott SJ, Di Cello F, Stins MF, Kim KS. 2003. The K1 capsule modulates trafficking of E. coli-containing vacuoles and enhances intracellular bacterial survival in human brain microvascular endothelial cells. Cell Microbiol 5:245-252.

235. Aguero ME, Aron L, DeLuca AG, Timmis KN, Cabello FC. 1984. A plasmid- encoded outer membrane protein, TraT, enhances resistance of Escherichia coli to phagocytosis. Infect Immun 46:740-746.

236. Montenegro MA, Bitter-Suermann D, Timmis JK, Aguero ME, Cabello FC, Sanyal SC, Timmis KN. 1985. traT gene sequences, serum resistance and pathogenicity-related factors in clinical isolates of Escherichia coli and other gram-negative bacteria. J Gen Microbiol 131:1511-1521.

237. Manning PA, Beutin L, Achtman M. 1980. Outer membrane of Escherichia coli: properties of the F sex factor TraT protein which is involved in surface exclusion. J Bacteriol 142:285-294.

238. Isolde R, Marie-Luise E. 1986. Evidence that TraT interacts with OmpA of Escherichia coli. FEBS Lett 205:241-245.

239. Johnson TJ, Nolan LK. 2009. Pathogenomics of the virulence plasmids of Escherichia coli. Microbiol Mol Biol Rev 73:750-774.

240. Johnson TJ, Jordan D, Kariyawasam S, Stell AL, Bell NP, Wannemuehler YM, Alarcon CF, Li G, Tivendale KA, Logue CM, Nolan LK. 2010. Sequence analysis and characterization of a transferable hybrid plasmid encoding multidrug resistance and enabling zoonotic potential for extraintestinal Escherichia coli. Infect Immun 78:1931-1942.

241. Cascales E, Buchanan SK, Duche D, Kleanthous C, Lloubes R, Postle K, Riley M, Slatin S, Cavard D. 2007. Colicin biology. Microbiol Mol Biol R 71:158-229.

242. Johnson TJ, Giddings CW, Horne SM, Gibbs PS, Wooley RE, Skyberg J, Olah P, Kercher R, Sherwood JS, Foley SL, Nolan LK. 2002. Location of increased serum survival gene and selected virulence traits on a conjugative R plasmid in an avian Escherichia coli isolate. Avian Dis 46:342-352.

63

243. Carattoli A, Bertini A, Villa L, Falbo V, Hopkins KL, Threlfall EJ. 2005. Identification of plasmids by PCR-based replicon typing. J Microbiol Meth 63:219-228.

244. Carattoli A, Zankari E, Garcia-Fernandez A, Voldby Larsen M, Lund O, Villa L, Moller Aarestrup F, Hasman H. 2014. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 58:3895-3903.

245. Kauffmann F. 1947. The serology of the coli group. J Immunol 57:71-100.

246. Orskov I, Orskov F, Jann B, Jann K. 1977. Serology, chemistry, and genetics of O and K antigens of Escherichia coli. Bacteriol Rev 41:667.

247. Iguchi A, Iyoda S, Seto K, Morita-Ishihara T, Scheutz F, Ohnishi M, Pathogenic EcWGiJ. 2015. Escherichia coli O-genotyping PCR: a comprehensive and practical platform for molecular O serogrouping. J Clin Microbiol 53:2427-2432.

248. DebRoy C, Fratamico PM, Roberts E. 2018. Molecular serogrouping of Escherichia coli. Anim Health Res Rev doi:10.1017/S1466252317000093:1-16.

249. Joensen KG, Tetzschner AM, Iguchi A, Aarestrup FM, Scheutz F. 2015. Rapid and easy in silico serotyping of Escherichia coli isolates by use of whole-genome sequencing data. J Clin Microbiol 53:2410-2426.

250. Wirth T, Falush D, Lan R, Colles F, Mensa P, Wieler LH, Karch H, Reeves PR, Maiden MC, Ochman H, Achtman M. 2006. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol Microbiol 60:1136-1151.

251. Clermont O, Gordon D, Denamur E. 2015. Guide to the various phylogenetic classification schemes for Escherichia coli and the correspondence among schemes. Microbiol 161:980-988.

252. Oteo J, Diestra K, Juan C, Bautista V, Novais A, Pérez-Vázquez M. 2009. Extended-spectrum β-lactamase-producing Escherichia coli in Spain belong to a large variety of multilocus sequence typing types, including ST10 complex/A, ST23 complex/A and ST131/B2. Intl J Antimicrob Agents 34:173-176.

253. Jaureguy F, Landraud L, Passet V, Diancourt L, Frapy E, Guigon G, Carbonnelle E, Lortholary O, Clermont O, Denamur E, Picard B, Nassif X, Brisse S. 2008. Phylogenetic and genomic diversity of human bacteremic Escherichia coli strains. BMC Genom 9:560.

254. Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, Jelsbak L, Sicheritz-Ponten T, Ussery DW, Aarestrup FM, Lund O. 2012. Multilocus sequence typing of total-genome-sequenced bacteria. J Clin Microbiol 50:1355- 1361.

64

255. Perez-Losada M, Cabezas P, Castro-Nallar E, Crandall KA. 2013. Pathogen typing in the genomics era: MLST and the future of molecular epidemiology. Infect Genet Evol 16:38-53.

256. Tivendale KA, Logue CM, Kariyawasam S, Jordan D, Hussein A, Li G. 2010. Avian-pathogenic Escherichia coli strains are similar to neonatal meningitis E. coli strains and are able to cause meningitis in the rat model of human disease. Infect Immun 78:3412-3419.

257. Wirth T, Falush D, Lan R, Colles F, Mensa P, Wieler LH. 2006. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol Microbiol 60:1136-1151.

258. Whittam TS, Ochman H, Selander RK. 1983. Multilocus genetic structure in natural populations of Escherichia coli. P Natl Acad Sci USA 80:1751-1755.

259. Picard B, Garcia JS, Gouriou S, Duriez P, Brahimi N, Bingen E. 1999. The link between phylogeny and virulence in Escherichia coli extraintestinal infection. Infect Immun 67:546-553.

260. Gordon DM, Clermont O, Tolley H, Denamur E. 2008. Assigning Escherichia coli strains to phylogenetic groups: multi-locus sequence typing versus the PCR triplex method. Environ Microbiol 10:2484-2496.

261. Doumith M, Day MJ, Hope R, Wain J, Woodford N. 2012. Improved multiplex PCR strategy for rapid assignment of the four major Escherichia coli phylogenetic groups. J Clin Microbiol 50:3108-3110.

262. Carlos C, Pires MM, Stoppe NC, Hachich EM, Sato MI, Gomes TA, Amaral LA, Ottoboni LM. 2010. Escherichia coli phylogenetic group determination and its application in the identification of the major animal source of fecal contamination. BMC Microbiol 10:161.

263. Xie Z, Tang H. 2017. ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics 33:3340-3347.

264. Huang SH, Chen YH, Fu Q, Stins M, Wang Y, Wass C, Kim KS. 1999. Identification and characterization of an Escherichia coli invasion gene locus, ibeB, required for penetration of brain microvascular endothelial cells. Infect Immun 67:2103-2109.

265. Khan NA, Wang Y, Kim KJ, Chung JW, Wass CA, Kim KS. 2002. Cytotoxic necrotizing factor-1 contributes to Escherichia coli K1 invasion of the central nervous system. J Biol Chem 277:15607-15612.

266. Kim KJ, Chung JW, Kim KS. 2005. 67-kDa laminin receptor promotes internalization of cytotoxic necrotizing factor 1-expressing Escherichia coli K1 into human brain microvascular endothelial cells. J Biol Chem 280:1360–1368.

65

267. Lu S, Zhang X, Zhu Y, Kim KS, Yang J, Jin Q. 2011. Complete genome sequence of the neonatal-meningitis-associated Escherichia coli strain CE10. J Bacteriol 193.

268. Moriel DG, Bertoldi I, Spagnuolo A, Marchi S, Rosini R, Nesta B, Pastorello I, Corea VAM, Torricelli G, Cartocci E, Savino S, Scarselli M, Dobrindt U, Hacker J, Tettelin H, Tallon LJ, Sullivan S, Wieler LH, Ewers C, Pickard D, Dougan G, Fontana MR, Rappuoli R, Pizza M, Serino L. 2010. Identification of protective and broadly conserved vaccine antigens from the genome of extraintestinal pathogenic Escherichia coli. P Natl Acad Sci USA 107:9072-9077.

269. Nicholson BA, Wannemuehler YM, Logue CM, Li G, Nolan LK. 2016. Complete genome sequence of the neonatal meningitis-causing Escherichia coli strain NMEC O18. Genome Announc 4.

270. Wijetunge DSS, Katani R, Kapur V, Kariyawasam S. 2015. Complete genome sequence of Escherichia coli strain RS218 (O18:H7:K1), associated with neonatal meningitis. Genome Announc 3:e00804-15.

271. Peigne C, Bidet P, Mahjoub-Messai F, Plainvert C, Barbe V, Médigue C. 2009. The plasmid of Escherichia coli strain S88 (O45: K1: H7) that causes neonatal meningitis is closely related to avian pathogenic E. coli plasmids and is associated with high-level bacteremia in a neonatal rat meningitis model. Infect Immun 77:2272-2284.

66

CHAPTER 2. THE IMPACT OF MEDIA, PHYLOGENETIC CLASSIFICATION, AND E. COLI PATHOTYPES ON BIOFILM FORMATION IN EXTRAINTESTINAL AND COMMENSAL E. COLI FROM HUMANS AND ANIMALS

Modified from a paper published in Frontiers in Microbiology

Daniel W. Nielsena, James S. Klimaviczb, Tia Cavendera, Yvonne Wannemuehlera,

Nicolle L. Barbieric, Lisa K. Noland and Catherine M. Loguea,c

aDepartment of Veterinary Microbiology and Preventive Medicine, College of Veterinary

Medicine, Iowa State University, Ames, IA, USA.

bInterdepartmental Toxicology Program, Iowa State University, Ames, IA, USA.

cDepartment of Population Health, College of Veterinary Medicine, University of

Georgia, Athens, GA, USA.

dDepartment of Infectious Diseases, College of Veterinary Medicine, University of

Georgia, Athens, GA, USA.

Abstract

Extraintestinal pathogenic Escherichia coli (ExPEC) include avian pathogenic E. coli (APEC), neonatal meningitis E. coli (NMEC), and uropathogenic E. coli (UPEC) and are responsible for significant animal and human morbidity and mortality. This study sought to investigate if biofilm formation by ExPEC likely contributes to these losses since biofilms are associated with recurrent urinary tract infections, antibiotic resistance, and bacterial exchange of genetic material. Therefore, the goal of this study was to examine differences in biofilm formation among a collection of ExPEC and to ascertain

67 if there is a relationship between their ability to produce biofilms and their assignment to phylogenetic groups in three media types—M63, diluted TSB, and BHI. Our results suggest that ExPEC produce relatively different levels of biofilm formation in the media tested as APEC (70.4%, p=0.0064) and NMEC (84.4%, p=0.0093) isolates were poor biofilm formers in minimal medium M63 while UPEC isolates produced significantly higher ODs under nutrient-limited conditions with 25% of strains producing strong biofilms in diluted TSB (p=0.0204). Additionally, E. coli phylogenetic assignment using

Clermont’s original and revised typing scheme demonstrated significant differences among the phylogenetic groups in the different media. When the original phylogenetic group isolates previously typed as group D were phylogenetically typed under the revised scheme and examined, they showed substantial variation in their ability to form biofilms, which may explain the significant values of revised phylogenetic groups E and F in M63

(p=0.0291, 0.0024). Our data indicates that biofilm formation is correlated with phylogenetic classification and subpathotype or commensal grouping of E. coli strains.

Introduction

Extraintestinal pathogenic E. coli (ExPEC) is a pathotype of Escherichia coli responsible for morbidity and mortality in a wide range of hosts. In humans, neonatal meningitis E. coli (NMEC) is responsible for approximately 28-29% of cases of neonatal bacterial meningitis with a mortality rate of 12% (1, 2). Uropathogenic E. coli (UPEC) is responsible for the greatest number of both catheter-associated urinary tract infections and uncomplicated urinary tract infections worldwide (3). Serious cases of urinary tract infections can result in pyelonephritis, potentially leading to sepsis and death. Avian pathogenic E. coli (APEC) is a non-human subpathotype of ExPEC and the causative

68 agent of colibacillosis in poultry. Collectively, human infections caused by ExPEC strains result in billions of dollars in healthcare costs annually, while APEC also poses a significant financial burden to the poultry industry (3, 4). In addition, ExPEC subpathotypes from different host sources have also been shown to exhibit genomic (5) and phenotypic similarities (6, 7).

Biofilms are complex communities of surface-associated microorganisms that are enclosed in a structured, highly hydrated extracellular polysaccharide (8, 9). In themselves, biofilms are a distinct bacterial lifestyle since biofilm formation promotes resistance to antibiotics and sanitation as well as exchange of DNA (10, 11). Biofilms can lead to persistent and chronic infections in humans, and an estimated 65% of hospital infections are of biofilm origin (12). Additionally, persistence of APEC in the poultry production environment may be due to biofilm development on plastic surfaces such as water feeding systems (13, 14). Escherichia coli K1 biofilms have also been isolated from neonatal nasogastric feeding tubes (15), and biofilm-like intracellular bacterial communities (IBCs) are known to play in important role in the pathogenesis of UPEC since IBC formation allows UPEC to continue colonization of the bladder and resist expulsion (3, 16, 17). The ability of ExPEC to exchange DNA in biofilms is also of concern due to potential acquisition of virulence and antimicrobial resistance plasmids

(18, 19).

Although biofilm analyses have been done independently in APEC (20), UPEC

(21), and NMEC (22), cross comparisons among these ExPEC subpathotypes are problematic due to procedural differences in the studies. This lack of comparative data can be a roadblock to study of the zoonotic potential of ExPEC, development of universal

69 mitigation strategies to control ExPEC-caused diseases, and our understanding of

ExPEC’s environmental persistence, transmission and pathogenesis. Additionally, failure to understand the differences in biofilm production between ExPEC and their commensal

E. coli counterparts, including avian fecal E. coli (AFEC) and human fecal E. coli

(HFEC), can undermine our understanding of ExPEC’s disease pathogenesis and prevent clarity on the importance of biofilm production in the survival of ExPEC outside vs. inside the host.

Another issue that may limit our clarity on the importance of biofilm production to ExPEC pathogenesis is that previous studies (20-22) of ExPEC biofilm production were analyzed by phylogenetic assignment according to the original typing scheme of

Clermont (23). Here, we compare different ExPEC, assigned to different phylogenetic groups using Clermont’s more recent and refined typing scheme (24). Such data may lend great insight into the role of biofilm in pathogenesis since phylogenetic group assignment may predict an E. coli’s capacity to cause disease (25).

In addition to looking for differences in biofilm production between ExPEC and

E. coli commensals, among members of different ExPEC subpathotypes, and members of different phylogenetic groups, we sought to determine if these differences were best discerned under three different conditions of growth, including nutrient-poor, nutrient- limited, or nutrient-rich conditions (20).

In the present study, we seek to remedy deficits in our understanding of ExPEC biofilm production. To do so, biofilm production of ExPEC from APEC, NMEC and

UPEC subpathotypes and their commensal counterparts, all assigned to phylogenetic

70 groups according to Clermont’s original and revised typing schemes and using the same methodology under three conditions of growth was assessed.

Materials and methods

Strain selection

To assess biofilm formation, 175 E. coli isolates were tested including 108

ExPEC strains and 67 commensal E. coli strains. These isolates were randomly selected from collections that have been described previously (7, 18, 26-29), and efforts were made to ensure that members of all phylogenetic groups were represented. Since Logue et al. (2017) demonstrated natural differences in the distribution of the phylogenetic groups among isolates of the different subpathotypes, this study did not equilibrate the isolates selected based on their subpathotypes or commensal classification thus avoiding production of erroneous results when generalizing the groups (29). Table 2.1 provides information about the isolates examined in this study, and data for each strain is found in the supplementary file.

Biofilm assay

Isolates to be tested were struck from frozen stock to tryptic soy agar (TSA,

Difco, BD, Franklin Lakes, NJ, USA) and incubated at 37 ºC for 18 to 24 hours. Three colonies were selected per strain and incubated for 16 hours in Luria Bertani (LB) Miller broth (Difco). These cultures were tested based on methods previously described (20, 30,

31). Previous work has demonstrated that the assay used here, a 96-well microtiter plate with crystal violet staining, has a high repeatability and broad applicability and is useful for the examination of the early stages of biofilm development (31-34). After incubation

71 for 16 hours, the strains were diluted 1:100 with each broth; these included brain heart infusion (BHI) broth (Bacto), 1/20 diluted tryptic soy broth (TSB) (Bacto), and M63 minimal media (20, 35). 200 uL aliquots of each diluted strain in medium were dispensed into seven wells of a Sarstedt 96-well flat bottom microtest plate (Sarstedt, Nümbrecht,

Germany). Uninoculated medium was used as a negative control in the eighth well. Plates were incubated at 37 ºC for 24 hours without shaking before the contents of the plates were decanted, and the plates washed once with sterile deionized water. Following washing, microplates were stained for 30 minutes with 200 uL of 0.1% crystal violet solution (J.T. Baker Chemical Co., Phillipsburg, N.J.). After staining, the plates were washed four times with sterile deionized water and allowed to air-dry for one hour.

Following drying, the biofilms were resolubilized with 200 uL of an 80:20 solution of ethanol and acetone, and 150 uL of the resulting solution was transferred into a new plate. Biofilm formation was quantified by measuring the optical density (OD) of the dissolved crystal violet at 600 nm using an ELx808 Ultra MicroPlate Reader with Gen5

Microplate Reader and Imaging software (Bio-Tek Instruments, Winooski, VT). The

ODs of the three biological replicates in seven wells were averaged for each test strain in each medium tested.

Clermont’s phylogenetic typing

All isolates were assigned to phylogenetic group based on the original and revised

Clermont phylogenetic typing schemes (23, 24). These protocols have been described previously and are based on amplification of the chuA and yjaA genes as well as the

TSPE4.C2 DNA fragment (23) or revised chuA, yjaA, and TspE4.C2 DNA fragment along with the arpA and trpA genes (24, 29). Amplified products were run on a 2%

72 agarose gel in 1X TAE buffer with known controls (ECOR collection) and classification of the strains was determined by the presence or absence of gene products.

Table 2.1: Isolates used in this study as classified by subpathotype or commensal category and Clermont’s phylogenetic typing schemes.

Original Scheme Revised Scheme

n= A B1 B2 D A B1 B2 C D E F

ExPEC APEC 44 11 4 7 22 4 4 6 8 5 9 8

NMEC 32 8 2 15 7 5 2 15 3 1 1 5

UPEC 32 9 5 7 11 4 5 5 5 7 1 5

Commensal AFEC 33 9 6 8 10 5 5 4 4 5 5 5

HFEC 34 9 6 9 10 7 6 8 1 6 1 5

Sum 175 46 23 46 60 25 22 38 21 24 17 28

Extraintestinal pathogenic E. coli (ExPEC) subpathotypes examined included avian pathogenic E. coli (APEC), neonatal meningitis E. coli (NMEC), and uropathogenic E. coli (UPEC). Commensal E. coli isolates included could be separated by host—avian fecal E. coli (AFEC) and human fecal E. coli (HFEC).

Biostatistics

The ODs measured for the negative control wells for each test strain and media combination were measured and averaged. The ODs for each test strain in a given medium were averaged and normalized against the negative control by subtracting the average OD of the negative control from the average OD of the test strain. Statistical analyses were performed in MATLAB (Mathworks®) using the normalized, average data for each test strain in each of the three media. As the obtained ODs deviate from a normal distribution, the Kruskal-Wallis (KW) test (one-way ANOVA on ranks) was used as a

73 non-parametric test to determine significant differences in biofilm formation between the original or revised Clermont’s phylogenetic types or subpathotype or commensal E. coli

(36). Multiple comparisons were performed in relation to ANOVA using Tukey’s honest significant difference test (HSD) to reduce the incidence of Type I error (37). Direct comparisons between two groups were made using the Mann-Whitney U test (MW) (38).

To reduce human bias introduced by setting cutoffs for levels of biofilm formation, cluster analysis was performed by an unsupervised machine learning technique. For each medium, isolates were clustered by level of biofilm formation using k-means clustering (39) with four means identified to represent negligible, low, moderate, and high levels of biofilm formation; the use of these four categories is consistent with previous work (20, 30). The k-means algorithm was performed with the L1-norm; 10,000 replicates for each media were performed to produce optimal clustering. When biofilm production was treated as a discrete classifier, the chi-square test of homogeneity was used to determine statistically significant differences. Significance for all statistical tests was determined at the α=0.05 level.

Results

Overall associations of medium and biofilm production

In the three different media tested, biofilm formation exhibited limited correlation between media (Pearson’s correlation coefficient: M63 to TSB, r=0. 3461, M63 to BHI, r=0. 3217, BHI to TSB, r=0. 1892), which suggests biofilm formation in each medium should be examined individually. In general, the optical densities were significantly greater for isolates when grown in M63 than in diluted TSB or BHI (KW, p=1.492E-09).

74

Supplementary figure 2.1 shows the optical densities of each medium for all strains examined with the standard error of mean included.

When all E. coli were classified as a negligible, low, moderate, or high biofilm in each medium using k-means clustering (Figure 2.1), approximately 33.1% of E. coli strains formed moderate or high biofilms in M63; classification of moderate or high biofilms in diluted TSB and BHI were lower at 23.4 and 26.9% respectively (Figure 2.2,

Supplementary Table 2.1). The final OD measurements of each strain tested are available in the supplementary file.

Associations between biofilm formation by E. coli commensals and ExPEC subpathotypes

When all test strains were examined in M63 broth and classified as negligible, low, moderate, or high biofilm producers, strains classified as APEC and NMEC were found to be significantly weaker biofilm producers (χ2, p=0.0064 and 0.0092, respectively) among all of the E. coli examined. In APEC, the majority of strains (56.8%) produced negligible biofilms while the majority of NMEC (53.1%) produced low-level biofilms (Figure 2.2). UPEC, AFEC, and HFEC strains could not be differentiated from any other E. coli groups (MW, p= 0.6007, 0.6371, 0.4956, respectively). In addition, the majority of UPEC (62.5%), AFEC (57.6%), and HFEC (52.9%) isolates produced low or moderate biofilms using k-means clustering (Figure 2.2).

75

Figure 2.1: k-means Clustering of E. coli by Media k-means clustering was used to determine the level of biofilm production for each E. coli isolate in four categories: negligible (red), low (green), moderate (blue), and high (purple).

In diluted TSB, UPEC strains produced significantly higher ODs compared to

APEC, NMEC, AFEC, and HFEC isolates (MW, p=0.02045, Figure 2.3). Even though

UPEC did not produce a significant p-value (χ2, p=0.0505) when the biofilm levels of

UPEC strains were categorized, a greater proportion of UPEC strains classified as high- level biofilm formers (25%) compared to any other commensal or ExPEC group (Figure

2.2). In contrast, most (range 53-73%) of the APEC, NMEC, AFEC, and HFEC isolates produced negligible biofilms in 1/20 TSB (Figure 2.2).

76

Figure 2.2: Prevalence of Biofilm Formation in ExPEC subpathotypes and E. coli Commensals.

Percentage of negligible (red), low (green), moderate (blue), and high (purple) biofilm forming isolates by subpathotype or commensal E. coli designation. APEC and NMEC are significantly different in their abilities to form biofilms in M63 (0.0064 and 0.0093, respectively).

In BHI, no significant differences were observed for the ODs of the groups

(Figure 2.2), yet more APEC and NMEC strains were categorized as negligible biofilm producers than any other category of biofilm (38.6 and 37.5%, respectively, Figure 2.2).

AFEC and HFEC isolates produced more low-level biofilms than any other category of biofilm formation (54.5 and 44.1%, respectively). Additionally, HFEC produced more

77 high-level biofilms than the isolates from any other subpathotype or commensal group

(14.7%, Figure 2.2).

Figure 2.3: Optical Density of ExPEC and Commensal E. coli Strains.

Biofilm formation relative to the ExPEC commensal and subpathotype classification and media: M63 (red), 1/20 TSB (green), and BHI (blue). The mean OD600 is plotted; error bars represent ± standard error of mean.

Associations between biofilm formation by E. coli and host

Since APEC and NMEC showed significant OD differences in M63 broth, we attempted to determine if there were differences between E. coli isolated from chickens

78

(APEC and AFEC) or humans (NMEC, UPEC, HFEC). Interestingly, no significant differences were observed between strains isolated from the two hosts, suggesting that biofilm formation is not associated solely with the host origin of the strain

(Supplementary Figure 2.2).

Associations between Clermont’s original phylogenetic groups and biofilm formation

Significant differences were observed between the ODs of each isolate classified by Clermont’s original phylogenetic groups when all E. coli were compared in M63 broth

(KW, p=0.0055, Figure 2.4). The OD of phylogenetic group B2 showed significantly greater biofilm formation than phylogenetic group A (HSD, p=0.0131, Figure 2.4). The majority of phylogenetic group A strains were found to be significantly different (χ2, p=0.0172) than the other E. coli when the level of biofilm production was examined. A majority of group A isolates produced negligible biofilms (55.6%, Table 2.2). There were no significant differences between groups B2 and B1 or D since a greater prevalence of these isolates produced moderate or high-level biofilms at 43.5, 41.3, and 27.9%, respectively among the isolates examined (Table 2.2).

Like M63, significant differences were observed between all E. coli grown in diluted TSB (KW, p=2.999E-3, Figure 2.4). When the ODs of phylogenetic groups B1 and B2 were examined, these groups demonstrated significantly greater biofilm formation than phylogenetic group A (HSD, p=0.01084 and p=8.366E-3, respectively).

When the biofilm level of phylogenetic group B2 was compared to others, it was significantly different (χ2, p=0.0183) than the other phylogenetic groups, and 45.7% of group B2 produced low or moderate biofilms (Table 2.2). 43.5% of phylogenetic group

79

B1 isolates produced moderate or high biofilms while the majority of phylogenetic group

A and D isolates produced negligible biofilms (75.6 and 62.3%, respectively, Table 2.2).

Phylogenetic groups A, B2, and D were statistically different from the other phylogenetic groups (χ2, p=0.0047, p=0.0240, p=0.0367, respectively) in BHI when their biofilm levels were assessed. The majority of phylogenetic group A isolates (75.6%) produced negligible or low-level biofilms (Table 2.2). Phylogenetic group B2 isolates

(60.9%) formed more low-level biofilms than any other group. Phylogenetic group D strains (31.1%) produced more moderate level biofilms than phylogenetic groups A, B1, or B2 (Table 2.2).

Associations between Clermont’s revised phylogenetic groups and E. coli biofilm formation

A Kruskal-Wallis test based on the ODs of each isolate classified by Clermont’s revised phylogenetic typing scheme identified significant differences among the revised phylogenetic groups and strains grown in M63 (p=1.627E-5). The OD of phylogenetic groups A and F were less than groups B2 or E (p values in Supplementary file, Figure

2.5). The ODs of phylogenetic group C compared to any different phylogenetic group were not significant, but the difference between group C and E was suggestive of a possible difference (p=0.0527). When their biofilm levels were compared to the remaining isolates using the χ2 test, phylogenetic groups A, E, and F were significantly different (p=0.0243, p=0.0290, p=0.0024, respectively, Supplementary table 2.2). In

M63, phylogenetic groups A and F overwhelmingly produced negligible biofilms, (60 and 67.9%, respectively, Figure 2.6). In contrast, phylogenetic group E was the most prolific biofilm former with 58.8% of isolates forming moderate or high biofilms (Figure

80

2.6). Of the other phylogenetic groups, 50% of phylogenetic group D isolates formed low biofilms, and 57.1% of phylogenetic group C isolates formed negligible biofilms (Figure

2.6). The majority of phylogenetic group B2 isolates (65.8%) produced low or moderate biofilms (Figure 2.6).

Table 2.2: Chi-square Discrete Categorization of E. coli by Clermont’s Original

Phylogenetic Groups*.

M63 Negligible Low Moderate High p A 55.6% 17.8% 15.6% 11.1% 0.0172 B1 21.7% 34.8% 39.1% 4.3% 0.0847 B2 21.7% 37.0% 23.9% 17.4% 0.0803 D 39.3% 32.8% 18.0% 9.8% 0.7572

1/20 TSB Negligible Low Moderate High p A 75.6% 13.3% 6.7% 4.4% 0.0644 B1 47.8% 8.7% 17.4% 26.1% 0.0750 B2 45.7% 23.9% 21.7% 8.7% 0.0183 D 62.3% 18.0% 4.9% 14.8% 0.2369 BHI Negligible Low Moderate High p A 46.7% 28.9% 11.1% 13.3% 0.0047 B1 34.8% 34.8% 21.7% 8.7% 0.9032 B2 21.7% 60.9% 13.0% 4.3% 0.0240 D 26.2% 39.3% 31.1% 3.3% 0.0368

* Percentages may not add up to 100.0% due to rounding. Bolded values indicate p < 0.05.

81

Figure 2.4: Optical Density of E. coli isolates as Designated Using Clermont’s Original Phylogenetic Groups.

Biofilm formation relative to Clermont’s original phylogenetic typing scheme and media: M63 (red), 1/20 TSB (green), and BHI (blue). The mean OD600 is plotted; error bars represent ± standard error of mean.

Like M63, a Kruskal-Wallis test based on the OD of tested isolates and

Clermont’s revised phylogenetic typing scheme identified significant differences in diluted TSB (p=4.501E-7, Figure 2.5). The ODs of the different phylogenetic groups revealed phylogenetic groups A and F were characterized by significantly lower biofilm

ODs than other phylogenetic groups with higher optical densities observed for B1, B2, D, and E (HSD, p-values in supplementary file, Figure 2.5). Phylogenetic group C was not statistically different from any group. When the level of biofilm formation was

82 determined via k-means clustering, phylogenetic groups A, B1, and B2 were all significantly different from the other phylogenetic groups (χ2, p=0.0161, p=0.0176, p=0.0078, Figure 2.6, Supplementary table 2). A majority of isolates classified as phylogenetic groups A, C, D, and F produced negligible biofilms (88.0, 66.7, 50, and

78.6% respectively, Figure 2.6). 50.0 and 41.2% of isolates from phylogenetic groups B2 and E produced low and moderate biofilms, respectively (Figure 2.6). Isolates identified as phylogenetic group B1 produced more high-level biofilms than any other group in diluted TSB (31.8% of isolates examined, Figure 2.6).

In BHI, phylogenetic groups A and B2 were statistically different from groups other than themselves (χ2, p=0.0001 and p=0.0258, Supplementary table 2.2). 56% of phylogenetic group A isolates produced negligible biofilms while 63.2% of phylogenetic group B2 isolates produced low-level biofilms (Figure 2.6). For phylogenetic group E,

47.1% of the isolates examined formed low-level biofilms (9.1%, Figure 2.6).

Discussion

In extraintestinal pathogenic E. coli, there is substantial morbidity and mortality associated with urinary tract infections and neonatal meningitis in humans as well as colibacillosis in avian hosts. The suffering and costs associated with diseases caused by

ExPEC are undoubtedly significant (3, 4). Therefore, we assessed the associations between APEC, NMEC, UPEC, AFEC, and HFEC with biofilm formation using the crystal violet phenotype assay. Additionally, we examined the genetic similarity of the bacterial isolates tested and their ability to form biofilms using Clermont’s original and revised phylogenetic typing schemes. Although previous work has been done surveying biofilm formation in APEC and AFEC (20), NMEC and HFEC (22), UPEC (21), and E.

83 coli derived from raw meat and eggs (40), this study is the first to examine biofilm formation across the subpathotype and commensal groups under standardized conditions, and the work presented demonstrates a novel, unbiased way to classify biofilms in a high throughput manner.

In order to classify biofilm formation, we utilized a machine learning-based clustering methodology to remove human bias in classifying the negligible, low, moderate, and high biofilm forming groups. The four biofilm categories used in this study were selected to allow for comparison with the earlier work of our lab reported by

Skyberg et al. (20). We believe this method to be advantageous to other methods previously used to distinguish the relative formation of biofilms as we do not classify biofilms by an author drawn cutoff or use a standard deviation system from the control wells (20-22, 31, 40, 41), yet we do note that the approach presented may classify isolates slightly differently than previous methods. In a recent study by Mitchell et al. (41), 94% of ExPEC-like E. coli were found to produce biofilms in Luria-Bertani medium. These numbers differ significantly from our strain classifications here, but these E. coli strains were assayed in round-bottom 96-well plates. Additionally, the E. coli strains were reported to be isolated from meat and eggs, not ill avian or human hosts (41). These isolates had their subpathotype identified based on the presence or absence of genes associated with virulence or phenotypic differences. The current study classified the pathotype based on the source of the original isolate and the presence or absence of disease, similar to literature precedent (20, 21, 42). Therefore, APEC isolates were isolated from diseased birds while NMEC and UPEC were isolated from cases of neonatal meningitis or urinary tract infections in humans. AFEC and HFEC isolates were

84 obtained from avian or human hosts, respectively, which did not demonstrate apparent disease.

Figure 2.5: Optical Density of E. coli isolates as Designated Using Clermont’s Revised Phylogenetic Groups.

Biofilm formation relative to Clermont’s revised phylogenetic group and media: M63 (red), 1/20 TSB (green), and BHI (blue). The mean OD600 is plotted; error bars represent +/- standard error of mean.

85

Figure 2.6: Prevalence of Biofilm Formation by E. coli classified using Clermont’s Revised Phylogenetic Groups in Each Medium.

Cumulative percentage of negligible (red), low (green), moderate (blue), and high (purple) biofilm forming isolates by Clermont’s revised phylogenetic groups. The significant differences between groups are shown in Supplementary Table 2.2.

Interestingly, when we tested ExPEC isolates, we found our results to be analogous to previous work with ExPEC sensu stricto. For NMEC, it has previously been reported that approximately 77.4% of isolates form biofilms in Luria-Bertani broth, and

52.8% of isolates form biofilms in a minimal medium (22). Although we did not test the nutrient-rich Luria-Bertani medium, we did find that a majority (68.8%) of NMEC formed biofilms in a minimal medium, yet the majority of these biofilms were classified as low-level biofilms (53.1%). Likewise, Skyberg et al. found that the majority of APEC

86

(range 53.3-98.1%) and AFEC (range 56.3-83.4%) formed “none/weak” biofilms in M63,

1/20 TSB, and BHI (20). We concur that a majority of APEC (range 70.4-84.1%) and

AFEC (range 60.6-84.8%) produce negligible and low biofilms in all media tested, but we found that the production of moderate- or high-level biofilms was slightly greater when compared with Skyberg et al. (20).

Like APEC and NMEC, subtle differences were observed for UPEC strains when they were examined based on their biofilm classification and previously reported studies.

Soto et al. found 40-43% of UPEC produce biofilms (21) while the current study found that 37.5% of UPEC produced moderate- or high-level biofilms in minimal medium.

Nevertheless, the similarity of biofilm production among these ExPEC, even with slightly different methods of biofilm classifications between studies, gives us confidence in utilizing a k-means clustering methodology for biofilm classification.

To our knowledge, this is the first time an unbiased cluster analysis-based approach has been used to classify biofilm formation, yet machine learning clustering techniques have been used to examine other biological systems such as sensing contamination in the food industry (43, 44) and identification of bacteria responsible for urinary tract infections (45). Nevertheless, there are limitations to clustering techniques as biological systems can be variable and statistical significance can be lost. When we performed the statistical Mann-Whitney test on the continuous optical densities of the isolates, UPEC was significantly different than the other E. coli tested in diluted TSB.

However, discretization of the data occurred when the E. coli were sorted into the four biofilm levels, and significance was lost. Even with a loss of significance, the sorting of

87

E. coli into the four classes of biofilm production is central to deliberation and reflection regarding similarities and differences among strains.

Potential impact of subpathotypes and pathogenesis on biofilm formation

In M63 medium, APEC and NMEC were significantly different from the other E. coli when their biofilm levels were categorized. The bulk of APEC and NMEC strains were classified either as negligible or low biofilm producers while UPEC, AFEC, and

HFEC strains could not be differentiated. The limited biofilm production of the APEC and NMEC subpathotypes and the inability to separate E. coli based on their host of isolation (human or avian) leads to further evidence that APEC may be a zoonotic pathogen (6, 27, 46).

Although UPEC could not be differentiated from AFEC and HFEC in M63 broth, it had statistically greater ODs in diluted TSB than the other E. coli tested. Undoubtedly, the ability of UPEC to produce biofilms is well-documented (21, 47). It is well-known that virulence factors associated with the adhesion of UPEC in the host include the F1C,

P, S, and Type I pili as well as the Dr adhesins (3, 48). If the ability to produce biofilms promotes adherence by UPEC, the negligible or weaker biofilms produced by APEC and

NMEC may be expected as expulsion from the urinary tract is not a major impediment to their ability to cause disease. Nevertheless, the UPEC selected for this study may have been inadvertently selected by the ability to survive in a urinary catheter as the isolates studied here were isolated from hospitalized patients; survival in a catheter presumably may be aided by a biofilm. Unfortunately, there is no way to ascertain whether these isolates were isolated from urine or urinary catheters.

88

Interestingly, both groups known to reside in the intestinal tract, AFEC (69.7%) and HFEC (82.4%), produce more biofilms in the rich medium, BHI. A different gut dwelling E. coli pathotype, enteroaggreative E. coli (EAEC) has also shown an ability to produce robust biofilms in BHI (49), suggesting that E. coli of the gut may be more suited to biofilm production in nutrient-rich conditions.

Association between phylogenetic typing and biofilm formation

In Clermont’s original typing scheme, E. coli strains were subtyped into four groups: A, B1, B2, and D (23). Original phylogenetic group A isolates are found as commensals in the human gut (50), in sediment environments (51), or as pathogens in poultry (29). However, phylogenetic group A may be best known for containing the laboratory strain MG1655 K12 (52). When the ability to produce biofilms was examined for phylogenetic group A isolates, a majority of isolates formed negligible biofilms in each media, the only group to do so in all three media types. While group A and B1 have been proposed to be sister clades (53) and have been shown to be inversely related in some aquatic environments (54), group B1 produced more high-level biofilms than any other group in diluted TSB. Group B1 has been shown to occur more often than group A in aquatic environments during high temperature seasons, and the production of biofilms could promote adhesion to surfaces in these warmer temperatures (54). Phylogenetic group B2 has long been known as a group containing many human pathogens, including

NMEC (22, 29), UPEC (7, 21, 29), and strains belonging to irritable bowel disease patients (55). Phylogenetic group B2 had significantly different biofilm levels than the other groups in media that was more nutrient rich—diluted TSB and BHI (Table 4). In these media, B2 isolates produced more biofilms than any other group. Finally,

89 significant differences for group D occurred solely in BHI as group D produced more moderate level biofilms than any other group. While classification of the isolates by

Clermont’s original phylogenetic typing scheme yielded interesting results, the scheme was revised in 2013 as a means to better classify some of the inconsistencies in the original typing scheme. We include the scheme here as a means to interpret the potential influence of the original phylogenetic group for research already completed.

In 2008, Gordon et al. found that 80-85% of the phylogenetic group memberships assignment under Clermont’s original scheme were correct with assignment to groups B1 and B2 correct for 95% of assignments (56). Therefore, the scheme was revised in 2013 to include E. coli specific groups that are used to classify E. coli: A, B1, B2, C, D, E, F, and Clade I (24). Logue et al. (2017) found significant phylogenetic group changes when classification of ExPEC and commensal E. coli strains were examined using the new phylogenetic typing scheme (29). In APEC and UPEC, significant changes were observed from groups A to C and D to E or F (29). In NMEC, group D isolates changed to group F (29). Upon our re-categorization according to Clermont’s revised phylogenetic typing scheme (24), we found E. coli isolates of group E to be the most prolific producers of biofilms in minimal medium. Unfortunately, little is known about group E, but it may consist of pathogens such as diarrheagenic E. coli (57). Unlike phylogenetic group E, group F produced the lowest optical density, suggesting poor biofilm capability in minimal and nutrient limited media under the conditions of this study. The relatively low production of biofilms by group F is thought-provoking as group F strains have been shown to contain relatively high numbers of virulence, resistance, and pathogenicity island associated genes when examined in E. coli stains isolated from poultry (29). The

90

B2 phylogenetic group has been suggested to be a sister group to phylogenetic group F

(58, 59), but we found tested B2 strains were relatively good at producing biofilms in all media, significantly contrasting with group F. Thus, an investigation of the genetic differences between phylogenetic group B2 and F is warranted.

Conclusion

The work presented here is novel and uses a machine learning based approach to remove human bias in the classification of biofilms among a collection or ExPEC and their commensal counterparts. This study has also demonstrated that there was no significant difference between the E. coli isolates from animal and human hosts when they were not separated by commensal or subpathotype categories. When isolates were separated by their demonstrated pathogenesis in their respective host, biofilm production for APEC and NMEC was significantly different from other E. coli in M63 as these subpathotypes produced negligible or weaker biofilms. Interestingly, UPEC was a significantly better biofilm former than the other E. coli tested in diluted TSB. Clearly, further work is necessary to elucidate the mechanisms of pathogenesis and virulence factors of ExPEC that contribute to biofilm formation in moderate and high-level biofilm formers. When considering the prolific biofilm forming abilities of phylogenetic group E in M63, it is evident that more research is warranted to explore this relatively new phylogenetic group and its potential association with disease in human and animal hosts.

91

Declarations

Ethics approval

The work presented was covered under the Institutional Biosafety Committee

(IBC) approval 04-D/I-005-A/H at Iowa State University. No humans were involved in this study and the isolates used were collected from previous studies and de-identified when they were supplied to us. No animals were used in this study and isolates were collected from multiple studies dating back more than 20 years at multiple institutions.

Data/Material availability

All data from this study has been made available in the supplemental files.

Isolates used in the study can be requested but may not be available due to current work that is ongoing. Isolates from our collaborators will require request directly from them.

All materials will be subjected to a material transfer agreement (MTA). We confirm that no authors have any competing interests in this paper.

Funding

This work was funded by the Dean’s Office of Iowa State University College of

Veterinary Medicine and the Iowa Livestock Health Advisory Council.

Authors’ contributions

DN carried out the research, data analysis, and drafting of the paper; JK designed and performed the statistical analysis and co-authored the paper; TC provided assistance in strain analysis and data generation; YW provided assistance in molecular analysis including phylogenetic typing and contributed to the writing of the paper; NB provided

92 assistance in the microbial analysis, biofilm analysis, and editing of the paper; LN provided materials for the study and writing of the paper; CL helped design the study, draft the paper, and provided materials for the study.

Acknowledgements

Special thanks to Joel Maki (Iowa State University, Department of Veterinary

Microbiology) for his laboratory assistance and Dr. James Johnson (University of

Minnesota), Dr. Paul Carson (Meritcare Hospital, Fargo, ND), Dr. James Wynn

(University of Florida), and Dr. Kwang S. Kim (Johns Hopkins University), and Dr.

Lodewijk Spanjaard (Netherlands Reference Laboratory for Bacterial Meningitis) for their gift of bacterial strains.

References

1. Gaschignard J, Levy C, Romain O, Cohen R, Bingen E, Aujard Y, Boileau P. 2011. Neonatal bacterial meningitis: 444 cases in 7 years. Pediatr Infect Dis J 30:212-217.

2. Stoll BJ, Hansen NI, Sanchez PJ, Faix RG, Poindexter BB, Van Meurs KP, Bizzarro MJ, Goldberg RN, Frantz ID, 3rd, Hale EC, Shankaran S, Kennedy K, Carlo WA, Watterberg KL, Bell EF, Walsh MC, Schibler K, Laptook AR, Shane AL, Schrag SJ, Das A, Higgins RD. 2011. Early onset neonatal sepsis: the burden of group B streptococcal and E. coli disease continues. Pediatrics 127:817-826.

3. Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. 2015. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Micro 13:269-284.

4. Nolan LK, Barnes HJ, Vaillancourt JP, Abdul-Aziz T, Logue CM. 2013. Colibacillosis, p 751-805, Diseases of Poultry, 13th ed doi:10.1002/9781119421481. ch18. John Wiley & Sons, Ltd.

5. Johnson TJ, Kariyawasam S, Wannemuehler Y, Mangiamele P, Johnson SJ, Doetkott C, Skyberg JA, Lynne AM, Johnson JR, Nolan LK. 2007. The genome

93

sequence of avian pathogenic Escherichia coli strain O1:K1:H7 shares strong similarities with human extraintestinal pathogenic E. coli genomes. J Bacteriol 189:3228-3236.

6. Tivendale KA, Logue CM, Kariyawasam S, Jordan D, Hussein A, Li G, Wannemuehler Y, Nolan LK. 2010. Avian-pathogenic Escherichia coli strains are similar to neonatal meningitis E. coli strains and are able to cause meningitis in the rat model of human disease. Infect and Immun 78:3412-3419.

7. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Fakhr MK, Nolan LK. 2005. Comparison of Escherichia coli isolates implicated in human urinary tract infection and avian colibacillosis. Microbiol 151:2097-2110.

8. Donlan RM, Costerton JW. 2002. Biofilms: survival mechanisms of clinically relevant microorganisms. Clin Microbiol Rev 15:167-193.

9. Flemming H-C, Wingender J. 2010. The biofilm matrix. Nat Rev Micro 8:623- 633.

10. Beloin C, Roux A, Ghigo J-M. 2008. Escherichia coli biofilms. Curr Top Microbiol 322:249-289.

11. Davey ME, O'toole GA. 2000. Microbial biofilms: from ecology to molecular genetics. Microbiol Mole Biol Rev 64:847-867.

12. Potera C. 1999. Forging a link between biofilms and disease. Science 283:1837- 1839.

13. Lindsay D, Geornaras I, von Holy A. 1996. Biofilms associated with poultry processing equipment. Microbios 86:105-16.

14. Amaral Ld. 2004. Drinking water as a risk factor to poultry health. Rev Bras Cienc Aví 6:191-199.

15. Alkeskas A, Ogrodzki P, Saad M, Masood N, Rhoma NR, Moore K, Farbos A, Paszkiewicz K, Forsythe S. 2015. The molecular characterisation of Escherichia coli K1 isolated from neonatal nasogastric feeding tubes. BMC Infect Dis 15:449.

16. Schwartz DJ, Chen SL, Hultgren SJ, Seed PC. 2011. Population dynamics and niche distribution of uropathogenic Escherichia coli during acute and chronic urinary tract infection. Infect Immun 79:4250-4259.

17. Hannan TJ, Totsika M, Mansfield KJ, Moore KH, Schembri MA, Hultgren SJ. 2012. Host–pathogen checkpoints and population bottlenecks in persistent and intracellular uropathogenic Escherichia coli bladder infection. FEMS Microbiol Rev 36:616-648.

94

18. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Nolan LK. 2005. Characterizing the APEC pathotype. Vet Res 36:241-256.

19. Johnson TJ, Siek KE, Johnson SJ, Nolan LK. 2005. DNA sequence and comparative genomics of pAPEC-O2-R, an avian pathogenic Escherichia coli transmissible R -plasmid. Antimicrob Agents Ch 49:4681-4688.

20. Skyberg JA, Siek KE, Doetkott C, Nolan LK. 2007. Biofilm formation by avian Escherichia coli in relation to media, source and phylogeny. J Appl Microbiol 102:548-554.

21. Soto SM, Smithson A, Martinez JA, Horcajada JP, Mensa J, Vila J. 2007. Biofilm formation in uropathogenic Escherichia coli strains: relationship with prostatitis, urovirulence factors and antimicrobial resistance. J Urology 177:365-368.

22. Wijetunge DSS, Gongati S, DebRoy C, Kim KS, Couraud PO, Romero IA, Weksler B, Kariyawasam S. 2015. Characterizing the pathotype of neonatal meningitis causing Escherichia coli (NMEC). BMC Microbiol 15:211.

23. Clermont O, Bonacorsi S, Bingen E. 2000. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl Environ Microb 66:4555-4558.

24. Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Env Microbiol Rep 5:58-65.

25. Picard B, Garcia JS, Gouriou S, Duriez P, Brahimi N, Bingen E. 1999. The link between phylogeny and virulence in Escherichia coli extraintestinal infection. Infect Immun 67:546-553.

26. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Fakhr MK, Nolan LK. 2005. Comparison of Escherichia coli isolates implicated in human urinary tract infection and avian colibacillosis. Microbiol 151.

27. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of extraintestinal pathogenic Escherichia coli strains from human and avian sources reveals a mixed subset representing potential zoonotic pathogens. Appl Environ Microb 74:7043-7050.

28. Johnson TJ, Logue CM, Wannemuehler Y, Kariyawasam S, Doetkott C, DebRoy C, White DG, Nolan LK. 2009. Examination of the source and extended virulence genotypes of Escherichia coli contaminating retail poultry meat. Foodborne Pathog Dis 6:657-667.

29. Logue CM, Wannemuehler Y, Nicholson BA, Doetkott C, Barbieri NL, Nolan LK. 2017. Comparative analysis of phylogenetic assignment of human and avian ExPEC and fecal commensal Escherichia coli using the (previous and revised)

95

Clermont phylogenetic typing methods and its impact on avian pathogenic Escherichia coli (APEC) classification. Front Microbiol 8:283.

30. Stepanović S, Ćirković I, Ranin L, Svabić-Vlahović M. 2004. Biofilm formation by Salmonella spp. and Listeria monocytogenes on plastic surface. Lett Appl Microbiol 38:428-432.

31. O'Toole GA, Pratt LA, Watnick PI, Newman DK, Weaver VB, Kolter R. 1999. Genetic approaches to study of biofilms, p 91-109, Method Enzymol, Volume 310. Academic Press.

32. Peeters E, Nelis HJ, Coenye T. 2008. Comparison of multiple methods for quantification of microbial biofilms grown in microtiter plates. J Microbiol Meth 72:157-165.

33. O'Toole GA. 2011. Microtiter dish biofilm formation assay. J Vis Exp doi:10.3791/2437.

34. Merritt JH, Kadouri DE, O'Toole GA. 2005. Growing and analyzing static biofilms. Curr Protoc Microbiol 22:1B-1.

35. Sturgill G, Toutain CM, Komperda J, O'Toole GA, Rather PN. 2004. Role of CysE in production of an extracellular signaling molecule in Providencia stuartii and Escherichia coli: Loss of cysE enhances biofilm formation in Escherichia coli. J Bacteriol 186:7610-7617.

36. Kruskal WH, Wallis WA. 1952. Use of ranks in one-criterion variance analysis. J Am Stat Assoc 47:583-621.

37. Tukey JW. 1949. Comparing individual means in the analysis of variance. Biometrics 5:99-114.

38. Mann HB, Whitney DR. 1947. On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18.1:50-60.

39. MacKay DJC. 2003. Information theory, inference, and learning algorithms. Cambridge University Press, Cambridge, UK; New York.

40. Mellata M, Johnson JR, Curtiss R, 3rd. 2017. Escherichia coli isolates from commercial chicken meat and eggs cause sepsis, meningitis and urinary tract infection in rodent models of human infections. Zoonoses Public Hlth 65:103- 113.

41. Mitchell NM, Johnson JR, Johnston B, Curtiss R, 3rd, Mellata M. 2015. Zoonotic potential of Escherichia coli isolates from retail chicken meat products and eggs. Appl Environ Microbiol 81:1177-1187.

96

42. Logue CM, Doetkott C, Mangiamele P, Wannemuehler YM, Johnson TJ, Tivendale KA. 2012. Genotypic and phenotypic traits that distinguish neonatal meningitis-associated Escherichia coli from fecal E. coli isolates of healthy human hosts. Appl Environ Microbiol 78:5824-5830.

43. Haugen J-E, Kvaal K. 1998. Electronic nose and artificial neural network. Meat Sci 49:S273-S286.

44. Ghasemi-Varnamkhasti M, Mohtasebi SS, Siadat M, Balasubramanian S. 2009. Meat quality assessment by electronic nose (machine olfaction technology). Sensors 9:6058-6083.

45. Goodacre R, Timmins éM, Burton R, Kaderbhai N, Woodward AM, Kell DB, Rooney PJ. 1998. Rapid identification of urinary tract infection bacteria using hyperspectral whole-organism fingerprinting and artificial neural networks. Microbiol 144:1157-1170.

46. Nicholson BA, West AC, Mangiamele P, Barbieri NL, Wannemuehler Y, Nolan LK, Logue CM, Li G. 2016. Genetic characterization of ExPEC-like virulence plasmids among a subset of NMEC. PLOS ONE 11:e0147757.

47. Anderson GG, Palermo JJ, Schilling JD, Roth R, Heuser J, Hultgren SJ. 2003. Intracellular bacterial biofilm-like pods in urinary tract infections. Science 301:105-107.

48. Wright KJ, Hultgren SJ. 2006. Sticky fibers and uropathogenesis: bacterial adhesins in the urinary tract. Future Microbiol 1:75-87.

49. Schiebel J, Bohm A, Nitschke J, Burdukiewicz M, Weinreich J, Ali A, Roggenbuck D, Rodiger S, Schierack P. 2017. Genotypic and phenotypic characteristics in association with biofilm formation in different pathotypes of human clinical Escherichia coli isolates. Appl Environ Microbiol doi:10.1128/AEM.01660-17.

50. Duriez P, Clermont O, Bonacorsi S, Bingen E, Chaventre A, Elion J, Picard B, Denamur E. 2001. Commensal Escherichia coli isolates are phylogenetically distributed among geographically distinct human populations. Microbiol 147:1671-1676.

51. Vignaroli C, Luna GM, Rinaldi C, Di Cesare A, Danovaro R, Biavasco F. 2012. New sequence types and multidrug resistance among pathogenic Escherichia coli isolates from coastal marine sediments. Appl Environ Microbiol 78:3916-3922.

52. Nash JH, Villegas A, Kropinski AM, Aguilar-Valenzuela R, Konczy P, Mascarenhas M, Ziebell K, Torres AG, Karmali MA, Coombes BK. 2010. Genome sequence of adherent-invasive Escherichia coli and comparative genomic analysis with other E. coli pathotypes. BMC Genomics 11:667.

97

53. Lecointre G, Rachdi L, Darlu P, Denamur E. 1998. Escherichia coli molecular phylogeny using the incongruence length difference test. Mol Biol Evol 15.

54. Jang J, Di DY, Lee A, Unno T, Sadowsky MJ, Hur HG. 2014. Seasonal and genotypic changes in Escherichia coli phylogenetic groups in the Yeongsan River basin of South Korea. PLoS ONE 9:e100585.

55. Petersen AM, Nielsen EM, Litrup E, Brynskov J, Mirsepasi H, Krogfelt KA. 2009. A phylogenetic group of Escherichia coli associated with active left-sided inflammatory bowel disease. BMC Microbiol 9:171.

56. Gordon DM, Clermont O, Tolley H, Denamur E. 2008. Assigning Escherichia coli strains to phylogenetic groups: multi-locus sequence typing versus the PCR triplex method. Environ Microbiol 10:2484-2496.

57. Bohlin J, Brynildsrud OB, Sekse C, Snipen L. 2014. An evolutionary analysis of genome expansion and pathogenicity in Escherichia coli. BMC Genom 15:882.

58. Clermont O, Olier M, Hoede C, Diancourt L, Brisse S, Keroudean M, Glodt J, Picard B, Oswald E, Denamur E. 2011. Animal and human pathogenic Escherichia coli strains share common genetic backgrounds. Infect, Genet Evol. 11:654-662.

59. Jaureguy F, Landraud L, Passet V, Diancourt L, Frapy E, Guigon G, Carbonnelle E, Lortholary O, Clermont O, Denamur E, Picard B, Nassif X, Brisse S. 2008. Phylogenetic and genomic diversity of human bacteremic Escherichia coli strains. BMC Genom 9:560.

98

Supplementary Figures

Supplementary Figure 2.1: Optical Densities of E. coli Strains by Media Type

Biofilm formation relative to the media type with the mean OD600 plotted; error bars represent ± standard error of mean.

99

Supplementary Figure 2.2: Optical densities of E. coli classified from their host source

Optical densities of E. coli classified from their source of isolation, human (blue) or avian (green), with the three different media types.

100

Supplementary Tables

Supplementary Table 2.1: Chi-square Discrete Categorization of E. coli by

Commensal and Subpathotype Groups*

M63 Negligible Low Moderate High p APEC 56.8% 13.6% 20.5% 9.1% 0.0064 NMEC 31.3% 53.1% 6.3% 9.4% 0.0093 UPEC 28.1% 34.4% 28.1% 9.4% 0.6007 AFEC 27.3% 33.3% 24.2% 15.2% 0.6371 HFEC 32.4% 23.5% 29.4% 14.7% 0.4956

TSB Negligible Low Moderate High p APEC 65.9% 18.2% 6.8% 9.1% 0.5807 NMEC 53.1% 18.8% 21.9% 6.3% 0.1619 UPEC 43.8% 15.6% 15.6% 25.0% 0.0505 AFEC 57.6% 21.2% 6.1% 15.2% 0.6197 HFEC 73.5% 11.8% 8.8% 5.9% 0.2969

BHI Negligible Low Moderate High p APEC 38.6% 36.4% 20.5% 4.5% 0.6064 NMEC 37.5% 34.4% 21.9% 6.3% 0.7880 UPEC 31.3% 40.6% 25.0% 3.1% 0.7300 AFEC 30.3% 54.5% 9.1% 6.1% 0.2530 HFEC 17.6% 44.1% 23.5% 14.7% 0.0835

* Percentages may not add up to 100.0% due to rounding.

Values in bold are significant to α < 0.05

101

Supplementary Table 2.2: Chi-square Discrete Categorization of E. coli by

Clermont’s Revised Phylogenetic Groups*

M63 Negligible Low Moderate High p A 60.0% 16.0% 8.0% 16.0% 0.0244 B1 18.2% 40.9% 36.4% 4.5% 0.0787 B2 18.4% 39.5% 26.3% 15.8% 0.0717 C 57.1% 19.0% 19.0% 4.8% 0.1900 D 20.8% 50.0% 20.8% 8.3% 0.1274 E 11.8% 29.4% 29.4% 29.4% 0.0291 F 67.9% 14.3% 14.3% 3.6% 0.0024

1/20 TSB Negligible Low Moderate High p A 88.0% 8.0% 0.0% 4.0% 0.0161 B1 45.5% 9.1% 13.6% 31.8% 0.0176 B2 42.1% 26.3% 23.7% 7.9% 0.0078 C 66.7% 19.0% 14.3% 0.0% 0.3483 D 50.0% 20.8% 8.3% 20.8% 0.4438 E 47.1% 29.4% 11.8% 11.8% 0.5437 F 78.6% 7.1% 3.6% 10.7% 0.1193

BHI Negligible Low Moderate High p A 56.0% 16.0% 8.0% 20.0% 0.0002 B1 36.4% 31.8% 22.7% 9.1% 0.7861 B2 18.4% 63.2% 13.2% 5.3% 0.0258 C 42.9% 38.1% 14.3% 4.8% 0.6573 D 25.0% 33.3% 37.5% 4.2% 0.1456 E 17.6% 47.1% 29.4% 5.9% 0.5465 F 28.6% 50.0% 21.4% 0.0% 0.3986

* Percentages may not add up to 100.0% due to rounding.

Values in bold are significant to α < 0.05

102

CHAPTER 3. ASSOCIATION OF POLYMORPHISMS IN OUTER MEMBRANE A WITH EXTRAINTESTINAL PATHOGENIC ESCHERICHIA COLI’S PHYLOGENETIC GROUPS AND SEQUENCE TYPES

Daniel W. Nielsen1, Nicole Ricker2, Nicolle L. Barbieri3, Heather K. Allen2, Lisa

K. Nolan4 and Catherine M. Logue3*

1Department of Veterinary Microbiology and Preventive Medicine, College of

Veterinary Medicine, 1802 University Blvd., Iowa State University, Ames, Iowa 50011

2 Food Safety and Enteric Pathogens Research Unit, National Animal Disease

Center, ARS-USDA, Ames, Iowa

3Department of Population Health, College of Veterinary Medicine, University of

Georgia, Athens, Georgia 30602

4Department of Infectious Diseases, College of Veterinary Medicine, University

of Georgia, Athens, Georgia 30602

Abstract

Neonatal Meningitis Escherichia coli (NMEC) is the second-leading cause of neonatal bacterial meningitis and is a subpathotype of the Extraintestinal Pathogenic E. coli pathotype (ExPEC), which includes Avian Pathogenic E. coli (APEC) and

Uropathogenic E. coli (UPEC). Virulence factors associated with NMEC include outer membrane protein A (OmpA) and the type I fimbriae (FimH), which are also contained within APEC and UPEC. Outer membrane protein A (OmpA) contributes to the ability of

Neonatal Meningitis Escherichia coli (NMEC) to cross the blood-brain barrier and to

103 persist in the bloodstream. OmpA has been suggested as a potential vaccine target for

ExPEC, but the protein has different amino acid variants, which have the potential to alter the pathogenesis or virulence of strains or alter vaccine efficacy. OmpA is present in virtually all E. coli, but differences in its amino acid residues have yet to be surveyed in a large collection of E. coli as a community and within the ExPEC subpathotypes.

Additionally, the relationship between the E. coli chromosome and OmpA polymorphisms has not been fully examined. Here, we surveyed publicly available

OmpA sequences (n=493) and sequenced the ompA gene (n=399) from collections of

ExPEC. Thirty-three different OmpA polymorphism patterns were identified. Often, these OmpA polymorphism patterns separated E. coli by their phylogenetic groups or sequence types. Among the ExPEC collection, nine polymorphism patterns were significantly associated with an ExPEC subpathotype. Additionally, the ExPEC subpathotype had fewer polymorphism patterns than publicly available E. coli and was thus, more conserved. The polymorphism patterns varied among the subpathotypes for phylogenetic groups A, B1, B2, and F.

Introduction

Members of the Extraintestinal Pathogenic Escherichia coli (ExPEC) pathotype are adapted for an extraintestinal lifestyle. ExPEC subpathotypes include Neonatal

Meningitis E. coli (NMEC), Uropathogenic E. coli (UPEC), and Avian Pathogenic E. coli

(APEC), which are named by the host system or species that they negatively impact (1,

2). APEC, the causative agent of avian colibacillosis, is responsible for significant morbidity, mortality, and financial losses to the poultry production industry worldwide

104

(1). UPEC is the leading cause of uncomplicated and catheter-associated urinary tract infections in humans, and serious UPEC infections can result in pyelonephritis, potentially leading to sepsis or death (3). NMEC is the causative agent of 28-29% of neonatal bacterial meningitis cases (4, 5). NMEC-caused neonatal bacterial meningitis has a 33% mortality rate with survivors often suffering lifelong disability (5). Both common and distinguishing virulence factors among ExPEC subpathotypes are important to examine to explain the pathogenesis or virulence of the pathotype or subpathotypes.

One virulence factor of particular interest in NMEC is OmpA, an outer membrane protein that promotes bloodstream survival and assists in crossing the blood brain barrier (6-8).

The structure of OmpA assists in our understanding of the protein. Structurally,

OmpA consists of eight membrane-spanning β-strands that form a β-barrel (9). The N- terminal domain consists of the first 169 amino acids and was characterized by Patutsch and Shulz (10). The C-terminal domain was proposed to interact with the peptidoglycan layer (11), but it has yet to be crystalized (12). Nevertheless, it has been shown that

OmpA can exist as a monomer or dimer and that the soluble C-terminal domain of OmpA is responsible for dimerization of the protein (12). The OmpA protein is known to form four extracellular loops that have been shown to exhibit residue patterns encoded by allelic variants in the ompA gene across the protein’s loops (13). These “alleles” have been described in previous works (13-15). Structurally, the loops of OmpA contribute to

NMEC’s survival and entry into the human brain microvascular endothelial cells

(HBMEC) by binding the Ecgp glycoprotein (16, 17). Gu et al. (2018) have suggested that the loops of OmpA might be a good vaccine target to prevent NMEC infection (18).

OmpA also contributes to the binding and survival of NMEC in macrophages (19).

105

Additionally, the protein contributes to binding tropism by different types of E. coli (20) and acts as a receptor for bacteriophages (13, 14).

While the contribution of OmpA to NMEC pathogenesis has been demonstrated, the importance of OmpA among other ExPEC subpathotypes, such as APEC and UPEC, as well as other E. coli pathotypes remains relatively unexplored. Indeed, OmpA is present in virtually all E. coli isolates, including commensal strains (14, 21). Is OmpA’s relationship to NMEC virulence unique and ascribable to certain polymorphisms? Are certain polymorphisms in OmpA unique to NMEC or other ExPEC? Answering such questions may give us new insight into ExPEC’s ability to cause disease and its evolution, host specificity, or tissue proclivity.

This study assessed differences in OmpA amino acid sequences among the

ExPEC subpathotypes and among a wide-variety of different E. coli. An issue that might complicate such an analysis is the lack of chromosomal relatedness of the E. coli strains being compared since ExPEC subpathotypes have different distributions of the phylogenetic groups (22). An association of chromosomal history and polymorphism patterns in a virulence factor has precedence as polymorphisms in the adhesin FimH, a virulence factor of ExPEC, appear to correspond with a phylogenetic group and increased virulence (23). Thus, in the present study, we examined OmpA amino acid sequences in different E. coli strains, as classified by their phylogenetic group, serogroups, sequence types, and ExPEC subpathotype.

106

Materials and Methods in silico Data Acquisition

To assess phylogenetic similarities across E. coli from many sources, fully closed genomic sequences of E. coli were downloaded from the NCBI GenBank Escherichia coli Genome Assembly and Annotation report page (Supplementary Table 3.1). The downloaded genomes were phylogenetically typed with an in-house script (PhyloTyper), which was designed in reference to Clermont’s revised phylogenetic typing scheme (22,

24). PhyloTyper assigns E. coli isolates to their original and revised phylogenetic groups and allows one nucleotide mismatch per primer. The ompA nucleotide sequences were obtained from the FASTA file of each genome by an in-house script named RoDTIS

(Robust DNA Targeting in silico). RoDTIS returns FASTA files of the DNA region of interest when given primers in a Microsoft Excel® file. Both scripts and their documentation are available at https://github.com/nielsend. The distribution of phylogenetic groups for all isolates is shown in Table 3.1. The Warwick sequence types

(STs) of the genomes from GenBank were obtained by the multi-locus sequencing typing

(MLST) program based on PubMLST.org (https://github.com/tseemann/mlst) (25). The genomes tested (n=493), including their OmpA polymorphism patterns, GenBank accession numbers, and phylogenetic group are presented in Supplementary Table 3.1.

ExPEC Strains and DNA Isolation

A total of 399 ExPEC were used in this study. These 399 ExPEC were selected randomly from collections previously described (26-29). The ExPEC isolates were previously serogrouped through the Escherichia coli Reference Center (Pennsylvania

State University) (27, 29), and all ExPEC isolates for which the ompA gene was

107 sequenced were phylogenetically grouped by Clermont’s revised phylogenetic typing scheme using previously described protocols (Table 3.1) (22, 26). The 399 ExPEC were streaked out onto MacConkey agar plates from -80 °C glycerol stock and incubated at

37°C for 18-24 hours. Single colonies were selected and inoculated into 2 mL of Luria

Bertani or Brain Heart Infusion broth. These were incubated at 37 °C for 18-24 hours.

The cultures were centrifuged for two minutes at 37,000 x g, and the supernatant removed. The pellet was then resuspended in 200 uL of sterile water and placed in a heating block for 10 minutes at 100 °C before centrifugation to remove cellular debris.

The supernatant containing the DNA was transferred to a clean tube and frozen at -20 °C until use.

ompA Gene Amplification and Sequencing

The ompA gene was amplified from each strain twice via PCR with two primer sets and PCR reactions (Supplementary Table 3.2). PCR conditions using a MasterCycler

Gradient thermocycler (Eppendorf, Germany) were 94 °C for 3 minutes, followed by 30 cycles of amplification (denaturation: 30 seconds at 94 °C, annealing: 30 seconds at 54

°C, extension: 72 °C for 90 seconds), and a final extension at 72 °C for seven minutes.

PCR products were confirmed on a 2% agarose gel in 1x TAE buffer. The PCR products were purified with ExoSAP-IT (Affymetrix) to remove primers and remaining dNTPs according to the manufacturer’s instructions before both PCR products were

Sanger sequenced at the Iowa State University DNA Sequencing Facility (Ames, IA).

108

Table 3.1: Phylogenetic groups of the ExPEC and publicly available E. coli chromosomes examined in this study.

Phylogenetic Assignment ExPEC Sequences A B1 B2 C D E F Unknown Total APEC 21 33 26 59 3 2 25 2 171 NMEC 5 2 62 3 0 1 7 0 80 UPEC 5 13 109 7 7 1 6 0 148 ExPEC Total 31 48 197 69 10 4 38 2 399 ExPEC Prevalence 7.8% 12.0% 49.4% 17.3% 2.5% 1.0% 9.5% 0.5% 100%

Other Publicly Available 49 102 98 165 23 5 49 2 493 E. coli Sequences Prevalence 9.9% 20.7% 19.9% 33.5% 4.7% 1.0% 9.9% 0.4% 100%

TOTAL 80 150 295 234 33 9 87 4 892 Total Prevalence 9.0% 16.8% 33.1% 26.2% 3.7% 1.0% 9.8% 0.4% 100%

in silico Analysis of ompA

Nucleotide sequences of ompA from the above ExPEC strains and from GenBank

(Supplementary Table 3.1) were imported into Geneious v. 10.2 (BioMatters LTD,

Auckland, New Zealand), and the sequences were aligned using the Geneious aligner.

The sequences were trimmed to 975 nucleotides for consistent length among all sequences and were translated in silico. Residues were aligned using the Geneious aligner with the Blosum 62 cost matrix, and non-unique residue polymorphisms relative to the consensus sequence were removed. Polymorphisms that occurred at any position fewer than three times among all OmpA sequences were interpreted as potential sequencing errors and subsequently excluded from the analysis. The resulting amino acid sequences were used as the polymorphism pattern strings. The polymorphism pattern strings were

109 imported into R for analysis, where the TidyVerse and ggplot2 packages were used to conduct analyses and generate figures (30, 31).

Statistical analysis

The Chi-square test of homogeneity was used to determine statistically significant differences among the different data sets. Significance for all statistical tests was determined at the α=0.05 level.

Results

The OmpA Protein Has Unique Polymorphism Patterns

The analysis of the ompA sequences obtained from publicly available sequences and from the amplicons generated in the present study revealed 23 different predicted polymorphism sites for OmpA among the 892 E. coli strains examined (Figure 3.1A).

The majority of OmpA polymorphisms are located within the C-terminus region or the loops of the protein, which have previously been designated as part of the linker/dimerization and N-terminal domains, respectively (Figure 3.1B). Previous studies have created naming schemes for certain OmpA polymorphisms, and we first used these schemes to explain our results.

110

(A)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Power et al. Liao et al. This Study

(B) 11 12 13 14 1 2 3 4 9 10 17 15

6 7 8

18

LOOP 1 LOOP 2 LOOP 3 LOOP 4

5 16 19 20 N’ 21 22 PERIPLASM 23 C’

Figure 3.1: Polymorphisms present in OmpA (3.1A) and their location within the OmpA protein (3.1B).

(3.1A) The x-axis illustrates the relative location of the polymorphism, so individual polymorphisms can be identified. The orange and yellow bars beneath the polymorphism numbers indicate previous studies that the polymorphisms were identified for OmpA although these studies may not have completed analyses based on all the polymorphisms identified (13, 20). Polymorphisms identified in this study and the N-terminal and linker/dimerization domains examined are indicated by the two green bars, respectively. Any amino acid at one position that appeared fewer than three times was not included in the analysis. (3.1B) Structure of OmpA, represented by the black and blue line looping through the outer membrane, with amino acid sequence polymorphisms indicated at their approximate positions. The polymorphisms are numbered as in 3.1A. The OmpA structure shown is based on data presented in other work (10, 12, 20).

Comparison of Previous OmpA Variant Naming Schemes to Our Collection

Previously, Power et al. reported on the prevalence of two OmpA variants originally termed “alleles” that were identified as ompA1 and ompA2 (13). When OmpA

111 polymorphisms were examined in the dataset for these variations, positions 2-4, 6, 10, and 11-15, the results showed that 555/892 (62.2%) were one variant, the ompA1 allele, while 328/892 (36.7%) were another variant, the ompA2 allele (Figure 3.1A, Table 3.2).

The clustering of the polymorphisms into two large groups of variants with this scheme was expected based on previous studies, but interestingly, 9/892 (0.9%) sequences did not belong to either of the two ompA variants, suggesting that although most of the isolates in the current dataset can be categorized, a subset of variants might require an additional or more informative classification. For known pathogenic E. coli, ExPEC sensu stricto, 210/399 (52.6%) isolates harbored the ompA1 variant, and 182/399 (45.6%) isolates contained the ompA2 variant.

Table 3.2: Historic OmpA variant types

Variant Loop 2 Loop 3

ompA1 SVE SY----N

ompA2 DNI APGASFD

Variant types were described by Power et al. and Gophna et al. (13, 15).

OmpA variants were further categorized by their outer loop polymorphism patterns and linker/dimerization domain polymorphism patterns by Liao et al. (2017) (20) resulting in the identification of more polymorphism patterns. Outer loop polymorphism patterns I-IV were characterized and named by the abundance of the loop patterns

(polymorphisms 1-15 and 17, Figure 3.1) and the linker/dimerization domain was characterized by classification as α, ß, γ, and δ (polymorphisms 19, 21, and 23, Figure

112

3.1). In this study, to characterize polymorphisms in OmpA using this scheme, the outer loop and linker/dimerization domains were concatenated for the analysis and assigned identifiers indicated by combinations of Roman numerals and Greek letters. Table 3.3 shows the overall prevalence of these patterns among all test strains examined by Liao et al. and includes our strains for comparison when the same classification approach was applied. However, a significant number of our isolates could not be classified, and a new classification scheme was developed for this work.

Table 3.3: Liao et al. (20) polymorphism patterns identified by the total collection in this study and in Liao et al.

Pattern Name Pattern This study Liao et al. (n=892) NSVES--VY----NHANG I-α 35.65% 29.50% NSVES--VY----NHATA I-δ 12.44% 2.60% PDNIA--VPGASFDNANG II-α 8.86% 20.50% PDNIA--VPGASFDHANG III-α 11.10% 7.70% PDNIA--VPGASFDHVTA III-γ 3.47% 1.30% PDNIA--VPGASFDHATA III-δ 9.30% 1.30% NSVES--FD----NHATA IV-δ 5.27% 1.30% DSVEAHNVT-ESENHANG V-α 0.78% 6.40% DSVES--FD----NHATA VII-δ 0.56% 0.00% Unassigned 12.57% 20.40% TOTAL 87.43% 70.60%

Using this revised classification scheme, 33 different patterns of polymorphisms were found in this study, and these patterns were named based on polymorphisms found in the N-terminal and linker/dimerization domains (Table 3.4, Figure 3.1A). Henceforth, letters A-X were used to indicate the N-terminal pattern, and 1-7 were used to indicate the linker/dimerization domain (Table 3.4). This new scheme classifies the OmpA polymorphisms detected in all of the isolates analyzed in this study.

113

Table 3.4: Polymorphism patterns for the different domains and their identifying names.

Polymorphism C-terminal Region Dimerization Total ExPEC Pattern Region Count Count A3 DDNIIA--VPGASFDYHL AKNVG 3 2 B3 DDNIIA--VPGASFDYNL AKNVG 3 0 C3 DDNIVS--VPGVSTDYHL AKNVG 1 0 D5 DDNIVS--VPGVSTDYHM AKTVG 1 0 E3 DSVEVAHNVT-ESENWHL AKNVG 7 3 F4 DSVEVS--FD----NYHM AKTVA 5 2 F5 DSVEVS--FD----NYHM AKTVG 1 1 G4 DSVEVS--VY----NYHM AKTVA 4 2 G7 DSVEVS--VY----NYHM VKTVA 9 7 H3 NDNIVA--VPGASFDYNL AKNVG 1 1 I4 NDNIVS--VY----NYHM AKTVA 7 1 J3 NSVAIS--VY----NYHM AKNVG 1 1 K2 NSVEIS--VY----NYHM AKNVA 2 0 K3 NSVEIS--VY----NYHM AKNVG 315 36 K6 NSVEIS--VY----NYHM ARNVG 3 0 L4 NSVEVS--FD----NYHM AKTVA 47 37 L5 NSVEVS--FD----NYHM AKTVG 33 26 M4 NSVEVS--VY----NYHM AKTVA 111 91 M5 NSVEVS--VY----NYHM AKTVG 7 4 N3 NSVKIS--VY----NYHM AKNVG 6 0 O5 NSVKVS--FD----NYHM AKTVG 1 1 P3 NSVQIS--VY----NYHM AKNVG 1 1 Q3 PDDIVA--VPGASFDYHM AKNVG 1 1 R4 PDNIIA--VPGASFDYHM AKTVA 83 39 R5 PDNIIA--VPGASFDYHM AKTVG 27 8 R7 PDNIIA--VPGASFDYHM VKTVA 31 24 S3 PDNIIA--VPGASFDYNL AKNVG 15 4 T3 PDNIVA--VPGASFDYHL AKNVG 1 1 U3 PDNIVA--VPGASFDYHM AKNVG 92 69 U1 PDNIVA--VPGASFDYHM AKNAG 6 6 V3 PDNIVA--VPGASFDYNL AKNVG 64 31 W2 PDNIVS--VPGASTDYHM AKNVA 1 0 X7 PSVEVS--VY----NYHM VKTVA 2 0 n= 892 399

C-terminal includes polymorphisms 1-18 as seen in Figure 3.1, and the linker/dimerization domain includes polymorphisms 19-23 also shown in Figure 3.1.

114

Phylogenetic Groups Can Explain Some OmpA Polymorphisms in Publicly Available E. coli Sequences

Since E. coli’s assignment to phylogenetic group and content of polymorphisms reflect chromosomal structure, we hypothesized that there was a relationship between them. Indeed, many of the polymorphism patterns could be assigned to a single phylogenetic group (Figure 3.2), suggesting a relationship between OmpA polymorphisms and chromosomal history. However, certain OmpA polymorphism patterns (I4, K3, L4, L5, R4, R5, S3, U3, V3) were detected among several phylogenetic groups (Figure 3.2), with the K3 pattern having the least phylogenetic affiliation. When the K3 pattern was examined, it was composed of phylogenetic groups A (12.5%), B1

(24.4%), C (49.8%), F (12.5%), and Unknown (0.7%), which suggests that some polymorphism patterns cannot be associated with one major phylogenetic group.

Interestingly, the E3 pattern, which does not correlate to either the ompA1 or ompA2 variant (Table 3.1), was found only in E. coli assigned to phylogenetic group E

(Figures 3.2 and 3.3). Within this unique pattern several amino acid residues differed on the third loop of OmpA, resulting in an insertion of two amino acids at polymorphism number 7 and 8 (Figure 3.1). While all other predicted OmpA sequences contained Gly-

Ala-Ser-Phe (encoded by the ompA2 variant) or a deletion (ompA1 variant), pattern E3 had a unique variant--Glu-Ser-Glu, a change that is predicted to result in a more acidic protein. Some amino acid residues that differ for this pattern are residues 11-14 (Figure

3.1).

115

Figure 3.2: OmpA amino acid polymorphism patterns (x-axis) as associated with Clermont’s revised phylogenetic groups for the E. coli strains analyzed in the present study that were accessed from GenBank (n=493).

Some polymorphisms patterns represent more than one phylogenetic group (y-axis). Any polymorphism pattern that occurred fewer than two times was excluded from the analysis.

Although the entire polymorphism pattern can be informative, the N-terminal and

C-terminal domains of the protein differed in their ability to represent the phylogenetic groups. Despite the observed differences in its N-terminal domain (described above), phylogenetic group E contains the most common linker/dimerization domain pattern

(AKNVG), which is also seen predominantly in phylogenetic groups A, B1, C and F

(Figure 3.3A). Nevertheless, E. coli assigned to phylogenetic groups A, C, E, and F also exhibited other patterns such as ARNVG, AKNVA, and AKTVA (Figure 3.3).

116

Figure 3.3: OmpA of E. coli strains from GenBank (n=493) can differ by their phylogenetic group (x-axis) when examined between the (3A) linker/dimerization domain and (3B) N-terminal domain.

Polymorphism patterns are based on polymorphisms of Figure 3.1A. Any polymorphism pattern that occurred fewer than two times was excluded from the analysis.

117

Polymorphisms among the E. coli assigned to the B2 phylogenetic group were of particular interest when examining the linker/dimerization domain (Figure 3.3A).

Members of this phylogenetic group primarily possessed the AKTVA pattern, a pattern also found in some members of phylogenetic group F. Additionally, members of the B2 phylogenetic group contained the AKTVG pattern, a pattern shared with E. coli assigned to phylogenetic groups E and all group D isolates. Phylogenetic group B2 was also the only group to be composed of the VKTVA pattern. Members of this phylogenetic group harbored more polymorphism patterns than any other group.

When the N-terminus patterns were analyzed (Figure 3.3B), K was the most frequent variant observed, occurring in phylogroups A, B1, C, F and unassignable. In addition, groups A and B1 also shared the S and V polymorphism patterns. Like the linker/dimerization domain of E. coli assigned to phylogenetic group B2, the N-terminal of group B2 was the most diverse among all protein sequences examined, and shared similarities with the group D isolates and a minority of those assigned to phylogenetic group F (Figure 3.3B).

Although the linker/dimerization and N-terminus polymorphism patterns could differentiate a significant number of phylogenetic groups, a single polymorphism could occasionally indicate assignment to the phylogenetic group. For example, the N-terminal

PDNIVA--VPGASFDYNL (pattern V, Table 3.4, Figure 3.3B) was most closely associated with phylogenetic group B1, but this pattern was differentiated from other patterns by polymorphisms at positions 17 and 18, where histidine and methionine were replaced by asparagine and leucine (Figure 3.1 and Table 3.4). Likewise, phylogenetic group B2 dimerization polymorphism patterns were more likely than the other

118 phylogenetic groups to have a threonine and alanine replace an asparagine and glycine at polymorphism locations 21 and 23, respectively (Figures 3.1, 3.3B and Table 3.4).

Interestingly, sequences assigned to phylogenetic group D may be an evolutionary

“bridge” between phylogenetic group B2 and the other phylogenetic groups as polymorphism patterns of phylogenetic group D had a threonine and glycine at the 21 and

23 polymorphism locations, respectively (Figures 3.1 and 3.3). Likewise, a majority of group F had a threonine at position 21. The location of these polymorphisms may be functionally important to the protein as polymorphisms 21 and 23 are part of the linker/dimerization domain (Figure 3.1) (12). However, in the current study, phylogenetic group classification could not fully explain the differences in amino acid residues in

OmpA as polymorphism patterns K3, R5, S3, and U3 were associated with three or more phylogenetic groups.

Some OmpA Polymorphism Patterns Indicate Sequence Type for Publicly Available E. coli Sequences

We also probed for relationships between the occurrence and type of OmpA polymorphisms in publicly available E. coli with assignment to Sequence Types (STs). In order to make these assignments, MLST was performed in silico on the publicly available sequences (n=493) (31), and several STs could be identified by their polymorphism patterns. Notably, E. coli assigned to the ST131 group were characterized by a unique

OmpA polymorphism pattern (Figure 3.4). Since the relationship between polymorphism patterns and virulence on the greater E. coli community could not be assessed fully due to a lack of known information about all of the publicly available isolates used, only the

OmpA of ExPEC isolates of known provenance could be examined.

119

Figure 3.4: Association of polymorphism patterns with Warwick sequence types (x- axis) for publicly-available E. coli (n=493) (25).

Polymorphism Patterns Can Vary with the ExPEC Subpathotype

The distribution of nine polymorphism patterns were found to be statistically different among APEC, NMEC, and UPEC (Figure 3.5). APEC were more likely to exhibit OmpA polymorphism patterns K3, L5, U1, U3, and V3; whereas, UPEC were most likely to exhibit patterns L4, M4, R4, and R7. The majority of NMEC contained

OmpA polymorphism pattern M4, but NMEC also had a greater relative prevalence of polymorphism patterns R4 and R5 than one or more of the other subpathotypes (Figure

3.5). It was also noted that some polymorphism patterns were far less likely to be associated with ExPEC subpathotypes. When the ExPEC polymorphism patterns were compared to those that were publicly available, polymorphism patterns B3, C3, D5, G4,

I4, K2, K6, N3, W2, and X7 were absent or were present fewer than two times for the

120

ExPEC group (Figures 3.2 and 3.5). The ExPEC, however, contained pattern U1, a pattern correlated with APEC, that was absent from GenBank. Although the majority of these differences were statistically significant, the composition of the phylogenetic groups within the ExPEC subpathotypes differs (26). Therefore, the polymorphism patterns of APEC, NMEC, and UPEC were analyzed against the phylogenetic groups.

Figure 3.5: Polymorphism patterns (x-axis) and prevalence (y-axis) of each pattern for APEC (n=171), NMEC (n=80), and UPEC (n=148).

Any polymorphism pattern that occurred fewer than two times was excluded from the analysis. Polymorphism patterns G7, L4, L5, M4, R4, R5, R7, U3, V3 are statistically significant between the subpathotypes (p <0.05).

Polymorphism Patterns are Associated with ExPEC of Different Subpathotypes and Phylogenetic Groups

As found amongst all the publicly available E. coli sequences examined, the phylogenetic groups of the OmpA protein sequences discovered among our ExPEC collection could on occasion be predicted by OmpA polymorphism pattern (Figure 3.6).

121

The linker/dimerization domains were examined for the phylogenetic groups, and there were distinctions among the subpathotypes (Figure 3.6A). Phylogenetic groups A and B1 were unanimously composed of the AKNVG dimerization polymorphism pattern. The dimerization pattern for phylogenetic group C included an additional unique dimerization pattern, AKNAG, and this pattern was only found in APEC. There were also differences in the linker/dimerization domains of B2 as NMEC and UPEC contained the unique polymorphism pattern VKTVA, which was absent from APEC. However, it should be noted that the majority of NMEC and UPEC belong to phylogenetic group B2 (Table

3.1). Phylogenetic group F consisted of AKTVA and AKTVG. This differed greatly from E. coli in the GenBank collection, as the majority polymorphism pattern was

AKNVG followed by AKTVA (Figures 3.3A and 3.6A). The ExPEC assigned to group

B2 were more conserved than the B2 group in the GenBank collection as AKTVG was absent (Figures 3A and 6A). It is well-known that a majority of APEC belong to phylogenetic group C (26), so it was not surprising to find that APEC had a second polymorphism pattern in this group compared to NMEC, as shown by the two linker/dimerization domain patterns AKNAG and AKNVG (Figure 3.6A). Nevertheless,

UPEC did have a different pattern within phylogenetic group C as shown by its two N- terminal domain patterns (K and U).

122

Figure 3.6: ExPEC subpathotypes polymorphisms differ across the phylogenetic groups (facetted plots) by their (6A) linker/dimerization and (6B) N-terminal domains

Polymorphism patterns are based on polymorphisms of Figures 1A. Any polymorphism pattern that occurred fewer than two times was excluded from the analysis.

123

When the N-terminal domain pattern was examined, differences between the

ExPEC subpathotypes were evident for some of the phylogenetic groups (Figure 3.6B).

Important subpathotype differences in OmpA polymorphisms were found in ExPEC assigned to the B2 and F phylogenetic groups. The UPEC phylogenetic group B2 had a greater diversity of polymorphism patterns, and APEC had a different N-terminal domain pattern in phylogenetic group F compared to NMEC and UPEC. The unique N-terminal domain pattern of phylogenetic group F in APEC was also found in the GenBank collection where this pattern was found in the B2, D, and F phylogenetic groups. Data presented in Figures 3.3 and 3.6 illustrate that the ExPEC pathotype has a relatively conserved OmpA compared to the publicly available E. coli as ExPEC sensu stricto appears to have fewer polymorphism patterns in its C-terminal region. Likewise, the

ExPEC phylogenetic groups consistently contained fewer patterns than their GenBank counterparts.

OmpA Polymorphism Patterns are a Relatively Poor Indicator of Serogroup

Subpathotypes of ExPEC have historically been analyzed by their serogroups, which are based on serologic identification of O-antigens (1, 29, 32). Currently, 187 O- antigen groups are recognized. Here, we only examined ExPEC of serogroups O1, O2,

O8, O18, O25, O75, O78, and O83, that is, those commonly found in the different

ExPEC subpathotypes (27, 33-35). The O8, O75, and O83 serogroups were composed of one dimerization pattern (Figure 3.7A). The N-terminal of O75 strains consisted of a single pattern, pattern L (Figure 3.7B). Additionally, the relationship between the serogroups and phylogenetic groups was examined in this study. ExPEC of the O75 and

O83 serogroups (n=12, 9, respectively) were found only in members of the phylogenetic

124 group B2, while all other ExPEC examined were assigned to multiple phylogenetic groups (Supplementary Figure 3.1). Although there is some overlap between the phylogenetic groups and serogroups, the serogroups do not appear to predict phylogenetic group assignment.

Discussion

Nomenclature and Distribution of OmpA Polymorphisms among E. coli Sequences

In this study, ExPEC were more likely to contain the ompA2 allelic variant compared to the greater E. coli community that was examined. Previous work by Power et al. demonstrated that the ompA2 variant differs between humans and non-domesticated vertebrate animals in Australia (13). Experimentally, E. coli with the ompA2 variant have been shown to be more resistant to bacteriophages (13). If this observation is generalizable to other E. coli, then a majority of the ExPEC may be more resistant to bacteriophages than the greater E. coli community as a whole.

When the phylogenetic groups were examined relative to the allelic variant-typing scheme in this study, phylogenetic group E was found to contain an alternate variant to ompA1 and ompA2. Previous work has shown that human and avian E. coli pathogens and commensals belonging to phylogenetic group E are prolific biofilm formers (28).

Additionally, experimental evidence has shown that OmpA is important for biofilm formation (36-38). Together, these observations suggest that an alternate variant may contribute to the success of early biofilm development in E. coli assigned to phylogenetic group E.

125

Figure 3.7: Polymorphism differences occur between the ExPEC subpathotypes (x- axis) when separated by their serogroups (facetted plots).

The polymorphisms of the (7A) linker/dimerization and (7B) N-terminal domains were examined. Any polymorphism pattern that occurred fewer than two times was excluded from the analysis.

126

Re-classifying OmpA

Structurally, OmpA consists of the N-terminal, linker, and dimerization domains

(12). Previous work has indicated that OmpA is partially present as a dimer, and the dimer interface for the protein is located on the soluble C-terminus and has been named the dimerization domain (12). The C-terminal domain also contains a peptidoglycan- associated motif (39). Next to the dimerization domain is the linker domain (12). The region, not part of the linker/dimerization domain, referenced throughout this study as the

N-terminal domain, has also been shown to have biological relevance. The N-terminal domain consists of both loops and transmembrane regions (10, 13-15). These loops influence the action of OmpA as a receptor for bacteriophages and adhesion and invasion of the brain microvascular endothelial cells of the blood brain barrier (13, 16, 40). In contrast to findings reported by Gophna et al., we found the C-terminal domain was less highly conserved than previously reported, as distinct polymorphisms occurred in this region (15). As such, we thought it important to classify OmpA based on these regions, rather than on the variants of the loops (15) while also accounting for the diversity within the protein (20).

When the identification scheme created by Liao was applied to the isolates of this study, certain polymorphism patterns were unrepresented (20). As a consequence, a significant number of our isolates could not be classified, and a new classification scheme was developed. The typing scheme used here has broad applicability to type different E. coli OmpA protein sequences as it was developed from nearly 900 different E. coli from a variety of sources.

127

Phylogenetic Groups and OmpA Polymorphisms in Publicly Available E. coli Sequences

There is evidence that the E. coli of different phylogenetic groups harbor significant genomic differences. When Logue et al. (2017) examined the phylogenetic groups of avian pathogenic E. coli and avian fecal E. coli, distinct patterns were observed among the presence or absence of virulence genes and pathogenicity islands for some of the phylogenetic groups (26). However, the comparison also demonstrated patterns within the phylogenetic groups that could not be explained. When we examined the

OmpA polymorphism patterns by phylogenetic group, B2 members were found to harbor more patterns than members of any other group. This finding may have particular significance since E. coli that cause extraintestinal disease in humans overwhelmingly are assigned to phylogenetic group B2 (26). More polymorphism patterns within the B2 phylogenetic groups further highlights the complicated nature of the B2 phylogenetic group and its role in pathogenesis and hints to the vast ‘mix and match’ array of mechanisms by which ExPEC cause disease. In addition to the numerous polymorphism patterns for B2, there were differences in the polymorphisms among other phylogenetic groups.

Like phylogenetic group B2, human pathogens are frequently assigned to phylogenetic groups D and F (26, 41). Thus, it was interesting to note a potential evolutionary “bridge” demonstrated within the linker/dimerization domain by the polymorphisms at positions 21 and 23 (Figure 3.1). Certain phylogenetic group D and F polymorphisms appeared to be intermediate steps between the polymorphisms of phylogenetic group B2 and phylogenetic groups A, B1, and C (Figures 3 and 6).

128

Sequence Types and OmpA Polymorphisms in Publicly Available E. coli Sequences

Clinically, the ST131 group is considered an epidemic clone responsible for urinary tract infections and septicemia (42, 43). ExPEC of the ST131 group are associated with virulence, multidrug resistance, community-associated urinary tract infections, and sepsis (44). Indeed, this group of organisms is exceedingly important from a human health standpoint, as it is considered to be a pandemic clone (44). Considering the many utilities of OmpA, it is interesting that the E. coli of ST131 appears to contain a different polymorphism pattern, suggesting OmpA may have distinct pathogenic function in ST131 (Figure 3.4). Like ST131, other STs were composed of only one polymorphism pattern. ST95 consisted of a unique polymorphism pattern that was only shared with a group of untypeable MLST E. coli (Figure 3.4). Clinically ST95 is associated with bacteremia or neonatal sepsis (45). As such, OmpA may also have a distinct function in

ST95 pathogens. Nevertheless, a potential pitfall of the differentiation of OmpA polymorphisms by STs is the sheer number of sequence types currently available—4,499

STs (46). Prohibiting an ExPEC-specific MLST analysis is the current lack of comprehensive surveys of the sequence types among ExPEC isolates. Alternatively,

OmpA evolution may occur at a similar rate to differences in the seven genes (adk, icd, fumC, recA, mdh, gyrB, and purA) that differentiate the STs (25).

ExPEC OmpA Polymorphism Patterns

The OmpA loops of NMEC have been experimentally shown to contribute to neonatal bacterial meningitis (16, 40). In 2010, Mittal et al. found that loops 1 and 3 are needed for NMEC survival in macrophages, loops 1 and 2 are necessary for meningitis, and alterations of loop 4 resulted in enhanced severity in NMEC’s pathogenesis (40).

129

Nevertheless, this study found no defining loop pattern for NMEC, which may suggest that an NMEC OmpA-targeting vaccine may not be widely efficacious (18). Like NMEC, the APEC and UPEC subpathotypes did not have defining patterns. There were, however, statistically significant differences between some of the polymorphism patterns and their

ExPEC subpathotypes, which may agree with the assessment that certain subpathotype subsets can be eliminated as zoonotic pathogens (Figure 3.5) (29). Interestingly, the lack of any subpathotype-only differences also provides further evidence of a zoonotic potential of these organisms (32, 47-49).

Previous work has demonstrated the ability of certain APEC to cause disease in a rat model of human neonatal meningitis and certain NMEC to cause disease in animal models of avian colibacillosis (47). Although this work has been an important ‘proof of concept’ of these subpathotypes’ zoonotic potential, the isolates tested were members of phylogenetic group B2, leaving out a number of other important APEC and NMEC phylogenetic groups. Indeed, NMEC OmpA studies have had a singular focus on E. coli assigned to group B2 (6, 17, 19). Since there are differences in OmpA within every subpathotype and among the different phylogenetic groups (Figures 3.5 and 3.6), the question remains if different OmpA polymorphisms contribute to functional differences in these organisms to be able to cause disease in a specific host species or tissue type. For example, do the ExPEC subpathotypes have different affinities for brain microvascular endothelial cells dependent on their different phylogenetic groups? Such questions will warrant future investigation.

130

Polymorphism Patterns Associated with ExPEC Subpathotype and Phylogenetic Assignment

Although the different ExPEC subpathotypes did have significantly different

OmpA polymorphism patterns, these patterns were often most strongly associated with the phylogenetic groups. Interestingly, there were differences found between avian and human ExPEC for some phylogenetic group isolates. APEC belonging to phylogenetic group F had an N-terminus pattern unlike NMEC and UPEC (Figure 3.6). For isolates belonging to phylogenetic group C, UPEC had a unique N-terminus pattern, and APEC had a unique linker/dimerization domain. Although we cannot account for these unique differences, they may have the potential to confer some environmental or pathogenic advantage to the strain possessing them.

Polymorphism Patterns of ExPEC when Examined by Serogroup

Although OmpA is lodged in the outer membrane of E. coli, Gophna et al. also found that E. coli-associated sepsis strains of the serogroup O78 secreted OmpA during all growth phases (15). OmpA was secreted both as soluble proteins and in vesicles (50).

There appears to be a benefit to the bacterium from doing so. That is, secreted OmpA provides protection to bacteria from degradation due to host neutrophil elastase through competitive inhibition (51). Due to the relative lack of UPEC and NMEC isolates of the

O78 serogroup within our extensive collection, we could only draw conclusions based on

APEC O78 isolates (Figure 3.7, n=43). Of these isolates, the majority (83.7%) had one linker/dimerization domain pattern and 90.7% of these isolates did share the same N- terminal domain pattern (Figure 3.7B, pattern U). As there is not one conserved pattern

131 for all O78 strains, not all O78 strains may share these OmpA properties. Alternatively, other factors may influence the biological differences observed within the O78 serogroup.

When the OmpA polymorphisms and serology of ExPEC were examined beyond serogroup O78, conserved patterns were observed for serogroups O75 and O83 (Figure

3.7A), but all isolates with these serogroups typed to phylogenetic group B2. Therefore, we sought to identify the relationship between phylogenetic groups and serogroups, and we found that serogroups O75 and O83 were the only serogroups to fall within one phylogenetic group as the other serogroups (O1, O2, O18, O25, O78) were represented by multiple phylogenetic groups. Based on this information, it may be best to examine E. coli isolates through the lenses of several typing or subtyping schemes before interpreting data.

Conclusion

This study identified 23 unique polymorphisms and 33 unique polymorphism patterns in OmpA sequences of E. coli from different sources, hosts, environments and pathotypes. Numerous OmpA polymorphism patterns are associated with certain phylogenetic groups of E. coli, while other patterns may indicate specific differences among the ExPEC subpathotypes. Phylogenetic groups B2, D, and F have polymorphisms in their linker/dimerization domain, which may be of functional importance to the pathogenesis and virulence of E. coli. In addition to the phylogenetic groups, ST assignment appears to be associated with OmpA polymorphism patterns.

Members of ST95 and ST131, which are strongly linked to disease, have OmpA polymorphisms not present in other sequence types. Further work is needed to demonstrate the biological significance of the OmpA polymorphisms identified here, but

132 this study is an important first step to demonstrating potential relationships between amino acid differences and their respective function in OmpA.

Acknowledgements

Special thanks to Dr. Julian Traschel (USDA National Animal Disease Center) for his assistance in creating figures.

References

1. Nolan LK, Barnes HJ, Vaillancourt JP, Abdul-Aziz T, Logue CM. 2013. Colibacillosis, p 751-805, Diseases of Poultry, 13th ed doi:10.1002/9781119421481. Ch18. John Wiley & Sons, Ltd.

2. Johnson JR, Stell AL. 2000. Extended virulence genotypes of Escherichia coli strains from patients with urosepsis in relation to phylogeny and host compromise. J Infect Dis 181:261-272.

3. Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. 2015. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Micro 13:269-284.

4. Gaschignard J, Levy C, Romain O, Cohen R, Bingen E, Aujard Y, Boileau P. 2011. Neonatal bacterial meningitis: 444 cases in 7 years. Pediatr Infect Dis J 30:212-217.

5. Stoll BJ, Hansen NI, Sanchez PJ, Faix RG, Poindexter BB, Van Meurs KP, Bizzarro MJ, Goldberg RN, Frantz ID, 3rd, Hale EC, Shankaran S, Kennedy K, Carlo WA, Watterberg KL, Bell EF, Walsh MC, Schibler K, Laptook AR, Shane AL, Schrag SJ, Das A, Higgins, RD. 2011. Early onset neonatal sepsis: the burden of group B streptococcal and E. coli disease continues. Pediatrics 127:817-826.

6. Wang Y, Kim KS. 2002. Role of OmpA and IbeB in Escherichia coli K1 invasion of brain microvascular endothelial cells in vitro and in vivo. Ped Res 51:559-563.

7. Prasadarao NV, Wass CA, Weiser JN, Stins MF, Huang SH, Kim KS. 1996. Outer membrane protein A of Escherichia coli contributes to invasion of brain microvascular endothelial cells. Infect Immun 64:146-153.

8. Weiser JN, Gotschlich EC. 1991. Outer membrane protein A (OmpA) contributes to serum resistance and pathogenicity of Escherichia coli K-1. Infect Immun 59:2252-2258.

133

9. Vogel H, Jahnig F. 1986. Models for the structure of outer-membrane proteins of Escherichia coli derived from raman spectroscopy and prediction methods. J Mol Biol 190:191-199.

10. Pautsch A, Schulz GE. 1998. Structure of the outer membrane protein A transmembrane domain. Nat Struct Biol 5:1013-1017.

11. De Mot R, Vanderleyden J. 1994. The C-terminal sequence conservation between OmpA-related outer membrane proteins and MotB suggests a common function in both gram-positive and gram-negative bacteria, possibly in the interaction of these domains with peptidoglycan. Mol Microbiol 12:333-334.

12. Marcoux J, Politis A, Rinehart D, Marshall DP, Wallace MI, Tamm LK, Robinson CV. 2014. Mass spectrometry defines the C-terminal dimerization domain and enables modeling of the structure of full-length OmpA. Structure 22:781-790.

13. Power ML, Ferrari BC, Littlefield-Wyer J, Gordon DM, Slade MB, Veal DA. 2006. A naturally occurring novel allele of Escherichia coli outer membrane protein A reduces sensitivity to bacteriophage. Appl Environ Microb 72:7930- 7932.

14. Smith SG, Mahon V, Lambert MA, Fagan RP. 2007. A molecular Swiss army knife: OmpA structure, function and expression. FEMS Microbiol Lett 273:1-11.

15. Gophna U, Ideses D, Rosen R, Grundland A, Ron EZ. 2004. OmpA of a septicemic Escherichia coli O78--secretion and convergent evolution. Int J Med Microbiol 294:373-381.

16. Maruvada R, Kim KS. 2011. Extracellular loops of the Eschericia coli outer membrane protein A contribute to the pathogenesis of meningitis. J Infect Dis 203:131-140.

17. Prasadarao NV. 2002. Identification of Escherichia coli outer membrane protein A receptor on human brain microvascular endothelial cells. Infect Immun 70:4556-4563.

18. Gu H, Liao Y, Zhang J, Wang Y, Liu Z, Cheng P, Wang X, Zou Q, Gu J. 2018. Rational design and evaluation of an artificial Escherichia coli K1 protein vaccine candidate based on the structure of OmpA. Front Cell Infect Microbiol 8:172.

19. Sukumaran SK, Shimada H, Prasadarao NV. 2003. Entry and intracellular replication of Escherichia coli K1 in macrophages require expression of outer membrane protein A. Infect Immun 71:5951-5961.

20. Liao C, Liang X, Yang F, Soupir ML, Howe AC, Thompson ML, Jarboe LR. 2017. Allelic variation in outer membrane protein A and Its influence on attachment of Escherichia coli to corn stover. Front Microbiol 8:708.

134

21. Confer AW, Ayalew S. 2013. The OmpA family of proteins: roles in bacterial pathogenesis and immunity. Vet Microbiol 163:207-222.

22. Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Env Microbiol Rep 5:58-65.

23. Nielsen KL, Stegger M, Kiil K, Godfrey PA, Feldgarden M, Lilje B, Andersen PS, Frimodt-Moller N. 2017. Whole-genome comparison of urinary pathogenic Escherichia coli and faecal isolates of UTI patients and healthy controls. Int J Med Microbiol 307:497-507.

24. Clermont O, Bonacorsi S, Bingen E. 2000. Rapid and simple determination of the Escherichia coli phylogenetic group. Appl Environ Microb 66:4555-4558.

25. Wirth T, Falush D, Lan R, Colles F, Mensa P, Wieler LH. 2006. Sex and virulence in Escherichia coli: an evolutionary perspective. Mol Microbiol 60:1136-1151.

26. Logue CM, Wannemuehler Y, Nicholson BA, Doetkott C, Barbieri NL, Nolan LK. 2017. Comparative analysis of phylogenetic assignment of human and avian ExPEC and fecal commensal Escherichia coli using the (previous and revised) Clermont phylogenetic typing methods and its impact on avian pathogenic Escherichia coli (APEC) classification. Front Microbiol 8:283.

27. Logue CM, Doetkott C, Mangiamele P, Wannemuehler YM, Johnson TJ, Tivendale KA, Li G, Sherwood JS, Nolan LK. 2012. Genotypic and phenotypic traits that distinguish neonatal meningitis-associated Escherichia coli from fecal E. coli isolates of healthy human hosts. Appl Environ Microb 78:5824-5830.

28. Nielsen DW, Klimavicz JS, Cavender T, Wannemuehler Y, Barbieri NL, Nolan LK, Logue CM. 2018. The impact of media, phylogenetic classification, and E. coli pathotypes on biofilm formation in extraintestinal and commensal E. coli from humans and animals. Front Microbiol 9.

29. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Fakhr MK, Nolan LK. 2005. Comparison of Escherichia coli isolates implicated in human urinary tract infection and avian colibacillosis. Microbiol 151:2097-2110.

30. Hadley W. 2016. Ggplot2. Springer Science+Business Media, LLC, New York, NY.

31. Wickham H, Grolemund G. 2016. R for data science : import, tidy, transform, visualize, and model data, First edition. ed. O'Reilly Media, Inc., Sebastopol, CA.

32. Maluta RP, Logue CM, Casas MR, Meng T, Guastalli EA, Rojas TC, Montelli AC, Sadatsune T, de Carvalho Ramos M, Nolan LK, da Silveira WD. 2014. Overlapped sequence types (STs) and serogroups of avian pathogenic (APEC)

135

and human extra-intestinal pathogenic (ExPEC) Escherichia coli isolated in Brazil. PLoS ONE 9:e105016.

33. DebRoy C, Fratamico PM, Roberts E. 2018. Molecular serogrouping of Escherichia coli. Anim Health Res Rev doi:10.1017/S1466252317000093:1-16.

34. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Fakhr MK, Nolan LK. 2005. Comparison of Escherichia coli isolates implicated in human urinary tract infection and avian colibacillosis. Microbiol 151.

35. Wijetunge DSS, Gongati S, DebRoy C, Kim KS, Couraud PO, Romero IA, Weksler B, Kariyawasam S. 2015. Characterizing the pathotype of neonatal meningitis causing Escherichia coli (NMEC). BMC Microbiol 15:211.

36. Barrios AF, Zuo R, Ren D, Wood TK. 2006. Hha, YbaJ, and OmpA regulate Escherichia coli K12 biofilm formation and conjugation plasmids abolish motility. Biotechnol Bioeng 93:188-200.

37. Orme R, Douglas CW, Rimmer S, Webb M. 2006. Proteomic analysis of Escherichia coli biofilms reveals the overexpression of the outer membrane protein OmpA. Proteomics 6:4269-4277.

38. Ma Q, Wood TK. 2009. OmpA influences Escherichia coli biofilm formation by repressing cellulose production through the CpxRA two-component system. Environ Microbiol 11:2735-2746.

39. Singh SP, Williams YU, Miller S, Nikaido H. 2003. The C-terminal domain of Salmonella enterica serovar Typhimurium OmpA is an immunodominant antigen in mice but appears to be only partially exposed on the bacterial cell surface. Infect Immun 71:3937-3946.

40. Mittal R, Krishnan S, Gonzalez-Gomez I, Prasadarao NV. 2011. Deciphering the roles of outer membrane protein A extracellular loops in the pathogenesis of Escherichia coli K1 meningitis. J Biol Chem 286:2183-2193.

41. Franz E, Veenman C, van Hoek AH, de Roda Husman A, Blaak H. 2015. Pathogenic Escherichia coli producing extended-spectrum beta-lactamases isolated from surface water and wastewater. Sci Rep 5:14372.

42. Johnson JR, Johnston B, Clabots C, Kuskowski MA, Castanheira M. 2010. Escherichia coli sequence type ST131 as the major cause of serious multidrug- resistant E. coli infections in the United States. Clin Infect Dis 51:286-294.

43. Johnson JR, Johnston B, Clabots C, Kuskowski MA, Pendyala S, Debroy C, Nowicki B, Rice J. 2010. Escherichia coli sequence type ST131 as an emerging fluoroquinolone-resistant uropathogen among renal transplant recipients. Antimicrob Agents Chemother 54:546-550.

136

44. Rogers BA, Sidjabat HE, Paterson DL. 2010. Escherichia coli O25b-ST131: a pandemic, multiresistant, community-associated strain. J Antimicrob Chemother 66:1-14.

45. Shakir SM, Goldbeck JM, Robison D, Eckerd AM, Chavez-Bueno S. 2014. Genotypic and phenotypic characterization of invasive neonatal Escherichia coli clinical isolates. Am J Perinatol 31:975-982.

46. Clermont O, Gordon D, Denamur E. 2015. Guide to the various phylogenetic classification schemes for Escherichia coli and the correspondence among schemes. Microbiol 161:980-988.

47. Tivendale KA, Logue CM, Kariyawasam S, Jordan D, Hussein A, Li G, Wannemuehler Y, Nolan LK. 2010. Avian-pathogenic Escherichia coli strains are similar to neonatal meningitis E. coli strains and are able to cause meningitis in the rat model of human disease. Infect Immun 78:3412-3419.

48. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of Extraintestinal Pathogenic Escherichia coli Strains from Human and Avian Sources Reveals a Mixed Subset Representing Potential Zoonotic Pathogens. Appl Environ Microb 74:7043-7050.

49. Johnson TJ, Logue CM, Johnson JR, Kuskowski MA, Sherwood JS, Barnes HJ. 2012. Associations between multidrug resistance, plasmid content, and virulence potential among extraintestinal pathogenic and commensal Escherichia coli from humans and poultry. Foodborne Pathog Dis 9:37-46.

50. Wai SN, Lindmark B, Soderblom T, Takade A, Westermark M, Oscarsson J, Jass J, Richter-Dahlfors A, Mizunoe Y, Uhlin BE. 2003. Vesicle-mediated export and assembly of pore-forming oligomers of the enterobacterial ClyA cytotoxin. Cell 115:25-35.

51. Belaaouaj A, Kim KS, Shapiro SD. 2000. Degradation of outer membrane protein A in Escherichia coli killing by neutrophil elastase. Science 289:1185-1188.

137

Supplementary Figures

Supplementary Figure 3.1: Number of APEC, NMEC, and UPEC isolates in each phylogenetic group (x-axis) identified by serogroup (y-axis).

Supplementary Tables

Supplementary Table 3.1: Information on the GenBank sequences used in this study.

Due to its large size, this table is available as a supplemental elsewhere for ProQuest.

Supplementary Table 3.2: PCR Primers and Reagents

Primer Set Polymerase Used Source GTTATCTCGTTGGAGATATTCATGG USBtaq (Affymetrix) This GCGGGGTTTTTCTACCAGAC Study CACTGGCTGGTTCGCTAC DreamTaq (20) GCGGCTGAGTTACAACGTCT (Thermo Scientific)

138

CHAPTER 4. COMPLETE GENOME SEQUENCE OF AVIAN PATHOGENIC ESCHERICHIA COLI STRAIN APEC O2-211

Modified from an article published in Microbiology Resource Announcements

Daniel W. Nielsen1, Paul Mangiamele2, Nicole Ricker3, Nicolle L. Barbieri4, Heather K.

Allen3, Lisa K. Nolan5, Catherine M. Logue4

1 Department of Veterinary Microbiology and Preventive Medicine, College of

Veterinary Medicine, Iowa State University, Ames, Iowa

2 Roche Sequencing Solutions, Belmont, California

3 Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center,

ARS-USDA, Ames, Iowa

4 Department of Population Health, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

5 Department of Infectious Diseases, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

Abstract

Avian pathogenic Escherichia coli (APEC) is the causative agent of colibacillosis, a disease that affects poultry production worldwide and leads to multi-million dollar losses annually. Here, we report the genome sequence of an APEC O2-211, an ST117 strain isolated from a diseased chicken.

139

Genome Announcement

APEC O2-211 is a serogroup O2, sequence type 117 strain isolated from the air sac of a chicken clinically diagnosed with colibacillosis (1). APEC O2-211 is also present in the literature as APEC 211 or isolate number 820905 (1-3).

APEC O2-211 was stored at -80 ˚C in glycerol prior to sequencing, and the isolate was cultured on MacConkey agar. Next, a single colony was picked and grown in Luria

Bertani broth at 37˚C. DNA was extracted using the Qiagen DNeasy Blood and Tissue genomic tip kit (Qiagen Hilden, Germany) for Pacific Bioscience (PacBio) sequencing and the ChargeSwitch gDNA mini bacteria kit (Life Technologies Carlsbad, CA) for

Illumina sequencing. DNA yields were quantified using a Qubit fluorimeter dsDNA HS kit (Life Technologies Carlsbad, CA). The genomic library for MiSeq sequencing was prepared with the Nextera Flex kit (Illumina San Diego, CA), and the SMRTbell™ kit

(Pacific Biosciences Menlo Park, CA) was used to prepare the genomic library for

PacBio sequencing with BluePippin (Sage Science Beverly, MA) size selection to target a library size of 10 kb.

Genomic sequencing was performed on the PacBio RSII (Pacific Biosciences,

Menlo Park, CA)) and Illumina MiSeq (Illumina, San Diego, CA) instruments. Two

SMRT cells were used for PacBio sequencing. The Illumina data were trimmed using

Trimmomatic v. 0.36 (4) and assembled de novo using the SPAdes assembler v. 3.10.1

(5). The Pacific Bioscience (PacBio) data were assembled with Canu v. 1.5 (6). Next, the

PacBio assembly was circularized in Geneious v. 10.2.3 (7) and further corrected and polished once with paired Illumina reads utilizing Pilon v. 1.22 (Broad Institute) (8).

140

Potential contigs with plasmids were identified from both sets of sequencing data with the PlasmidFinder database (9) in ABRicate (https://github.com/tseemann/abricate).

Protein-encoding genes and tRNA-carrying genes were assessed via Prokka v. 1.13 (10).

A complete workflow and assembly reports can be found at http://github.com/nielsend/genomeassembly.

The genome of APEC O2-211 consists of a single chromosome, one large plasmid, and two small plasmids. The chromosome consists of 5,114,241 bp and 50.6%

GC content. It encodes 89 tRNAs and contains 4,783 coding sequences. The large plasmid, pAPECO2-211A-ColV is a hybrid IncFIB/IncFIC plasmid. It consists of

197,773 base pairs with 215 coding sequences and a 49.1% GC content. The two small plasmids pAPECO2-211B and pAPECO2-211C have 4,231 and 2,096 base pairs, respectively. Genomic comparisons of APEC O2-211 with other extraintestinal pathogenic E. coli (ExPEC) are ongoing.

Accession Number(s)

The chromosome and plasmids have been deposited in GenBank under the accession numbers CP006834, CP030791, CP030792, and CP030793. The PacBio and

Illumina reads are available in the NCBI Sequence Read Archive (SRP158042).

Acknowledgements

We thank Dr. David Alt and Dr. Darrell Bayles (National Animal Disease Center,

Ames, IA) for their assistance in sequencing and bioinformatics expertise, respectively.

This work was supported by the Dean’s Office at Iowa State University’s College of

141

Veterinary Medicine and USDA ARS CRIS funds. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture. USDA is an equal opportunity provider and employer.

References

1. Kapur V, White DG, Wilson RA, Whittam TS. 1992. Outer membrane protein patterns mark clones of Escherichia coli O2 and O78 strains that cause avian septicemia. Infect Immun 60:1687-1691.

2. Nielsen DW, Klimavicz JS, Cavender T, Wannemuehler Y, Barbieri NL, Nolan LK, Logue CM. 2018. The impact of media, phylogenetic classification, and E. coli pathotypes on biofilm formation in extraintestinal and commensal E. coli from humans and animals. Front Microbiol 9.

3. Logue CM, Wannemuehler Y, Nicholson BA, Doetkott C, Barbieri NL, Nolan LK. 2017. Comparative analysis of phylogenetic assignment of human and avian ExPEC and fecal commensal Escherichia coli using the (previous and revised) Clermont phylogenetic typing methods and its impact on avian pathogenic Escherichia coli (APEC) classification. Front Microbiol 8:283.

4. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120.

5. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455- 477.

6. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722-736.

7. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond A. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28:1647-1649.

8. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for

142

comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963.

9. Carattoli A, Zankari E, Garcia-Fernandez A, Voldby Larsen M, Lund O, Villa L, Moller Aarestrup F, Hasman H. 2014. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 58:3895-3903.

10. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068-2069.

143

CHAPTER 5. COMPLETE GENOME SEQUENCE OF THE MULTI-DRUG RESISTANT NEONATAL MENINGITIS ESCHERICHIA COLI SEROTYPE O75:H5:K1 STRAIN MCJCHV-1 (NMEC-O75)

Modified from an article published in Microbiology Resource Announcements

Daniel W. Nielsen1, Nicole Ricker2, Nicolle L. Barbieri3, James L. Wynn4, Oscar G.

Gómez-Duarte5, Junaid Iqbal5, Lisa K. Nolan6, Heather K. Allen2, Catherine M. Logue3*

1Department of Veterinary Microbiology and Preventive Medicine, College of Veterinary

Medicine, Iowa State University, Ames, Iowa

2 Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center,

ARS-USDA, Ames, Iowa

3Department of Population Health, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

4 Departments of Pediatrics and Pathology, Immunology, and Laboratory Medicine,

University of Florida, Gainesville, Florida

5Division of Pediatric Infectious Diseases, Department of Pediatrics, University at

Buffalo, The State University of New York, Buffalo, New York

6Department of Infectious Disease, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

144

Abstract

Neonatal meningitis Escherichia coli (NMEC) is the second-leading cause of neonatal bacterial meningitis (NBM) worldwide. Here, we report the genome sequence of a multi-drug resistant, NMEC serotype O75:H5:K1 strain mcjchv-1, which resulted in the death of an infant. The O75 serogroup is rare among NMEC isolates; therefore, this strain is considered an emergent pathogen.

Genome Announcement

Although NMEC is the second-leading cause of neonatal bacterial meningitis

(NBM) after Group B streptococci, NMEC is responsible for the greatest mortality of

NBM cases (1-3). The NMEC serotype O75:H5:K1 strain mcjchv-1 (NMEC-O75) described here was isolated from a 30-day old newborn with meningitis. The patient subsequently died of infection complications after 16 days of hospitalization despite appropriate antibiotic management (4). Although NMEC strains RS218 (5), CE10 (6),

IHE3034 (7), NMEC O18 (8), and S88 (9) have been sequenced and are publicly available in the NCBI GenBank database, these isolates all belong to serogroups O7 and

O18, and a depth of clinical information about the strains is not readily available. Here, we present the genome sequence of NMEC-O75, a clinical isolate with a previously published clinical history (4).

NMEC-O75 was grown on MacConkey agar and subsequently in Luria Bertani broth at 37˚C. Genomic DNA was extracted using the Qiagen DNeasy Blood and Tissue genomic tip kit (Qiagen Hilden, Germany) for Pacific Bioscience (PacBio) sequencing and the ChargeSwitch gDNA mini bacteria kit (Life Technologies Carlsbad, CA) for

Illumina sequencing. DNA yields were quantified using a Qubit fluorimeter dsDNA HS

145 kit (Life Technologies Carlsbad, CA). The genomic library for MiSeq sequencing was prepared using the Nextera Flex kit (Illumina San Diego, CA).

Genomic sequencing was performed on the PacBio RSII (Pacific Biosciences,

Menlo Park, CA) and Illumina MiSeq (Illumina, San Diego, CA) instruments. Two

SMRT cells were used for PacBio sequencing. The Illumina data were trimmed using

Trimmomatic v. 0.36 (10) and assembled de novo using the SPAdes assembler v. 3.10.1

(11). The Pacific Bioscience (PacBio) data were assembled with Canu v. 1.5 (12). Next, the PacBio assembly was circularized in Geneious v. 10.2 (13) and further corrected and polished with Pilon v. 1.22 (Broad Institute) (14). Potential contigs with plasmids were identified with the PlasmidFinder database (15) in ABRicate

(https://github.com/tseemann/abricate). Protein-encoding genes and tRNA-carrying genes were assessed via Prokka v. 1.13 (16). A complete workflow can be found at http://github.com/nielsend/genomeassembly.

The NMEC-O75 genome consists of a single chromosome, one large plasmid, and four small plasmids. The chromosome consists of 4,939,457 bp with 50.6% GC content.

It encodes 91 tRNAs and contains 4,593 coding sequences. The large plasmid, pNMEC-

O75A, is a hybrid IncFIA/IncFIB plasmid. It consists of 88,420 base pairs and 50.6% GC content and contains 97 coding sequences. The four small plasmids pNMEC-O75B, pNMEC-O75C, pNMEC-O75D, and pNMEC-O75E range from 1,983 to 6,465 base pairs and have 42.6-56.1% GC content. Genomic comparisons of NMEC-O75 with other

NMEC genomes are ongoing.

146

Accession Numbers

The chromosome and plasmids have been deposited in GenBank under the accession numbers CP030111, CP030112, CP030113, CP030114, CP030115, and

CP030116.

Acknowledgements

We thank Dr. David Alt and Dr. Darrell Bayles (National Animal Disease Center, Ames,

IA) for their assistance in sequencing and bioinformatics expertise, respectively. This work was supported by the Dean’s Office at Iowa State University’s College of

Veterinary Medicine and NIH K08GM106143. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.

USDA is an equal opportunity provider and employer.

References

1. Gaschignard J, Levy C, Romain O, Cohen R, Bingen E, Aujard Y, Boileau P. 2011. Neonatal bacterial meningitis: 444 cases in 7 years. Pediatr Infect Dis J 30:212-217.

2. Croxen MA, Finlay BB. 2010. Molecular mechanisms of Escherichia coli pathogenicity. Nat Rev Micro 8:26-38.

3. Logue CM, Doetkott C, Mangiamele P, Wannemuehler YM, Johnson TJ, Tivendale KA. 2012. Genotypic and phenotypic traits that distinguish neonatal meningitis-associated Escherichia coli from fecal E. coli isolates of healthy human hosts. Appl Environ Microb 78:5824-5830.

4. Iqbal J, Dufendach KR, Wellons JC, Kuba MG, Nickols HH, Gómez-Duarte OG, Wynn JL. 2016. Lethal neonatal meningoencephalitis caused by multi-drug resistant, highly virulent Escherichia coli. Infect Dis 48:461-466.

147

5. Wijetunge DS, Karunathilake KH, Chaudhari A, Katani R, Dudley EG, Kapur V. 2014. Complete nucleotide sequence of pRS218, a large virulence plasmid, that augments pathogenic potential of meningitis-associated Escherichia coli strain RS218. BMC Microbiol 14:203.

6. Lu S, Zhang X, Zhu Y, Kim KS, Yang J, Jin Q. 2011. Complete genome sequence of the neonatal-meningitis-associated Escherichia coli strain CE10. J Bacteriol 193.

7. Moriel DG, Bertoldi I, Spagnuolo A, Marchi S, Rosini R, Nesta B, Pastorello I, Corea VAM, Torricelli G, Cartocci E, Savino S, Scarselli M, Dobrindt U, Hacker J, Tettelin H, Tallon LJ, Sullivan S, Wieler LH, Ewers C, Pickard D, Dougan G, Fontana MR, Rappuoli R, Pizza M, Serino L. 2010. Identification of protective and broadly conserved vaccine antigens from the genome of extraintestinal pathogenic Escherichia coli. P Nat Acad Sci USA 107:9072-9077.

8. Nicholson BA, Wannemuehler YM, Logue CM, Li G, Nolan LK. 2016. Complete genome sequence of the neonatal meningitis-causing Escherichia coli strain NMEC O18. Genome Announc 4.

9. Peigne C, Bidet P, Mahjoub-Messai F, Plainvert C, Barbe V, Médigue C. 2009. The plasmid of Escherichia coli strain S88 (O45: K1: H7) that causes neonatal meningitis is closely related to avian pathogenic E. coli plasmids and is associated with high-level bacteremia in a neonatal rat meningitis model. Infect Immun 77:2272-2284.

10. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120.

11. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19:455-477.

12. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722-736.

13. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond A. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28:1647-1649.

14. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9:e112963.

148

15. Carattoli A, Zankari E, Garcia-Fernandez A, Voldby Larsen M, Lund O, Villa L, Moller Aarestrup F, Hasman H. 2014. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 58:3895-3903.

16. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068-2069.

149

CHAPTER 6. GENOMIC CHARACTERIZTION OF THE NEONATAL MENINGITIS ESCHERICHIA COLI (NMEC) SUBPATHOTYPE USING WHOLE GENOME SEQUENCING

Modified from an article to be submitted to mSphere

Daniel W. Nielsen1, Nicole Ricker2, Nicolle L. Barbieri3, Bryon A. Nicholson4, Ganwu

Li5, Heather K. Allen2, Lisa K. Nolan6 and Catherine M. Logue3*

1Department of Veterinary Microbiology and Preventive Medicine, College of Veterinary

Medicine, Iowa State University, Ames, Iowa

2 Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center,

ARS-USDA, Ames, Iowa

3Department of Population Health, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

4 Boehringer Ingelheim, Ames, IA

5 Department of Veterinary Diagnostics & Production Animal Management, College of

Veterinary Medicine, Iowa State University, Ames, Iowa

6Department of Infectious Diseases, College of Veterinary Medicine, University of

Georgia, Athens, Georgia

150

Abstract

Neonatal Meningitis Escherichia coli (NMEC) is the leading Gram-negative cause of neonatal bacterial meningitis (NBM), a disease that causes mortality in 15-40% of infected newborns and can result in permanent sequelae. Despite the clinical importance of NMEC, most research has been conducted on a single strain (NMEC strain

RS218), and much remains unknown about the diversity of the vast NMEC subpathotype.

To address this research gap, we sequenced, analyzed, and compared 49 clinical isolates with the publicly-available NMEC genomes (n=9). We found that all of the examined

NMEC harbored two conserved genes, which encoded pseudopilins of a T2SS that were not found in K-12 strains MG1655 and W3110. Additionally, an analysis of conserved genes indicated that the NMEC subpathotype clustered into four different clades. The plasmid repertoire was also examined because of the importance of plasmids to NMEC virulence. The results showed that plasmid diversity in the NMEC subpathotype was greater than simply the IncFIB and ColV plasmids that have been previously shown to be associated with the NMEC subpathotype. Small Col plasmids were present within 50% of the surveyed isolates, but their biological importance remains unknown. Overall, strain

RS218 is not representative of the diverse NMEC subpathotype, which suggests that potential virulence factors or mechanisms of pathogenesis may remain undiscovered without the examination of additional NMEC strains.

Introduction

Neonatal meningitis Escherichia coli (NMEC) is a subpathotype of

Extraintestinal Pathogenic E. coli (ExPEC), which also includes Avian Pathogenic E. coli

(APEC) and uropathogenic E. coli (UPEC) (1-3). NMEC is the leading Gram-negative

151 cause of neonatal bacterial meningitis (NBM) and second-leading cause of NBM after

Group B streptococci in developed countries (4). NBM results in a mortality rate ranging from 15-40%, but 30% of survivors suffer permanent sequalae such as deafness, epilepsy, or intellectual disability (4-7). A myriad of virulence factors have been reported to contribute to the virulence and pathogenesis of NMEC such as OmpA, FimH, CNF1, K1 capsule, and IbeA (8-15). However, no known virulence factors are exclusive to the

NMEC subpathotype. The virulence factors OmpA and FimH are present in virtually all

E. coli, and the prevalence of virulence factors such as CNF1, IbeA, and the K1 capsule vary by study (16, 17). The NMEC subpathotype also shows diversity by E. coli typing schemes as NMEC isolates vary by phylogenetic group assignment and serogroup; there is no robust survey of the sequence type (ST) for the NMEC subpathotype.

Of the seven E. coli-specific phylogenetic groups (A, B1, B2, C, D, E, and F), only NMEC belonging to phylogenetic groups B2 and F have been sequenced. NMEC phylogenetic group B2 strains that have sequenced chromosomes include IHE3034,

NMEC O18, NMEC-O75 (MCJCHV-1), RS218, and S88, and the NMEC group F strain that has been sequenced is CE10 (18-22). Additionally, genomic assemblies for several other NMEC strains are now publicly available (23). Since some virulence genes, such as ibeA, are correlated with certain phylogenetic groups, an examination of the chromosomal diversity of NMEC is important, so virulence genes from the different phylogenetic groups are not overlooked (24).

In addition to the NMEC chromosome, plasmids appear to contribute to the virulence of NMEC in vivo as shown by a pRS218, a large plasmid of the NMEC strain

RS218 (25). IncF plasmids, especially IncFIB, are associated with the NMEC

152 subpathotype (26, 27). Historically, studies of plasmids in NMEC strains have primarily focused on large IncF plasmids and have shown that these plasmids encode genes for replication, iron-acquisition, protectins, and adhesins (26, 28). The presence of other replicons, such as the Col replicons, remains unexplored.

Here, we examine NMEC, using whole genome sequencing (WGS), to ascertain the diversity of the subpathotype as WGS provides better resolution than classical typing schemes. Such WGS resolution allows clarification of the composition of virulence- associated genes and identification of plasmids present in the NMEC subpathotype.

Additionally, WGS data may present potential virulence factors that would otherwise go unnoticed within the genomes of diverse NMEC isolates.

Materials & Methods

Bacterial Strain Selection and DNA Isolation

Forty-nine NMEC isolates were randomly selected from a collection of 87 strains, which have been previously described, phylogenetically typed, and serogrouped (17, 28,

29). DNA for short-read sequencing was extracted using a ChargeSwitch gDNA mini bacteria kit (Life Technologies Carlsbad, CA). DNA for long-read sequencing was extracted using a Qiagen DNeasy Blood and Tissue genomic tip kit (Qiagen Hilden,

Germany) or the ChargeSwitch gDNA mini bacteria kit.

Short-read sequencing and assembly

Genomic library DNA was prepared using the Nextera XT kit (Illumina San

Diego, CA) for 48 NMEC strains while one strain was prepared separately using the

153

NexteraFlex kit (Illumina San Diego, CA). The libraries were sequenced on an Illumina

MiSeq Desktop Sequencer using the MiSeq Reagent Kit v2 for 500 cycles at the USDA

National Animal Disease Center (Ames, Iowa). The resultant FASTQ files were assessed using FASTQC, trimmed using Trimmomatic v. 0.36 (30), and de novo assembled using the SPAdes Genome Assembler v. 3.10.1 (31). Assemblies were assessed with Quast (32) to determine contig sizes, N50, and assembly size (Supplementary Table 6.1).

Long-read sequencing and assembly

Thirteen of the 49 NMEC were sequenced via a Pacific Bioscience RSII sequencer at the Yale Center for Genomic Analysis (Pacific Biosciences Menlo Park,

CA) for this study. The resultant reads were assembled de novo using Canu v. 1.5 (33).

Canu assemblies were polished twice using paired-read data with Pilon v. 1.22 (Broad

Institute, Cambridge, Massachusetts) (34). The workflow resembles that of previously completed ExPEC genomes (35, 36).

NMEC Core Genome, Clades, and Virulence Factor Identification

The 49 assembled isolates and publicly available sequences (n=9) were annotated using Prokka v. 1.13, and the core and pan genome assessed with Roary (37, 38). The resultant Roary files were used as input into FastTree to create a Maximum-Likelihood phylogenetic tree (37, 39, 40). The final phylogenetic tree was created with the

Interactive Tree of Life (iTOL) v. 4.2.3 (41, 42). Conserved NMEC genes and NMEC clade groupings were obtained from the Roary output. ExPEC-associated virulence genes were also identified with ABRicate using a custom ExPEC-specific dataset

154

(github.com/nielsend/NMEC-genomics). The resultant data were analyzed and graphed in

R using the tidyverse and ggplot packages, respectively (43, 44).

Plasmid & Chromosome Visualization

Potential plasmid contigs were identified by ABRicate using the PlasmidFinder database (45). These contigs were positively identified as plasmids via BLAST, circularized in Geneious (46), and annotated with Prokka v 1.13 (38). Plasmids larger than 10kbp and NMEC chromosomes were visualized with Blast Ring Image Generator,

BRIG (47).

Results

The ExPEC-associated virulence genes within the NMEC subpathotype were surveyed to determine if any virulence-associated genes were conserved within the

NMEC genomes (Figure 6.1). Genes adhE, csgA, feoA, fimH, ibeC, ompA, potE, and umuC were detected in all NMEC genomes (n=58). However, these genes are present in most E. coli, like the fimH and ompA genes (48). Although not present in all NMEC genomes, genes present in the majority of the NMEC subpathotype included: adhesins

(aslA, ibeB, matA, matB), iron-associated genes (chuA, feoB, fyuA, irp2, iucA, sitA), the toxin usp, plasmid-associated colicin V, protectins (bor, kps, ompT, and traT), and other genes (malX, potE, and speF). When the percent identity of the genes was examined, csgA, ompA, and speE, which are genes that were always or nearly-always present, contained differences that could result in polymorphisms when translated (Figure 6.1).

Invasin ibeA was also harbored by the majority of NMEC surveyed but at a lower prevalence than expected (53.4%); ibeA has long been defined as being associated with

155 the NMEC subpathotype (49-51). Taken together, the results suggest that the NMEC genomes have variation in their carriage of virulence-associated genes, and the NMEC- associated gene ibeA is a poor indicator of the subpathotype. Nevertheless, the conserved genes of NMEC remains unexplored.

NMEC Core Genome Has Two Genes Absent in MG1655

A comparison of the NMEC genomes (n=58) with Roary resulted in the definition of a core genome of 2,531 genes (n>=57), including 550 soft-core genes (55 <= n < 57), and 12,570 genes present in fewer than 55 isolates (Supplemental Figure 6.1).

Phylogenetic analysis of the NMEC core genome revealed that the strains tended to group by phylogenetic group, serogroup, or sequence type (Figure 6.2). The inclusion of these assignments demonstrates no classification scheme represents all of the NMEC.

Indeed, Sequence types (ST) of NMEC showed no conserved clonal lineages were evident; STs 12, 62, 95, 390, 416, and 567 were identified (Figure 6.2). The acknowledgement that NMEC is a diverse subpathotype is important to understanding how NMEC may cause disease.

156

Figure 6.1: Genomes of the NMEC subpathotype show diversity of virulence- associated genes.

ExPEC virulence-associated genes (x-axis) are harbored within the NMEC genomes (y- axis) (1, 3, 16, 17, 52-55). All genes shown have both >=90% coverage and identity. The identity is shown from 90% (steel blue) to 100% (blue). Variants of the gene are indicated with an underscore and number (e.g. sitA_2). The ExPEC specific virulence- associated gene sequences FASTA file is available at github.com/nielsend/NMEC- genomics.

157

Unique core genes to NMEC relative to other E. coli strains may influence the pathogenicity of NMEC. Two genes contained within the NMEC core genome that were absent in E. coli MG1655, a derivative strain of the original E. coli K-12 isolated from a diphtheria patient in 1922, were identified (56, 57). These genes (xcpV and epsH) encode minor pseudopilin components of the type II secretion system (T2SS), as annotated in

Pseudomonas aeruginosa and Vibrio cholerae (58). Further investigation of these genes revealed two E. coli homologues that are annotated as gspI and gspH, pseudopilins of the

T2SS general secretory pathway (59). When all four genes were conceptually translated and aligned, approximately 24-25% identity between GspI and XcpV and between GspH and EpsH was observed. BLAST analysis revealed that the NMEC subpathotype always contained the epsH and xcpV genes, which were absent from MG1655 (Figure 6.3). To examine if this observation was unique to MG1655, E. coli strain W3110 was also included, and the genome displayed the same absence of epsH and xcpV and presence of gspI and gspH as MG1655 (Figure 6.3). Interestingly, gspI and gspH, which were found in the K-12 strains, were missing from 11 different NMEC. Every NMEC isolate assigned to the B2 phylogenetic group carried this T2SS, but the system was not B2- specific, as NMEC strains belonging to other phylogenetic groups, such as A and C, also harbored the genes (Figures 6.2 and 6.3).

158

Figure 6.2: Maximum-Likelihood phylogenetic tree of NMEC core genome alignment.

NMEC core genome phylogenetic tree with phylogenetic group (inner ring), serogroup (middle ring), and Achtman Sequence Type (outer ring) assignments shown around the outside of the tree. Serogroup O18ac is included in the serogroup O18 category.

The NMEC genomic regions encoding the T2SS genes were notably conserved in terms of flanking genes, but some variation was evident. The T2SS conserved within the

NMEC subpathotype was flanked upstream by the glc locus, which is associated with glycolate utilization (60). However, there was variation downstream of the T2SS. In the majority of strains, virulence-associated capsule formation genes kpsM and neuA were in the downstream region. The non-conserved T2SS, carrying the gspI and gspH genes, was flanked upstream by ribosomal genes (rps and rpl) and downstream by bacterioferritin- associated genes (bfr and bfd), a bifunctional enzyme with lysozyme/chitinase activity

159

(chiA), and elongation factors (tufA and fusA) (61-63). For NMEC isolates without the non-conserved (gspI and gspH) T2SS, the rps and rpl genes were present upstream of gspO, which was immediately upsteam of bacterioferritin genes (bfr and bfd).

Interestingly, the gspO gene of the general secretory pathway was present, suggesting a deletion of the other gsp genes may have occurred. The differences presented among these T2SS genes represent a small fraction of the more than 15,000 genes that compose the NMEC subpathotype.

gspH gspI SP46 SP13 SP5 SP4 S88 RS218 NMEC−O18 NM92 NM91 NM89 NM86 NM84 NM83 NM82 NM81 NM74 NM73 NM72 NM71 NM70 NM67 NM65 NM60 NM59 NM57 Percent NM55 Identity NM52 NM49 100.0 NM45 NM43 97.5 NM42 NM40 NM38 95.0 NM37 NMEC Isolate NM36 92.5 NM35 NM34 NM30 NM29 NM28 NM27 NM26 NM23 NM22 NM20 NM19 NM18 NM16 NM15 NM14 NM11 NM08 NM07 NM06 NM03 NM02 IHE3034 CE10 W3110 MG1655 epsH gspH gspI xcpV

Gene

Figure 6.3: Presence of the T2SS genes among NMEC and K-12 isolates (W3110 and MG1655.

The genes xcpV and epsH are present in one secretion system in all NMEC genomes, and gpsH and gspI are present in a different secretion system that is less-conserved in NMEC but also present in K-12 strains.

160

NMEC Clades Contain Distinct Conserved Genes

An analysis of the pan-genome and a maximum likelihood phylogenetic tree revealed four distinct NMEC clades (Figure 6.4A). These four clades also reflect the other E. coli typing schemes with NMEC clade 1 consisting predominantly of serogroups

O18 and O1; clade 2 predominantly consisting of O21; clade 3 predominantly consisting of O83; and clade 4 predominantly consisting of O7 and O12. Likewise, the sequence types (STs) of NMEC clade 1 were composed primarily of ST 95, ST 390, and ST 416; clade 2 is composed of ST 12; clade 3 is composed of ST567; and clade 4 are of the ST62 type. However, not every serogroup or sequence type assignment matched the clade designation as the NMEC pan-genome derived clades offered better resolution of NMEC groupings.

With the four clades derived, this study identified the genes conserved within each clade that were not conserved in the other clades (Figure 6.4B). The functions of the genes that the different clades contained was variable. Indeed, annotated functions of the genes included toxins, toxin-antitoxin systems, transporters, adhesins, porins, hemolysins, and OMPs as well as prophage, CRISPR-associated, and plasmid-associated genes (Table

6.1). Although the influence these genes may have on virulence was not tested, some of these genes have the potential to be ExPEC virulence genes. NMEC clades 1, 2, and 3 are contained solely within phylogenetic group B2, but NMEC clade 4 consisted of a diverse collection of strains classified as the A, B1, C, D, E, and F phylogenetic groups (Figure

6.4A). The comparative genomic analysis revealed that 255 genes were conserved among strains belonging to the B2 phylogenetic group, and these genes included transporters, the matA transcriptional regulator, adhesins, and toxin-antitoxin systems (Table 6.2). Specific genes of interest conserved within this group included imm, the phage T4 and colicin

161 immunity protein; kpsM, the polysialic acid transport protein; and mntB, the manganese transport system. Taken together, the genes conserved among clades 1, 2, and 3 adds to our knowledge of the virulence-associated B2 phylogenetic group.

NMEC Chromosomes Are Important in Understanding Chromosomal Diversity

Thirteen new draft and completed chromosomes allowed us to better visualize the

NMEC chromosome with those already publicly available (n=6) (Figure 6.5). Notably, strain RS218 appears to have large regions not conserved in other NMEC strains (Figure

6.5). Like RS218, other newly sequenced strains contain unique chromosomal regions relative to previously sequenced NMEC isolates; these chromosomes may assist in further studying the NMEC subpathotype (Table 6.3).

The NMEC Subpathotype Harbors Unexpected Plasmids

Analysis of the NMEC genomes, using ABRicate and PlasmidFinder, revealed that the majority of NMEC genomes contained the IncFIB plasmid replicon, and other Inc replicons included IncFII, IncFIC, IncI1, and IncB/O/K/Z (Supplementary Table 6.2,

Figure 6.6). IncFIB plasmids are well-recognized as important to the virulence of NMEC

(25, 28). Long-read sequencing was used to examine the plasmids of 13 NMEC strains beyond their replicons; the diversity of plasmids was striking when compared to reference NMEC plasmids as shown by pS88 (Figure 6.7).

162

(4A)

(4B)

Figure 6.4: NMEC clusters into four clades by the presence of conserved genes.

(6.4A) A maximum-likelihood phylogenetic tree of the core NMEC genome with the phylogenetic group (left), serogroup (center), and Achtman Sequence Type (ST; right) assignments shown. NMEC clades were defined as labeled by color overlaid on the NMEC strain names—clade 1 (red), clade 2 (blue), clade 3 (green), and clade 4 (purple). All nodes had bootstrap proportions greater than 70%, with the exception of the node indicated with a black triangle (6.4B) Non-hypothetical conserved genes to the specific clades in 6.4A (and not conserved in the other clades) are shown. Conserved genes for each clade are defined as genes present in >= n-1 for each grouping.

163

Table 6.1: Conserved genes unique to each clade with adhesins, toxins, iron regulation, and lipoproteins shown.

GROUP 1 GROUP 2 GROUP 3 GROUP 4

alpA: DNA-binding borD: Bacteriophage lambda abgT: p-aminobenzyl CRISPR-associated genes transcriptional regulator Bor glutamate: H+ symporter

dinD: DNA-damaging inducible gadB: Glutamate bfpA: Bundle forming pilus arsB: arsenite: H+ antiporter protein decarboxylase B

kilR: Inhibitor of FtsZ, killing ehaG: Autotransporter adhesin ccdB: Toxin CcdB fecA: Ferric citrate OMP protiein

fliC: Flagellar biosynthesis intZ: Predicted integrase mec: CysO-cystein peptidase frv: PTS permease

neuA: N-acylneuraminate ompT: Outer membrane mleN: Malate-2H(+)/Na(+) hlyE: Hemolysin E cytidyltransferase protease VII lactate antiporter

ompW: Outer membrane symE: toxin-like protein of psiB: Protein PsiB hyf: Hydrogenase 4 Proteins SOS

wzzB: Regulator of O-antigen repB: RepFIB replication Prophage genes iucC: Aerobactin synthase length protein

ycgV: Putative adhesion & scrB: Sucrose-6 phosphate lsr: Autinducer-2 ABC vgrG1: Actin cross-linking toxin penetration protein transporter

yadC: Putatitive fimbrial-like pup: Putrescine: H+ ydcM: Putative transposase tra: Tra genes adhesin symporter puuD: Gamma-glutamyl- ydeQ: Putative fimbrial-like tsh: Putative adhesion & yafN: Antitoxin gamma-aminobutyrate adhesin penetration protein hydrolase yjjL: Galactonate: H+ sfm: Putative fimbrial-like yafO: mRNA interferase yihN: YihN MFS transporter symporter adhesin

yjjM: Putative DNA-binding yddB: Putative porin protein tisB: Toxic peptide transcriptional regulator

ypjA: Adhesin-like yfeW: Penicillin binding ylpA: Lipoprotein autotransporter protein

164

Table 6.2: Complete and draft chromosomes for 13 NMEC strains and reference chromosomes.

CHROMOSOME PHYLOGENETIC ISOLATE %GC CLOSED CONTIGS SEROGROUP ST SIZE GROUP

NM14 5,032,772 50.8% 1 C O7 -

NM19 4,840,395 50.7% 2 B2 - 567

NM20 4,902,218 50.6% X 1 D O7 362

NM23 5,372,956 50.6% 1 F O7 62

NM26 4,966,497 50.6% X 1 B2 O21 12

NM27 5,034,177 50.8% X 1 A O12 10

NM34 5,130,420 50.8% 2 F - 379

NM45 4,889,182 50.7% X 1 B1 O23 453

NM52 4,935,851 50.7% X 1 B2 O25 131

NM55 4,911,895 50.9% 3 C O9 88

NM73 4,829,265 50.9% 6 B1 O29 58

NM83 5,221,029 50.8% 12 F O1 59

NM84 4,735,582 50.6% 1 B2 O21 538

CE10 5,313,531 50.6% X 1 F O7 62

IHE3034 5,108,383 50.7% X 1 B2 O18 95

NMEC 5,002,781 50.7% X 1 B2 O18 416 O18 NMEC- 4,939,457 50.6% X 1 B2 O75 1193 O75

RS218 5,087,638 50.6% X 1 B2 O18 -

S88 5,032,268 50.7% X 1 B2 O45 95

Eight strains exist each as a sole contig. NMEC-O75 is also known as MCJCHV-1 or as its assembly, NM92. ST or serogroups with a (-) have a result of “untypeable”.

165

NMEC Chromosomes

Figure 6.5: Complete or nearly-complete sequenced NMEC chromosomes relative to NMEC strain RS218.

Regions of RS218 not present in the other plasmids are indicated in white. Chromosomes examined are ordered from the inner-most ring (purple-CE10) to the outer-most ring (dark pink-S88).

166

Figure 6.6: Plasmid replicons within the NMEC genomes.

The 58 surveyed NMEC genomes (y-axis) with the presence/absence and percent identity of replicons (x-axis) shown from 96% identity (light green) to 100% identity (dark green). All replicons included had greater than 96% coverage. Reference sequences IHE3034 and NMEC-O18 were both clinically isolated from newborns with plasmid sequences, but the DNA sequences of the plasmids have not been sequenced or publicly released (19).

167

Table 6.3: Large plasmids carried by NMEC isolates.

Strain Plasmid Size %GC Closest Matching Replicon(s) Closed (bp) NM14 pNM14 142,785 49.4 IncFIB, IncFIC(FII) X NM19 pNM19 142,431 49.3 IncFII_1_pSFO_AF401292 X

NM20 pNM20 184,523 51.1 IncFIA, IncFIC(FII) X NM23 pNM23 54,545 49.3 IncFII(29)_1_pUTI89, X Col156_1 NM26 pNM26 154,274 51.9 IncFII_1_pSFO, X Col156_1 NM27 pNM27A 87,517 49.8 IncFIC(FII), Col156_1 X

pNM27B 48,077 44.5 Unknown X NM34 pNM34A 74,865 51.7 IncFII(29)_1_pUTI89 X pNM34B 74,564 50.2 IncFII_1_pSFO, Col156_1 X

pNM34C 48,392 44.6 Unknown X NM45 pNM45 184,477 49.5 IncFIC(FII) X NM52 pNM52 148,714 48.2 IncFIC(FII) X NM55 pNM55A 155,907 51.3 IncFIB(AP001918), IncFIC(FII) pNM55B 98,331 47.8 p0111_1 X pNM55C 36,672 57.5 ColRNAI_1 X pNM55D 30,736 59.3 Unknown NM83 pNM83A 82,041 53.3 IncB/O/K/Z_2 X pNM83B 74,885 50.2 IncFII_1_pSFO, Col156_1 X pNM83C 33,823 42.0 IncX1_1 X pNM83D 18,573 46.8 Unknown NM84 pNM84 142,432 48.0 IncFII_1_pSFO X

Some plasmids, pNM27A, pNM34C, pNM55D, and pNM83D did not match known replicons in PlasmidFinder, but they were included due to their close identity to other publicly available plasmids.

Furthermore, plasmids of the same replicon type did not have large conserved regions among all plasmids with that replicon (Supplementary figures 6.2 and 6.3). In contrast to this, plasmids that carried colicin-V gene (cva) had large conserved regions

168 among NMEC ColV plasmids, even compared to prototypical ColV plasmids, like pAPEC-O2-ColV, and other APEC plasmids (Figure 6.8). ColV plasmids have long been associated with the ExPEC pathotype and disease (3, 64-67), and the plasmids are known to carry sit, tra, psi, cva, ets, iss, and eit virulence or virulence-associated genes (65). The association of ColV genes (cva) with ExPEC plasmid-and-virulence-associated genes can also be seen when all NMEC isolates were surveyed (Figure 6.1). However, non-ColV plasmids were also present within NMEC isolates (Table 6.4 and Figure 6.8).

Plasmids with unknown replicons were identified within the NMEC genomes.

Plasmids pNM27B, pNM34C, pNM55D, and pNM83D had unidentifiable replicons, but they shared identity to publicly available plasmids as determined via BLAST (Table 6.4).

These plasmids showed large regions of dissimilarity via BLAST when compared with previously sequenced NMEC plasmids (Figure 6.7). Plasmids pNM27B and pNM34C are closely related as both plasmids contained two shared regions that totaled approximately

40kb, and these two regions predominantly contained hypothetical and phage-associated genes. Plasmid pNM55D was similar to the plasmids carried by different

Enterobacteriaceae, including Klebsiella, Salmonella, Yersinia, Shigella, and Escherichia species. Plasmid pNM83D resembled plasmid pVR50D (CP011138.1), a plasmid isolated from a uropathogenic E. coli strain although pNM83D was approximately 10kb larger

(68). The four plasmids with unknown replicons demonstrate the diversity of the larger

NMEC plasmids, but they were smaller than other plasmids with the exception of the

IncX and small Col plasmids (Table 6.4).

169

Figure 6.7: NMEC ColV plasmids are similar to each other and other APEC ColV plasmids, including the prototypical ColV plasmid pAPEC-O2-ColV (dark blue) (65).

Regions of pS88 not present in the other plasmids are indicated in white.

Along with the Inc replicons, Col replicons were detected in the NMEC subpathotype and were present in smaller plasmids that ranged in size from approximately 1.5-7.5 kb (Supplementary Table 6.2). The Col156 replicon was found in the most NMEC strains, followed by the ColpVC and Col(BS512) replicons (Figure 6.6).

Nevertheless, when individual plasmids were examined, the ColRNAI_1 replicon was the most prevalent. NMEC 27 and SP5 contained the most small plasmids, with each carrying six different small plasmids of the Col156, ColRNAI, Col8282, Col(BS512), and Col(MG828) replicons.

170

Seventy plasmids were found among the 58 different NMEC strains when the small Col replicon-containing plasmids were examined (Supplementary Table 6.2).

Interestingly, dependent on replicon type, these plasmids appeared to contain distinct contents. Nine ColRNAI plasmids ranging from 6.6-6.8 kbp carried the cea gene, which encodes colicin 10, a bacteriocin that results in the death of closely related strains (69,

70). Similarly, five ColRNAI plasmids ranging from 5-6.8 kbp harbored neo and folP, genes associated with antibiotic resistance (71, 72). Plasmids with the Col156 replicon were likely to have the mobA gene. The small Col(MG828) plasmids often carried hypothetical proteins. Other genes identified included genes that encode methyltransferases and entry exclusion proteins. In the case that small plasmids were not present in a given NMEC strain, a Col replicon may be found within a larger Inc plasmid

(Table 6.4).

Discussion

Extraintestinal pathogenic E. coli have been described as having a mix-and-match approach to virulence, and the prevalence of different ExPEC-associated virulence genes varied within the NMEC genomes (Figure 6.1). The work presented here recognizes the critical terminology “NMEC” as some NMEC were missing the K1 capsular antigen

(Figure 6.1). The K1 capsular antigen was one of the first NMEC virulence factors described, and the subpathotype is occasionally referred to as E. coli K1 due to the presence of the capsular antigen (14, 73, 74). Likewise, the IbeA invasin, a well-known virulence factor of NMEC, was absent from approximately 47% of NMEC strains indicating that previous work, which typed E. coli as NMEC if the isolates contained certain ExPEC genes and both ibeA and kps, was likely incorrect (51). Similarly, cnf1,

171 which encodes the CNF1 toxin, was absent from a majority of strains examined, but the protein was previously reported to be of significant importance to NMEC’s ability to cross the blood-brain barrier (8). Genes csgA, ompA, speF, and usp had variable nucleotide identity among the genomes, suggesting that there were differences among the virulence-associated genes (Figure 6.1). Although work has been completed examining the polymorphisms of the OmpA protein of NMEC, csgA and speF remain unexplored in the NMEC subpathotype and warrant future examination (Nielsen et al., in preparation).

NMEC Core Genome Harbors Two Unique Genes

All NMEC isolates sequenced contained two genes that encode minor pseudopillins of the T2SS that were absent in the K-12 strains W3110 or MG1655

(Figure 6.3). These psuedopilins form a pilus-like structure that bridges the periplasmic space between the inner and outer membrane, which allows for the secretion of proteins such as toxins, cellulases, phospholipases, lipases, and proteases (59). Indeed, the LT and

CT toxins of enterotoxigenic E. coli and Vibrio, respectively, are secreted through T2SS

(58, 59). EspH, the GspH homologue found within all NMEC, is hypothesized to form the tip of the T2SS’s pseudopilus, which allows EspH to interact with other proteins (57).

Additionally, GspI, the XcpV homologue from enterotoxigenic E. coli, forms a trimeric complex with two additional proteins at the tip of the pseudopilus; this complex likely interacts with the T2SS components in the outer membrane (75). Nevertheless, ETEC and

NMEC are not the only E. coli pathovars to contain a T2SS, strains MG1655 and W3110, along with other NMEC, contain another, different T2SS (Figure 6.3). Previous work on the T2SS of MG1655 has shown that the secretion system is rarely expressed, but in vitro experiments demonstrate the T2SS of MG1655 can promote chitinase secretion (59). The

172 lack of expression could explain the lack of pathogenesis seen by the K-12 strains surveyed (59). We hypothesize that this secretion system conserved within the NMEC may secrete virulence factors that assist NMEC in causing disease, but further investigation is required to determine its expression and ability to cause disease.

NMEC Clades Have Sets of Conserved Genes

Serology has long been used to define the clonal groups of many bacteria (76, 77).

E. coli consists of 187 different O-groups that are often used for typing (78). Likewise, clonal lineages of E. coli have been described through sequence typing or by broader phylogenetic groups that are correlated with the sequence types (79, 80). The identification of NMEC STs is important as the STs are often used to describe clonal pandemic ExPEC, especially uropathogens (Johnson et al., in review)(81, 82). Although

NMEC do not appear to share a conserved Achtman sequence type, serogroup, or phylogenetic group, this study identified four characteristic clades (Figure 6.4A). The clades carry different conserved genes that have previously been underdescribed or undescribed in the literature. A significant number of these genes have the potential to contribute to the pathogenicity or virulence of NMEC as the genes encode fimbriae, toxins, toxin-antitoxin systems, symporters, antiporters, autotransporters, lipoproteins, metallo-regulatory proteins, and metabolic functions as well as numerous others like phage, CRISPR-associated, and regulatory genes.

Toxins contribute to the pathogenicity of many different E. coli pathovars (83), and toxin genes undescribed in NMEC are found in separate clades. Examples of genes annotated as toxins include vgrG1 (clade 1), symE and ccdB (clade 2), and tisB (clade 4).

VgrG-1 is most well-known as a toxin of V. cholerae that cross-links actin and is secreted

173 by a T6SS (84). ccdB encodes a toxin that targets gyrases, E. coli protect themselves with ccdA, part of the toxin-antitoxin system. TisB targets the inner membrane of E. coli and results in persister cell formation when ciprofloxacin is administered (85, 86). Of course, other virulence factors also vary between the clades. Notably, bfpA, a pilus of

Escherichia coli commonly associated with typical enteropathogenic E. coli, was conserved within the second NMEC clade (83, 87).

In addition to the underreported genes, the NMEC clades also encode different conserved genes for each of the groups associated with the virulence or pathogenesis of

NMEC (Table 1). Clade 1 carries neuA, a gene involved in the regulation of sialic acid in the capsule of some NMEC (88). Clade 2 covers ompT, an outer membrane protease associated with the APEC and UPEC subpathotypes (53, 89). Clade 3 harbors tra, tsh, repB, and borD, plasmid-associated genes that have been well established in previous

ExPEC literature (1, 26, 27, 53, 64, 66). Taken together, NMEC clades 1, 2, and 3 each carry conserved NMEC associated virulence factors, unlike clade 4.

NMEC have long been associated with phylogenetic group B2, and the majority of NMEC are classified to this phylogenetic group (Figures 6.2 and 6.4A). Therefore, it is unsurprising that NMEC clades 1, 2, and 3 are all contained within phylogenetic group

B2. These groups contain 255 genes conserved to the B2 phylogenetic group compared to

NMEC clade 4. Interestingly, this includes the polysialic acid transport protein (kpsM), the colicin-E7 immunity protein (imm) as well as numerous toxin-antitoxin systems (hip, yefM/yoeB, tabP, etc.), and other potential virulence factors (Table 6.2). Such results indicate unique properties of the NMEC B2 phylogenetic group. Indeed, Nielsen et al. (in progress) found that the B2 phylogenetic group was more likely to contain more

174 polymorphisms in the NMEC virulence protein Outer Membrane Protein A, encoded by ompA. Likewise, Logue et al. reported that the B2 phylogenetic group of ExPEC were more likely to contain virulence-associated genes or pathogenicity islands (29).

In 2008, Johnson et al. found that a mixed cluster of APEC, NMEC, and UPEC had similar genotyping patterns (27). Likewise, Logue et al. demonstrated differences in phylogenetic group composition among the APEC, NMEC, and UPEC subpathotypes

(29). The process by which NMEC causes NBM in newborns remains unknown, although transmission via the mother is expected (8). Even though APEC causes disease in birds,

Tivendale et al. demonstrated that APEC could cause neonatal bacterial meningitis in a rat model (90). Although this may seem conflicting, it is possible that certain APEC and

NMEC produce NBM through different mechanisms of the NMEC clades. For example,

APEC that inflicted NBM in rats belonged to ST95, which resembled Clade 1 NMEC

(90). Other subpathotypes may also have the ability to cause NBM in newborns. In addition to APEC O1, UPEC strains CFT073 and UTI89 resulted in a 0% survival rate when mice were subcutaneously injected with the strains (91). Likewise, adherent invasive E. coli (AIEC) strain LF82 resulted in a 40% survival rate when subcutaneously injected into mice (91). A significant number of AIEC belong to the O83 serogroup, which compose the majority of the third NMEC clade (91). Clearly, more research is warranted to elaborate on how certain pathotypes may cause disease in alternative hosts or niches.

NMEC Chromosomes Publicly Available

Currently, there are six completed, publicly-available NMEC chromosomes—

CE10, IHE3034, NMEC O18, NMEC-O75 (MCJCHV-1), RS218, and S88. Of all these

175 isolates, only CE10 does not belong to phylogenetic group B2, as it is classified to phylogenetic group F. To further discussion on the importance of chromosomal diversity within the NMEC subpathotype, we submit 13 completed or nearly completed chromosomes to the scientific community (Table 6.3). These 13 isolates belong to phylogenetic groups A, B1, B2, C, D, and F (Table 6.3). Likewise, they ensure coverage of each of the NMEC chromosomal types (Figure 6.4A). Clade 1 is represented by

NMEC O18, IHE3034, RS218, and S88; clade 2 contains NM26 and NMEC-O75

(MCJCHV-1) (Figure 6.4A). Clade 3 includes NM19, NM52, and NM84, and clade 4 consists of CE10, NM14, NM20, NM23, NM27, NM34, NM45, NM55, NM73, and

NM83 (Figure 6.4A). Together, this results in a total of 19 completed or nearly completed NMEC chromosomes and 58 NMEC with sequencing data available (Figure

6.4A, Table 6.3, Figure 6.5). With a diverse set of sequencing data, a toolbox of NMEC sequences may further spur exploration of the mechanisms by which this subpathotype causes disease. Comparative work on the abundance of now-available sequencing data is on-going.

NMEC Replicons

Tracing plasmids is of epidemiological importance as plasmids can contribute to the virulence and antibiotic resistance of their host (92). Plasmids can be identified and traced by their replicons via PCR or bioinformatically (45, 92). When the replicons within NMEC isolates were surveyed, the IncFIB replicon was the most prevalent (Figure

6). Our result corresponded with previous work by Johnson et al. (2008) (93).

Additionally, the IncFIB replicon has been associated with the APEC subpathotype (93,

94).

176

In addition to the well-recognized IncFIB and IncFII plasmid replicons, this study found a significant amount of Col replicons. These Col replicons were commonly found on small plasmids. Although previous ExPEC studies have sequenced these small plasmids, this study examined them in depth (18, 35, 36, 68, 95)(Johnson et al., in review). Examination of these plasmids is critical as these plasmids may contribute to virulence or have a direct impact on antibiotic resistance. Reducing antibiotic resistance is an important factor in decreasing the mortality rates of NMEC (96). Clearly, more work is needed to inspect these small plasmids in pathogenic E. coli.

When larger plasmids (18 – 185 kb) were examined, interesting differences emerged (Table 6.4). Previously, Nicholson et. al examined large NMEC virulence plasmids (97.8-157.2 kbp) and found these plasmids had a core genome composed of iroBCDEN, repA1, repA2, yubQ, etsABC, traA, traE, iucABCD, iutA, ompT, and iss (28).

Here, we completed the plasmid sequences examined by Nicholson et al. including pNM14, pNM19, and pNM26 (28). Nevertheless, we also increased the number of

NMEC plasmids studied from 11 to 25 plasmids. In addition to the plasmids with known replicons, we found the plasmids with unknown replicons to be of interest. More research is necessary to examine the plasmids within NMEC strains and the large number of hypothetical genes that they carry.

Conclusion

In this work, we present the largest NMEC genomic sequencing study to date, which examined both chromosomal and episomal sequencing data from 58 NMEC isolates. Clearly, the NMEC subpathotype contains a myriad of virulence factors, but there is also conservation present. NMEC had a core genome of 2,531 genes, and notably,

177 two of these 2,531 genes were not present in the surveyed K-12 E. coli. These two genes, epsH and xcpV are minor pseudopilins of the T2SS, which has been associated with the secretion of virulence factors (59). Four NMEC clades were also delineated from the pan genome, and these clades shared groups of conserved genes. Further investigation is needed to determine any potential role in virulence for conserved, shared genes within these clades. NMEC ColV plasmids were similar to previously sequenced plasmids, but non-ColV plasmids were also present, and they demonstrated diversity. Clearly, more work is needed to elucidate the mechanisms by which NMEC causes disease, investigate potential virulence factors, and discover the unknown functions of the many hypothetical genes carried on both the plasmids and the chromosomes.

Acknowledgements

We thank Dr. David Alt and Dr. Darrell Bayles (National Animal Disease Center,

Ames, IA) for their assistance in sequencing and bioinformatics expertise, respectively.

Additionally, we would like to thank Dr. Tim Johnson (University of Minnesota) for helpful conversation. This work was supported by the Dean’s Office at Iowa State

University’s College of Veterinary Medicine. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.

USDA is an equal opportunity provider and employer.

References

1. Nolan LK, Barnes HJ, Vaillancourt JP, Abdul-Aziz T, Logue CM. 2013. Colibacillosis, p 751-805, Diseases of Poultry, 13th ed doi:10.1002/9781119421481. Ch18. John Wiley & Sons, Ltd.

178

2. Russo TA, Johnson JR. 2000. Proposal for a new inclusive designation for extraintestinal pathogenic isolates of Escherichia coli: ExPEC. J Infect Dis 181:1753-1754.

3. Ewers C, Li G, Wilking H, Kiessling S, Alt K, Antao EM, Laturnus C, Diehl I, Glodde S, Homeier T, Bohnke U, Steinruck H, Philipp HC, Wieler LH. 2007. Avian pathogenic, uropathogenic, and newborn meningitis-causing Escherichia coli: how closely related are they? Int J Med Microbiol 297:163-176.

4. Gaschignard J, Levy C, Romain O, Cohen R, Bingen E, Aujard Y, Boileau P. 2011. Neonatal bacterial meningitis: 444 cases in 7 years. Pediatr Infect Dis J 30:212-217.

5. Grimwood K, Anderson P, Anderson V, Tan L, Nolan T. 2000. Twelve year outcomes following bacterial meningitis: further evidence for persisting effects. Arch Dis Child 83:111-116.

6. Simonsen KA, Anderson-Berry AL, Delair SF, Davies HD. 2014. Early-onset neonatal sepsis. Clin Microbiol Rev 27:21-47.

7. Stoll BJ, Hansen NI, Sanchez PJ, Faix RG, Poindexter BB, Van Meurs KP, Bizzarro MJ, Goldberg RN, Frantz ID, 3rd, Hale EC, Shankaran S, Kennedy K, Carlo WA, Watterberg KL, Bell EF, Walsh MC, Schibler K, Laptook AR, Shane AL, Schrag SJ, Das A, Higgins RD, Eunice Kennedy Shriver National Institute of Child H, Human Development Neonatal Research N. 2011. Early onset neonatal sepsis: the burden of group B streptococcal and E. coli disease continues. Pediatrics 127:817-826.

8. Croxen MA, Finlay BB. 2010. Molecular mechanisms of Escherichia coli pathogenicity. Nat Rev Micro 8:26-38.

9. Weiser JN, Gotschlich EC. 1991. Outer membrane protein A (OmpA) contributes to serum resistance and pathogenicity of Escherichia coli K-1. Infect Immun 59:2252-2258.

10. Wang Y, Kim KS. 2002. Role of OmpA and IbeB in Escherichia coli K1 invasion of brain microvascular endothelial cells in vitro and in vivo. Ped Res 51:559-563.

11. Prasadarao NV. 2002. Identification of Escherichia coli outer membrane protein A receptor on human brain microvascular endothelial cells. Infect Immun 70:4556-4563.

12. Khan NA, Kim Y, Shin S, Kim KS. 2007. FimH-mediated Escherichia coli K1 invasion of human brain microvascular endothelial cells. Cell Microbiol 9:169- 178.

179

13. Kim KJ, Chung JW, Kim KS. 2005. 67-kDa laminin receptor promotes internalization of cytotoxic necrotizing factor 1-expressing Escherichia coli K1 into human brain microvascular endothelial cells. J Biol Chem 280:1360–1368.

14. Kim KS, Itabashi H, Gemski P, Sadoff J, Warren RL, Cross AS. 1992. The K1 capsule is the critical determinant in the development of Escherichia coli meningitis in the rat. J Clin Invest 90:897-905.

15. Hoffman JA, Wass C, Stins MF, Kim KS. 1999. The capsule supports survival but not traversal of Escherichia coli K1 across the blood-brain barrier. Infect Immun 67:3566-3570.

16. Wijetunge DSS, Gongati S, DebRoy C, Kim KS, Couraud PO, Romero IA, Weksler B, Kariyawasam S. 2015. Characterizing the pathotype of neonatal meningitis causing Escherichia coli (NMEC). BMC Microbiol 15:211.

17. Logue CM, Doetkott C, Mangiamele P, Wannemuehler YM, Johnson TJ, Tivendale KA, Li G, Sherwood JS, Nolan LK. 2012. Genotypic and phenotypic traits that distinguish neonatal meningitis-associated Escherichia coli from fecal E. coli isolates of healthy human hosts. Appl Environ Microb 78:5824-5830.

18. Lu S, Zhang X, Zhu Y, Kim KS, Yang J, Jin Q. 2011. Complete genome sequence of the neonatal-meningitis-associated Escherichia coli strain CE10. J Bacteriol 193.

19. Moriel DG, Bertoldi I, Spagnuolo A, Marchi S, Rosini R, Nesta B, Pastorello I, Corea VAM, Torricelli G, Cartocci E, Savino S, Scarselli M, Dobrindt U, Hacker J, Tettelin H, Tallon LJ, Sullivan S, Wieler LH, Ewers C, Pickard D, Dougan G, Fontana MR, Rappuoli R, Pizza M, Serino L. 2010. Identification of protective and broadly conserved vaccine antigens from the genome of extraintestinal pathogenic Escherichia coli. P Nat Acad Sci USA 107:9072-9077.

20. Nicholson BA, Wannemuehler YM, Logue CM, Li G, Nolan LK. 2016. Complete genome sequence of the neonatal meningitis-causing Escherichia coli strain NMEC O18. Genome Announc 4.

21. Wijetunge DSS, Katani R, Kapur V, Kariyawasam S. 2015. Complete genome sequence of Escherichia coli strain RS218 (O18:H7:K1), associated with neonatal meningitis. Genome Announc 3:e00804-15.

22. Peigne C, Bidet P, Mahjoub-Messai F, Plainvert C, Barbe V, Médigue C. 2009. The plasmid of Escherichia coli strain S88 (O45: K1: H7) that causes neonatal meningitis is closely related to avian pathogenic E. coli plasmids and is associated with high-level bacteremia in a neonatal rat meningitis model. Infect Immun 77:2272-2284.

180

23. Xu A, Johnson JR, Sheen S, Sommers C. 2018. Draft genome sequences of five neonatal meningitis-causing Escherichia coli isolates (SP-4, SP-5, SP-13, SP-46, and SP-65). Genome Announc 6:e00091-18.

24. Homeier T, Semmler T, Wieler LH, Ewers C. 2010. The GimA locus of extraintestinal pathogenic E. coli: does reductive evolution correlate with habitat and pathotype? PLoS One 5:e10877.

25. Wijetunge DS, Karunathilake KH, Chaudhari A, Katani R, Dudley EG, Kapur V. 2014. Complete nucleotide sequence of pRS218, a large virulence plasmid, that augments pathogenic potential of meningitis-associated Escherichia coli strain RS218. BMC Microbiol 14:203.

26. Johnson TJ, Nolan LK. 2009. Pathogenomics of the virulence plasmids of Escherichia coli. Microbiol Mol Biol Rev 73:750-774.

27. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of extraintestinal pathogenic Escherichia coli strains from human and avian sources reveals a mixed subset representing potential zoonotic pathogens. Appl Environ Microb 74:7043-7050.

28. Nicholson BA, West AC, Mangiamele P, Barbieri NL, Wannemuehler Y, Nolan LK, Logue CM, Li G. 2016. Genetic characterization of ExPEC-like virulence plasmids among a subset of NMEC. PLoS ONE 11:e0147757.

29. Logue CM, Wannemuehler Y, Nicholson BA, Doetkott C, Barbieri NL, Nolan LK. 2017. Comparative analysis of phylogenetic assignment of human and avian ExPEC and fecal commensal Escherichia coli using the (previous and revised) Clermont phylogenetic typing methods and its impact on avian pathogenic Escherichia coli (APEC) classification. Front Microbiol 8:283.

30. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120.

31. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 19:455- 477.

32. Gurevich A, Saveliev V, Vyahhi N, Tesler G. 2013. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072-1075.

33. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 27:722-736.

181

34. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9:e112963.

35. Nielsen DW, Ricker N, Barbieri NL, Wynn JL, Gómez-Duarte OG, Iqbal J, Nolan LK, Allen HK, Logue CM. 2018. Complete genome sequence of the multidrug- resistant neonatal meningitis Escherichia coli serotype O75:H5:K1 strain mcjchv- 1 (NMEC-O75). Microbiol Resour Announc 7:e01043-18.

36. Nielsen DW, Mangiamele P, Ricker N, Barbieri NL, Allen HK, Nolan LK, Logue CM. 2018. Complete genome sequence of avian pathogenic Escherichia coli strain APEC O2-211. Microbiol Resour Announc 7:e01046-18.

37. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MT, Fookes M, Falush D, Keane JA, Parkhill J. 2015. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics 31:3691-3693.

38. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068-2069.

39. Price MN, Dehal PS, Arkin AP. 2009. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26:1641- 1650.

40. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2--approximately maximum- likelihood trees for large alignments. PLoS ONE 5:e9490.

41. Letunic I, Bork P. 2007. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23:127-128.

42. Letunic I, Bork P. 2016. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res 44:W242-W245.

43. Hadley W. 2016. Ggplot2. Springer Science+Business Media, LLC, New York, NY.

44. Wickham H, Grolemund G. 2016. R for data science : import, tidy, transform, visualize, and model data, First edition. ed. O'Reilly Media, Inc., Sebastopol, CA.

45. Carattoli A, Zankari E, Garcia-Fernandez A, Voldby Larsen M, Lund O, Villa L, Moller Aarestrup F, Hasman H. 2014. In silico detection and typing of plasmids using PlasmidFinder and plasmid multilocus sequence typing. Antimicrob Agents Chemother 58:3895-3903.

46. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C, Thierer T, Ashton B, Meintjes P, Drummond

182

A. 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28:1647-1649.

47. Alikhan NF, Petty NK, Ben Zakour NL, Beatson SA. 2011. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genom 12:1.

48. Johnson JR. 2002. Extraintestinal pathogenic Escherichia coli: the other bad E. coli. J Lab Clin Med 139.

49. Che X, Chi F, Wang L, Jong TD, Wu CH, Wang X, Huang SH. 2011. Involvement of IbeA in meningitic Escherichia coli K1-induced polymorphonuclear leukocyte transmigration across brain endothelial cells. Brain Pathol 21:389-404.

50. Chi F, Jong TD, Wang L, Ouyang Y, Wu C, Li W. 2010. Vimentin-mediated signalling is required for IbeA+ E. coli K1 invasion of human brain microvascular endothelial cells. Biochem J 427:79-90.

51. Huang SH, Chen YH, Kong G, Chen SH, Besemer J, Borodovsky M, Jong A. 2001. A novel genetic island of meningitic Escherichia coli K1 containing the ibeA invasion gene (GimA): functional annotation and carbon-source-regulated invasion of human brain microvascular endothelial cells. Funct Integr Genom 1:312-322.

52. Barbieri NL, de Oliveira AL, Tejkowski TM, Pavanelo DB, Rocha DA, Matter LB, Callegari-Jacques SM, de Brito BG, Horn F. 2013. Genotypes and pathogenicity of cellulitis isolates reveal traits that modulate APEC virulence. PLoS One 8:e72322.

53. Rodriguez-Siek KE, Giddings CW, Doetkott C, Johnson TJ, Nolan LK. 2005. Characterizing the APEC pathotype. Vet Res 36:241-256.

54. Fagan RP, Smith SG. 2007. The Hek outer membrane protein of Escherichia coli is an auto-aggregating adhesin and invasin. FEMS Microbiol Lett 269:248-255.

55. Flores-Mireles AL, Walker JN, Caparon M, Hultgren SJ. 2015. Urinary tract infections: epidemiology, mechanisms of infection and treatment options. Nat Rev Micro 13:269-284.

56. Bachmann BJ. 1996. Derivations and genotypes of some mutant derivatives of Escherichia coli K-12. Escherichia coli and Salmonella: cellular and molecular biology, 2nd ed ASM Press, Washington, DC:2460-2488.

57. Yanez ME, Korotkov KV, Abendroth J, Hol WG. 2008. Structure of the minor pseudopilin EpsH from the type 2 secretion system of Vibrio cholerae. J Mol Biol 377:91-103.

183

58. Douzi B, Durand E, Bernard C, Alphonse S, Cambillau C, Filloux A, Tegoni M, Voulhoux R. 2009. The XcpV/GspI pseudopilin has a central role in the assembly of a quaternary complex within the T2SS pseudopilus. J Biol Chem 284:34580- 34589.

59. Sandkvist M. 2001. Type II secretion and pathogenesis. Infect Immun 69:3523- 3535.

60. Pellicer MT, Badia J, Aguilar J, Baldoma L. 1996. glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator protein. J Bacteriol 178:2051-2059.

61. McHugh JP, Rodriguez-Quinones F, Abdul-Tehrani H, Svistunenko DA, Poole RK, Cooper CE, Andrews SC. 2003. Global iron-dependent gene regulation in Escherichia coli. A new mechanism for iron homeostasis. J Biol Chem 278:29478-29486.

62. Lindahl L, Zengel JM. 1986. Ribosomal genes in Escherichia coli. Annu Rev Genet 20:297-326.

63. Francetic O, Badaut C, Rimsky S, Pugsley AP. 2000. The ChiA (YheB) protein of Escherichia coli K-12 is an endochitinase whose gene is negatively controlled by the nucleoid-structuring protein H-NS. Mol Microbiol 35:1506-1517.

64. Skyberg JA, Johnson TJ, Nolan LK. 2008. Mutational and transcriptional analyses of an avian pathogenic Escherichia coli ColV plasmid. BMC Microbiol 8:24.

65. Johnson TJ, Siek KE, Johnson SJ, Nolan LK. 2006. DNA sequence of a ColV plasmid and prevalence of selected plasmid-encoded virulence genes among avian Escherichia coli strains. J Bacteriol 188:745-758.

66. Montenegro MA, Bitter-Suermann D, Timmis JK, Aguero ME, Cabello FC, Sanyal SC, Timmis KN. 1985. traT gene sequences, serum resistance and pathogenicity-related factors in clinical isolates of Escherichia coli and other gram-negative bacteria. J Gen Microbiol 131:1511-1521.

67. Binns MM, Davies DL, Hardy KG. 1979. Cloned fragments of the plasmid ColV,I-K94 specifying virulence and serum resistance. Nature 279:778-781.

68. Beatson SA, Ben Zakour NL, Totsika M, Forde BM, Watts RE, Mabbett AN, Szubert JM, Sarkar S, Phan MD, Peters KM, Petty NK, Alikhan NF, Sullivan MJ, Gawthorne JA, Stanton-Cook M, Nhu NT, Chong TM, Yin WF, Chan KG, Hancock V, Ussery DW, Ulett GC, Schembri MA. 2015. Molecular analysis of asymptomatic bacteriuria Escherichia coli strain VR50 reveals adaptation to the urinary tract by gene acquisition. Infect Immun 83:1749-1764.

184

69. Cascales E, Buchanan SK, Duche D, Kleanthous C, Lloubes R, Postle K, Riley M, Slatin S, Cavard D. 2007. Colicin biology. Microbiol Mol Biol Rev 71:158- 229.

70. Pilsl H, Braun V. 1995. Novel colicin 10: assignment of four domains to TonB- and TolC-dependent uptake via the Tsx receptor and to pore formation. Mol Microbiol 16:57-67.

71. Beck E, Ludwig G, Auerswald EA, Reiss B, Schaller H. 1982. Nucleotide sequence and exact localization of the neomycin phosphotransferase gene from transposon Tn5. Gene 19:327-336.

72. Dallas WS, Gowen JE, Ray PH, Cox MJ, Dev IK. 1992. Cloning, sequencing, and enhanced expression of the dihydropteroate synthase gene of Escherichia coli MC4100. J Bacteriol 174:5961-5970.

73. Silver RP, Aaronson W, Vann WF. 1988. The K1 capsular polysaccharide of Escherichia coli. Rev Infect Dis 10 S282-S286.

74. Robbins JB, McCracken GHJ, Gotschlich EC, Ørskov F, Ørskov I, Hanson LA. 1974. Escherichia coli K1 capsular polysaccharide associated with neonatal meningitis. New Engl J Med 290:1216-1220.

75. Korotkov KV, Hol WG. 2008. Structure of the GspK-GspI-GspJ complex from the enterotoxigenic Escherichia coli type 2 secretion system. Nat Struct Mol Biol 15:462-468.

76. Orskov I, Orskov F, Jann B, Jann K. 1977. Serology, chemistry, and genetics of O and K antigens of Escherichia coli. Bacteriol Rev 41:667.

77. Kauffmann F. 1947. The serology of the coli group. J Immunol 57:71-100.

78. DebRoy C, Fratamico PM, Roberts E. 2018. Molecular serogrouping of Escherichia coli. Anim Health Res Rev doi:10.1017/S1466252317000093:1-16.

79. Clermont O, Gordon D, Denamur E. 2015. Guide to the various phylogenetic classification schemes for Escherichia coli and the correspondence among schemes. Microbiol 161:980-988.

80. Clermont O, Christenson JK, Denamur E, Gordon DM. 2013. The Clermont Escherichia coli phylo-typing method revisited: improvement of specificity and detection of new phylo-groups. Env Microbiol Rep 5:58-65.

81. Johnson JR, Johnston B, Clabots C, Kuskowski MA, Castanheira M. 2010. Escherichia coli sequence type ST131 as the major cause of serious multidrug- resistant E. coli infections in the United States. Clin Infect Dis 51:286-294.

185

82. Johnson JR, Johnston B, Clabots C, Kuskowski MA, Pendyala S, Debroy C, Nowicki B, Rice J. 2010. Escherichia coli sequence type ST131 as an emerging fluoroquinolone-resistant uropathogen among renal transplant recipients. Antimicrob Agents Chemother 54:546-550.

83. Kaper JB. 2004. Pathogenic Escherichia coli. Nat Rev Microbiol 2:123-140.

84. Satchell KJ. 2009. Actin crosslinking toxins of Gram-negative bacteria. Toxins (Basel) 1:123-133.

85. Unoson C, Wagner EG. 2008. A small SOS-induced toxin is targeted against the inner membrane in Escherichia coli. Mol Microbiol 70:258-270.

86. Dorr T, Vulic M, Lewis K. 2010. Ciprofloxacin causes persister formation by inducing the TisB toxin in Escherichia coli. PLoS Biol 8:e1000317.

87. Melo AR, Lasunskaia EB, de Almeida CM, Schriefer A, Kipnis TL, Dias da Silva W. 2005. Expression of the virulence factor, BfpA, by enteropathogenic Escherichia coli is essential for apoptosis signalling but not for NF-kappaB activation in host cells. Scand J Immunol 61:511-519.

88. Severi E, Hood DW, Thomas GH. 2007. Sialic acid utilization by bacterial pathogens. Microbiol 153:2817-2822.

89. Kanamaru S, Kurazono H, Ishitoya S, Terai A, Habuchi T, Nakano M, Ogawa O, Yamamoto S. 2003. Distribution and genetic association of putative uropathogenic virulence factors iroN, iha, kpsMT, ompT and usp in Escherichia coli isolated from urinary tract infections in Japan. J Urol 170:2490-2493.

90. Tivendale KA, Logue CM, Kariyawasam S, Jordan D, Hussein A, Li G, Wannemuehler Y, Nolan LK. 2010. Avian-pathogenic Escherichia coli strains are similar to neonatal meningitis E. coli strains and are able to cause meningitis in the rat model of human disease. Infection and Immunity 78:3412-3419.

91. Miquel S, Peyretaillade E, Claret L, de Vallee A, Dossat C, Vacherie B, Zineb el H, Segurens B, Barbe V, Sauvanet P, Neut C, Colombel JF, Medigue C, Mojica FJ, Peyret P, Bonnet R, Darfeuille-Michaud A. 2010. Complete genome sequence of Crohn's disease-associated adherent-invasive E. coli strain LF82. PLoS ONE 5.

92. Carattoli A, Bertini A, Villa L, Falbo V, Hopkins KL, Threlfall EJ. 2005. Identification of plasmids by PCR-based replicon typing. J Microbiol Methods 63:219-228.

93. Johnson TJ, Wannemuehler Y, Johnson SJ, Stell AL, Doetkott C, Johnson JR, Kim KS, Spanjaard L, Nolan LK. 2008. Comparison of extraintestinal pathogenic Escherichia coli from human and avian sources reveals a mixed subset representing potential zoonotic pathogens, Appl Environ Microb.

186

94. Johnson TJ, Wannemuehler YM, Johnson SJ, Logue CM, White DG, Doetkott C, Nolan LK. 2007. Plasmid replicon typing of commensal and pathogenic Escherichia coli isolates. Appl Environ Microb 73:1976-1983.

95. Zhu Ge X, Jiang J, Pan Z, Hu L, Wang S, Wang H, Leung FC, Dai J, Fan H. 2014. Comparative genomic analysis shows that avian pathogenic Escherichia coli isolate IMT5155 (O2:K1:H5; ST complex 95, ST140) shares close relationship with ST95 APEC O1:K1 and human ExPEC O18:K1 strains. PLoS ONE 9:e112048.

96. Harvey D, Holt DE, Bedford H. 1999. Bacterial meningitis in the newborn: a prospective study of mortality and morbidity. Semin Perinatol 23:218-225.

Supplementary Figures

Supplementary Figure 6.1: Core & Pan Genome Pie-Chart

187

Supplementary Figure 6.2: NMEC plasmids carrying the IncFIB replicon relative to IncFIB/IncFII plasmid pS88.

188

Supplementary Figure 6.3: NMEC plasmid carrying the IncFII replicon compared to IncFIB/IncFII plasmid pS88

189

Supplementary Tables

Supplementary Table 6.1: Assembly statistics for sequenced E. coli strains & E. coli used in this study

ID Contigs Large Largest Total length GC (%) N50 Reference: Contigs contig NM02 366 313 167,007 5,221,299 50.65 38,937 NM03 128 85 452,586 5,294,023 50.68 223,928 NM06 471 346 59,743 4,970,648 50.93 23,613

NM07 414 340 138,980 5,214,001 50.65 29,342 NM08 87 55 692,599 4,930,509 50.55 266,757 NM11 405 335 116,448 5,211,846 50.61 28,142 NM14 192 155 283,036 5,069,131 50.68 92,889 NM15 207 140 325,626 5,262,260 50.57 89,623 NM16 120 76 447,594 4,930,996 50.56 158,443 NM18 85 57 703,944 5,103,139 50.58 340,894 NM19 147 85 335,651 4,910,728 50.61 119,227 NM20 156 90 255,251 5,035,093 50.63 121,749 NM22 159 107 262,317 5,221,462 50.71 118,099 NM23 409 272 200,647 5,261,475 50.52 41,145 NM26 202 112 266,868 5,056,013 50.55 99,233

NM27 328 214 174,578 5,047,144 50.69 66,014 NM28 141 100 524,611 5,196,498 50.71 136,288 NM29 122 74 591,042 5,223,463 50.67 204,007 NM30 197 117 402,789 5,244,771 50.66 148,034 NM34 476 387 107,868 5,160,639 50.60 26,276 NM35 377 235 231,126 5,143,349 50.61 52,674 NM36 218 146 250,698 5,178,756 50.52 88,169 NM37 151 91 716,672 5,198,133 50.58 245,330 NM38 300 214 321,123 5,254,316 50.58 49,848 NM40 129 78 349,493 5,088,706 50.40 139,412 NM42 206 117 342,737 5,133,181 50.35 110,220 NM43 96 68 703,696 5,194,812 50.57 235,940

NM45 150 115 308,853 4,997,635 50.64 101,299 NM49 87 60 573,840 4,960,651 50.50 178,842 NM52 105 55 523,265 5,041,827 50.62 190,976 NM55 120 94 364,763 5,106,997 50.71 155,215 NM57 85 58 692,979 4,936,656 50.54 201,597 NM59 106 73 860,775 4,999,874 50.54 213,069 NM60 79 55 520,045 5,277,991 50.60 200,060 NM65 85 53 692,823 4,947,064 50.58 211,011 NM67 80 53 692,864 4,934,247 50.56 231,520 NM70 116 78 507,247 5,156,348 50.56 334,505 NM71 78 53 726,101 4,883,899 50.59 233,192 NM72 111 70 438,761 5,034,529 50.52 233,612

NM73 127 92 376,668 4,758,223 50.81 112,160 NM74 371 227 215,018 5,446,521 50.53 70,198 NM81 189 101 367,747 5,140,391 50.33 193,524 NM82 145 82 423,515 5,186,924 50.54 237,199 NM83 362 258 113,581 5,267,783 50.61 41,717 NM84 60 37 585,795 4,844,018 50.52 298,373 NM86 656 497 82,550 4,972,629 51.19 15,991 NM89 134 87 487,548 5,159,155 50.65 267,731 NM91 197 140 285,455 5,248,050 50.50 112,115

190

Supplementary Table 6.1. (continued)

NM92* 112 76 510,962 4,998,284 50.57 196,353 * CE10 5 ------(1) IHE3034 1 ------(2) NMEC- 1 ------(3) O18 RS218 2 ------(4, 5) S88 2 ------(6) SP-4 50 ------(7) SP-5 204 ------(7) SP-13 108 ------(7) SP-46 193 ------(7) *NM92 is an assembly of the now completed genome Escherichia coli MCJCHV-1 (NMEC-O75) (8).

Supplementary Table 6.2: Small plasmids with Col replicons are present in NMEC genomes, and numerous strains have multiple plasmids.

Strain Plasmid Size %GC Closest Matching Replicon Features (bp) NM02 pNM02-1 3257 47.2 ColRNAI_1 rop, yddM

NM08 pNM08-1 6648 48.5 ColRNAI_1 mbeC, rop, cnl, 2 cea, mbeA, 4 CDS

NM11 pNM11-1 4237 40.8 Col(MG828)_1 5 CDS

pNM11-2 1552 51.8 Col(MG828)_1 2 CDS

NM16 pNM16-1 4593 50.5 ColRNAI_1 rop, mbeC, mxibeA, CDS

pNM16-2 1552 51.8 Col(MG828)_1 2 CDS

NM19 pNM19-1 6211 52.5 ColRNAI_1 6 CDS, neo, folP

pNM19-2 4715 41 ColRNAI_1 2 CDS

NM22 pNM22-1 4315 43.7 ColRNAI_1 2 CDS, yhdJ

NM23 pNM23-1 6888 48.6 ColRNAI_1 rop, mbeC, mbeA, cea, 3 CDS

pNM23-2 5163 47.5 Col156_1 mobA, yibT, 3 CDS

pNM23-3 1976 56.3 ColpVC_1 2 CDS

pNM23-4 1549 50.9 Col(MG828)_1 CDS

NM27 pNM27-1 6753 51.2 Col156_1 mobA, imm, cnl, 4 CDS

pNM27-2 5631 47.4 ColRNAI_1 mbeA, mbeC, rop, 2 CDS

pNM27-3 6647 48.6 ColRNAI_1 mbeA, mbeC, rop, cnl, cea, 3 CDS

pNM27-4 4087 49.9 Col8282_1 repE, 4 CDS

pNM27-5 2089 47.1 Col(BS512)_1 2 CDS

pNM27-6 1552 51.9 Col(MG828)_1 1 CDS

191

Supplementary Table 6.2. (continued)

NM28 pNM28-1 4593 50.5 ColRNAI_1 mbeC, rop, 3 CDS

pNM28-2 1552 51.9 Col(MG828)_1 2 CDS

NM34 pNM34-2 5181 47.1 Col156_1 yibT, mobA, 3 CDS

NM35 pNM35-1 6780 46.9 ColRNAI_1 rop, mbeC, mbeA, cea, 3 CDS

pNM35-2 2089 47.2 Col(BS512)_1__NC_010656_dupe 2 CDS

pNM35-3 1593 56.6 ColpVC_1 1 CDS

NM36 pNM36-1 3726 51.2 Col156_1 mobA, 2 CDS

NM45 pNM45-1 6223 52.4 ColRNAI_1 neo, folP, 4 CDS

NM49 pNM49-1 6222 52.5 ColRNAI_1 neo, folP, 5 CDS

NM55 pNM55-2 4067 50.1 Col8282_1 mobA, CDS

NM57 pNM57-1 6200 52.6 ColRNAI_1 neo, folP, 5 CDS

pNM57-2 2443 46.4 ColRNAI_1 2 CDS

NM67 pNM67-1 6211 52.5 ColRNAI_1 neo, folP, 6 CDS

NM70 pNM70-1 5716 42.4 Col156_1 cmo, mobA, 4 CDS

pNM70-2 5631 47.4 ColRNAI_1 rop, mbeA, mbeC, CDS

pNM70-3 5295 47.6 Col156_1 yibT, mobA, 2 CDS

pNM70-4 2934 43.6 Col(MG828)_1 2 CDS

pNM70-5 1909 48.9 Col(MG828)_1 1 CDS

NM74 pNM74-1 6647 48.6 ColRNAI_1 mbeC, rop, cnl, cea, mbeA, 3 CDS

NM82 pNM82-1 6647 48.6 ColRNAI_1 mbeA, mbeC, cnl, cea, 3 CDS

pNM82-2 5150 47 Col156_1 mbeA, mbeC, rop, 3 CDS

NM83 pNM83-2 5631 47.4 ColRNAI_1 mbeA, mbeC, rop, 2 CDS

pNM83-5 4066 50.1 Col8282_1 repE, 4 CDS

pNM83-6 2073 48 Col(BS512)_1 2 CDS

NM86 pNM86-1 6668 48.6 ColRNAI_1 mbeA, mbeC, cnl, cea, 3 CDS

pNM86-2 5689 42.1 Col156_1 cmo, mobA, 4 CDS

pNM86-3 5170 47.1 Col156_1 yibT, mobA, 4 CDS

pNM86-4 4074 50 Col8282_1 2 CDS

pNM86-5 1822 44.5 Col(MG828)_1 mobA, 1 CDS

NM91 pNM91-1 5163 47.5 Col156_1 yibT, mobA, 4 CDS

pNM91-2 4197 42.4 ColRNAI_1 rop, dcm, 3 CDS

pNM91-3 1549 51.1 Col(MG828)_1 2 CDS

NM92 pNM92-1 6465 42.6 Col(MGD2)_1 rop, toxN, 4 CDS

pNM92-2 4082 49.4 Col8282_1 repE, 2 CDS

pNM92-3 2113 47.1 Col(BS512)_1 2 CDS

pNM92-4 1983 56.1 ColpVC_1 2 CDS

192

Supplementary Table 6.2. (continued)

CE10 pCE10B 5163 47.5 Col156_1 mobA, yibT, 4 CDS

pCE10C 4197 42.4 ColRNAI_1 rop, dcm, 4 CDS

pCE10D 1549 51.1 Col(MG828)_1 1 CDS

CE10 pCE10B 5163 47.5 Col156_1 mobA, yibT, 4 CDS

SP5 PNYF02000089 6817 48.5 ColRNAI_1 mbeA, mbeC, rop, cea, 3 CDS

PNYF02000098 5048 47.3 Col156_1 mobA, yibT, 3 CDS

PNYF02000104 4151 49.6 Col8282_1 repE, mobA, CDS

PNYF02000105 4126 42.1 ColRNAI_1 rop, dcm, 3 CDS

PNYF02000123 1998 48.3 Col(MG828)_1 1 CDS

PNYF02000133 1547 50.4 Col(MG828)_1 1 CDS

SP13 PNYD02000060 2707 43 Col(MG828)_1 2 CDS

SP46 POSV02000084 7507 48.3 Col156_1 ygaV, norW, mobA, 3 CDS

POSV02000085 7065 47.3 ColRNAI_1 mbeA, mbeC, cea, 3 CDS

POSV02000090 5737 47.2 ColRNAI_1 mbeA, mbeC, 4 CDS

POSV02000129 1626 50.9 Col(MG828)_1 1 CDS

193

CHAPTER 7. CONCLUSION

The Extraintestinal Pathogenic E. coli (ExPEC) pathotype has demonstrated significant phenotypic and genomic diversity as evidenced by biofilm formation and data from sequencing analyses. ExPEC must clearly utilize virulence factors to cause disease, but no known conserved virulence factor has been identified in this study as being singularly ExPEC-specific. Indeed, even conserved virulence factors, such as OmpA, have significant differences within their protein sequences among the various subpathotypes and even within the same subpathotype. Diversity appears to be critical to the ExPEC pathotype as shown by the myriad of virulence factors and plasmid contents of NMEC. To date, the majority of NMEC studies have focused on prototypical strain

RS218, but this study has demonstrated that the NMEC subpathotype extends beyond a single phylogenetic group, sequence type, or serogroup. With the generation of sequencing data from a variety of different NMEC, numerous isolates may be stronger candidates for in vivo studies, which could elucidate different mechanisms by which

ExPEC cause disease. NMEC-O75 is a potential rival for in vivo work to RS218, as

NMEC-O75 was recently isolated from a severe case of NBM in the United States. Yet, other NMEC isolates should also be examined as the phylogenetic groups appear to be associated with phenotypic traits and, conceivably, differences in the virulence of NMEC.

In conclusion, the ExPEC subpathotypes demonstrated statistically significant differences when their biofilm formation and amino acid polymorphisms of OmpA were compared. Nevertheless, the phylogenetic groups also had statistically significant differences for the same analyses. Taken together, these findings suggest that conclusions should be drawn for each subpathotype after the phylogenetic groups are considered.

194

However, the ExPEC subpathotypes are still a clinical concern. NMEC sequencing data revealed great diversity within the subpathotype, and future studies characterizing NMEC should focus on this diversity, instead of a single prototypical isolate. There is certainly opportunity to elucidate the mechanisms of pathogenicity by further study of the putative virulence factors identified within the NMEC genomes. Specifically, the T2SS of NMEC remains a promising target. Likewise, investigation of the ExPEC plasmids, which have been shown to contribute to virulence, is a high priority since the presence of numerous unidentifiable, hypothetical genes remains concerning.