<<

The Pennsylvania State University

The Graduate School

Department of Veterinary and Biomedical Sciences Pathobiology Program

PATHOGENOMICS AND SOURCE DYNAMICS OF SALMONELLA ENTERICA

SEROVAR ENTERITIDIS

A Dissertation in

Pathobiology

by

Matthew Raymond Moreau

 2015 Matthew R. Moreau

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

May 2015

The Dissertation of Matthew R. Moreau was reviewed and approved* by the following:

Subhashinie Kariyawasam Associate Professor, Veterinary and Biomedical Sciences Dissertation Adviser Co-Chair of Committee

Bhushan M. Jayarao Professor, Veterinary and Biomedical Sciences Dissertation Adviser Co-Chair of Committee

Mary J. Kennett Professor, Veterinary and Biomedical Sciences

Vijay Kumar Assistant Professor, Department of Nutritional Sciences

Anthony Schmitt Associate Professor, Veterinary and Biomedical Sciences Head of the Pathobiology Graduate Program

*Signatures are on file in the Graduate School

iii

ABSTRACT

Salmonella enterica serovar Enteritidis (SE) is one of the most frequent common causes of morbidity and mortality in humans due to consumption of contaminated eggs and egg products. The association between egg contamination and foodborne outbreaks of SE suggests egg derived SE might be more adept to cause human illness than SE from other sources.

Therefore, there is a need to understand the molecular mechanisms underlying the ability of egg- derived SE to colonize the chicken intestinal and reproductive tracts and cause disease in the human host. To this end, the present study was carried out in three objectives.

The first objective was to sequence two egg-derived SE isolates belonging to the PFGE type JEGX01.0004 to identify the genes that might be involved in SE colonization and/or pathogenesis. Both genomes were almost identical (99% identity) being approximately 4.67Mb in size and the GC content was 52%. Both genomes contained about 4,600 open reading frames, of which 600 (or 12.5% of the genome) were related to virulence. Nine genes contained single polymorphisms (SNPs) when these virulence-associated genes of egg isolated SE were compared with human isolated SE providing evidence of host-adapted microevolution of SE.

Among these SNPs, two resulted in non-conservative changes in a fimbrial usher and a lipopolysaccharide biosynthesis gene, whereas four SNPs were located in regions.

The second objective of the study was to identify the host-specific genes (poultry vs. human) by pan-genomic analysis of the newly sequenced genomes and the genomes of

Salmonella serovars published in the NCBI database. There were approximately 2, 800 cluster of orthologous genes (COGs) conserved among all serovars tested of which 247 were associated with Salmonella virulence. These core virulence genes may be associated with colonization and/or subsequent infection in all host species whereas the ‘host-specific’ genes most likely determine the host-specific mechanisms. This analysis identified 10 poultry-specific potential

iv virulence genes, including the genes of two fimbrial operons, lpf- and sti- and four genes with hypothetical functions. Twelve other genes (e. g. iroD, ttrB/C, and sthA) were present only in

Salmonella serovars that can infect the human host.

The third objective of the study was to elucidate if the source attributes have an effect on

SE virulence. Here, the colonization ability and virulence potential of SE grown in Luria broth

(LB) medium and egg yolk in the laboratory and SE recovered from feces of mice experimentally infected with SE) were compared using a mouse colitis model of SE infection . The results demonstrated that SE grown in the egg yolk possesses enhanced colonization, shedding, and virulence capabilities as compared to the SE grown in the LB medium or SE recovered from mouse feces as determined by clinical signs, gross pathology, histological lesion scoring, and bacterial enumeration of feces, small and large intestines, and internal organs of the infected mice. These data suggest that egg yolk may condition SE to be better ‘primed’ for transmission to and infection of a second host by upregulating the expression of certain genes. Future studies should be directed towards understanding the mechanisms involved in microevolution of SE virulence in egg yolk using approaches, such as microarray and RNA sequencing.

In conclusion, this study provides new insights into the current understanding of SE virulence and identifies potential targets for rational development of vaccines and antimicrobials to minimize human foodborne illness due to SE.

v

TABLE OF CONTENTS

List of Figures ...... viii

List of Tables ...... ix

Abbreviations ...... xi

Acknowledgements ...... xii

Chapter 1 Overview of the Pathobiological Perspectives of Salmonella enterica Serovar Enteritidis ...... 1

1.1 Classification and General Background of Salmonella and Salmonella enterica Subspecies enterica ...... 2 1.2 Salmonella Enteritidis Colonization, Infection and Contamination of Poultry and Shell Eggs ...... 4 1.3 Salmonella Enteritidis Colonization and Pathogenesis in Humans...... 9 1.4 References ...... 26

Chapter 2 Evolution of Pathogenicity Revealed by Whole Genome Sequencing and Comparative Genomics of Two Egg Isolates of Salmonella Enteritidis ...... 33

2.1 Abstract ...... 34 2.2 Background ...... 35 2.3 Methods ...... 36 2.3.1 Bacterial Strains...... 36 2.3.2 PFGE Profiling...... 36 2.3.3 Genomic DNA Purification...... 36 2.3.4 Sequencing and Assembly...... 37 2.3.5 Construction of the whole genome optical map (OpMap)...... 37 2.3.6 Genome Annotation...... 37 2.3.7 Comparative Genomics and SNP Analysis...... 38 2.4 Results ...... 38 2.4.1 General Overview of SEE1 and SEE2 Genomes ...... 38 2.4.2 Phage Regions in SEE1 and SEE2 ...... 39 2.4.3 Genes That Exhibit Frameshifts ...... 40 2.4.4 Pathogenomics-Based Virulence Gene Profiling of SEE1 and SEE2 ...... 41 2.4.5 Bioinformatic Evidence of Genomic Microevolution during Human Infection...... 46 2.5 Discussion and Conclusions ...... 47 2.6 Author Contributions ...... 53 2.7 References ...... 98

vi

Chapter 3 Patho-Pan-Genomics of Salmonella enterica Serovars Reveals Host-Specific Factors and Potential Vaccine Targets of Salmonella Enteritidis ...... 101

3.1 Abstract ...... 102 3.2 Background ...... 104 3.3 Materials and Methods ...... 107 3.3.1 Orthologous gene clusters and pan-genome matrices...... 107 3.3.2 Pan-Genome Analysis...... 107 3.4 Results ...... 108 3.4.1 Genetic Relatedness of the Various Salmonella enterica Serovars...... 108 3.4.2 Virulence-Associated COGs Present in Salmonella enterica Serovars that Infect Poultry ...... 109 3.4.3 Virulence-Associated COGs Present in Salmonella enterica Serovars that Infect Humans ...... 109 3.4.4 Virulence-Associated COGs in All Serovars of Salmonella enterica and Use for Rational Vaccine Design ...... 110 3.5 Conclusions and Discussion ...... 111 3.6 Author Contributions ...... 117 3.7 References ...... 133

Chapter 4 Growth in Egg Yolk Enhances Salmonella Enteritidis Colonization and Virulence in a Mouse Model of Salmonella Colitis ...... 135

4.1 Abstract ...... 136 4.2 Introduction ...... 137 4.3 Materials and Methods ...... 139 4.3.1 , Media Used and Innocula...... 139 4.3.2 Animal Experiments...... 140 4.3.3 Histopathology...... 141 4.4 Results ...... 141 4.4.1 SE Grown in Yolk Displays Increased Colonization of the GIT ...... 141 4.4.2 SE From Yolk Increases Fecal Shedding ...... 142 4.4.3 SEE1 From Yolk and Mouse Show Variable Dissemination ...... 143 4.4.4 Growth in Egg Yolk Enhances SE Virulence in Vivo ...... 143 4.5 Conclusions and Discussion ...... 144 4.6 Author Contributions ...... 150 4.7 References ...... 156

Chapter 5 Summary and Significance ...... 159

5.1 Synopsis ...... 160 5.2 Evolution of Pathogenicity Revealed by Whole Genome Sequencing and Comparative Genomics of Two Egg Isolates of Salmonella Enteritidis ...... 160 5.2A Summary and Significance ...... 160 5.3 Patho-Pan-Genomics of Salmonella enterica Serovars Reveals Host-Specific Factors and Potential Vaccine Targets of Salmonella Enteritidis ...... 162 5.3A Summary and Significance ...... 162 5.4 Growth in Egg Yolk Enhances Salmonella Enteritidis Colonization and Virulence in a Mouse Colitis Model of Salmonella infection ...... 164

vii

5.4A Summary and Significance ...... 164 5.5 References ...... 166

Appendices ...... 169

Appendix A Pan-Genome Reference Tables ...... 170 Appendix B Growth Characterizations of SEE1 and SEE2 ...... 248

viii

LIST OF FIGURES

Figure 1.1: Multiple Points of Egg Infection After Oral Intake of SE by Laying Hen ...... 20

Figure 1.2: Components and Areas of the Hen Egg...... 21

Figure 1.3: Relative Locations of SPI-1 through SPI-5 and Virulence Factor Encoding Phage Regions ...... 23

Figure 1.4: Cellular Model of SPI-1 and SPI-2 Mechanisms and Contributions ...... 24

Figure 1.5: GIT Model of Salmonella enterica Infection and Role of the Immune System .... 25

Figure 2.1: PFGE Profiles of SEE1 and SEE2 ...... 54

Figure 2.2A: Then Circular Map of SEE1 Genome with its GC Content and GC Skew...... 55

Figure 2.2B: The Circular Map of SEE2 Genome with its GC Content and GC Skew...... 56

Figure 2.3: RAST-Predicted Genome Composition of SEE1 and SEE2...... 57

Figure 2.4: Relative Distribution of Virulence-Associated Genes in SEE1 and SEE2 ...... 58

Figure 2.5: SNP-Based Phylogenetic Analysis of 11 Human Isolates of SE and SEE1...... 59

Figure 2.6: BRIG Analysis of SE Genomes Used in this Study...... 60

Figure 3.1: Genome Composition Similarity of Different Serovars of Salmonella enterica ... 118

Figure 3.2: Cluster Dendrogram of Salmonella enterica Serovars ...... 119

Figure 3.3: SNP-Based Phylogenetic Tree of Serovars Compared to SEE1/2 ...... 121

Figure 4.1: Effect of Source on Colonization and Fecal Shedding of SE In Vivo...... 151

Figure 4.2: Dissemination of SE Grown in Various Sources...... 152

Figure 4.3: Necropsy Evidence of Enhanced Virulence of Egg Yolk-Grown SEE1 and SEE2...... 153

Figure 4.4: Average Total Histopathological Scores of SEE1 in the Cecum...... 154

Figure 4.5: A Hypothetical Pathway for Gene Regulation of SE during Growth in Egg Yolk...... 155

Figure B-1: Growth Curves of SEE1 and SEE2...... 248

Figure B-2: Enumeration and Viability by OD600 ...... 249

ix

LIST OF TABLES

Table 1.1: Phage Encoded Effectors of S. enterica ...... 22

Table 2.1: Genome Statistics for SEE1 and SEE2...... 61

Table 2.2: Strain Identification and Source of Genome...... 61

Table 2.3: Genes that Exhibit Frameshifts...... 62

Table 2.4: Phage Regions and Phage-Associated Loci in SEE1 and SEE2...... 63

Table 2.5: Fimbrial Adherence-Related Genes...... 68

Table 2.6: Non-Fimbrial Adhesin Genes...... 71

Table 2.7: Salmonella Pathogenicty Island 1 (SPI-1) Genes...... 72

Table 2.8: Salmonella Pathogenicty Island 2 (SPI-2) Genes...... 74

Table 2.9: Salmonella Pathogenicty Island 3 (SPI-3) Genes...... 76

Table 2.10: Salmonella Pathogenicty Island 4 (SPI-4) Genes...... 76

Table 2.11: Salmonella Pathogenicty Island 5 (SPI-5) Genes...... 77

Table 2.12: Non-SPI Effectors, Toxins, and Secretion Systems...... 78

Table 2.13: Iron Acquisition Genes ...... 82

Table 2.14: Motility and Chemotaxis Genes ...... 84

Table 2.15: Resistance Genes ...... 86

Table 2.16: Signalling Genes ...... 89

Table 2.17: Miscellaneous Virulence-Associated Genes ...... 91

Table 2.18: Single Nucleotide Polymorphisms in Human vs. Egg Isolates of SE ...... 94

Table 3.1: Strains and Accession Numbers Used in this Study ...... 122

Table 3.2: Poultry-Infectious Serovar COGs and Genes ...... 123

Table 3.3: Human-Infectious Serovar COGs and Genes ...... 123

x

Table 3.4: Signaling and Motility COGs from Core With Potential Vaccine Targets ...... 124

Table 3.5: LPS COGs from Core With Potential Vaccine Targets ...... 126

Table 3.6: OMPs and Adherence COGs from Core With Potential Vaccine Targets ...... 127

Table 3.7: Transporter COGs from Core With Potential Vaccine Targets ...... 129

Table 3.8: SPI- COGs from Core With Potential Vaccine Targets ...... 130

Table 3.9: Misc. Virulence-Associated COGs from Core With Potential Vaccine Targets .... 132

Table A-1: Full Human-Infectious COG List ...... 170

Table A-2: Full Poultry-Infectious COG List ...... 172

Table A-3: Complete Salmonella enterica Shared COG List ...... 175

xi

Abbreviations

APC Antigen Presenting Cell ATCG , , Thymine, Cytosine, and Guanine CDC Centers for Disease Control and Prevention CFU Colony Forming Units DC Dentritic Cell DNA Deoxyribose Nucleic Acid GIC Gastrointestinal Cavity GIT Gastrointestinal Tract IACUC Institutional Animal Care and Use Committee IFN-γ Interferon Gamma IL- Interleukin LB Luria Bertani Media M-Cell Microfold Cell OD Optical Density PFGE Pulse-Field Gel Electrophoresis PSU Penn State University PSU-ADL Penn State University Animal Diagnostic Lab RNA Nucleic Acid rRNA Ribosomal Ribose Nucleic Acid SC Salmonella enterica Choleraesuis SCV Salmonella-containing Vacuole SE Salmonella enterica Enteritidis SEE1 Salmonella enterica Enteritidis from Egg Isolate 1 SEE2 Salmonella enterica Enteritidis from Egg Isolate 2 SG Salmonella enterica Gallinarum SH Salmonella enterica Heidelberg SPt Salmonella enterica Paratyphi SP Salmonella enterica Pullorum SPI Salmonella Pathogenicity Island STy Salmonella enterica Typhi STym Salmonella enterica Typhimurium tRNA Translational Ribose Nucleic Acid TNF-α Tumor Necrosis Factor Alpha T1SS Type 1 Secretion System T2SS Type 2 Secretion System T3SS Type 3 Secretion System T4SS Type 4 Secretion System T5SS Type 5 Secretion System US United States WHO World Health Organization

xii

ACKNOWLEDGEMENTS

First and foremost I would like to parlay my utmost appreciation and undying gratitude to my dissertation advisors Dr. Subhashinie Kariyawasam and Dr. Bhushan Jayarao. Words cannot relay the gratitude I feel toward you both. You took a chance on me and your help, guidance, and patience helped paved the way for me to follow. Your relentless encouragement for me and my success as a scientist and as a person has helped me grow more than I can put into words and I have every bit of confidence moving forward thanks to you. Similarly I would really like to thank the rest of my dissertation committee Dr. Mary Kennett and Dr. Vijay Kumar. Your help and insightful conversations were instrumental for this process to go as well as it did. All of you have not only helped me grow as a scientist, you have changed my life for the better and have given me a very high standard to try to live up to. I have met some wonderful people during my tenure at Penn State University. I would like to extend a big thank you to my wonderful lab mates Saumya, Eranda, Megan, Christine and Sudharsan. We have become extremely close over these past couple years and I think of you all like family. Thank you for all of your insightful discussions, help with projects, help with undergraduates, making me laugh, making me feel as a part of something good. Thank you to my former mentor Dr. Vivek Kapur who mentored me through my Master’s degree. Also to my former lab mates Ro and Lingling who have kept me going and cheering me on. Thank you to my extensive network of collaborators, including Maria, Michelle, and Indira, who have helped me with many of my projects and for your excellent council that helped me develop new ideas. To other friends of mine that were former lab mates of mine or friends I have met here at PSU who have continued to have my back and push me through. Claire, Jen, Dale, Anne, Bryan, Colleen, Celine, Jens, and Sarah; you have all been an inspiration to me during times I was in and out of school. To all of my friends from college including those from Bridgewater State University and Cabrini College, you have helped me immensely. As this work was co-sponsored by my teaching assistantship through the Biochemistry and Molecular Biology Department I want to thank them. In particular, very big thanks to Dr. Heather Giebink, Dr. Ola Sodeinde, and Dr. Greg Broussard for helping facilitate my assistantship and giving me a chance to learn how to teach. To Dr. Mike Radis who played an integral role in my time here at PSU both at the Medical Center and here at UP. To all of the undergraduate students that I have taught and you have showed me how to be a better teacher and mentor.

xiii

I would like to extend my deepest gratitude to my loving family who have loved and supported me through this new quest for the family as a whole. I do my best for myself but in doing so I try my hardest to make them proud of the son, brother, nephew, and grandchild they all know. To my parents and step parents Michael & Pam and Sandi & George. Big thanks to my loving grandmothers who keep telling me how proud they are Elaine and Marjorie. Thanks to all of my aunts, uncles and extended family who have supported me so far. I thank my many brothers (Jon, Jeff, Daniel, and Michael), sisters (Alicia and Becky), cousins, nieces and nephews for helping me drive myself to set a good example. This work is also in memoriam to the family and friends I have lost before now and did not see me become the man and scientist I am today including Raymond Moreau, Kenneth P. Smead, Rose and Charlie Grinell, Joe and Phoebe Frazier, and Bill Frazier. I have no way to express my thanks to the friends who have become family to me. I owe a great deal of gratitude to my former lab mate and dearest friend from overseas Yury. Your help on my projects and being such a good friend through most of my time at PSU has been invaluable. To my very close friends Megan, Jocelyn, Chris, Mark and Paul, WE DID IT! I owe a very special thank you to Dr. Patricia Mancini, Dr. Michael Carson, and Dr. Jim Strickler. Without your guidance as an undergraduate and your constant pushing me, none of this would have been possible. My final acknowledgment is to my loving and wonderful Fiancé and soon-to-be wife (10/17/15) and to my many furry children Milo, Felicity, Oliver, and KitKat. I really couldn’t have done this without you. You have filled my life with joy and with the completion of this next step in our future together I feel as if I am almost whole. I look forward to the wonderful science we will be a part of in the future. To everyone in my life, I thank you so very much. In one way or another you have taught me that the best thing to do in life is to “Learn from yesterday, live for today, and hope for tomorrow. The important thing is to not to stop questioning.” -A. Einstein. With this degree I promise to never stop questioning and keep pursuing to be a good scientist and a good man. I will continue to do you all proud as a sign of my appreciation for all you have all done for me.

Chapter 1

Overview of the Pathobiological Perspectives of Salmonella enterica Serovar Enteritidis

2

1.1 Classification and General Background of Salmonella and Salmonella enterica Subspecies enterica

The genus Salmonella is comprised of three major species of bacteria: Salmonella enterica, Salmonella subterranea and Salmonella bongori [1-5]. The latter is known to predominately infect cold blooded animals and homogeneously maintains Salmonella pathogenicity island 1 (SPI-1) [4,6,7]. Salmonella subterranea is a recently discovered

Salmonella species that is isolated from subterranean deposits of high atomic weight metals such as uranium under low pH [2]. Salmonella enterica however maintains both SPI-1 and SPI-2; which is the primary reason the species are separate [4]. Prior to this, Salmonella bongori had been considered subspecies 5 of Salmonella enterica. Salmonella enterica still consists of six other subspecies: enterica (I), salamae (II), arizonae (III), diarizonae (IIIb), houtenae (IV), and indica (VI) [8]. As a species, Salmonella enterica infects a plethora of hosts and has subspecies and serovars that are either host-restricted or have a broad host range [9]. Of all the subspecies however, Salmonella enterica subspecies enterica (also written as Salmonella enterica enterica) poses the greatest threat to human health as it is comprised of many serovars that have the capacity to cause human illness and are responsible for 99% of Salmonella-linked infections in the world [1,8-11].

Salmonella enterica are Gram-negative, rod shaped, peritrichous, facultative anaerobes

[5,12]. Though most Salmonella enterica subspecies and their serovars are motile there are non- motile variants found in nature. Most of these non-motile Salmonella occur due to frameshift mutations in their flagellar genes. Salmonella enterica subspecies enterica contains over 2, 600 different serovars each with different host specificities with the ability to colonize and/or cause disease within the respective host/s [5,13]. This subspecies is divided into two groups of bacteria depending on the nature of the disease they cause: typhoidal and non-typhoidal [13,14]. Both

3 types account for 90 million cases of human gastroenteritis and 20 million cases of typhoid-like fever world-wide [3,11]. Non-typhoidal Salmonella enterica infections are estimated to be around 1.3 billion annually, with an average annual cost of 2.3 billion annually globally and in the US, respectively [8].

Typhoidal serovars are those that cause systemic disease generally known as typhoid or enteric fever. [15]. Some of the most well-known serovars that are considered typhoidal salmonellae are Salmonella Typhi (STy), S. Paratyphi (SPty), S. Gallinarum (SG), and S.

Pullorum (SP) [5,15,16]. The first two are human restricted serovars, meaning they only infect and cause disease in humans. The last two are poultry restricted causing the same symptoms but only being able to cause disease in poultry [16]. Non-typhoidal Salmonella serovars are infections that are characterized predominantly by a self-limiting gastroenteritis [17]. In rare cases, mostly in immunocompromised individuals, non-typhoidal salmonellae can cause a typhoid-like infection [10,17]. Non-typhoidal Salmonella are a leading cause of bacterial foodborne illness around the world and responsible for more infections, hospitalizations and deaths than any other bacterial foodborne pathogen in the US [11,18-20]. There are many serovars that belong to non-typhoidal Salmonella, but a few examples of these are Salmonella

Typhimurium (ST), Salmonella Heidelberg (SH), and Salmonella Enteritidis (SE) [21,22].

According to The World Health Organization (WHO), these three serovars of Salmonella enterica are responsible for the highest percentage of human cases of salmonellosis in the world

[19].

4

1.2 Salmonella Enteritidis Colonization, Infection and Contamination of Poultry and Shell

Eggs

Salmonella Enteritidis is a broad host-range Salmonella enterica serovar with the capacity to infect cattle, mice, rats, hedgehogs, pigs, birds, and humans, among other animals

[5,23,24]. The reservoir with the highest impact on human health is poultry and poultry products such as meat and most importantly, eggs and egg products [25]. Salmonella Enteritidis colonizes chickens in ovo through the infected reproductive tract of the mother or horizontally after oviposition through the fecal-oral route from feces of infected birds or rodents. In fact, rodents are one of the most important risk factors for the presence of SE within hen houses and or their spread to other sources such as peanuts [26,27]. Contamination of poultry meat generally occurs when the meat comes into contact with feces at the time slaughter [28].

Infected chickens carry SE asymptomatically and shed the bacteria periodically in feces allowing continual passage and re-infection of chickens [29]. Salmonella Enteritidis uses genes located on the Salmonella Pathogenicity Island (SPI)-1 to colonize the GIT of the chicken. Even though competition with the microbiota is usually required for colonization of the GIT by SE, previous work has shown that SE has only a slight effect on the composition of microbiota in the cecum [12,30]. After colonization of the GIT, SE can then be disseminated by the use of SPI-2 for infection and escape from macrophages to cause spread into the reproductive organs of the laying hen, resulting in the ability to contaminate eggs directly and internally [31,32]. The reproductive tract of the hen contains many compartments all of which have various roles in the development of the egg and thus will allow SE to contaminate the egg at various points [29].

According to the Centers for Disease Control and Prevention (CDC) only about 1 in 20,000 eggs are infected internally, so most of the cases of contamination may occur from outside exposure post-lay or in the development of the shell [28]. As Figure 1.1 shows, there are multiple points in

5 which SE can infect either the inside or outside of the egg with the sole purpose to attempt to invade and penetrate through to the yolk for extensive growth [32,33].

At the apex of the reproductive tract are the ovaries and the infundibulum. If these sites are infected SE can contaminate the yolk compartment directly. It is in the yolk that extensive growth can occur due to the abundance of nutrients. If SE colonizes the magnum, SE will then contaminate the albumin of the egg. The albumin may contain lysozyme and other antimicrobial components making it difficult for bacteria including SE to survive [29,34]. Colonization of SE in the isthmus, vagina, or exposure of the egg shell to feces or other environmental sources can lead to the contamination of the egg shell and/or the shell glands and membranes [29,34,35].

Much like the albumin, the egg shell provides both physical and chemical barriers to penetration by invading bacteria, but bacteria may gain entry into the egg through the pores of the egg shell

[35].

Of the many layers and compartments of the egg outlined in Figure 1.2, the cuticle is the first level of defense of the egg. This is the hard outer covering of the inner and outer membranes of the shell and is extremely hydrophobic and proteinaceous [36]. The aforementioned membranes are made up of a network of randomly oriented fibers and an electro-dense membrane known as the limiting membrane [37]. All of these, including the albumin contain abundant lysozyme (which inhibits cell wall biosynthesis of microbes) [38]. Eggs also contain ovotransferrin, which can bind iron with high affinity in an attempt to keep it from bacteria [29].

Recently a novel antimicrobial factor had been identified in eggs: ovocalyxin-36. This is believed to be highly similar to lipopolysaccharide (LPS) binding proteins that allow for increased permeability and destruction of bacterial cells [29,39]. Despite this, Salmonella has been shown to be able to grow on the egg shell as long as temperature and humidity remain low, especially in the absence of fecal contamination [35].

6

The egg is not only protected by the barriers provided by the shell and albumin, but also contains many antimicrobial molecules and compounds. Aside from lysozyme and ovotransferrin

(which has a dual function of iron chelation and direct interaction and induction of damage of the bacterial membrane) there are many more chemical barriers that have been developed [29,40]. In the oviduct 11 different types of gallinacins (chicken β-defensins) are expressed which are highly charged antimicrobial . Infection of SE into sites where these gallinacins are expressed will lead to overexpression of the gallinacins in response to the LPS. This occurs primarily in gallinacin 1, 2, and 3. The final line of defense within the egg is the vitelline membrane that surrounds the yolk. The vitelline has been shown to contain many of the components present in the albumin and the shell as well as some other factors [29,41]. Despite the many barriers to protect the egg from environmental insults, bacteria, such as (E. coli) and

Salmonella enterica serovars (including SE) still find ways to enter and survive in the egg [42].

In response to many cellular, chemical, and physical barriers employed by the hen and the egg, SE has also evolved a number of mechanisms to evade and be a fit pathogen inside the hen and its eggs. Prior to infection of the egg the bacteria must first colonize either the GIT for fecal shedding and egg contamination after the egg is laid, or dissemination and colonization of the reproductive tract. In 2008, Gantois and colleagues applied in vivo expression technology

(IVET) to identify genes that are expressed by SE during oviduct colonization of chicken [31].

Genes that were over-represented in the study were genes involved in the biosynthesis of bacterial cell wall, fimbrial operon regulation, nucleic acid and , stress-related genes, and motility [29,31]. Considering all of the defense mechanisms employed by the developing egg and the reproductive tract, these data suggest that SE has evolved with strategies to subvert or evade these various mechanisms [29].

Some other studies have shown the importance of other central virulence loci, albeit may not be essential, for the ability to colonize the reproductive tract and the egg. The first is the type

7

3 secretion system (T3SS) machinery and effectors of SPI-1 and SPI-2. After initial colonization and invasion of macrophages, SE is believed to use SPI-2 for survival and escape the macrophages and then SPI-1 for subsequent invasion of the tubular gland cells of the oviduct through type 1 fimbrial adherence to glycosphingolipids, mainly in the infundibulum [43-46].

The second critical factor in this process is the LPS. Lipopolysaccharides have been shown to be important for survival inside the macrophages during systemic spread of SE as well as for survival and persistence in the egg and the reproductive tract of the chicken. Another IVET study on SE growth in the egg revealed the expression of rfbH, an O-antigen biosynthesis gene, is strongly induced at room temperature [47]. The same gene was shown to be important for survival in the albumin; which makes sense given the role of LPS in protecting the cell from antimicrobials; much of which reside in the albumin and the vitelline. There are other factors, such as the flagellar system, that have also been shown to aid in survival and persistence of SE in the egg but not in colonization of the reproductive tract of poultry [29]. These factors are likely to be important for SE growth in the egg post-lay.

The vitelline membrane deteriorates over time during storage; releasing nutrients into the albumin. If SE contaminates the egg outside of the yolk (shell, membranes, albumin, etc.), this would create a gradient that could be sensed through chemotaxis. It has been shown that non- motile mutants are unable to proliferate as they are not able to swim through the albumin to gain access to the nutrient rich yolk. Given this and the link between motility and chemotaxis, flagellar motility might be important for survival and growth of SE within the egg [48-50]. Curli fibers in most enteric bacteria, including SE, have been shown to be important for biofilm formation and attachment to glycoproteins, such as fibronectin [29]. The vitelline membrane is comprised of glycoproteins surrounding the collagenous matrix; thus curli fimbriae could be important for yolk invasion and proliferation [50]. Lock and Board showed that a curli-deficient

SE mutant occurred significantly less in the yolk compared to wild-type SE demonstrating the

8 importance of curli for outgrowth in the yolk [51]. Furthermore, the expression of curli was upregulated during late log phase, which suggests that the expression of curli would occur during a situation where bacterial multiplication occurs rapidly as is the case within yolk [48]. Despite growing to very high numbers within the yolk either at 37oC or 42oC, resulting in approximately

109 CFU/mL after 24 hours of growth; there is no external change to the egg to indicate contamination [52]. This high numbers of bacteria in the yolk with the absence of external sign of contamination poses a significant risk for human consumption of SE contaminated eggs. Some studies suggest that SE can also grow rapidly in the vitelline prior to penetration into the yolk

[53].

The primary site of SE contamination of the egg is still unknown; although the vitelline and albumin have been suggested as two possible sites [29,32]. Interestingly, at temperatures of

<10oC, very little to no bacterial replication occurs if the bacteria are in fact in the albumen [54].

However, at room temperature (>20oC), even very low doses of bacteria are able to outgrow to

>106 CFU/mL, in separated albumen [55]. In whole eggs, this effect was enhanced; with high numbers of bacteria migrating towards the yolk. These observations suggest that SE has mechanisms and intrinsic responses to the hostile conditions presented by albumin and have evolved to grow in a multitude of conditions in order to reach the nutrient rich yolk [56]. The energy expenditure of this outgrowth in order to obtain adherence and penetration of the vitelline to the yolk is still unknown. Many of the mechanisms underlying chicken colonization and spread are understudied. The number of chickens tested positive for SE has gone down dramatically however human illness due to the consumption of contaminated shell eggs and broiler meat and their products has increased [10,28].

9

1.3 Salmonella Enteritidis Colonization and Pathogenesis in Humans

If humans consume raw or undercooked eggs or poultry meat contaminated with SE, the symptoms of illness will begin to show between 12 and 72 hours and last usually no longer than 7 days. Since SE is a non-typhoidal Salmonella serovar it usually causes a self-limiting gastroenteritis that presents with symptoms such as vomiting, watery stools, abdominal cramps, and fever [17]. The disease is not generally fatal but in cases where the dehydration from the diarrhea is left untreated can be fatal. Children, elderly, and immunocompromised individuals are at high risk for disseminated and typhoid-like symptoms with complications such as secondary organ damage. In cases where the diarrhea is extremely severe or the disease becomes disseminated, antibiotic treatment and hospitalizations are usually required. Some SE infections can lead to reactive arthritis but the actual cause and effect of this phenomenon is still relatively poorly understood. The actual infection process of SE in the human host is complex with multiple steps that occur at the chemical, genetic, cellular, tissue, and organism levels [10,28].

Most of the outbreaks and cases of SE infections in humans are linked to the PFGE type

JEGX01.0004. Interestingly, a study by Sandt and colleagues showed that this PFGE type was also found in about 99% of all poultry and egg samples tested, thus establishing a strong link between poultry and shell egg contamination and human infection [57].

Much of what is known about the infectious process of SE and other non-typhoidal

Salmonella has come from studies on Salmonella Typhimurium (STym). Many of the central pathobiological strategies employed by STym are similar to many of the non-typhoidal

Salmonella though STym and SE are genetically distinct and have different sets of genes responsible for differences in their pathogenic strategies [22]. The usual route of infection of SE to humans and animals is through fecal-oral route or through contaminated egg to oral route

[5,10]. Because pathogens like SE need to survive in a wide-range of changing

10 microenvironments within a host, they have evolved mechanisms to respond to the changes in the host ,such as pH, sites of attachment, presence of immune cells, presence of mucus, temperature, host genetic regulation and cell differentiation, among others [58]. Also, broad-host range salmonellae, such as SE must have evolved with various mechanisms to survive in various microenvironments of different host species [5,59].

Once ingested and passing through the GIT, SE activates the acid shock system to deal with the extreme low pH of the stomach. The acid resistance system includes over 50 genes and gene products including the two component signaling system PhoPQ and σ-factor RpoS [60,61].

Aside from these two regulators, Fur (the iron metabolism regulator) has also been shown to regulate a panel of 8 genes that are important for acid stress. This is quite intriguing as Fur is regulated by intracellular iron concentrations (when iron levels are low, it is inhibited), but can also act as a regulator of acid stress [60]. This is not the only split regulatory function bestowed upon Fur, as it has been shown to also have a back-and-forth regulation of the zinc regulator ZntR

[62]. Studies have shown that many of these genes not only protect against the low pH but also the inorganic and organic acids found in the stomach. If SE survives the stomach environment, the bacteria then pass into the small intestine and spread to the large intestine, which is the main site of infection and colonization.

Once in the GIT, most serovars of Salmonella enterica subspecies enterica (including

SE) can use their pathogenic strategies through a combination of virulence factors and virulence- associated factors. For example, SE begins to use it flagella as a motor and express genes related to adherence. The primary gene families used in the adherence of SE to intestinal epithelia are flagella and fimbrial-adherence. Most SE isolates have a number of fimbrial operons, which are long, polar, fibrous adhesins that contain multiple subunits and are terminated by a receptor- specific adhesins that dictates the specificity of binding. In one sequenced SE isolate (strain

P125109) which belongs to phage type (PT) 4, contains 13 fimbrial operons [7]. Motile SE

11 isolates also contain flagella which can vary in number to up to approximately 10 randomly positioned flagella. These proteins, like fimbriae, are multi-subunit proteins and are essential for motility and chemotaxis-related motility. Flagella do not only have the ability to be used for motility but also, in some instances, act as an adhesin. Salmonella Enteritidis, as well as many other Salmonella enterica serovars, also contain non-fimbrial adhesins that may also be used for attachment and invasion into host cells, but their roles in SE adherence have yet to be elucidated.

Non-fimbrial adhesins are mostly single subunit β-barrel autotransporters which mediate selective binding. Many serovars of Salmonella enterica have shown to encode multiple Omp- family proteins as well as AidA and MisL (autotransporters) and mediators of hyperadherence such as YidE [63-66]. The MisL autotransporter, like some fimbrial adhesins, is involved in adherence to fibronectin. Fibronectin is a proteoglycan found in the extracellular matrix of many eukaryotic cells of multi-celled organisms [65]. The actual role of many of the Omp- family proteins have not yet been well studied in Salmonella, but it is known that many of these play a critical role in adherence and adherence-related phenotypes of many other enteric pathogens, such as Escherichia coli [67,68]. Interestingly, LPS has also been shown to be an important mediator of adherence of other enteric pathogens [69].

SE most likely uses a combination of these non-fimbrial and fimbrial adhesins in concert with flagella to make contact with enterocytes and preferably, Microfold cells (M-cells) to colonize the GIT starting in the small intestine [70]. M-cells are target cells for invasive bacteria, such as Salmonella as they are specialized antigen sampling cells which have less mucus at the apical surface of the cell. M-cells also have a large pocket on the apical surface that allows for endocytosis of incoming antigen bearing objects such as bacterial cells. Typically, M-cells use transcytosis to move antigens from the apical surface effacing the lumen of the GIT to the basal- lateral surface where professional antigen presenting cells (APCs), such as dendritic cells, for presentation to the adaptive immune system [71,72]. Salmonella Enteritidis is able to prevent this

12 transcytosis towards the basal membrane of the cell where they multiply intracellularly and invade the neighboring cells laterally contributing to the spread of the infection through the GIT

[3,70].

Once attached, SE deploys a series of factors mainly through SPIs. The majority of these genes are contained within five SPIs; 1, 3, and 5 showing the greatest amount of heterogeneity amongst different serovars and 2 and 4 showing the greatest amount of conservation [73]. SPI-1 through SPI-5 all have a variety of functions, some of which overlap and are completely different.

There are more pathogenicity islands that have been identified in different isolates of different serovars but it is completely variable. The SPI-3 has been shown to be required for survival in macrophages and growth at low magnesium conditions [74,75] whereas SPI-4 is known to be important for intra-macrophage survival and contains genes that can induce apoptosis and toxin secretion [75,76]. Finally, SPI-5 displays genes that produce T3SS effectors [75]. Studies show that many effectors and other gene products encoded by pathogenicity islands help facilitate host- specific colonization and exhibition of virulence within the host t. There are many other virulence genes that play crucial role in SE pathogenesis located outside the SPIs. Though many of these are found scattered throughout the chromosome, some genes, sseI which is an E3 ubiquitin and sopE2 which encodes for a guanine exchange factor are found on integrated phage regions (Table 1.1) [77]. Both the pathogenicity islands and integrated phage genomes are located at various points throughout the chromosome as shown in Figure 1.3.

Salmonella pathogenicity island-3 through SPI-5 all appear to have ancillary properties that aid in the processes of invasion, intracellular survival and spread. SPI-1 and SPI-2 are essential for SE’s ability to invade and colonize the mammalian host, as is the case with infections in hens [71,75]. Both of these SPIs encode their own T3SS machinery and effectors with specific functions as it relates to SE’s ability to invade and survive within host cells (Figure

1.4). The SPI-1 is critically important for adherence, invasion and toxicity of the host cell. It

13 contains the T3SS activator, hilA, as well as many proteins involved in the formation of the T3SS apparatus (spa-, prg- and inv-), as well as genes encoding effectors critical for the invasion process, such as the sip- operon [71,78-80]. The effectors, SipA and SipC have been shown to interact with the actin cytoskeletal components of host cells causing them to extend outward and facilitate invasion through a process known as ruffling [80,81].

Perhaps one of the most important effectors of SE is AvrA which is encoded by avrA.

AvrA, in concert with YopJ, inhibits the NFκB signaling through inhibition of IκB phosphorylation by IKK-β, to allow for ubiquitination and targeting for proteosomal degradation

(YopJ), and a distal point of regulation not yet understood with AvrA [82]. This inhibition not only prevents the activation of the anti-apoptosis pathway but also prevents immune signaling from the epithelia to promote a Th1 response to the bacteria [82,83]. The other critical protein secreted through the T3SS-1 is the Sop- proteins (encoded by the sop- operon) including SopB.

SopB facilitates the attraction of neutrophils to the site of infection and ion balance alterations; which can lead to the watery stool phenotype. SopB is an inositol-phosphate that acts on the inositol groups of the eukaryotic cell membrane and allows for the complete formation and closing of the phagocytic cup [84]. In concert with Sip- proteins, SopB also interacts with the actin cytoskeleton causing the membrane ruffling. These outward protrusions of the membrane help facilitate the invasion of SE through the formation of Salmonella-containing vesicles or vacuoles, or SCVs [5,84].

Once enclosed within the SCV, SE then activates the second T3SS and effectors (T3SS-

2) that are involved in maintaining the SCV [75,85]. SCVs play critical roles in the survival of

SE in resident epithelial cells by preventing phagolysosomal fusion by migration to the basal membrane. SPI-2 is critical in the trafficking of the SCV by many of the effector proteins, which can disrupt cytoskeleton and motor proteins that aid in this process [84,86]. Proteins, such as Srf,

SopD, PipB, and SpiC, secreted through the SPI-2 T3SS have ,been implicated in the SCV aspect

14 of the infection [87]. In particular the Sif- proteins, such as SifA/B have been shown to be critical for fusion of SCV with other vesicles that may contain nutrients or nutrient containing molecules.

This may be important for intracellular proliferation of SE within the vacuole. It is unclear how these vacuoles are involved in the pathogenesis of Salmonella enterica, including SE. Although the mechanism of eventual escape has still yet to be elucidated, it appears that SopA may be involved by attracting ubiquitin near the SCV. The SCVs may protect SE from the phagolysosomal or cytosolic bactericidal agents such as uibiquiticidin. Escape into the cytosol would be beneficial for intracellular proliferation of SE [86].

The regulation of most of the virulence genes, including SPI-related genes, is poorly understood. There are known activators and virulons involved in their regulation, such as hilA for

SPI-1 and ssrA/B for SPI-2, but what activates these is understudied. A previous study performed by Sturm and colleagues revealed that SPI-1 seems to be activated by low oxygen tension [78].

Another study revealed that the global regulator, PhoP/Q is involved in regulation of many of the trans-activators of the SPI-operons [88]. Similarly, PhoP/Q was shown to be important for proper activation of SPI-2 through SsrA/B. However, it was not the only regulator as the same study also showed that OmpZ/EnvR are also critical for this regulation [89]. Other SPI-related regulation is still being elucidated, but it is clear that there are a multitude of regulators controlling expression of these virulence factors. This is important because there is a significant cost to producing virulence factors inappropriately [78].

Many serovars, including SE, produce several different types of endo- and exotoxins that have a wide range of biological consequences. Previous studies have revealed that SE can produce a heat-labile, trypsin-sensitive cytotoxin which Salmonella Typhi and Choleraesuis also produce albeit with varying molecular weights. In addition, Shigella dysentariae 1-like cytotoxin is produced by SE isolates. Salmonella Enteritidis has also been shown to produce salmolysin and Salmonella enterotoxin which are also produced by other serovars, such as STy and STm [5].

15

Endotoxins, like lipopolysaccharide (LPS) are molecules that can elicit a strong host response. In addition to the role played in SE colonization of chicken and survival and growth in egg, LPS is also known to generate a strong immunogenic response in the host. Because SE is an intracellular pathogen, it is protected from this immune response. Host cell and tissue damage during infection is in part due to the immune-pathological effects of this strong host immune response [90].

The many factors that SE can use to mediate protection, invasion and damage to the host are important for SE to elicit disease through its own and host-mediated damage through generating a strong immune response [3]. Most non-typhoidal Salmonella, such as SE trigger an immense immune response through the secretion of IL-18 from cells undergoing pyroptosis and

IL-23 from mononuclear cells [91]. These cytokines feed forward in the immune cascade stimulating other immune cells to secrete more pro-inflammatory cytokines in order to respond to the invasion by SE. IL-18 has been previously shown to induce a Th1 response to intracellular bacteria, which would secrete IL-1, TNF-α, and IFN-γ; all of which are damaging to not only the bacteria but also the host [91,92]. IL-23 will stimulate the production of IL-17, which together elicit the Th17 response normally dedicated to helping defend against extracellular pathogens.

The actual role of the Th17 response in Salmonella protection seems to be limited to preventing dissemination by inhibiting Salmonella access to the mesenteric lymph node; however after dissemination has occurred, this response is useless [93]. Also, the host expression of chemokines such as CXCL1, 2, and 5 are all important for the chemokine gradient that attracts neutrophils to the site of infection [3,92-94].

The Th17 response may prevent the ability of SE to disseminate, however, the host response to Th17 in the context of an SE infection can be damaging to the host as well [3]. Many of the toxins and certain effectors armed at damaging host cells have one purpose; to gain access to nutrients generally scarce in the host microenvironment. The host responds in turn with the

16 increase in production of IL-17 and the iron and siderophore-chelating protein lipocalin-2 [95].

Lipocalin-2 has been previously shown to not only bind bioavailable iron in the presence of a pathogen but also to bind bacterial siderophores, such as enterobactin as well. Many intestinal bacterial pathogens have evolved specialized siderophores that have similar affinities for iron and enterobactin but with different modifications and thus cannot be bound by lipochalin-2 [95,96].

Another important metal, zinc is also prioritized by the host by expression of calprotectin [97]. However, as is the case with lipochalin-2, SE encodes for high-affinity zinc transporters that can bind zinc even in the presence of calprotectin [98]. The host uses these as preventative countermeasures against invading pathogens in order to limit the amount of damage produced by the inflammatory response but gain the same result, death of the bacterial cell.

However, because SE does not die in the presence of these nutrient binding proteins, the immune system is forced to use mechanisms that can cause extra damage to the host [3].

Due to the high level of neutrophil recruitment to the sites of infection coupled with the stimulating IL-17 expression, neutrophils begin creating and secreting oxidative bursts in the form of nitric oxide (NO) and other reactive oxygen species (ROS) [3,99]. Most human cells, including epithelial cells, do not respond to high levels of ROS or the cytokines released with them such as TNF-α [100,101]. Salmonella Enteritidis on the other hand has acquired many genes to combat ROS such as superoxide dismutase SodC and peroxide resistance gene catalase

[102]. Although release of ROS is critical for the clearance of bacterial infections, SE has evolved mechanisms to take advantage of this response for their own survival. This release of these reactive oxygen species also begins to kill many of the microbiota that are obligate anaerobes such as Clostridium and Bacteroides species; which constitute greater than 90% of the microbiota within the healthy gut [103,104]. These microbiota are considered the first line of defense against many pathogens [72,105]. Perturbing the symbiosis of the microbiota is beneficial to the SE on two fronts, first, it creates niche space and an opening to the apical surface

17 of the gastrointestinal epithelial cells and second, the dysbiosis is well known to cause an enhanced or increased inflammatory state [72]. Furthermore, the composition of microbiota resulting from the infection may also contribute to colitis that develops in the mammalian gut

[106].

During the infection, under anaerobic conditions, the damage induced by the Th17 response and high turnover of cells in the GIT release an abundance of ethanolamine, a non- fermentable compound, generated from phosphatidylethanolamine [107]. Ethanolamine is used by SE to perform anaerobic respiration. The phosphatidylethanolamine is the phospholipid most commonly found in the membranes of gastrointestinal epithelial cells. Because it is non- fermentable in an anaerobic environment (when fermentation is the preferred method of carbon- source metabolism), SE’s ability to anaerobically respire by using unique electron transporters

(such as tetrathionate) gives the SE another competitive advantage [107,108]. Furthermore, it appears that the colitis that develops may also develop through both MyD88-dependent and - independent mechanisms. This is another way that the host plays into the pathogen’s strategy

[109]. Thus, it is apparent that the host’s immune system significantly aids in the SE infection and in generating the disease phenotype associated with non-typhoidal Salmonella infections

(Figure 1.5).

Some SE cells, in rare instances, will pass through the basal membrane in which the cells can then be phagocytized by macrophages and/or dendritic cells. Salmonella Enteritidis contains genes to help them survive this event and the survival and intracellular growth in the macrophage allows the SE infection to become systemic once the macrophage drains in the lymph to a second site such as the liver. Most of these genes are expressed by or secreted through the SPI-2 and

T3SS-2 [85]. When the bacteria are able to disseminate by passing through the basal-lateral surface of the GIT epithelium, they are usually phagocytized by dendritic cells or other macrophages which are then carried to the mesenteric lymph [110]. This lymph primarily drains

18 into the liver and spleen; allowing the SE access to these extra-intestinal sites, which is commonly seen in S. Typhi and Paratyphi infections [3]. Salmonella Enteritidis uses a series of genes to form a replication vacuole and enhance replication in response to TLR-mediated acidification of the macrophages [111]. By internal secretion of effectors that can inhibit NFκB signaling, the anti-apoptosis pathways are eliminated [82]. This allows SE to cause some of the macrophages to undergo pyroptosis or necropoptosis and allows the release of the SCV from the macrophage in the satellite organs [3,111,112]. As previously mentioned, SPI-2 and other factors are critical for this action of SE, but the virulence plasmid also plays a significant role in the spread of SE; though its role in the gastroenteritis and initial colonization of the GIT remains unknown.

Many isolates of different Salmonella enterica serovars contain serotype-specific virulence plasmids which are low copy and range from approximately 50-100 kb. Not all isolates within a serovar will contain a virulence plasmid and this process is mitigated by the genetic background of the strain and ability to utilize the plasmid. The content and size of each can even differ between different isolates within a particular serovar and there is little evidence that horizontal transfer of these plasmids occur between serovars. Most if not all of these plasmids contain a Salmonella plasmid virulence (spv) locus that is important for reticuloendothelial multiplication, such as in the liver or spleen. The spvABCD operon is regulated through spvR which is induced upon sensing nutrient limiting conditions such as that found within certain cell types, such as macrophages. The remaining loci on the plasmid vary but can include other virulence factors, toxins, and resistance mechanisms not found in the chromosome of that particular Salmonella enterica serovar [5].

Salmonella enterica serovar Enteritidis is a well-adapted and evolved pathogen of many different hosts including humans. It has developed many different molecular mechanisms to subvert a host’s immune system, allow for colonization within a host, and acquire nutrients in

19 order to survive within its various hosts. Many of the mechanisms and genes that have been conserved or acquired in SE and their role in the overall pathogenic strategies of SE have yet to be completely elucidated. Aside from the genes used for these strategies, identification of the regulators and their epistatic relationships of the various Pathogenicity Island and virulence- associated genes within its various hosts are still being discovered.

20

Figure 1.1: Multiple Points of Egg Infection After Oral Intake of SE by Laying Hen Once SE is taken up by a laying hen, SE then travels down the GI tract and colonizes the gut. Once the gut has been colonized the infection can become systemic through the invasion of macrophages and eventual escape allowing for colonization of the reproductive tract. At the various points shown above on the tract that can be infected will ultimately determine if and where the bacteria will be able to contaminate the resulting egg. Any egg that has a developing chick inside may pass SE to the chick which will develop into either a broiler or layer, resulting in further spread of SE. Adapted from: Gantois et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738

21

Figure 1.2: Components and Areas of the Hen Egg. This Figure is linked to the Figure 1.2 in the various areas of the egg that SE can infect after colonization of the reproductive organs and tissues of the hen. Reprinted by Open Access from Gantois et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738

22

Table 1.1: Phage Encoded Effectors of S. enterica

Table 1.1 shows the phage type in the left hand column, the protein encoded by the phage in the middle column and the effector function in the far right column. Reprinted by Open Access from Boyd EF, Carpenter MR, and Chowdhury, N (2012). Mobile Effector Proteins on Phage Genomes. Bacteriophage 2: 139-148

23

Figure 1.3: Relative Locations of SPI-1 through SPI-5 and Virulence Factor Encoding Phage Regions There are many genes which have been horizontally transferred from other bacteria that have all contributed to the evolution of the S. enterica chromosome, especially for the non-typhoidal S. enterica serovars. Though the SPI-1 and SPI-2 are essential for many aspects of the pathogenic traits of S. enterica serovars (including SE), there are many other factors that have been acquired by these serovars to facilitate these processes and become better adapted to a multitude of hosts and host evolution. Reprinted by Open Access from Boyd EF, Carpenter MR, and Chowdhury, N (2012). Mobile Effector Proteins on Phage Genomes. Bacteriophage 2: 139-148

24

S. enterica Enteritidis Step 1 Step 2

Step 3

Step 4

Figure 1.4: Cellular Model of SPI-1 and SPI-2 Mechanisms and Contributions The SPI-1 and SPI-2 are critical contributors to invasion and maintenance of Salmonella containing vacuole (SCV), respectively. The effectors listed above are not a complete list, but the general schematic showing that SPI-1 effectors are involved in both actin remodeling and immune modulation as well as hijacking cellular signaling pathways. Among these are signaling to the mitochondria (which houses eukaryotic ATP stores), as well as the which is involved in trafficking of proteins and lipids inside the cell and to the membrane. SPI-2 has effectors that are predominantly involved in immune modulation but as well as movement of the SCV to the basement membrane. Adapted from Boyd EF, Carpenter MR, and Chowdhury, N (2012). Mobile Effector Proteins on Phage Genomes. Bacteriophage 2: 139-148

25

SPI-1

SPI-2

Figure 1.5: GIT Model of Salmonella enterica Infection and Role of the Immune System As important as the immune system is in fighting off infection, in the case of inflammatory bacterial infections such as those caused by SE, the immune system plays into the pathogen’s hands. Above is a system view of the infection by an S. enterica (non-typhoidal) serovar, such as SE, and the host’s response to it. Adapted from Behnsen J, Perez-Lopez A, Nuccio S-P, Raffatellu M. (2015) Exploiting host immunity: the Salmonella paradigm. Trends in Immunology 36: 112- 120

26

1.4 References

1. Groisman EA, Ochman H (1997) How Salmonella Became a Pathogen. Trends in Microbiology 5: 343-349. 2. Shelobolina ES, Sullivan SA, O'Neill KR, Nevin KP, Lovley DR (2004) Isolation, Characterization, and U(VI)-Reducing Potential of a Facultatively Anaerobic, Acid- Resistant Bacterium from Low-pH, Nitrate- and U(VI)-Contaminated Subsurface Sediment and Description of Salmonella subterranea sp. nov. Applied and Environmental Microbiology 70: 2959-2965. 3. Behnsen J, Perez-Lopez A, Nuccio S-P, Raffatellu M (2015) Exploiting host immunity: the Salmonella paradigm. Trends in Immunology 36: 112-120. 4. Heyndrickx M, Pasmans F, Ducatelle R, Decostere A, Haesebrouck F (2005) Recent changes in Salmonella nomenclature: The need for clarification. The Veterinary Journal 170: 275- 277. 5. Foley SL, Johnson TJ, Ricke SC, Nayak R, Danzeisen J (2013) Salmonella Pathogenicity and Host Adaptation in Chicken-Associated Serovars. Microbiology and Molecular Biology Reviews 77: 582-607. 6. Chan K, Baker S, Kim CC, Detweiler CS, Dougan G, et al. (2003) Genomic Comparison of Salmonella enterica Serovars and Salmonella bongori by Use of an S. enterica Serovar Typhimurium DNA Microarray. Journal of Bacteriology 185: 553-563. 7. Betancor L, Yim L, Martínez A, Fookes M, Sasias S, et al. (2012) Genomic Comparison of the Closely Related Salmonella enterica Serovars Enteritidis and Dublin. The Open Microbiology Journal 6: 5-13. 8. Desai PT, Porwollik S, Long F, Cheng P, Wollam A, et al. (2013) Evolutionary Genomics of Salmonella enterica Subspecies. mBio 4. 9. Bäumler AJ, Tsolis RM, Ficht TA, Adams LG (1998) Evolution of Host Adaptation in Salmonella enterica. Infection and Immunity 66: 4579-4587. 10. CDC (2011) Vital Signs: Incidence and Trends of Infection with Pathogens Transmitted Commonly Through Food --- Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 1996--2010. 11. Majowicz SE, Musto J, Scallan E, Angulo FJ, Kirk M, et al. (2010) The Global Burden of Nontyphoidal Salmonella Gastroenteritis. Clinical Infectious Diseases 50: 882-889. 12. Santos RL (2014) Pathobiology of Salmonella, Intestinal Microbiota, and the Host innate Immune Response. Frontiers in Immunology 5. 13. Uzzau S, Brown DJ, Wallis T, Rubino S, Leori G, et al. (2000) Host Adapted Serotypes of Salmonella enterica. Epidemiol Infect 125: 229-255. 14. Suez J, Porwollik S, Dagan A, Marzel A, Schorr YI, et al. (2013) Virulence Gene Profiling and Pathogenicity Characterization of Non-Typhoidal Salmonella Accounted for Invasive Disease in Humans. PLoS ONE 8: e58449. 15. Bhan MK, Bahl R, Bhatnagar S (2005) Typhoid and Paratyphoid Fever. The Lancet 366: 749- 762. 16. Barrow PA, Neto OCF (2011) Pullorum Disease and Fowl Typhoid—New Thoughts on Old Diseases: a Review. Avian Pathology 40: 1-13. 17. Acheson D, Hohmann EL (2001) Nontyphoidal Salmonellosis. Clinical Infectious Diseases 32: 263-269.

27

18. Gordon MA (2011) Invasive Non-typhoidal Salmonella Disease – epidemiology, pathogenesis and diagnosis. Current opinion in infectious diseases 24: 484-489. 19. Vieira Aea (2009) A Resource to Link Human and Non-Human Sources of Salmonella. WHO Global Foodborne Infections Network Country Databank 20. Chai SJ, White PL, Lathrop SL, Solghan SM, Medus C, et al. (2012) Salmonella enterica Serotype Enteritidis: Increasing Incidence of Domestically Acquired Infections. Clinical Infectious Diseases 54: S488-S497. 21. Demczuk W, Soule G, Clark C, Ackermann H-W, Easy R, et al. (2003) Phage-Based Typing Scheme for Salmonella enterica Serovar Heidelberg, a Causative Agent of Food Poisonings in Canada. Journal of Clinical Microbiology 41: 4279-4284. 22. Silva CA, Blondel CJ, Quezada CP, Porwollik S, Andrews-Polymenis HL, et al. (2012) Infection of Mice by Salmonella enterica Serovar Enteritidis Involves Additional Genes That Are Absent in the Genome of Serovar Typhimurium. Infection and Immunity 80: 839-849. 23. Nauerby B, Pedersen K, Dietz HH, Madsen M (2000) Comparison of Danish Isolates of Salmonella entericaSerovar Enteritidis PT9a and PT11 from Hedgehogs (Erinaceus europaeus) and Humans by Plasmid Profiling and Pulsed-Field Gel Electrophoresis. Journal of Clinical Microbiology 38: 3631-3635. 24. Suar M, Jantsch J, Hapfelmeier S, Kremer M, Stallmach T, et al. (2006) Virulence of Broad- and Narrow-Host-Range Salmonella enterica Serovars in the Streptomycin-Pretreated Mouse Model. Infection and Immunity 74: 632-644. 25. Guard-Petter J (2001) The Chicken, the Egg and Salmonella Enteritidis. Environmental Microbiology 3: 421-430. 26. Meerburg BG, Kijlstra A (2007) Role of Rodents in Transmission of Salmonella and Campylobacter. Journal of the Science of Food and Agriculture 87: 2774-2781. 27. Henzler DJ, Opitz HM (1992) The Role of Mice in the Epizootiology of Salmonella Enteritidis Infection on Chicken Layer Farms. Avian Diseases 36: 625-631. 28. CDC (2010) Salmonella Serotype Enteritidis. National Center for Emerging and Zoonotic Infectious Diseases. 29. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Gast R, et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738. 30. Petra Videnska, Frantisek Sisak, Hana Havlickova, Faldynova M, Rychlik I (2013) Influence of Salmonella enterica Serovar Enteritidis Infection on the Composition of Chicken Cecal Microbiota. BMC Veterinary Research 9. 31. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Van Immerseel F (2008) Salmonella enterica Serovar Enteritidis Genes Induced during Oviduct Colonization and Egg Contamination in Laying Hens. Applied and Environmental Microbiology 74: 6616- 6622. 32. Gast RK, Guraya R, Guard-Bouldin J, Holt PS, Moore RW (2007) Colonization of Specific Regions of the Reproductive Tract and Deposition at Different Locations Inside Eggs Laid by Hens Infected with Salmonella Enteritidis or Salmonella Heidelberg. . Avian Diseases 51: 40-44. 33. Gast RK, Guraya R, Guard J (2013) Salmonella Enteritidis Deposition in Eggs after Experimental Infection of Laying Hens with Different Oral Doses. Journal of Food Protection 76: 108-113. 34. De Reu K, Grijspeerdt K, Messens W, Heyndrickx M, Uyttendaele M, et al. (2006) Eggshell factors influencing eggshell penetration and whole egg contamination by different bacteria, including Salmonella enteritidis. International Journal of Food Microbiology 112: 253-260.

28

35. Messens W, Grijspeerdt K, De Reu K, De Ketelaere B, Mertens K, et al. (2007) Eggshell Penetration of Various Types of Hens' Eggs by Salmonella enterica Serovar Enteritidis. Journal of Food Protection 70: 623-628. 36. Lunam CA, Ruiz J (2000) Ultrastructural Analysis of the Eggshell: Contribution of the Individual Calcified Layers and the Cuticle to Hatchability and Egg Viability in Broiler Breeders. British Poultry Science 41: 584-592. 37. Liong JWW, Frank JF, Bailey S (1997) Visualization of Eggshell Membranes and Their Interaction with Salmonella enteritidis Using Confocal Scanning Laser Microscopy. Journal of Food Protection 60: 1022-1028. 38. Hincke MT, Gautron J, Panheleux M, Garcia-Ruiz J, McKee MD, et al. (2000) Identification and Localization of Lysozyme as a Component of Eggshell Membranes and Eggshell Matrix. Matrix Biology 19: 443-453. 39. Gautron J, Murayama E, Vignal A, Morisson M, McKee MD, et al. (2007) Cloning of Ovocalyxin-36, a Novel Chicken Eggshell Protein Related to Lipopolysaccharide- Binding Proteins, Bactericidal Permeability-Increasing Proteins, and Plunc Family Proteins. Journal of Biological Chemistry 282: 5273-5286. 40. Ibrahim HR, Sugimoto Y, Aoki T (2000) Ovotransferrin Antimicrobial (OTAP-92) Kills Bacteria Through a Membrane Damage Mechanism. Biochimica et Biophysica Acta (BBA) - General Subjects 1523: 196-205. 41. Mageed AMA, Isobe N, Yoshimura Y (2008) Expression of Avian β-Defensins in the Oviduct and Effects of Lipopolysaccharide on Their Expression in the Vagina of Hens. Poultry Science 87: 979-984. 42. Subedi K, Isobe N, Nishibori M, Yoshimura Y (2007) Changes in the Expression of Gallinacins, Antimicrobial Peptides, in Ovarian Follicles During Follicular Growth and in Response to Lipopolysaccharide in Laying Hens (Gallus domesticus). Reproduction 133: 127-133. 43. Li S, Zhang Z, Pace L, Lillehoj H, Zhang S (2009) Functions Exerted by the Virulence- Associated Type-Three Secretion Systems During Salmonella enterica Serovar Enteritidis Invasion Into and Survival Within Chicken Oviduct Epithelial Cells and Macrophages. Avian Pathology 38: 97-106. 44. De Buck J, Immerseel FV, Haesebrouck F, Ducatelle R (2004) Effect of Type 1 Fimbriae of Salmonella enterica Serotype Enteritidis on Bacteraemia and Reproductive Tract Infection in Laying Hens. Avian Pathology 33: 314-320. 45. De Buck J, Pasmans F, Van Immerseel F, Haesebrouck F, Ducatelle R (2004) Tubular Glands of the Isthmus are the Predominant Colonization Site of Salmonella Enteritidis in the Upper Oviduct of Laying Hens. Poultry Science 83: 352-358. 46. De Buck J, Van Immerseel F, Meulemans G, Haesebrouck F, Ducatelle R (2003) Adhesion of Salmonella enterica Serotype Enteritidis Isolates to Chicken Isthmal Glandular Secretions. Veterinary Microbiology 93: 223-233. 47. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Van Immerseel F (2009) The Salmonella Enteritidis Lipopolysaccharide Biosynthesis Gene rfbH is Required for Survival in Egg Albumen. Zoonoses and Public Health 56: 145-149. 48. Baron F, Gautier M, Brule G (1997) Factors Involved in the Inhibition of Growth of Salmonella Enteritidis in Liquid Egg White. Journal of Food Protection 60: 1318-1323. 49. Cogan TA, Jørgensen F, Lappin-Scott HM, Benson CE, Woodward MJ, et al. (2004) Flagella and Curli Fimbriae are Important for the Growth of Salmonella enterica Serovars in Hen Eggs. Microbiology 150: 1063-1071. 50. Mariconda S, Wang Q, Harshey RM (2006) A Mechanical Role for the Chemotaxis System in Swarming Motility. Molecular Microbiology 60: 1590-1602.

29

51. Lock JL, Board RG (1992) Persistence of Contamination of Hens' Egg Albumen in vitro with Salmonella Serotypes. Epidemiology and Infection 108: 389-396. 52. Guan J, Grenier C, Brooks BW (2006) In Vitro Study of Salmonella enteritidis and Salmonella typhimurium Definitive Type 104: Survival in Egg Albumen and Penetration through the Vitelline Membrane. Poultry Science 85: 1678-1681. 53. Gast RK, Guraya R, Guard-Bouldin J, Holt PS (2008) Multiplication of Salmonella Enteritidis on the Yolk Membrane and Penetration to the Yolk Contents at 30 C in an In Vitro Egg Contamination Model. Journal of Food Protection 71: 1905-1909. 54. Schoeni JL, Glass KA, McDermott JL, Wong ACL (1995) Growth and Penetration of Salmonella Enteritidis, Salmonella Heidelberg and Salmonella Typhimurium in Eggs. International Journal of Food Microbiology 24: 385-396. 55. Clavijo RI, Loui C, Andersen GL, Riley LW, Lu S (2006) Identification of Genes Associated with Survival of Salmonella enterica Serovar Enteritidis in Chicken Egg Albumen. Applied and Environmental Microbiology 72: 1055-1064. 56. Messens W, Duboccage L, Grijspeerdt K, Heyndrickx M, Herman L (2004) Growth of Salmonella serovars in hens’ egg albumen as affected by storage prior to inoculation. Food Microbiology 21: 25-32. 57. Sandt CH, Fedorka-Cray PJ, Tewari D, Ostroff S, Joyce K, et al. (2013) A Comparison of Non-Typhoidal Salmonella from Humans and Food Animals Using Pulsed-Field Gel Electrophoresis and Antimicrobial Susceptibility Patterns. PLoS ONE 8: e77836. 58. Biton M, Levin A, Slyper M, Alkalay I, Horwitz E, et al. (2011) Epithelial microRNAs Regulate Gut Mucosal Immunity Via Epithelium-T Cell Crosstalk. Nat Immunol 12: 239- 246. 59. Foley SL, Nayak R, Hanning IB, Johnson TJ, Han J, et al. (2011) Population Dynamics of Salmonella enterica Serotypes in Commercial Egg and Poultry Production. Applied and Environmental Microbiology 77: 4273-4279. 60. Bearson BL, Wilson L, Foster JW (1998) A Low pH-Inducible, PhoPQ-Dependent Acid Tolerance Response Protects Salmonella Typhimurium Against Inorganic Acid Stress. Journal of Bacteriology 180: 2409-2417. 61. Bearson SMD, Bearson BL, Rasmussen MA (2006) Identification of Salmonella enterica Serovar Typhimurium Genes Important for Survival in the Swine Gastric Environment. Applied and Environmental Microbiology 72: 2829-2836. 62. Pruteanu M, Neher SB, Baker TA (2007) Ligand-Controlled Proteolysis of the Escherichia coli Transcriptional Regulator ZntR. Journal of Bacteriology 189: 3017-3025. 63. Laarmann S, Schmidt MA (2003) The Escherichia coli AIDA Autotransporter Adhesin Recognizes an Integral Membrane Glycoprotein as Receptor. Microbiology 149: 1871- 1882. 64. Wells TJ, Totsika M, Schembri MA (2010) Autotransporters of Escherichia coli: a Sequence- Based Characterization. Microbiology 156: 2459-2469. 65. Dorsey CW, Laarakker MC, Humphries AD, Weening EH, Bäumler AJ (2005) Salmonella enterica serotype Typhimurium MisL is an intestinal colonization factor that binds fibronectin. Molecular Microbiology 57: 196-211. 66. Torres AG, Jeter C, Langley W, Matthysse AG (2005) Differential Binding of Escherichia coli O157:H7 to Alfalfa, Human Epithelial Cells, and Plastic Is Mediated by a Variety of Surface Structures. Applied and Environmental Microbiology 71: 8008-8015. 67. Torres AG, Li Y, Tutt CB, Xin L, Eaves-Pyles T, et al. (2006) Outer Membrane Protein A of Escherichia coli O157:H7 Stimulates Dendritic Cell Activation. Infection and Immunity 74: 2676-2685.

30

68. Ma Q, Wood TK (2009) OmpA Influences Escherichia coli Biofilm Formation by Repressing Cellulose Production Through the CpxRA Two-Component System. Environmental Microbiology 11: 2735-2746. 69. Carter JA, Blondel CJ, Zaldivar M, Ãlvarez SA, Marolda CL, et al. (2007) O-Antigen Modal Chain Length in Shigella flexneri 2a is Growth-Regulated Through RfaH-Mediated Transcriptional Control of the wzy Gene. Microbiology 153: 3499-3507. 70. Reis R, Horn F (2010) Enteropathogenic Escherichia coli, Samonella, Shigella and Yersinia: Cellular Aspects of Host-Bacteria Interactions in Enteric Diseases. Gut Pathogens 2: 8. 71. Velge P, Wiedemann A, Rosselin M, Abed N, Boumart Z, et al. (2012) Multiplicity of Salmonella Entry Mechanisms, a New Paradigm for Salmonella Pathogenesis. MicrobiologyOpen 1: 243-258. 72. Chassaing B, Kumar M, Baker M, Singh V, Vijay-Kumar M (2014) Mammalian gut immunity. 246-258 p. 73. Amavisit P, Lightfoot D, Browning GF, Markham PF (2003) Variation Between Pathogenic Serovars Within Salmonella Pathogenicity Islands. Journal of Bacteriology 185: 3624- 3635. 74. Blanc-Potard A-B, Solomon F, Kayser J, Groisman EA (1999) The SPI-3 Pathogenicity Island of Salmonella enterica. Journal of Bacteriology 181: 998-1004. 75. Rychlik I, Karasova D, Sebkova A, Volf J, Sisak F, et al. (2009) Virulence potential of five major pathogenicity islands (SPI-1 to SPI-5) of Salmonella enterica serovar Enteritidis for chickens. BMC Microbiology 9: 268. 76. Gerlach RG, Jäckel D, Geymeier N, Hensel M (2007) Salmonella Pathogenicity Island 4- Mediated Adhesion Is Coregulated with Invasion Genes in Salmonella enterica. Infection and Immunity 75: 4697-4709. 77. Boyd EF, Carpenter MR, Chowdhury N (2012) Mobile Effector Proteins on Phage Genomes. Bacteriophage 2: 139-148. 78. Sturm A, Heinemann M, Arnoldini M, Benecke A, Ackermann M, et al. (2011) The Cost of Virulence: Retarded Growth of Salmonella Typhimurium Cells Expressing Type III Secretion System 1. PLoS Pathog 7: e1002143. 79. Main-Hester KL, Colpitts KM, Thomas GA, Fang FC, Libby SJ (2008) Coordinate Regulation of Salmonella Pathogenicity Island 1 SPI1 and SPI4 in Salmonella enterica Serovar Typhimurium. Infection and Immunity 76: 1024-1035. 80. Phoebe Lostroh C, Lee CA (2001) The Salmonella Pathogenicity Island-1 Type III Secretion System. Microbes and Infection 3: 1281-1291. 81. Wallis TS, Galyov EE (2000) Molecular Basis of Salmonella-Induced Enteritis. Molecular Microbiology 36: 997-1005. 82. Collier-Hyams LS, Zeng H, Sun J, Tomlinson AD, Bao ZQ, et al. (2002) Cutting Edge: Salmonella AvrA Effector Inhibits the Key Proinflammatory, Anti-Apoptotic NF-κB Pathway. The Journal of Immunology 169: 2846-2850. 83. Chen ZJ (2005) Ubiquitin Signaling in the NF-κB Pathway. Nature cell biology 7: 758-765. 84. Steele-Mortimer O (2008) The Salmonella-Containing Vacuole – Moving with the Times. Current Opinion in Microbiology 11: 38-45. 85. Waterman SR, Holden DW (2003) Functions and Effectors of the Salmonella Pathogenicity Island 2 Type III Secretion System. Cellular Microbiology 5: 501-511. 86. Brumell JH, Tang P, Zaharik ML, Finlay BB (2002) Disruption of the Salmonella-Containing Vacuole Leads to Increased Replication of Salmonella enterica Serovar Typhimurium in the Cytosol of Epithelial Cells. Infection and Immunity 70: 3264-3270.

31

87. Abrahams GL, Hensel M (2006) Manipulating cellular transport and immune responses: dynamic interactions between intracellular Salmonella enterica and its host cells. Cellular Microbiology 8: 728-737. 88. Holden DW (2002) Trafficking of the Salmonella Vacuole in Macrophages. Traffic 3: 161- 169. 89. Xu X, Hensel M (2010) Systematic Analysis of the SsrAB Virulon of Salmonella enterica. Infection and Immunity 78: 49-58. 90. Tsolis RM, Young GM, Solnick JV, Baumler AJ (2008) From bench to bedside: stealth of enteroinvasive pathogens. Nat Rev Micro 6: 883-892. 91. Godinez I, Raffatellu M, Chu H, Paixão TA, Haneda T, et al. (2009) Interleukin-23 Orchestrates Mucosal Responses to Salmonella enterica Serotype Typhimurium in the Intestine. Infection and Immunity 77: 387-398. 92. Damsker JM, Hansen AM, Caspi RR (2010) Th1 and Th17 cells: Adversaries and collaborators. Annals of the New York Academy of Sciences 1183: 211-221. 93. McGeachy MJ, McSorley SJ (2012) Microbial-induced Th17: Superhero or Supervillain? Journal of immunology (Baltimore, Md : 1950) 189: 3285-3291. 94. Liu JZ, Pezeshki M, Raffatellu M (2009) Th17 cytokines and host-pathogen interactions at the mucosa: dichotomies of help and harm. Cytokine 48: 156-160. 95. Raffatellu M, Bäumler AJ (2010) Salmonella's iron armor for battling the host and its microbiota. Gut Microbes 1: 70-72. 96. Müller S, Valdebenito M, Hantke K (2009) Salmochelin, The Long-Overlooked Catecholate Siderophore of Salmonella. BioMetals 22: 691-695. 97. Corbin BD, Seeley EH, Raab A, Feldmann J, Miller MR, et al. (2008) Metal Chelation and Inhibition of Bacterial Growth in Tissue Abscesses. Science 319: 962-965. 98. Liu JZ, Jellbauer S, Poe A, Ton V, Pesciaroli M, et al. (2012) Zinc sequestration by the neutrophil protein calprotectin enhances Salmonella growth in the inflamed gut. Cell host & microbe 11: 227-239. 99. Zhang Y, Wang H, Ren J, Tang X, Jing Y, et al. (2012) IL-17A Synergizes with IFN-γ to Upregulate iNOS and NO Production and Inhibit Chlamydial Growth. PLoS ONE 7: e39214. 100. Bhattacharya K, Naha PC, Naydenova I, Mintova S, Byrne HJ (2012) Reactive oxygen species mediated DNA damage in human lung alveolar epithelial (A549) cells from exposure to non-cytotoxic MFI-type zeolite nanoparticles. Toxicology Letters 215: 151- 160. 101. Arnold JW, Niesel DW, Annable CR, Hess CB, Asuncion M, et al. (1993) Tumor necrosis factor-α mediates the early pathology in Salmonella infection of the gastrointestinal tract. Microbial Pathogenesis 14: 217-227. 102. Sly LM, Guiney DG, Reiner NE (2002) Salmonella enterica Serovar Typhimurium Periplasmic Superoxide Dismutases SodCI and SodCII Are Required for Protection against the Phagocyte Oxidative Burst. Infection and Immunity 70: 5312-5315. 103. Stecher B, Robbiani R, Walker AW, Westendorf AM, Barthel M, et al. (2007) Salmonella enterica Serovar Typhimurium Exploits Inflammation to Compete with the Intestinal Microbiota. PLoS Biology 5: e244. 104. Behnsen J, Jellbauer S, Wong Christina P, Edwards Robert A, George Michael D, et al. (2014) The Cytokine IL-22 Promotes Pathogen Colonization by Suppressing Related Commensal Bacteria. Immunity 40: 262-273. 105. Kamada N, Seo S-U, Chen GY, Nunez G (2013) Role of the Gut Microbiota in Immunity and Inflammatory Disease. Nat Rev Immunol 13: 321-335.

32

106. Ferreira RBR, Gill N, Willing BP, Antunes LCM, Russell SL, et al. (2011) The Intestinal Microbiota Plays a Role in Salmonella Induced Colitis Independent of Pathogen Colonization. PLoS ONE 6: e20338. 107. Thiennimitr P, Winter SE, Winter MG, Xavier MN, Tolstikov V, et al. (2011) Intestinal Inflammation Allows Salmonella to Use Ethanolamine to Compete With the Microbiota. Proceedings of the National Academy of Sciences of the United States of America 108: 17480-17485. 108. Lopez CA, Winter SE, Rivera-Chávez F, Xavier MN, Poon V, et al. (2012) Phage-Mediated Acquisition of a Type III Secreted Effector Protein Boosts Growth of Salmonella by Nitrate Respiration. mBio 3: e00143-00112. 109. Hapfelmeier S, Stecher B, Barthel M, Kremer M, Müller AJ, et al. (2005) The Salmonella Pathogenicity Island (SPI)-2 and SPI-1 Type III Secretion Systems Allow Salmonella Serovar typhimurium to Trigger Colitis via MyD88-Dependent and MyD88-Independent Mechanisms. The Journal of Immunology 174: 1675-1685. 110. Weiss DS, Raupach B, Takeda K, Akira S, Zychlinsky A (2004) Toll-Like Receptors Are Temporally Involved in Host Defense. The Journal of Immunology 172: 4463-4469. 111. Arpaia N, Godec J, Lau L, Sivick KE, McLaughlin LM, et al. (2011) TLR signaling is required for virulence of an intracellular pathogen. Cell 144: 675-688. 112. Fink SL, Cookson BT (2005) Apoptosis, Pyroptosis, and Necrosis: Mechanistic Description of Dead and Dying Eukaryotic Cells. Infection and Immunity 73: 1907-1916.

33

Chapter 2

Evolution of Pathogenicity Revealed by Whole Genome Sequencing and Comparative Genomics of Two Egg Isolates of Salmonella Enteritidis

34

2.1 Abstract

Salmonella Enteritidis (SE) is the most common cause of foodborne salmonellosis through the consumption of contaminated shell eggs. To date, there are over 300 SE sequences from various sources currently on the GenBank database, but none have been sequenced from shell eggs. In this study, genomes of two egg originated SE strains belonging to the PFGE pattern

JEGX01.0004 were sequenced, annotated and analyzed. In line with other sequenced SE isolates, both genomes were approximately 4.67 Mb with a GC content of 52% and ~4,800 open reading frames. A virulence gene profile representing the PFGE type JEGX01.0004 was constructed using manual annotation of both genomes revealed these isolates contain over 600 genes related to virulence, the majority of which encode for genes related to Salmonella pathogenicity island

(SPI) proteins or non-SPI secretion systems, toxins, and effectors. Comparative genomics of human and egg isolates of SE provided evidence of microevolution of nine virulence-associated genes. This study will hopefully begin to aid in the current understanding of molecular mechanisms and genes involved in SE pathogenesis as well as pathoadaptive mutations that may occur to optimize SE colonization of different hosts.

35

2.2 Background

Salmonella enterica subspecies enterica serovar Enteritidis (SE) is one of the most prevalent causes of bacterial foodborne morbidity and mortality in the United States and around the world [1-4]. It is a non-typhoidal Salmonella enterica serovar which can infect many different hosts including humans, hedgehogs, mice, and various types of birds. The latter of which, especially poultry, are considered the largest reservoir of asymptomatic carriage of SE [5].

Thus, it is not surprising that the greatest contributor to human infection is eggs, egg products and poultry meat that have been contaminated by SE [5-7]. Shell egg contamination can either occur by trans-ovarian passage of SE from the infection of the reproductive organs of the laying hen or through the fecal-egg contamination post-lay [6,8]. Symptoms of human illness appear 12-72 hours following consumption of SE contaminated eggs. Infection in human is usually limited to gastroenteritis with symptoms, such as abdominal cramping, vomiting and diarrhea, [2]. In rare cases, especially in children, the elderly and immunocompromised individuals, SE can cause a typhoid-like infection leading to symptoms of enteric fever including death [2,9].

Outbreaks and cases of SE have continued to be steady or even rise in some countries.

According to the World Health Organization, approximately 93% of geographical regions investigated revealed SE was either the first or second highest reported of all human cases of

Salmonella. In the same study, almost half of these regions reported SE as one of the top two top causes of salmonellosis in animals. In the United States alone, the number of human illness due to SE has been increased by 44% for the last 30 years [1]. All these data highlight the importance of SE as a cause of human foodborne illness as well as high cost to healthcare systems and animal agriculture [7]. Recent advances in sequencing techniques have offered the opportunity to examine the entire genome landscape of different isolates of a bacterial species recovered from

36 various places, times, and sources facilitating molecular epidemiological investigations. Despite this ability, pulsed-field gel electrophoresis (PFGE) is still regarded as the ‘gold standard’ typing method for Salmonella. Many isolates share close to 95% sequence homology and therefore

PFGE may fail to differentiate between closely related isolates [13]. Although there are over 300 genomes of SE reported to GenBank none these SE have been isolated from eggs.

In light of this, we first set out to characterize the genome structure and composition of two SE strains isolated from shell eggs. This sequence information was then used to develop a virulence gene profile for SE. Finally, using comparative genomics between egg and human isolates of SE, we identified genomic evidence of microevolution that might have occurred at the nucleotide level. Understanding these nucleotide changes may provide evidence of pathoadaptive mutations occurred in SE during the course of human infection.

2.3 Methods

2.3.1 Bacterial Strains. Two strains of SE isolated from shell eggs at the Animal Diagnostic

Laboratory at the Pennsylvania State University (University Park, PA) were used in the study.

These two isolates were designated as SEE1 and SEE2 (Salmonella Enteritidis from Egg).

2.3.2 PFGE Profiling. Pulsed-field gel electrophoresis (PFGE) was performed as previously described [10]. Briefly, chromosomal DNA was digested with XbaI and electrophoresed using

CHEF DRII System (Bio-Rad, Hercules, CA) under following conditions: Initial Time: 2.2 s.

Final time: 54.2 s. Gradient: 6V cm -1 at included angle of 120o. Gels were then electrophoresed for 24 hours using Salmonella Braenderup strain H9812 (ATCC BAA664, Manassas, VA.) as a standard.

2.3.3 Genomic DNA Purification. The Promega Genomic Wizard Kit (Promega Corp., Madison

WI.) was used to isolate genomic DNA. Briefly, a colony of each isolate was grown overnight

37 and pelleted. The pellet was resuspended in 50 mM ethylenediaminetetracetic acid (EDTA) to extract DNA according to manufacturer’s instructions. Purified DNA was dissolved in distilled water.

2.3.4 Sequencing and Assembly. Both SEE1 and SEE2 were sequenced at the Genomics Core

Facility at the Pennsylvania State University (University Park, PA) by Ion Torrent PGM sequencing (Life Technologies, Grand Island, NY) using a 318 chip to provide over 100-fold coverage of the genome. Approximately, 100 ng of high purity genomic The output of approximately six million reads was assembled by DNAStar NGen program by referenced guided assembly using SE strain P125109 (GenBank accession AM933172.1) as the reference. Each assembly was compared by NcoI optical mapping (OpGen, Inc. Gaithersburg, MD) compared to both the reference genome and the optical map obtained from each isolate. Once all gaps were closed and greater than 100-fold coverage was obtained, the genomes were considered closed.

2.3.5 Construction of the whole genome optical map (OpMap). The whole genome restriction optical map was generated using NcoI digestion by OpGen, Inc. and imaged in silico [11]. In brief, high molecular weight DNA was extracted, linearized, immobilized on to a Mapcard containing micro channels which hold single chromosomes and subsequently digested with the

NcoI restriction . The resulting DNA fragments were stained with a fluorescence dye, and lengths were measured using fluorescence microscopy. These were then assembled by overlapping fragments to generate the restriction cut site map. Restriction map was visualized using the MapSolver software version 3.0 (OpGen).

2.3.6 Genome Annotation. The completed genomes were submitted to Rapid Annotation using

Subsystem Technology (RAST) from NMPDR [12]. Genome outputs were read on Artemis genome read platform and manually annotated to assign gene names and locus numbers [13]. All open reading frames (ORFs) were analyzed by BLASTp and BLASTn to determine correct size of ORFs and find ORFs that were not previously recognized by RAST.

38

2.3.7 Comparative Genomics and SNP Analysis. All 13 Salmonella enterica subsp. enterica serovar Enteritidis genomes (Table 2.2) were shredded into 54-bp-long DNA sequences and mapped using SSaha v2.5.4 onto the SEE1 genome [14]. The pan-genome matrix was constructed with each column containing a genome of a single isolate and each row containing a single gene. Evidence of sequence-based microevolutions was considered for SNPs that are homologous in all human isolates and not in egg isolates. Only variable-in-all-genomes SNP sites were considered for building a phylogenetic tree. The output SNP alignment file served as an input for construction of phylogenetic tree with RAxML v7.0.4 (Stamatakis, 2006), using a

General Time Reversible (GTR) model with a gamma correction for among site rate variation and ten random starting trees. Single nucleotide polymorphism (SNP) analysis was performed by using the ‘read-in’ function (Artemis) overlaying the genomes of the other SE isolates to the genomes of SEE1 and SEE2. These should be non-synonymous SNPs (nsSNPs), as these will change amino acids. A BRIG analysis was performed with BRIG software [15].

2.4 Results

2.4.1 General Overview of SEE1 and SEE2 Genomes

The genomes of SEE1 and SEE2 which belonged to the PFGE pattern, JEGX01.0004

(Figure 2.1), were almost identical in nucleotide composition (>99% similarity) and size. As

Table 2.1and Figure 2.2 (A and B) shows, the two genomes were approximately 4.67 Mb long and had a GC content of approximately 52%, both of which are in line with many other SE isolates previously sequenced. Both have 22 ribosomal RNA (rRNA) genes and 84transfer RNA

(tRNA) genes. The genome of SEE1 has a total of 4,677 open reading frames (ORFs) with an

39 average density of 0.999 genes per Kb whereas the genome of SEE2 consists of 4671 ORFs with an average density of 0.998 genes per Kb.

Submission of both genomes to RAST gave both and automated annotation and a metabolic model with a functional breakdown of the genome by percentage of the genes related to a particular subsystem. Each subsystem is defined by a pathway of interconnected genes toward a common function such as metabolism, stress response, and virulence. The functional breakdown of each genome can be viewed in Figure 2.2. Notably, the two largest subsystems that were identified by number of genes were carbohydrate utilization and amino acid metabolism and synthesis. Virulence, disease, and defense and stress response were the two largest pathogen related subsystems. About 60% of the genes in each genome were involved in the subsystems shown in Figure 2.2. However, 40% of genes could not be assigned to a particular subsystem, 750 of which are hypothetical genes and 1,050 are non-hypothetical.

2.4.2 Phage Regions in SEE1 and SEE2

Both SEE1 and SEE2 have five major phage regions (phage regions 1 to 5) consisting of multiple remnants of phage integration (such as mobile genetic elements) or full prophage regions with a full complement of genes involved in phage assembly. The genes contained within each region are listed in Table 2.4. The phage region 1in both SEE1 and SEE2 contain features of a pathogenicity island. Most of the genes at the start of this phage region are uncharacterized but have been implicated in pathogenicity of SE such as a type 6 secretion system (T6SS) effector

VgrG (SEE1_0283/SEE2_0283) and other effectors, such as ImpA and ImpH. The terminal end of this phage region contains the saf- fimbrial operon. The region in the middle is primarily comprised of hypothetical proteins and phage genes.

40

Phage region 2 of SEE1-SEE2 contains mostly phage genes. One of the two non-phage genes that are among the phage genes is sseI, a secreted effector protein and msgA which has been identified as a virulence gene in S. enterica. The phage region 3 contains multiple genes, such as SEE1_1234, SEE1_1241, sopE2 and pagO in SEE1 (SEE2 only has SEE2_1236 which is homologous to SEE1_1234) which are potentially involved in SE pathogenesis. This region also contains sodC, which encodes superoxide dismutase. Phage region 4 only contains phage genes and remnants of an integrated prophage. This region does not have genes with a definitive function but contains many genes that encode for hypothetical proteins.

The phage region 5 consists of genes associated with the packaging and replication of a phage. It also contains genes that are normally found in SE such as SEE1_2759, SEE1_2761, and SEE1_2762 in SEE1and having SEE2_2755, SEE2_2757, and SEE2_2758 in SEE2. The first gene of this region encodes a large repetitive protein that is similar to the sequence of the RTX toxin found in pathogenic E. coli. The latter two in each isolate encode for a putative type 1 secretion system (T1SS) and a putative HlyD family secretion protein.

2.4.3 Genes That Exhibit Frameshifts

Nine genes of SEE1 and 11 genes of SEE2 contain frameshifts (Table 2.3).

Approximately, half of these genes have putative functions. Of those genes the RAST functional annotation identified, slrP encodes a putative internalin. SlrP is a T3SS effector which acts as an

E3 ubiquitin ligase that targets mammalian thioredoxin [16].

41

2.4.4 Pathogenomics-Based Virulence Gene Profiling of SEE1 and SEE2

The virulence profile of SEE1 and SEE2 is based on the principle that these genes and their products directly influence SE’s virulence potential either directly (e.g. effectors, toxins, and adhesins) or indirectly e.g. nutrient acquisition systems, and signaling). Based on these criteria both SEE1 and SEE2 possess over 600 genes related to virulence, the various functions of which can be found in Figure 2.3.

2.4.4-A. Adhesins

SEE1 and SEE2 contain 15 fimbrial operons which are comprised of 79 genes (Table

2.5). Eight of these operons are considered putative or probable fimbrial operons whereas the remaining seven are known fimbrial operons. Twelve of the operons contain fimbrial usher, chaperone, and anchoring proteins but do not contain an identified fimbrial adhesin. Three of these operons have an identified fimbrial adhesin with one locus (fimA-fimW) containing 3 different fimbriae-like adhesins (fimA, fimI, and fimH). One operon, the csg- operon, contains the curli-like fimbrial adhesins [17].

There are also non-fimbrial adhesins which play an important role in adherence of bacteria to a variety of host molecules which may in turn lead to invasion. Like fimbrial adhesins, non-fimbrial adhesins also exhibit the ligand/receptor binding specificity which is determined by structure and composition of the adhesin. As listed in Table 2.6, 22 non-fimbrial adhesins are present in SEE1 and SEE2. Many of these adhesins are outer membrane proteins are typically type 5 secretion systems (T5SS) which are also known as autotransporters [18,19].

Autotransporters are defined by single subunit proteins that consist of three important domains: the β-barrel domain anchoring the adhesin into the membrane, the linker domain, and the effector

42 or passenger domain. The linker domain aids in the movement of the passenger domain, which confers the activity and specificity of a particular adhesin [20]. Among the non-fimbrial adhesins are five omp genes including ompA, ompF, ompC, ompW and ompX, which is considered to be an attachment and invasion locus precursor.

Beyond the omp genes, both SEE1 and SEE2 contain many other adherence genes that play a multitude of roles in the adherence and eventual invasion process. The gene yidE encodes for a mediator of hyperadherence whereas SEE_1461/SEE2_1464, and yhcP have been identified as invasins. The SEE1_0228 in SEE1 and SEE2_0228 in SEE2 encode for an enhancin.

2.4.4-B. Salmonella Pathogenicity Island-Related Genes

Five known Salmonella Pathogenicity Islands (SPIs) consisting of 101 genes were present in both SEE1 and SEE2 (Table 2.7 - 2.11). Each SPI is fully intact with no frameshifts occurring in any of the genes. The previously identified operons critical for each SPI function such as the spa-, sip-, pip-, sse-, and ssa- are present in both SEE1 and SEE2. Some SPI- associated genes, such as avrA (inhibits NF-κB signaling), sopB (inositol phosphate phosphatase), and hilA (transcriptional activator of the T3SS) [3,21] might play a role in the ability of SEE1 and

SEE2 to colonize and cause disease in multiple host species.

2.4.4-C. Non-SPI Effectors, SS, Toxins, and Virulence Factors

There were 92 genes that have identified as being involved in or directly responsible for the synthesis and expression of non-SPI effectors, secretion system (SS), toxins, and virulence factors. As shown in Table 2.12, both SEE1 and SEE2 possess lpx-, rfb-, rfa-, and wzz-/wec- operons which are required for the biosynthesis of lipopolysaccharides (LPS), an immunogenic

43 component of Gram-negative bacterial outer membrane that elicits a strong immune response by the host. They also contain over 40 genes encode for non-LPS virulence factors (Table 2.12).

The genes sifB, srfA, sopA and sopD all encode for putative effectors. The TolC protein may also play a role in secretion of large effectors or toxins. These isolates also contain toxins, such as yraP, yafQ, SEE1_2762/SEE2_2758, SEE1_3099/SEE2_3096, and yfjD which all have previously been identified in SE. Most of these gene products are related to hemolysins based on their structure. The SEE1_2759/SEE2_2755 encodes for a large repetitive protein, possibly involved in secretion of toxins. Some other genes (e.g. sppA, pagD, and pgtE) produce other surface exposed virulence factors that are membrane bound.

2.4.4-D. Iron Sequestration Genes

Table 2.13 shows that SEE1 and SEE2 contain the ent- and iro- operons, which are involved in the biosynthesis of enterochelin and salmochelin, respectively. Enterobactin and salmochelin are the only two siderophores identified in the geneomes of these two SE isolates.

To acquire siderophore-bound iron, these genomes also contain different receptors specific to these complexes, and to release the iron from the siderophores. In SEE1 and SEE2, there are six genes (fhuA, fepA, foxA, iroN, fhuA, and btuB) that encode receptors for different siderophores which all acquire iron in the form of ferric. In addition to these systems, the FeoA,

FeoB, and FeoC systems acquire iron in the form of ferrous, are also present in these two genomes.

44

2.4.4-E. Motility and Chemotaxis

The analysis of the genomes of SEE1 and SEE2 revealed 61 genes related to motility and chemotaxis. For example, three flagella-expressing operons fli-, flg- and flh- were identified.

The motor and stator of the flagella are encoded by motA and motB, respectively, which are present in both SEE genomes. The SEE1 and SEE2 genomes also contain the cheA-cheZ chemotaxis operon. The cheV gene is not located on this operon but lies near the flg- operon.

The gene yggR present in both genomes encodes for PilT, a protein involved in twitching motility. The full list of these genes can be found on Table 2.14.

2.4.4-F. Two Component Signaling Systems and Quorum Sensing

There are approximately 40 genes recognized as being responsible for expression of two component signaling and quorum sensing in SEE1 and SEE2 as shown on Table 2.16. The major

2-component sensor that plays an important role in regulation of virulence genes in SE is the phoP/Q two component regulatory systems, which SEE1 and SEE2 contain within their genomes.

Aside from two component regulatory systems, many bacteria also have quorum sensing mechanisms. Many quorum sensing mechanisms play a significant role in regulating a multitude of genes in a cell-density dependent manner. Quorum sensing works by the secretion of autoinducers (AIs) or quorum molecules (such as acyl homoserine lactones), which increase in secretion by an increase in cell density. Among these are ygiX and ygiY which encode QseB and

QseC and the yde- operon which is involved in AI-2 signaling (AI-2 synthesis and sensing). The

QseB/QseC two component system responds to type 3 AIs and hormones/ neurotransmitter-like molecules such as epinephrine and norepinephrine [22].

45

2.4.4-G. Antimicrobial Resistance Genes

As Table 2.15 shows, over 70 antimicrobial resistance genes, including the operons, which harbor multidrug resistance genes, such as mar- and yeg-, are present in SEE1 and SEE2.

Both SEE1 and SEE2 have genes dedicated to the resistance to polymyxin (prm- operon), tellurite

(tehA, tehB), streptomycin (aadA), macrolides (ybjY, ybjZ), and beta-lactams

(SEE1_3829/SEE2_3822). In addition to the genes to resist the effect of antibiotics, they also possess genes to protect against other adversities encounter in the environment and the host, such as super-oxides (sodC, katE, katG and oxyR), acid shock (SEE1_1684/SEE2_1677), and heavy metals (ybgR, zntA, and SEE1_0575/SEE2_0574, and copA). The pgtE and

SEE1_2653/SEE2_2651 encode for an omptin-like protease and alpha-2 macroglobulin like protein, both shown to protect the cell against different antimicrobial factors such as cationic antimicrobial molecules [23,24].

2.4.4-H. Miscellaneous Virulence-Associated Genes

Both SEE1 and SEE2 have the wca- and mdo- operons for biofilm production through colonic acid biosynthesis (Table 2.17). They also contain genes that are essential for anaerobic respiration and growth competitiveness of SE in the gut through utilization of ethanolamine as a carbon source (eut- operon), and for utilization of tetrathionate as a final electron acceptor (ttr- operon). Both SEE1 and SEE2 have genes for entericidin B and Colicin V production, which can target susceptible microbiota and kill them thus creating niche space for SE. These genomes also have a stage V sporulation protein involved in survival of SE. There are approximately 16 putative exported proteins that are predicted within the genomes of SEE1 and SEE2.

46

2.4.5 Bioinformatic Evidence of Genomic Microevolution during Human Infection

To understand any particular sequence-based microevolution that may contribute to the interchange of survival and growth in or on egg versus inside the human host, we performed whole genome comparisons between the SEE1 and SEE2 and 11 genomes of human clinical isolates of SE whose genomes were completed and uploaded to NCBI GenBank. Analysis of all

11 isolates compared to SEE1 and SEE2 revealed that most isolates were extremely similar in both the gene content and nucleotide sequence identity (Figures 2.5 and 2.6). A total of 116 single nucleotide polymorphisms (SNPs) that were differentially conserved between the human isolates and egg isolates (SEE1 and SEE2) were identified (Table 2.18). Most loci that contained

SNPs were in coding sequences and only eight were found in intergenic regions. Of those located in intergenic regions, only half of these were present in regions that could potentially serve as promoters of genes proximal to that intergenic region.

Among the four promoter regions that were discovered to have SNPs, three could potentially be virulence associated. First is the promoter region for bioA/bioB of the biotin synthesis genes, which is not essential for virulence but is upregulated at least 10 folds during invasion of epithelial cells [25]. Next is nmpC, which does not have a well-defined function but is an outer membrane protein or porin. The third is the promoter of the emrD, a multidrug resistance gene. The last of these is the T3SS-2 effector gene, sseB.

Of the remaining 108 SNPs discovered in coding sequences, 37 SNPs were non- synonymous (nsSNP). One such example is the T:C (SEE to Human) missense mutation that causes an asparagine (polar, uncharged) to a histidine (polar, positively charged) in the fimbrial usher protein StiC. This is a non-conservative difference for a polar, uncharged carbon-strand amino acid to a pentagon-shaped R-group with a charge, and such a change will affect all the biochemical and physical forces previously associated with that R group, and thus could

47 potentially alter the shape of the protein and therefore its function. Beyond this, two nsSNP, there are two others that may play a direct role in virulence. One is the C:G, a missense mutation that changes an alanine to a glycine, though this is a conservative change in the gene SEE1_2760, which is found in a region of the genome containing the genes for an HlyD-family protein which is a large repeat protein and a T1SS ATPase. One final example of this type is rffG that has a non-conservative amino acid change from an aspartic acid (negatively charged, hydrophilic) to a glycine (non-polar, uncharged) from an A:G substitution.

2.5 Discussion and Conclusions

Salmonella enterica serovar Enteritidis remains a major cause of human morbidity and mortality due to foodborne illness worldwide including the United States [3,4,26]. Pulsed-field gel electrophoresis is considered to be the gold standard method of typing Salmonella to study their genetic relatedness [27]. The CDC has recently shown that the most common PFGE type of

SE associated with human disease is JEGX01.0004 with only five PFGE types of SE representing 85% of all outbreaks [10,28]. To date, full genome sequencing together with virulence gene profiling of any isolates of SE belonging to the JEGX01.0004 PFGE type or any

SE isolated from shell eggs, has not been performed. In this study, the genomes of two strains of

SE isolated from shell eggs and belonging to JEGX01.0004 PFGE type were sequenced, assembled and annotated to understand their genetic structure. The genomes of SEE1 and SEE2 are almost identical with respect to size, number of open reading frames, GC content, and number of rRNA and tRNA loci (Table 2.1). Except for one isolate (P125109), the remaining 10 human isolates mapped closely to SEE1 and SEE2 based on gene content and SNP analysis. Although

P125109 and 77_1427 both belong to the PFGE type JEGX01.0004, strain 77_1427 is phylogenetically closer to SEE1/2 than P125109 (Figure 2.5).

48

In concert with this observation, previous studies have shown that different SE isolates in this particular PFGE pattern exhibit a certain degree of genomic diversity [28]. Despite this diversity, this study presents a virulence gene profile representative of SE isolates belonging to the JEGX01.0004 PFGE type. The relatedness of the human isolates of SE with SEE1 and SEE2

(Figure 2.5-2.6) suggests that SEE1 and SEE2 contain the genes important to cause human disease. Having a virulence gene profile for SE of this PFGE type is critical in understanding not only the genetic basis for the high prevalence of human SE infection linked to this PFGE type but also in broadening the current understanding of the genes and mechanisms involved in SE’s overall pathobiology. Analysis of the genomes of SEE1 and SEE2 revealed that most of the genes are part of subsystems dedicated to multiple pathways of metabolism and cellular homeostasis. However, there were over 600 genes that have been linked to a virulence mechanism, such as motility, adherence, competition with microbiota, invasion, and intracellular growth and survival.

Flagella-based motility is important for SE’s ability to move through different environments and hosts [17,29]. In SEE1/2, three operons encoding for flagella biosynthesis were identified. Flagellar movement is regulated through chemotaxis, and in most enteric bacteria this is accomplished through the Che- system [30]. This system is present in both SEE1 and SEE2. Flagella motility and function are one of the important factors determining the ability of Salmonella enterica serovars to infect humans or penetrate various membranes of a shell egg.

Flagella not mediate motility but can also act as an adhesin in some enteric bacteria [31].

Adherence and invasion is an important function of SE pathogenesis and almost 20% of all virulence genes of SEE1 and SEE2 are related to adherence. Most of these are involved in fimbriae-mediated adherence but there are many non-fimbrial adhesins previously shown to be important for mediating adherence of SE to different epithelial cells. The abundance of fimbrial

49 and non-fimbrial adhesins in SE demonstrates the importance of functional redundancy of adherence in SE colonization and infection of different hosts.

Iron is an essential element for most bacteria as it is a cofactor of many bacterial proteins.

As a result most pathogenic bacteria possess multiple iron acquisition systems, such as the iut- aerobactin operon or the chu- binding operon found in different pathotypes of Escherichia coli [32,33]. For example, extracellular enteric pathogens like Escherichia coli contain a multitude of genes related to the biosynthesis, export, and import of iron chelating compounds known as siderophores. Both SEE1 and SEE2 only have genes to produce two siderophores, enterobactin and salmochelin, which is also a modified enterobactin. The disparity in the number of genes related to iron acquisition between similar species (such as E. coli and S. enterica) may be due to the differences of their pathogenesis, as it is likely more bio-available iron is present inside cells than the extracellular milieu. The presence of the iro- operon that is used for salmochelin biosynthesis and uptake is important for SE survival in the host as salmochelin cannot be bound by host siderophore-chelating lipochalin-2 [34].

There are many two component systems and quorum sensing systems present in both genomes. Like chemotaxis, two-component signaling systems are important for sensing changes in micro-environmental cues and host-mediated responses, allowing SE to control expression of certain genes. The phosphate-sensing PhoP/Q two-component system is involved in regulation of

SPI-2 genes for example. Aside from chemical signaling, many bacteria have shown the ability to quorum sense, or the ability to regulate through cell-density and stress hormone sensing. The quorum sensing systems regulating LsrR (via the autoinducer-2 quorum sensing) and LuxS/R have both been linked to expression of SPI-1 related genes [35,36]. Recent evidence has shown that S. Typhimurium can sense and respond to norepinephrine through the

QseB/C system, similar to enterohemorrhagic E. coli [22,37].

50

Almost 25% of the virulence genes identified in SEE1 and SEE2 were related to the production of resistance and miscellaneous proteins. Most of these genes go hand in hand with the pathogenic strategies used by SE in the human host. To get access to molecules for anaerobic respiration, SE depends on the release of reactive oxygen species (ROS) to deplete obligate anaerobic microbiota thereby releasing thiosulfate and then converting it to tetrathionate through the ttr- operon. Salmonella Enteritidis is able to survive this by producing superoxide dismutases such as SodC and KatG. Also, SE uses toxins to kill epithelial cells in combination with the natural turnover of gastrointestinal tract (GIT) cells releasing ethanolamine. Ethanolamine is then imported and utilized through the eut- operon gene expression. There are many other genes aside from these, but this is one example of genes that have been found in SEE1 and SEE2 that may help give them an advantage in the competitive environment of the GIT [38].

Full compliments of all five SPIs were also found in both isolates, as well as 92 genes implicated in the biosynthesis and function of non-SPI regulated toxins, effectors, and secretion systems. Because the five SPIs are more or less ubiquitously found in isolates of SE of different

PFGE types, it is most likely that the non-SPI related factors may be important in the scope of

PFGE JEGX01.0004 being most commonly associated with human disease in particular those linked to the consumption of shell eggs and egg products. One such effector, slrP has been shown to be under genetic degeneration of the locus through the acquisition of two nonsense mutations [16]. This gene encodes an E3 ubiquitin ligase which targets thioredoxin. In SEE1 and

SEE2, the seeI gene also encodes for an E3 ubiquitin ligase, indicating that this ubiquitin ligase may be the more suitable E3 ubiquitin ligase as compared to SlrP. Aside from this gene, there were three hemolysin-like proteins and HlyD-family transporter genes found within both genomes. These and the other non-SPI regulated genes may play an important role in SE pathogenesis, but their roles in those processes are still being studied.

51

This study also shed light into the role of microevolution of SE at the genomic level, if any, between the source (egg) and the host (in this case human). As shown in Table 2.18, there is evidence of sequence-based microevolution that occurs during the human infection as compared to egg contamination. For genes bearing synonymous SNPs, it is very possible that the adaptation may be due to codon bias but otherwise would consist of genes with no important changes. Though this has not yet been established, codon bias may be more important for bacteria with changing environments as the materials required to make certain tRNA molecules may change, for example. For those with a nsSNP, there are only a few that could truly be pathoadaptive (depending on their role in certain hosts), such as the nsSNP acquired in stiC, which is a fimbrial usher protein, or the dTDP-glucose 4,6- rffG involved in LPS biosynthesis. Lipopolysaccharides and fimbriae have been shown to be very important in the ability of SE to colonize hosts, invade cells, and survive in extracellular environments among others [3,39,40]. These changes can alter the structure proximal to this amino acid which can affect the function of the gene . Charge can have a direct effect on the interactions and hydrogen bonding capabilities because charge allows for electrostatic interactions and possibility of covalent bonds whereas uncharged interactions are based on other forces such as Van der

Waals. To understand how these changes may lead to pathoadaptation of SE during human infection it is first necessary to elucidate the role these genes play in the SE pathogenesis.

Genes that have changes in their promoter sequences can cause a change in the interaction of the cis and trans elements involved in gene expression. Subsequently, this may increase or decrease their expression, both of which can affect the phenotype of the cell.

Examples are the T:C SNP in the promoter region of the nmpC gene or sseB gene that has a T:G change. The nmpC gene encodes for an outer membrane protein whereas the gene product of sseB is a T3SS effector that is secreted through the SPI-2 T3SS, and thus could be involved in

52 survival and maintenance of the Salmonella-containing vacuole (SCV). Further studies are necessary to understand the effect of these promoter SNPs on their respective genes.

Because some of these SNPs occur in the genes that could potentially play a role in a pathogenic trait (such as adhesion), these mutations provides bioinformatics-based evidence for microevolution occurring at the genomic level leading to pathoadaptive mutations as SE passes between different hosts and environments. The next step will be to determine through in vitro and in vivo analyses if these mutations are truly pathoadaptive. An alternative scenario to this is, despite the evidence of microevolution, the SNP-based linkage between all isolates tested in this study is still very similar in sequence content and relationship.

In conclusion, this study provided the first potential virulence gene profile of SE and more specifically SE falling to the JEGX01.0004 PFGE type, which displays over 600 genes dedicated to virulence-associated facets of SE. These are good evidence as to why SE is such a well-adapted pathogen in a multitude of hosts either causing disease or infecting asymptomatically. With the information presented in this study, many unanswered inquiries can now begin to be addressed. It is likely that the genome of SE isolated from shell eggs would represent the mutations accumulated overtime for SE to survive in the egg environment, thus allowing for the study of pathoadaptive mutations during human infection. This study demonstrated that SE of the JEGX01.0004 PFGE type appears to be a very adaptable pathogen based on the number of genes dedicated to various aspects of SE’s virulence. These findings can further the understanding of how SE infects a multitude of hosts and refining the genes and their regulation responsible for disease.

53

2.6 Author Contributions

Matthew R. Moreau1,2, Yury V. Ivanov1, Bhushan M. Jayarao1,3, and Subhashinie

Kariyawasam1,3

1Veterinary and Biomedical Sciences Department, The Pennsylvania State University

2Pathobiology Graduate Program, The Pennsylvania State University

3Animal Diagnostic Laboratory, Pennsylvania State University

Conceived and Designed Experiments: MRM, YVI, BMJ, SK

Performed Experiments: MRM, YVI

Analyzed Data: MRM

Wrote the Paper: MRM, YVI, BMJ, SK

1 2 M

JEGX01.0004

Figure 2.1: PFGE Profiles of SEE1 and SEE2 Pulsed-field gel electrophoresis was performed on both isolates prior to Sequencing. Based on the fingerprint patterns, both SEE1 and SEE2 were assigned the PFGE profile JEGX01.0004. Lane1, PFGE profile of SEE1; Lane 2, PFGE profile of SEE2; M, molecular weight marker.

55

Figure 2.2A: Then Circular Map of SEE1 Genome with its GC Content and GC Skew.

56

Figure 2.2B: The Circular Map of SEE2 Genome with its GC Content and GC Skew.

57

Figure 2.3: RAST-Predicted Genome Composition of SEE1 and SEE2. After genomes were closed, they were analyzed for automated annotation and composition breakdown through Rapid Annotation Software Technology, RAST. Gene product functions were predicted and assigned to subsystems, if known. Subsystem coverage denotes the percentage of genes dedicated to a subsystem within the cell in green (subsystems on the far right and represented in the pie chart) and unknown in blue.

11% 15%

7% 4%

12%

17%

58

Figure 2.4: Relative Distribution of Virulence-Associated Genes in SEE1 and SEE2 There were over 600 genes discovered in the genomes of SEE1 and SEE2 which are potentially associated directly or indirectly with SE. The percentages overlaid in the pie graph depict the relative percentages of those 600 genes belonging to each category of virulence. The total number of genes per category are also listed around the center.

59

Figure 2.5: SNP-Based Phylogenetic Analysis of 11 Human Isolates of SE and SEE1. This phylogenetic tree was constructed using whole genome SNP analysis. Eleven human isolates were chosen at random and compared to SEE1 for SNP differences.

60

Figure 2.6: BRIG Analysis of SE Genomes Used in this Study. Each genome sequence was downloaded from the NCBI GenBank database and input into the BLAST Ring Image Generator for ORF alignment. Places that are discontinuous within each ring are the regions of dissimilarity as compared to SEE1 and SEE2. Each color ring and percent identity based on shading is indicated by the legend on the right.

61

Table 2.1: Genome Statistics for SEE1 and SEE2.

SEE1 SEE2

Length of Genome (Mb) 4.67 4.67 GC Ratio (%) 52 52 Open Reading Frames (ORF) 4677 4671 Density (Genes/Kb) 0.999 0.998 Number of rRNA 22 22 Number of tRNA 84 84

Table 2.2: Strain Identification and Source of Genome.

Strain Designation Accession # 77-1427 CP007598.1 Durban CP007507.1 CDC_2010K_0968 CP007528.1 EC20110223 CP007266.1 EC20110356 CP007262.1 EC20110357 CP007261.1 EC20110358 CP007260.1 EC20110359 CP007259.1 EC20110360 CP007258.1 EC20110361 CP007263.1 P125109 AM933172.1 SEE1 this study SEE2 this study

62

Table 2.3: Genes that Exhibit Frameshifts.

SEE1 SEE2

Gene Description Gene Description

invasion plasmid antigen / internalin, invasion plasmid antigen / internalin, slrP slrP putative putative

agp Glucose-1-phosphatase agp Glucose-1-phosphatase

SEE1_2169 putative cytoplasmic protein SEE2_2166 putative cytoplasmic protein

SEE1_3845 ATP binding protein SEE2_3838 ATP binding protein

yihO Glucuronide transport protein YihP yihO Glucuronide transport protein YihP

L-idonate, D-gluconate, 5-keto-D- L-idonate, D-gluconate, 5-keto-D- idnT idnT gluconate transporter gluconate transporter

mrr FIG01047896: hypothetical protein mrr FIG01047896: hypothetical protein

4-hydroxybenzoate ubiC ubiC 4-hydroxybenzoate polyprenyltransferase polyprenyltransferase

SEE1_3889 Putative periplasmic protein SEE2_3882 Putative periplasmic protein

Putative DMT superfamily metabolite ybiF efflux protein precursor

63

Table 2.4: Phage Regions and Phage-Associated Loci in SEE1 and SEE2.

SEE1 SEE2 Phage Phage Description Phage Locus Description Region Locus SEE2_0279 Uncharacterized protein ImpA SEE2_0279 Uncharacterized protein ImpA

SEE2_0280 Uncharacterized protein ImpH/VasB SEE2_0280 Uncharacterized protein ImpH/VasB

SEE2_0281 Probable secreted protein SEE2_0281 Probable secreted protein SEE2_0282 Putative cytoplasmic protein SEE2_0282 Putative cytoplasmic protein

SEE2_0283 VgrG protein SEE2_0283 VgrG protein SEE2_0284 Rhs-family protein SEE2_0284 Rhs-family protein SEE2_0285 FIG01046258: hypothetical protein SEE2_0285 FIG01046258: hypothetical protein

SEE2_0286 FIG01046370: hypothetical protein SEE2_0286 FIG01046370: hypothetical protein SEE2_0287 FIG01046306: hypothetical protein SEE2_0287 FIG01046306: hypothetical protein

SEE2_0288 rhs core protein with extension SEE2_0288 rhs core protein with extension

SEE2_0289 hypothetical protein SEE2_0289 hypothetical protein 1 SEE2_0290 FIG01046337: hypothetical protein SEE2_0290 FIG01046337: hypothetical protein SEE2_0291 putative cytoplasmic protein SEE2_0291 putative cytoplasmic protein SEE2_0292 FIG01046190: hypothetical protein SEE2_0292 FIG01046190: hypothetical protein SEE2_0293 Putative SEE2_0293 Putative transposase SEE2_0294 No product SEE2_0294 No product SEE2_0295 Mobile element protein SEE2_0295 Mobile element protein safA Fimbrial Lipoprotein safA Fimbrial Lipoprotein Periplasmic fimbrial chaperone Periplasmic fimbrial chaperone safB safB protein protein FIG034929: Fimbriae usher protein FIG034929: Fimbriae usher protein safC safC SafC SafC safD Putative fimbrial structural subunit safD Putative fimbrial structural subunit ybeJ Polysaccharide deacetylase ybeJ Polysaccharide deacetylase SEE2_0301 Putative protein SEE2_0301 Putative protein SEE2_0302 TnpA SEE2_0302 TnpA SEE1_0975 Gifsy-2 prophage RecT SEE2_0973 Gifsy-2 prophage RecT

SEE1_0976 FIG01046854: hypothetical protein SEE2_0974 FIG01046854: hypothetical protein

SEE1_0977 FIG01046508: hypothetical protein SEE2_0975 FIG01046508: hypothetical protein SEE1_0978 Phage tail fiber protein SEE2_0976 Phage tail fiber protein

SEE1_0979 Phage tail fibers SEE2_0977 Phage tail fibers 2 sseI Secreted effector protein sseI Secreted effector protein SEE1_0981 Transposase SEE2_0979 Transposase SEE1_0982 Gifsy-2 prophage protein SEE2_0980 Gifsy-2 prophage protein SEE1_0983 FIG01047756: hypothetical protein SEE2_0981 FIG01047756: hypothetical protein SEE1_0984 FIG01046696: hypothetical protein SEE2_0982 FIG01046696: hypothetical protein SEE1_0985 hypothetical protein SEE2_0983 hypothetical protein

64

SEE1_0986 Virulence protein msgA SEE2_0984 Virulence protein msgA SEE1_0987 hypothetical protein SEE2_0985 hypothetical protein SEE1_1216 Copper resistance protein D SEE2_1211 Copper resistance protein D Putative periplasmic or exported Putative periplasmic or exported SEE1_1217 SEE2_1212 protein protein

SEE1_1218 Mobile element protein SEE2_1213 Mobile element protein

SEE1_1219 excisionase SEE2_1214 excisionase SEE1_1220 Putative SEE2_1215 Putative hydrolase

SEE1_1221 hypothetical protein SEE2_1216 hypothetical protein SEE1_1222 FIG01047714: hypothetical protein SEE2_1217 FIG01047714: hypothetical protein SEE1_1223 Phage protein SEE2_1218 Phage protein

SEE1_1224 Gifsy-2 prophage protein SEE2_1219 Gifsy-2 prophage protein SEE1_1225 antiterminator-like protein SEE2_1220 antiterminator-like protein

SEE1_1226 hypothetical protein SEE2_1221 hypothetical protein

SEE1_1227 FIG01048890: hypothetical protein SEE2_1222 FIG01048890: hypothetical protein SEE1_1228 hypothetical protein SEE2_1223 hypothetical protein

SEE1_1229 GtgA SEE2_1224 GtgA

Phage holin #Lambda-like group I SEE1_1230 Phage holin #Lambda-like group I holin SEE2_1225 holin SEE1_1231 Phage lysozyme SEE2_1226 Phage lysozyme Phage outer membrane lytic protein Phage outer membrane lytic protein SEE1_1232 SEE2_1227 Rz; Endopeptidase Rz; Endopeptidase SEE1_1233 FIG01049225: hypothetical protein SEE2_1228 FIG01049225: hypothetical protein 3 Attachment invasion locus protein Attachment invasion locus protein SEE1_1234 SEE2_1229 precursor precursor Superoxide dismutase [Cu-Zn] Superoxide dismutase [Cu-Zn] sodC sodC precursor precursor SEE1_1236 Phage minor tail protein SEE2_1231 Phage minor tail protein SEE1_1237 Phage tail assembly protein SEE2_1232 Phage tail assembly protein SEE1_1238 Phage tail fiber protein SEE2_1233 Phage tail fiber protein SEE1_1239 Phage tail fiber protein SEE2_1234 Phage tail fiber protein SEE1_1240 Phage tail fibers SEE2_1235 Phage tail fibers invasion-associated secreted SEE1_1241 invasion-associated secreted protein. SEE2_1236 protein. DNA invertase from prophage CP- DNA invertase from prophage CP- SEE1_1242 SEE2_1237 933H 933H SEE1_1243 FIG01045615: hypothetical protein SEE2_1238 FIG01045615: hypothetical protein SEE1_1244 Mobile element protein SEE2_1239 Mobile element protein SEE1_1245 FIG01047716: hypothetical protein SEE2_1240 FIG01047716: hypothetical protein Phage lysin, 1,4-beta-N- Phage lysin, 1,4-beta-N- SEE1_1246 SEE2_1241 acetylmuramidase acetylmuramidase Homology to phage-tail assembly Homology to phage-tail assembly SEE1_1247 SEE2_1242 proteins proteins SEE1_1248 lytic enzyme SEE2_1243 lytic enzyme SEE1_1249 FIG01046582: hypothetical protein SEE2_1244 FIG01046582: hypothetical protein SEE1_1250 FIG01045807: hypothetical protein SEE2_1245 FIG01045807: hypothetical protein

65

SEE1_1251 FIG01047586: hypothetical protein SEE2_1246 FIG01047586: hypothetical protein SEE1_1252 FIG01046232: hypothetical protein SEE2_1247 FIG01046232: hypothetical protein SEE1_1253 Phage tail fiber protein SEE2_1248 Phage tail fiber protein mig-3 phage tail assembly-like protein mig-3 phage tail assembly-like protein SEE1_1255 Phage tail fiber protein SEE2_1250 Phage tail fiber protein SEE1_1256 Hypothetical protein SEE2_1251 Hypothetical protein SEE1_1257 Hypothetical protein SEE2_1252 hypothetical protein SEE1_1258 hypothetical protein SEE2_1253 hypothetical protein SEE1_1259 FIG01045658: hypothetical protein SEE2_1254 FIG01045658: hypothetical protein pagO Inner membrane protein pagO Inner membrane protein SEE1_1261 FIG01045706: hypothetical protein SEE2_1256 FIG01045706: hypothetical protein SEE1_1262 Hypothetical protein SEE2_1257 hypothetical protein SEE1_1263 Hypothetical protein SEE2_1258 Hypothetical protein SEE1_1264 Mobile element protein SEE2_1259 Mobile element protein Conserved secreted hypothetical Conserved secreted hypothetical SEE1_1265 SEE2_1260 protein protein SEE1_1266 hypothetical protein SEE2_1261 hypothetical protein SEE1_1267 Putative acetyltransferase SEE2_1262 Putative acetyltransferase SEE1_1268 FIG01045929: hypothetical protein SEE2_1263 FIG01045929: hypothetical protein SEE1_1269 Putative cytoplasmic protein SEE2_1264 Putative cytoplasmic protein SEE1_1270 FIG01045215: hypothetical protein SEE2_1265 FIG01045215: hypothetical protein SEE1_1271 G-nucleotide exchange factor SopE sopE2 G-nucleotide exchange factor SopE SEE1_1272 FIG01046404: hypothetical protein SEE2_1267 FIG01046404: hypothetical protein STY1986 from Accession AL513382: STY1986 from Accession AL513382: SEE1_1273 SEE2_1268 Salmonella typhi CT18 Salmonella typhi CT18 SEE1_1274 FIG01046004: hypothetical protein SEE2_1269 FIG01046004: hypothetical protein SEE1_1275 Ren protein prpA Ren protein SEE1_1487 Phage SEE2_1481 Phage integrase

SEE1_1488 hypothetical protein SEE2_1482 hypothetical protein

SEE1_1489 hypothetical protein SEE2_1483 hypothetical protein SEE1_1490 Kil protein SEE2_1484 Kil protein

SEE1_1491 hypothetical protein SEE2_1485 hypothetical protein SEE1_1492 hypothetical protein SEE2_1486 hypothetical protein SEE1_1493 Mobile element protein SEE2_1487 Mobile element protein

SEE1_1494 hypothetical protein SEE2_1488 hypothetical protein SEE1_1495 hypothetical protein SEE2_1489 hypothetical protein

SEE1_1496 Putative protein SEE2_1490 Putative protein

SEE1_1497 Unknown function SEE2_1491 Unknown function SEE1_1498 conserved hypothetical protein SEE2_1492 conserved hypothetical protein 4 SEE1_1499 Phage protein SEE2_1493 Phage protein SEE1_1500 FIG00639062: hypothetical protein SEE2_1494 FIG00639062: hypothetical protein SEE1_1501 Phage antitermination protein Q SEE2_1495 Phage antitermination protein Q

66

SEE1_1502 hypothetical protein SEE2_1496 hypothetical protein SEE1_1503 hypothetical protein SEE2_1497 hypothetical protein SEE1_1504 FIG00638630: hypothetical protein SEE2_1498 FIG00638630: hypothetical protein putative prophage membrane SEE1_1505 putative prophage membrane protein SEE2_1499 protein SEE1_1506 hypothetical protein SEE2_1500 hypothetical protein SEE1_1507 hypothetical protein SEE2_1501 hypothetical protein SEE1_1508 site-specific recombination SEE2_1502 site-specific recombination SEE1_1509 FIG01046477: hypothetical protein SEE2_1503 FIG01046477: hypothetical protein SEE1_2064 putative transposase SEE2_2058 putative transposase

SEE1_2065 Mobile element protein SEE2_2059 Mobile element protein

SEE1_2066 FIG01046422: hypothetical protein SEE2_2060 FIG01046422: hypothetical protein 5 SEE1_2067 Hypothetical protein SEE2_2061 Hypothetical protein SEE1_2068 FIG01046824: hypothetical protein SEE2_2062 FIG01046824: hypothetical protein tRNA tRNA-Ser-CGA tRNA tRNA-Ser-CGA yeeI FIG01220476: hypothetical protein yeeI FIG01220476: hypothetical protein tRNA tRNA-Asn-GTT tRNA tRNA-Asn-GTT SEE1_2070 FIG01048042: hypothetical protein SEE1_2064 FIG01048042: hypothetical protein tRNA tRNA-Asn-GTT tRNA tRNA-Asn-GTT SEE1_2071 integrase SEE1_2065 integrase SEE1_2758 FIG01045174: hypothetical protein SEE2_2754 FIG01045174: hypothetical protein

SEE1_2759 Large repetitive protein SEE2_2755 Large repetitive protein

SEE1_2760 FIG01045638: hypothetical protein SEE2_2756 FIG01045638: hypothetical protein Putative type I secretion protein, ATP- Putative type I secretion protein, SEE1_2761 SEE2_2757 binding protein ATP-binding protein Putative HlyD family secretion SEE1_2762 Putative HlyD family secretion protein SEE2_2758 protein SEE1_2763 FIG01049483: hypothetical protein SEE2_2759 FIG01049483: hypothetical protein

SEE1_2764 Gene D protein SEE2_2760 Gene D protein SEE1_2765 Phage tail protein SEE2_2761 Phage tail protein STY4603 from Accession AL513382: STY4603 from Accession AL513382: SEE1_2766 SEE2_2762 Salmonella typhi CT18 Salmonella typhi CT18 SEE1_2767 putative phage tail protein SEE2_2763 putative phage tail protein

SEE1_2768 Tail protein SEE2_2764 Tail protein SEE1_2769 Phage major tail tube protein SEE2_2765 Phage major tail tube protein

SEE1_2770 Phage major tail sheath protein SEE2_2766 Phage major tail sheath protein

SEE1_2771 hypothetical protein SEE2_2767 hypothetical protein SEE1_2772 hypothetical protein SEE2_2768 hypothetical protein

SEE1_2773 Phage tail fiber protein SEE2_2769 Phage tail fibers 6 SEE1_2774 Tail fiber protein SEE2_2770 Tail fiber protein SEE1_2775 Phage tail fibers SEE2_2771 Phage tail fibers SEE1_2776 Baseplate assembly protein J SEE2_2772 Baseplate assembly protein J SEE1_2777 Phage baseplate assembly protein SEE2_2773 Phage baseplate assembly protein

67

SEE1_2778 Baseplate assembly protein V SEE2_2774 Baseplate assembly protein V SEE1_2779 FIG01047449: hypothetical protein SEE2_2775 FIG01047449: hypothetical protein SEE1_2780 Phage tail completion protein SEE2_2776 Phage tail completion protein SEE1_2781 Phage tail protein SEE2_2777 Phage tail protein SEE1_2782 Phage spanin Rz SEE2_2778 Phage spanin Rz SEE1_2783 Phage lysin SEE2_2779 Phage lysin SEE1_2784 possible secretory protein SEE2_2780 possible secretory protein SEE1_2785 Phage tail X SEE2_2781 Phage tail X Phage head completion-stabilization Phage head completion- SEE1_2786 SEE2_2782 protein stabilization protein Phage terminase, Phage terminase, endonuclease SEE1_2787 SEE2_2783 subunit subunit SEE1_2788 Phage major capsid protein SEE2_2784 Phage major capsid protein SEE1_2789 Phage capsid scaffolding protein SEE2_2785 Phage capsid scaffolding protein

SEE1_2790 Phage terminase, ATPase subunit SEE2_2786 Phage terminase, ATPase subunit

SEE1_2791 Phage capsid and scaffold SEE2_2787 Phage capsid and scaffold SEE1_2792 hypothetical protein SEE2_2788 hypothetical protein

SEE1_2793 Hypothetical protein SEE2_2789 Hypothetical protein SEE1_2794 FIG00641226: hypothetical protein SEE2_2790 FIG00641226: hypothetical protein SEE1_2795 hypothetical protein SEE2_2791 hypothetical protein

SEE1_2796 Phage replication protein SEE2_2792 Phage replication protein SEE1_2797 Phage replication protein SEE2_2793 Phage replication protein Methyl-directed repair DNA adenine Methyl-directed repair DNA SEE1_2798 SEE2_2794 methylase adenine methylase

SEE1_2799 Phage protein SEE2_2795 Phage protein

STY3665 from Accession AL513382: STY3665 from Accession AL513382: SEE1_2800 SEE2_2796 Salmonella typhi CT18 Salmonella typhi CT18 SEE1_2801 FIG01045453: hypothetical protein SEE2_2797 FIG01045453: hypothetical protein 6 SEE1_2802 FIG00640946: hypothetical protein SEE2_2798 FIG00640946: hypothetical protein SEE1_2803 Regulatory protein CII SEE2_2799 Regulatory protein CII SEE1_2804 Phage regulatory protein SEE2_2800 Phage regulatory protein SEE1_2805 Phage repressor protein cI SEE2_2801 Phage repressor protein cI SEE1_2806 Phage integrase SEE2_2802 Phage integrase SEE1_2807 hypothetical protein SEE2_2803 hypothetical protein SEE1_2808 membrane protein SEE2_2804 membrane protein SEE1_2809 FIG01047617: hypothetical protein SEE2_2805 FIG01047617: hypothetical protein SEE1_2810 Hypothetical protein SEE2_2806 Hypothetical protein SEE1_2811 Mobile element protein SEE2_2807 Mobile element protein SEE1_2812 Mobile element protein SEE2_2808 Mobile element protein SEE1_2813 Mobile element protein SEE2_2809 Mobile element protein SEE1_2814 Hypothetical protein SEE2_2810 Hypothetical protein SEE1_2815 hypothetical protein SPUL_2764 SEE2_2811 hypothetical protein SPUL_2764

68

Table 2.5: Fimbrial Adherence-Related Genes.

SEE1 SEE2 Gene Description Gene Description bcfA Type 1 fimbriae major subunit FimA bcfA Type 1 fimbriae major subunit FimA bcfB Chaperone FimC bcfB Chaperone FimC papC Type 1 fimbriae anchoring protein FimD papC Type 1 fimbriae anchoring protein FimD bcfD Fimbriae-like adhesin SfmH bcfD Fimbriae-like adhesin SfmH bcfE Type 1 fimbrae adaptor subunit FimF bcfE Type 1 fimbrae adaptor subunit FimF bcfF Type 1 fimbrae adaptor subunit FimF bcfF Type 1 fimbrae adaptor subunit FimF Hypothetical fimbrial chaperone ycbF Hypothetical fimbrial chaperone ycbF bcfG bcfG precursor precursor hofC Type IV fimbrial assembly protein PilC hofC Type IV fimbrial assembly protein PilC hofB Type IV fimbrial assembly, ATPase PilB hofB Type IV fimbrial assembly, ATPase PilB ppd Type IV pilin PilA ppd Type IV pilin PilA stiH Putative fimbriae stiH Putative fimbriae stiC FIG100795: Fimbriae usher protein StiC stiC FIG100795: Fimbriae usher protein StiC stiB Chaperone protein EcpD stiB Chaperone protein EcpD stiA Putative fimbrial subunit stiA Putative fimbrial subunit stfA Major fimbrial subunit StfA stfA Major fimbrial subunit StfA stfC Fimbriae usher protein StfC stfC Fimbriae usher protein StfC stfD Periplasmic fimbrial chaperone StfD stfD Periplasmic fimbrial chaperone StfD stfE Minor fimbrial subunit StfE stfE Minor fimbrial subunit StfE stfF Minor fimbrial subunit StfF stfF Minor fimbrial subunit StfF stfG Minor fimbrial subunit StfG stfG Minor fimbrial subunit StfG safA Putative fimbrial lipoprotein safA Putative fimbrial lipoprotein safB Periplasmic fimbrial chaperone protein safB Periplasmic fimbrial chaperone protein safC FIG034929: Fimbriae usher protein SafC safC FIG034929: Fimbriae usher protein SafC safD Putative fimbrial structural subunit safD Putative fimbrial structural subunit crl Curlin genes transcriptional activator crl Curlin genes transcriptional activator stbE Putative pilus chaperone, PapD family stbE Putative pilus chaperone, PapD family stbD Putative exported protein precursor stbD Putative exported protein precursor stbC outer membrane fimbrial usher protein stbC outer membrane fimbrial usher protein stbB Putative fimbrial chaperone stbB Putative fimbrial chaperone stbA Fimbrial protein precursor stbA Fimbrial protein precursor fimA Fimbriae-like adhesin SfmA fimA Fimbriae-like adhesin SfmA fimI Fimbriae-like adhesin FimI fimI Fimbriae-like adhesin FimI fimC Chaperone FimC fimC Chaperone FimC fimD Outer membrane usher protein SfmD fimD Outer membrane usher protein SfmD fimH Fimbriae-like adhesin SfmH fimH Fimbriae-like adhesin SfmH fimF Fimbriae-like periplasmic protein SfmF fimF Fimbriae-like periplasmic protein SfmF

69

Transcriptional regulator of fimbriae Transcriptional regulator of fimbriae fimZ fimZ expression FimZ expression FimZ Transcriptional regulator of fimbriae Transcriptional regulator of fimbriae fimY fimY expression FimY expression FimY fimW Fimbriae W protein fimW Fimbriae W protein csgC Putative curli production protein CsgC csgC Putative curli production protein CsgC csgA Major curlin subunit precursor CsgA csgA Major curlin subunit precursor CsgA csgB Minor curlin subunit CsgB csgB Minor curlin subunit CsgB Transcriptional regulator CsgD for 2nd Transcriptional regulator CsgD for 2nd curli csgD csgD curli operon operon Curli production assembly/transport Curli production assembly/transport csgE csgE component CsgE component CsgE Curli production assembly/transport Curli production assembly/transport csgF csgF component CsgF component CsgF Curli production assembly/transport Curli production assembly/transport csgG csgG component CsgG component CsgG SEE1_2078 PilV-like protein SEE2_2074 PilV-like protein SEE1_2079 Putative type IV pilin protein precursor SEE2_2075 Putative type IV pilin protein precursor SEE1_2080 Hypothetical protein SEE2_2076 Hypothetical protein SEE1_2081 Conjugal transfer protein TraA SEE2_2077 Conjugal transfer protein TraA Uncharacterized protein YehA pegD pegD Uncharacterized protein YehA precursor precursor pegC Fimbriae usher protein StcC pegC Fimbriae usher protein StcC Uncharacterized fimbrial chaperone Uncharacterized fimbrial chaperone YehC pegB pegB YehC precursor precursor pegA Probable fimbrial chain protein stcA pegA Probable fimbrial chain protein stcA SEE1_3011 Fimbriae usher protein StfC SEE2_3008 Fimbriae usher protein StfC SEE1_3012 Periplasmic fimbrial chaperone SEE2_3009 Periplasmic fimbrial chaperone SEE1_3013 MrfF SEE2_3010 MrfF SEE1_3014 MrfF SEE2_3011 MrfF SEE1_3015 Fimbrial subunit SEE2_3012 Fimbrial subunit stdC Probable fimbrial chaperone protein stdC Probable fimbrial chaperone protein stdB FIG036507: Fimbriae usher protein StdB stdB FIG036507: Fimbriae usher protein StdB stdA Putative fimbrial-like protein stdA Putative fimbrial-like protein hofQ Type IV pilus biogenesis protein PilQ hofQ Type IV pilus biogenesis protein PilQ yrfA Type IV pilus biogenesis protein PilP yrfA Type IV pilus biogenesis protein PilP yrfB Type IV pilus biogenesis protein PilO yrfB Type IV pilus biogenesis protein PilO yrfC Type IV pilus biogenesis protein PilN yrfC Type IV pilus biogenesis protein PilN yrfD Type IV pilus biogenesis protein PilM yrfD Type IV pilus biogenesis protein PilM lpfE Putative fimbrial protein precursor lpfE Putative fimbrial protein precursor lpfD Putative fimbrial protein lpfD Putative fimbrial protein lpfC Type 1 fimbriae anchoring protein FimD lpfC Type 1 fimbriae anchoring protein FimD lpfB Chaperone protein lpfB precursor lpfB Chaperone protein lpfB precursor lpfA Long polar fimbria protein A precursor lpfA Long polar fimbria protein A precursor SEE1_4547 Fimbrial protein precursor SEE2_4541 Fimbrial protein precursor SEE1_4548 Fimbrial chaperone protein SEE2_4542 Fimbrial chaperone protein

70

SEE1_4549 Hypothetical protein SEE2_4543 Hypothetical protein SEE1_4550 Outer membrane fimbrial usher protein SEE2_4544 Outer membrane fimbrial usher protein sthE Putative major fimbrial subunit sthE Putative major fimbrial subunit sthD Putative fimbrial subunit sthD Putative fimbrial subunit sthB Type 1 fimbriae anchoring protein FimD sthB Type 1 fimbriae anchoring protein FimD sthA Putative fimbrial chaperone protein sthA Putative fimbrial chaperone protein SEE1_4673 Putative fimbrial protein SEE2_4667 Putative fimbrial protein Uncharacterized protein YehA pegD pegD Uncharacterized protein YehA precursor precursor pegC Fimbriae usher protein StcC pegC Fimbriae usher protein StcC Uncharacterized fimbrial chaperone Uncharacterized fimbrial chaperone YehC pegB pegB YehC precursor precursor pegA Probable fimbrial chain protein stcA pegA Probable fimbrial chain protein stcA FIG004136: Prepilin peptidase FIG004136: Prepilin peptidase dependent ppdC ppdC dependent protein C precursor protein C precursor ygdB FIG006270: hypothetical protein ygdB FIG006270: hypothetical protein FIG004819: Prepilin peptidase FIG004819: Prepilin peptidase dependent ppdB ppdB dependent protein B precursor protein B precursor Prepilin peptidase dependent protein A Prepilin peptidase dependent protein A ppdA ppdA precursor precursor stdC Probable fimbrial chaperone protein stdC Probable fimbrial chaperone protein stdB FIG036507: Fimbriae usher protein StdB stdB FIG036507: Fimbriae usher protein StdB stdA Putative fimbrial-like protein stdA Putative fimbrial-like protein

71

Table 2.6: Non-Fimbrial Adhesin Genes.

SEE1 SEE2 Gene Description Gene Description SEE1_0229 Enhancin SEE2_0228 Enhancin Outer membrane protein assembly Outer membrane protein assembly factor yaeT yaeT factor YaeT YaeT hlpA Outer membrane protein H precursor hlpA Outer membrane protein H precursor Attachment invasion locus protein Attachment invasion locus protein SEE1_0348 SEE2_0347 precursor precursor Attachment invasion locus protein Attachment invasion locus protein ompX ompX precursor precursor ompF Outer membrane protein F precursor ompF Outer membrane protein F precursor ompA Outer membrane protein A precursor ompA Outer membrane protein A precursor ompC Outer membrane protein C precursor ompC Outer membrane protein C precursor ychP Invasin ychP Invasin ompW Outer membrane protein W precursor ompW Outer membrane protein W precursor SEE1_1469 Invasin-like protein SEE2_1464 Invasin-like protein SEE1_1638 Outer membrane protein C precursor SEE2_1631 Outer membrane protein C precursor ompN Outer membrane protein N precursor ompN Outer membrane protein N precursor Attachment invasion locus protein Attachment invasion locus protein pagC pagC precursor precursor ompC Outer membrane protein C precursor ompC Outer membrane protein C precursor shdA AIDA autotransporter-like protein shdA AIDA autotransporter-like protein Adherence and invasion Adherence and invasion outermembrane sinH sinH outermembrane protein protein Outer membrane protein YfgL, Outer membrane protein YfgL, lipoprotein yfgL yfgL lipoprotein component component Attachment invasion locus protein Attachment invasion locus protein SEE1_3097 SEE2_3094 precursor precursor yiaD Outer membrane protein A precursor yiaD Outer membrane protein A precursor misL Autotransporter misL Autotransporter yidE Mediator of hyperadherence YidE yidE Mediator of hyperadherence YidE yidQ Outer membrane lipoprotein YidQ yidQ Outer membrane lipoprotein YidQ

72

Table 2.7: Salmonella Pathogenicty Island 1 (SPI-1) Genes.

SEE1 SEE2 Gene Description Gene Description Type III secretion injected virulence Type III secretion injected virulence avrA avrA protein-NF-κB Inhibition protein-NF-κB Inhibition SPI1-associated transcriptional SPI1-associated transcriptional sprB sprB regulator SprB regulator SprB Type III secretion transcriptional Type III secretion transcriptional hilC hilC regulator HilC regulator HilC Putative effector protein OrgC of SPI-1 Putative effector protein OrgC of SPI-1 SEE1_2918 SEE2_2915 T3SS T3SS OrgB protein, associated with ATPase of OrgB protein, associated with ATPase ogrA ogrA T3SS of T3SS Oxygen-regulated invasion protein Oxygen-regulated invasion protein SEE1_2920 SEE2_2917 OrgA OrgA prgK Type III secretion bridge lipoprotein prgK Type III secretion bridge lipoprotein prgJ Type III secretion system protein prgJ Type III secretion system protein Type III secretion cytoplasmic protein Type III secretion cytoplasmic protein prgI prgI (YscF) (YscF) prgH Type III secretion protein EprH prgH Type III secretion protein EprH Type III secretion transcriptional Type III secretion transcriptional hilD hilD regulator HilD regulator HilD Type III secretion transcriptional Type III secretion transcriptional hilA hilA activator HilA activator HilA iagB Invasion protein IagB precursor iagB Invasion protein IagB precursor Type III secretion injected virulence Type III secretion injected virulence sptP sptP protein protein sicP secretion chaparone sicP secretion chaparone Found within S. typhi pathogenicity Found within S. typhi pathogenicity SEE1_2930 SEE2_2927 island island iacP Probable acyl carrier protein iacP iacP Probable acyl carrier protein iacP Type III secretion injected virulence Type III secretion injected virulence sipA sipA protein (YopE) protein (YopE) Cell invasion protein SipD (Salmonella Cell invasion protein SipD (Salmonella sipD sipD invasion protein D) invasion protein D) Cell invasion protein sipC (Effector Cell invasion protein sipC (Effector sipC sipC protein SipC) protein SipC) sipB Cell invasion protein SipB sipB Cell invasion protein SipB Type III secretion chaperone protein for Type III secretion chaperone protein sicA sicA YopD (SycD) for YopD (SycD) Type III secretion inner membrane Type III secretion inner membrane spaS spaS protein protein Type III secretion inner membrane Type III secretion inner membrane spaR spaR protein protein Type III secretion inner membrane Type III secretion inner membrane spaQ spaQ protein protein Type III secretion inner membrane Type III secretion inner membrane spaP spaP protein protein Type III secretion inner membrane Type III secretion inner membrane spaO spaO protein protein Type III secretion host injection and Type III secretion host injection and invJ invJ negative regulator protein negative regulator protein

73

Surface presentation of antigens Surface presentation of antigens invI invI protein SpaM protein SpaM Probable ATP synthase SpaL (Invasion Probable ATP synthase SpaL (Invasion invC invC protein InvC) protein InvC) invB Type III secretion system protein BsaR invB Type III secretion system protein BsaR Type III secretion inner membrane Type III secretion inner membrane invA invA channel protein channel protein Type III secretion outermembrane Type III secretion outermembrane invE invE contact sensing protein contact sensing protein Type III secretion outermembrane pore Type III secretion outermembrane invG invG forming protein pore forming protein Type III secretion thermoregulatory Type III secretion thermoregulatory invF invF protein protein invH Invasion protein invH precursor invH Invasion protein invH precursor

74 Table 2.8: Salmonella Pathogenicty Island 2 (SPI-2) Genes..

SEE1 SEE2 Gene Description Gene Description Type III secretion inner membrane ssaU Type III secretion inner membrane protein ssaU protein Type III secretion inner membrane ssaT Type III secretion inner membrane protein ssaT protein Type III secretion inner membrane ssaS Type III secretion inner membrane protein ssaS protein Type III secretion inner membrane ssaR Type III secretion inner membrane protein ssaR protein Type III secretion inner membrane ssaQ Type III secretion inner membrane protein ssaQ protein ssaP Type III secretion protein (YscP) ssaP Type III secretion protein (YscP) Type III secretion spans bacterial envelope Type III secretion spans bacterial ssaO ssaO protein (YscO) envelope protein (YscO) ssaN Flagellum-specific ATP synthase FliI ssaN Flagellum-specific ATP synthase FliI Type III secretion inner membrane channel Type III secretion inner membrane ssaV ssaV protein channel protein Secretion system apparatus protein ssaM Secretion system apparatus protein SsaM ssaM SsaM Type III secretion cytoplasmic protein ssaL Type III secretion cytoplasmic protein (YscL) ssaL (YscL) ssaK Type III secretion protein SsaK ssaK Type III secretion protein SsaK SEE1_1759 FIG029138: Type III secretion protein SEE2_1752 FIG029138: Type III secretion protein ssaJ Type III secretion bridge ssaJ Type III secretion bridge ssaI Type III secretion protein SsaI ssaI Type III secretion protein SsaI ssaH Type III secretion protein SsaH ssaH Type III secretion protein SsaH ssaG Type III secretion protein SsaG ssaG Type III secretion protein SsaG sseG Secretion system effector SseG sseG Secretion system effector SseG sseF Type III secretion effector SseF sseF Type III secretion effector SseF sscB Secretion system chaparone SscB sscB Secretion system chaparone SscB sseE Secretion system effector SseE sseE Secretion system effector SseE sseD Secretion system effector SseD sseD Secretion system effector SseD sseC Secretion system effector SseC sseC Secretion system effector SseC sscA Secretion system chaparone SscA sscA Secretion system chaparone SscA sseB Secretion system effector SseB sseB Secretion system effector SseB Type III secretion system chaperone sseA Type III secretion system chaperone SseA sseA SseA ssaE Secretion system effector SsaE ssaE Secretion system effector SsaE ssaD Secretion system apparatus SsaD ssaD Secretion system apparatus SsaD Type III secretion outermembrane pore Type III secretion outermembrane ssaC ssaC forming protein pore forming protein Type III secretion system effector ssaB Type III secretion system effector protein ssaB protein Secretion system regulator: Sensor Secretion system regulator: Sensor ssrA ssrA component component

75

Secretion system regulator of Secretion system regulator of ssrB ssrB DegU/UvrY/BvgA type DegU/UvrY/BvgA type Transcriptional regulator associated with Transcriptional regulator associated orf242 orf242 photolyase with photolyase COG1683: Uncharacterized conserved COG1683: Uncharacterized orf319 orf319 protein conserved protein orf70 FIG01045422: hypothetical protein orf70 FIG01045422: hypothetical protein

76

Table 2.9: Salmonella Pathogenicty Island 3 (SPI-3) Genes.

SEE1 SEE2 Gene Description Gene Description SEE1_3848 FIG01046146: hypothetical protein SEE2_3841 FIG01046146: hypothetical protein Putative DNA-binding protein in Putative DNA-binding protein in cluster SEE1_3849 SEE2_3842 cluster with Type I RMS with Type I RMS SEE1_3850 FIG01046502: hypothetical protein SEE2_3843 FIG01046502: hypothetical protein rmbA RmbA rmbA RmbA misL autotransporter misL autotransporter fidL YqeJ protein fidL YqeJ protein marT Putative sensory transducer marT Putative sensory transducer SEE1_3855 FIG01046505: hypothetical protein SEE2_3848 FIG01046505: hypothetical protein Nicotinamidase family protein slsA slsA Nicotinamidase family protein YcaC YcaC cigR Putative inner membrane protein cigR Putative inner membrane protein mgtB Mg(2+) transport ATPase, P-type mgtB Mg(2+) transport ATPase, P-type SEE1_3859 FIG01045269: hypothetical protein SEE2_3852 FIG01045269: hypothetical protein mgtC Mg(2+) transport ATPase protein C mgtC Mg(2+) transport ATPase protein C

Table 2.10: Salmonella Pathogenicty Island 4 (SPI-4) Genes.

SEE1 SEE2 Gene Description Gene Description Putative inner membrane protein Putative inner membrane protein SEE1_4324 SEE2_4317 or exported protein SiiA or exported protein SiiA Putative integral membrane Putative integral membrane SEE1_4325 SEE2_4318 protein SiiB protein SiiB SEE1_4326 Agglutination protein SiiC SEE2_4319 Agglutination protein SiiC Putative type-I secretion protein Putative type-I secretion protein SEE1_4327 SEE2_4320 SiiD SiiD SEE1_4328 Large repetitive protein SiiE SEE2_4321 Large repetitive protein SiiE Putative type-1 secretion protein Putative type-1 secretion protein SEE1_4329 SEE2_4322 SiiF SiiF yjcB YjcB protein yjcB YjcB protein FIG00638940: hypothetical yjcC FIG00638940: hypothetical protein yjcC protein soxS Regulatory protein SoxS soxS Regulatory protein SoxS Redox-sensitive transcriptional Redox-sensitive transcriptional soxR soxR activator SoxR activator SoxR

77

Table 2.11: Salmonella Pathogenicty Island 5 (SPI-5) Genes.

SEE1 SEE2 Gene Description Gene Description Pathogenicity island encoded protein: Pathogenicity island pipA pipA SPI3 encoded protein: SPI3 FIG01046201: pipB FIG01046201: hypothetical protein pipB hypothetical protein pipC Invasion gene E protein SEE2_1016 Hypothetical Protein Invasion gene E protein sopB Inositol phosphate phosphatase sopB pipC (Pathogenicity island encoded protein) Inositol phosphate pipD Probable dipeptidase sopB phosphatase sopB Putative two component system FIG01045843: copR SEE2_1019 histidine YedV hypothetical protein Putative two-component system copS pipD Probable dipeptidase response regulator YedW Putative two component copR system YedV Putative two-component copS system response regulator YedW

78

Table 2.12: Non-SPI Effectors, Toxins, and Secretion Systems

SEE1 SEE2 Genes Description Genes Description UDP-3-O-[3-hydroxymyristoyl] UDP-3-O-[3-hydroxymyristoyl] lpxD lpxD glucosamine N-acyltransferase glucosamine N-acyltransferase (3R)-hydroxymyristoyl-[acyl carrier (3R)-hydroxymyristoyl-[acyl carrier fabZ fabZ protein] dehydratase protein] dehydratase Acyl-[acyl-carrier-protein]--UDP-N- Acyl-[acyl-carrier-protein]--UDP-N- lpxA acetylglucosamine O- lpxA acetylglucosamine O- acyltransferase acyltransferase lpxB Lipid-A-disaccharide synthase lpxB Lipid-A-disaccharide synthase SEE1_0280 Uncharacterized protein ImpA SEE2_0279 Uncharacterized protein ImpA SEE1_0281 Uncharacterized protein ImpH/VasB SEE2_0280 Uncharacterized protein ImpH/VasB SEE1_0282 Probable secreted protein SEE2_0281 Probable secreted protein SEE1_0283 Putative cytoplasmic protein SEE2_0282 Putative cytoplasmic protein SEE1_0284 VgrG protein SEE2_0283 VgrG protein ybjX Virulence factor VirK ybjX Virulence factor VirK SEE1_0937 FIG01046987: hypothetical protein SEE2_0935 FIG01046987: hypothetical protein himD Integration host factor beta subunit himD Integration host factor beta subunit SEE1_0986 Virulence protein msgA SEE2_0983 Virulence protein msgA SEE1_1074 Secreted protein Hcp SEE2_1070 Secreted protein Hcp SEE1_1075 IcmF-related protein SEE2_1072 IcmF-related protein SEE1_1229 GtgA SEE2_1224 GtgA Lipid A biosynthesis (KDO) 2- Lipid A biosynthesis (KDO) 2- msbB msbB (lauroyl)-lipid IVA acyltransferase (lauroyl)-lipid IVA acyltransferase Invasion-associated secreted Invasion-associated secreted SEE1_1241 SEE2_1236 protein. protein. sopE2 G-nucleotide exchange factor SopE sopE2 G-nucleotide exchange factor SopE sseJ Secreted effector J SseJ sseJ Secreted effector J SseJ ydcP Putative collagenase ydcP Putative collagenase sifB Secreted effector protein sifB Secreted effector protein srfC Putative virulence factor srfC Putative virulence factor srfB SrfB srfB SrfB srfA Putative virulence effector protein srfA Putative virulence effector protein sppA Protease IV sppA Protease IV Integration host factor alpha Integration host factor alpha himA himA subunit subunit rfc O-antigen rfc O-antigen polymerase Putative outer membrane virulence Putative outer membrane virulence pagD pagD protein protein Probable lipoprotein envE envE envE Probable lipoprotein envE precursor precursor msgA Virulence protein MsgA msgA Virulence protein MsgA envF Probable lipoprotein envF precursor envF Probable lipoprotein envF precursor

79

sifA SifA protein sifA SifA protein Proposed peptidoglycan lipid II Proposed peptidoglycan lipid II mviN mviN flippase MurJ flippase MurJ mviM Virulence factor MviM mviM Virulence factor MviM Lipid A biosynthesis lauroyl Lipid A biosynthesis lauroyl htrB htrB acyltransferase acyltransferase sopA Secreted effector protein sopA Secreted effector protein Regulator of length of O-antigen Regulator of length of O-antigen wzzB wzzB component of LPS chains component of LPS chains udg UDP-glucose dehydrogenase udg UDP-glucose dehydrogenase 6-phosphogluconate 6-phosphogluconate gnd gnd dehydrogenase, decarboxylating dehydrogenase, decarboxylating Undecaprenyl-phosphate Undecaprenyl-phosphate rfbP rfbP galactosephosphotransferase galactosephosphotransferase rfbK Phosphomannomutase rfbK Phosphomannomutase Mannose-1-phosphate Mannose-1-phosphate rfbM rfbM (GDP) guanylyltransferase (GDP) O antigen biosynthesis O antigen biosynthesis rfbN rfbN rhamnosyltransferase rfbN rhamnosyltransferase rfbN rfbU O-antigen flippase Wzx rfbU O-antigen flippase Wzx rfbV Putative rfbV Putative glycosyltransferase rfbX O-antigen flippase Wzx rfbX O-antigen flippase Wzx rfbE dTDP-glucose 4,6-dehydratase rfbE dTDP-glucose 4,6-dehydratase rfbS UDP-glucose 4-epimerase rfbS UDP-glucose 4-epimerase CDP-4-dehydro-6-deoxy-D-glucose CDP-4-dehydro-6-deoxy-D-glucose rfbH rfbH 3-dehydratase 3-dehydratase Similar to CDP-glucose 4,6- Similar to CDP-glucose 4,6- rfbG rfbG dehydratase dehydratase Glucose-1-phosphate Glucose-1-phosphate rfbF rfbF cytidylyltransferase cytidylyltransferase CDP-6-deoxy-delta-3,4-glucoseen CDP-6-deoxy-delta-3,4-glucoseen rfbI rfbI reductase-like reductase-like dTDP-4-dehydrorhamnose 3,5- dTDP-4-dehydrorhamnose 3,5- rfbC rfbC epimerase epimerase Glucose-1-phosphate Glucose-1-phosphate rfbA rfbA thymidylyltransferase thymidylyltransferase dTDP-4-dehydrorhamnose dTDP-4-dehydrorhamnose rfbD rfbD reductase reductase rfbB dTDP-glucose 4,6-dehydratase rfbB dTDP-glucose 4,6-dehydratase sspH2 Secreted effector protein sspH2 Secreted effector protein wzc Tyrosine- Wzc wzc Tyrosine-protein kinase Wzc Low molecular weight protein- Low molecular weight protein- wzb wzb tyrosine-phosphatase Wzb tyrosine-phosphatase Wzb Polysaccharide export lipoprotein Polysaccharide export lipoprotein wza wza Wza Wza SEE1_2350 Homolog of virulence protein msgA SEE2_2348 Homolog of virulence protein msgA Von Willebrand factor type A Von Willebrand factor type A yfbK yfbK domain protein domain protein pgtE Protease VII (Omptin) precursor pgtE Protease VII (Omptin) precursor Lipid A biosynthesis lauroyl Lipid A biosynthesis lauroyl ddg ddg acyltransferase acyltransferase

80

Hemolysins and related proteins Hemolysins and related proteins yfjD yfjD containing CBS domains containing CBS domains SEE1_2759 Large repetitive protein SEE2_2755 Large repetitive protein SEE1_2760 FIG01045638: hypothetical protein SEE2_2756 FIG01045638: hypothetical protein Putative type I secretion protein, Putative type I secretion protein, SEE1_2761 SEE2_2757 ATP-binding protein ATP-binding protein Putative HlyD family secretion Putative HlyD family secretion SEE1_2762 SEE2_2758 protein protein SEE1_2822 Similar to pipB SEE2_2819 Similar to pipB virK Virulence protein VirK virK Virulence protein VirK sopD Secreted protein yfjD Secreted protein SEE1_3099 VapC toxin protein SEE2_3096 VapC toxin protein COG1272: Predicted membrane COG1272: Predicted membrane yqfA yqfA protein hemolysin III homolog protein hemolysin III homolog yqfB Protein HI1394 yqfB Protein HI1394 Type I secretion outer membrane Type I secretion outer membrane tolC tolC protein, TolC precursor protein, TolC precursor yraP 21 kDa hemolysin precursor yraP 21 kDa hemolysin precursor Putative surface-exposed virulence Putative surface-exposed virulence bigA bigA protein protein SEE1_3595 YafQ toxin protein SEE2_3589 YafQ toxin protein ADP-L-glycero-D-manno-heptose-6- ADP-L-glycero-D-manno-heptose-6- rfaD rfaD epimerase epimerase ADP-heptose--lipooligosaccharide ADP-heptose--lipooligosaccharide rfaF rfaF heptosyltransferase II heptosyltransferase II rfaC LPS heptosyltransferase I rfaC LPS heptosyltransferase I Oligosaccharide repeat unit Oligosaccharide repeat unit rfaL rfaL polymerase Wzy; O-antigen ligase polymerase Wzy; O-antigen ligase LPS 1,2-N- LPS 1,2-N- rfaK rfaK acetylglucosaminetransferase acetylglucosaminetransferase rfaZ LPS core biosynthesis protein RfaZ rfaZ LPS core biosynthesis protein RfaZ rfaY LPS core biosynthesis protein RfaY rfaY LPS core biosynthesis protein RfaY UDP-glucose:(glucosyl)LPS alpha- UDP-glucose:(glucosyl)LPS alpha- rfaJ rfaJ 1,2- 1,2-glucosyltransferase UDP-glucose:(glucosyl)LPS alpha- UDP-glucose:(glucosyl)LPS alpha- rfaI rfaI 1,3-glucosyltransferase 1,3-glucosyltransferase rfaB LPS 1,6- rfaB LPS 1,6-galactosyltransferase LPS core biosynthesis protein LPS core biosynthesis protein WaaP, rfaP rfaP WaaP, heptosyl-I-kinase heptosyl-I-kinase UDP-glucose:(heptosyl) LPS UDP-glucose:(heptosyl) LPS rfaG rfaG alpha1,3-glucosyltransferase WaaG alpha1,3-glucosyltransferase WaaG rfaQ LPS heptosyltransferase III rfaQ LPS heptosyltransferase III Regulator of length of O-antigen Regulator of length of O-antigen wzzE wzzE component of LPS chains component of LPS chains UDP-N-acetylglucosamine 2- UDP-N-acetylglucosamine 2- wecB wecB epimerase epimerase wecC UDP-glucose dehydrogenase wecC UDP-glucose dehydrogenase rffG dTDP-glucose 4,6-dehydratase rffG dTDP-glucose 4,6-dehydratase Glucose-1-phosphate Glucose-1-phosphate SEE1_4006 SEE2_3999 thymidylyltransferase thymidylyltransferase

81

rffC LPS biosynthesis protein RffC rffC LPS biosynthesis protein RffC 4-keto-6-deoxy-N-Acetyl-D- 4-keto-6-deoxy-N-Acetyl-D- wecE hexosaminyl-(Lipid carrier) wecE hexosaminyl-(Lipid carrier) aminotransferase aminotransferase wzxE WzxE protein wzxE WzxE protein Virulence regulon transcriptional Virulence regulon transcriptional virF virF activator virF activator virF 4-alpha-L- (EC SEE1_4010 4-alpha-L-fucosyltransferase SEE2_4003 2.4.1.-) wecF Putative ECA polymerase wecF Putative ECA polymerase Probable UDP-N-acetyl-D- Probable UDP-N-acetyl-D- wecG wecG mannosaminuronic acid mannosaminuronic acid transferase

82

Table 2.13: Iron Acquisition Genes

SEE1 SEE2 Gene Description Gene Description Ferric hydroxamate outer membrane Ferric hydroxamate outer membrane fhuA fhuA receptor FhuA receptor FhuA Ferric hydroxamate ABC transporter, ATP- Ferric hydroxamate ABC transporter, ATP- fhuC fhuC binding protein FhuC binding protein FhuC Ferric hydroxamate ABC transporter, Ferric hydroxamate ABC transporter, fhuD periplasmic binding protein fhuD periplasmic substrate binding protein FhuD FhuD Ferric hydroxamate ABC transporter, Ferric hydroxamate ABC transporter, fhuB fhuB permease component FhuB permease component FhuB foxA Ferrichrome-iron receptor foxA Ferrichrome-iron receptor 4'-phosphopantetheinyl 4'-phosphopantetheinyl entD entD transferase/[enterobactin] siderophore transferase/[enterobactin] siderophore TonB-dependent receptor; Outer TonB-dependent receptor; Outer membrane fepA membrane receptor for ferric fepA receptor for ferric enterobactin enterobactin fes Enterobactin fes Enterobactin esterase FIG005032: Putative cytoplasmic protein FIG005032: Putative cytoplasmic protein ybdZ ybdZ YbdZ in enterobactin biosynthesis operon YbdZ in enterobactin biosynthesis operon Enterobactin synthetase component F, Enterobactin synthetase component F, entF entF serine activating enzyme serine activating enzyme fepE Ferric enterobactin uptake protein FepE fepE Ferric enterobactin uptake protein FepE Ferric enterobactin transport ATP-binding Ferric enterobactin transport ATP-binding fepC fepC protein FepC protein FepC Ferric enterobactin transport system Ferric enterobactin transport system fepG fepG permease protein FepG permease protein FepG Ferric enterobactin transport system Ferric enterobactin transport system fepD fepD permease protein FepD permease protein FepD entS Enterobactin exporter EntS entS Enterobactin exporter EntS Ferric enterobactin-binding periplasmic Ferric enterobactin-binding periplasmic fepB fepB protein FepB protein FepB Isochorismate synthase/enterobactin] Isochorismate synthase/enterobactin] entC entC siderophore siderophore 2,3-dihydroxybenzoate-AMP 2,3-dihydroxybenzoate-AMP entE entE ligase/[enterobactin] siderophore ligase/[enterobactin] siderophore Isochorismatase/[enterobactin] Isochorismatase/[enterobactin] entB siderophore/Apo-aryl carrier domain of entB siderophore/Apo-aryl carrier domain of EntB EntB 2,3-dihydro-2,3-dihydroxybenzoate 2,3-dihydro-2,3-dihydroxybenzoate entA dehydrogenase/[enterobactin] entA dehydrogenase/[enterobactin] siderophore siderophore Proofreading in enterobactin Proofreading thioesterase in enterobactin ybdB ybdB biosynthesis EntH biosynthesis EntH fur Ferric uptake regulation protein FUR fur Ferric uptake regulation protein FUR ftn Ferritin-like protein 2 ftn Ferritin-like protein 2 ftnB Ferritin-like protein 2 ftnB Ferritin-like protein 2 Ferric siderophore transport system, Ferric siderophore transport system, tonB tonB periplasmic protein TonB periplasmic binding protein TonB

83

Putative OMR family iron-siderophore Putative OMR family iron-siderophore fhuE fhuE receptor precursor receptor precursor iroB Glycosyltransferase IroB iroB Glycosyltransferase IroB iroC ABC transporter protein IroC iroC ABC transporter protein IroC iroD Trilactone hydrolase IroD iroD Trilactone hydrolase IroD iroE Periplasmic esterase IroE iroE Periplasmic esterase IroE Outer Membrane Siderophore Receptor iroN iroN Outer Membrane Siderophore Receptor IroN IroN Outer membrane vitamin B12 receptor btuB yqjH Outer membrane vitamin B12 receptor BtuB BtuB yqjH iron-chelator utilization protein bfr iron-chelator utilization protein bfr Bacterioferritin feoA Bacterioferritin feoA Ferrous iron transport protein A feoB Ferrous iron transport protein A feoB Ferrous iron transport protein B yhgH Ferrous iron transport protein B Ferrous iron-sensisng transcriptional Ferrous iron-sensisng transcriptional yhgH btuB regulator FeoC regulator FeoC

84

Table 2.14: Motility and Chemotaxis Genes

SEE1 SEE2 Gene Description Gene Description yaiU Putative flagellin structural protein yaiU Putative flagellin structural protein fliR Flagellar biosynthesis protein FliR fliR Flagellar biosynthesis protein FliR fliQ Flagellar biosynthesis protein FliQ fliQ Flagellar biosynthesis protein FliQ fliP Flagellar biosynthesis protein FliP fliP Flagellar biosynthesis protein FliP fliO Flagellar biosynthesis protein FliQ fliO Flagellar biosynthesis protein FliQ fliN Flagellar motor switch protein FliN fliN Flagellar motor switch protein FliN fliM Flagellar motor switch protein FliM fliM Flagellar motor switch protein FliM fliL Flagellar biosynthesis protein FliL fliL Flagellar biosynthesis protein FliL Flagellar hook-length control protein Flagellar hook-length control protein fliK fliK FliK FliK fliJ Flagellar protein FliJ fliJ Flagellar protein FliJ fliI Flagellum-specific ATP synthase FliI fliI Flagellum-specific ATP synthase FliI fliH Flagellar assembly protein FliH fliH Flagellar assembly protein FliH fliG Flagellar motor switch protein FliG fliG Flagellar motor switch protein FliG fliF Flagellar M-ring protein FliF fliF Flagellar M-ring protein FliF Flagellar hook-basal body complex Flagellar hook-basal body complex fliE fliE protein FliE protein FliE yedF UPF0033 protein YedF yedF UPF0033 protein YedF Putative transport system permease Putative transport system permease yedE yedE protein protein fliT Flagellar biosynthesis protein FliT fliT Flagellar biosynthesis protein FliT fliS Flagellar biosynthesis protein FliS fliS Flagellar biosynthesis protein FliS fliD Flagellar hook-associated protein FliD fliD Flagellar hook-associated protein FliD fljB Flagellar biosynthesis protein FliC fljB Flagellar biosynthesis protein FliC fliB Lysine-N-methylase fliB Lysine-N-methylase RNA polymerase sigma factor for RNA polymerase sigma factor for fliA fliA flagellar operon flagellar operon fliZ Flagellar biosynthesis protein FliZ fliZ Flagellar biosynthesis protein FliZ Cystine ABC transporter, periplasmic Cystine ABC transporter, periplasmic fliY fliY cystine-binding protein FliY cystine-binding protein FliY Flagellar transcriptional activator flhD Flagellar transcriptional activator FlhD flhD FlhD Flagellar transcriptional activator flhC Flagellar transcriptional activator FlhC flhC FlhC Flagellar motor rotation protein motA Flagellar motor rotation protein MotA motA MotA Flagellar motor rotation protein motB Flagellar motor rotation protein MotB motB MotB Signal transduction histidine kinase Signal transduction histidine kinase cheA cheA CheA CheA Positive regulator of CheA protein Positive regulator of CheA protein cheW cheW activity (CheW) activity (CheW)

85

Methyl-accepting chemotaxis protein Methyl-accepting chemotaxis protein cheM cheM II II Chemotaxis protein Chemotaxis protein cheR cheR CheR methyltransferase CheR Chemotaxis response regulator Chemotaxis response regulator cheB protein-glutamate methylesterase cheB protein-glutamate methylesterase CheB CheB Chemotaxis regulator - transmits Chemotaxis regulator - transmits cheY cheY chemoreceptor signals to CheY chemoreceptor signals to CheY Chemotaxis response - phosphatase Chemotaxis response - phosphatase cheZ cheZ CheZ CheZ flhB Flagellar biosynthesis protein FlhB flhB Flagellar biosynthesis protein FlhB flhA Flagellar biosynthesis protein FlhA flhA Flagellar biosynthesis protein FlhA flhE Flagellar protein FlhE flhE Flagellar protein FlhE Methyl-accepting chemotaxis protein Methyl-accepting chemotaxis protein trg trg III III flgL Flagellar hook-associated protein FlgL flgL Flagellar hook-associated protein FlgL Flagellar hook-associated protein flgK Flagellar hook-associated protein FlgK flgK FlgK Flagellar protein FlgJ [peptidoglycan Flagellar protein FlgJ [peptidoglycan flgJ flgJ hydrolase] hydrolase] flgI Flagellar P-ring protein FlgI flgI Flagellar P-ring protein FlgI flgH Flagellar L-ring protein FlgH flgH Flagellar L-ring protein FlgH flgG Flagellar basal-body rod protein FlgG flgG Flagellar basal-body rod protein FlgG flgF Flagellar basal-body rod protein FlgF flgF Flagellar basal-body rod protein FlgF flgE Flagellar hook protein FlgE flgE Flagellar hook protein FlgE Flagellar basal-body rod modification Flagellar basal-body rod modification flgD flgD protein FlgD protein FlgD flgC Flagellar basal-body rod protein FlgC flgC Flagellar basal-body rod protein FlgC flgB Flagellar basal-body rod protein FlgB flgB Flagellar basal-body rod protein FlgB Flagellar basal-body P-ring formation Flagellar basal-body P-ring formation flgA flgA protein FlgA protein FlgA Negative regulator of flagellin Negative regulator of flagellin flgM flgM synthesis FlgM synthesis FlgM flgN Flagellar biosynthesis protein FlgN flgN Flagellar biosynthesis protein FlgN cheV Chemotaxis protein CheV cheV Chemotaxis protein CheV yggR Twitching motility protein PilT yggR Twitching motility protein PilT Putative methyl-accepting chemotaxis Putative methyl-accepting SEE1_3213 SEE2_3755 protein chemotaxis protein SEE1_3228 Methyl-accepting chemotaxis protein SEE2_4375 Methyl-accepting chemotaxis protein Putative chemotaxis protein, Putative chemotaxis protein, SEE1_3761 SEE2_4376 resembles cheA resembles cheA katG Catalase/Peroxidase katG Catalase/Peroxidase Hydrogen peroxide-inducible genes Hydrogen peroxide-inducible genes oxyR oxyR activator activator tsr Methyl-accepting chemotaxis protein tsr Methyl-accepting chemotaxis protein SEE1_4382 Flagellar regulon repressor RtsB SEE2_3209 Flagellar regulon repressor RtsB Type III secretion and flagellar Type III secretion and flagellar SEE1_4383 SEE2_3224 regulator RtsA regulator RtsA

86

Table 2.15: Resistance Genes

SEE1 SEE2 Gene Description Gene Description ABC-type multidrug transport ABC-type multidrug transport yadG yadG system, ATPase component system, ATPase component ABC-type multidrug transport ABC-type multidrug transport yadH yadH system, permease component system, permease component SEE1_0272 Putative drug efflux protein SEE2_0271 Putative drug efflux protein Type III restriction-modification Type III restriction-modification mod mod system subunit system methylation subunit res Type III restriction-modification res Type III restriction-modification RND efflux system, outer membrane RND efflux system, outer SEE1_0352 SEE2_0351 lipoprotein CmeC membrane lipoprotein CmeC RND efflux system, inner membrane RND efflux system, inner SEE1_0353 SEE2_0352 transporter CmeB membrane transporter CmeB RND efflux system, membrane fusion RND efflux system, membrane SEE1_0354 SEE2_0353 protein CmeA fusion protein CmeA ampG AmpG permease ampG AmpG permease RND efflux system, inner membrane RND efflux system, inner acrB acrB transporter CmeB membrane transporter CmeB Membrane fusion protein of RND Membrane fusion protein of RND acrA acrA family multidrug efflux pump family multidrug efflux pump repressor of Transcription repressor of multidrug acrR acrR multidrug efflux pump acrAB efflux pump acrAB operon operon fsr Fosmidomycin resistance protein fsr Fosmidomycin resistance protein Lead, cadmium, zinc and mercury Lead, cadmium, zinc and mercury copA transporting ATPase. Copper- copA transporting ATPase. Copper- translocating P-type ATPase translocating P-type ATPase Polymyxin resistance protein ArnC, Polymyxin resistance protein ArnC, yfdH yfdH glycosyl transferase glycosyl transferase SEE1_0388 Possible efflux pump SEE2_0387 Possible efflux pump Cobalt-zinc-cadmium resistance Cobalt-zinc-cadmium resistance SEE1_0575 protein CzcA; Cation efflux system SEE2_0574 protein CzcA; Cation efflux system protein CusA protein CusA ybgL Lactam utilization protein LamB ybgL Lactam utilization protein LamB Cobalt-zinc-cadmium resistance Cobalt-zinc-cadmium resistance ybgR ybgR protein CzcD protein CzcD ABC transporter multidrug efflux ABC transporter multidrug efflux ybhF ybhF pump, fused ATP-binding domains pump, fused ATP-binding domains Predicted membrane fusion protein Predicted membrane fusion protein SEE1_0812 SEE2_0809 component of efflux pump, component of efflux pump, Transcriptional regulator YbiH, TetR Transcriptional regulator YbiH, TetR ybiH ybiH family family mdfA Multidrug MdfA mdfA Multidrug translocase MdfA Macrolide-specific efflux protein Macrolide-specific efflux protein ybjY ybjY MacA MacA Macrolide export ATP- Macrolide export ATP- ybjZ ybjZ binding/permease protein MacB binding/permease protein MacB

87

Copper resistance protein C Copper resistance protein C yobA yobA precursor precursor SEE1_1216 Copper resistance protein D SEE2_1211 Copper resistance protein D Probable transcription regulator Probable transcription regulator SEE1_1461 SEE2_1456 protein of MDR efflux pump cluster protein of MDR efflux pump cluster Ethidium bromide-methyl viologen Ethidium bromide-methyl viologen SEE1_1510 SEE2_1504 resistance protein EmrE resistance protein EmrE Permease of the drug/metabolite Permease of the drug/metabolite yedA yedA transporter transporter tehA Tellurite resistance protein TehA tehA Tellurite resistance protein TehA tehB Tellurite resistance protein TehB tehB Tellurite resistance protein TehB Methyl viologen resistance protein Methyl viologen resistance protein smvA smvA smvA smvA Permease of the drug/metabolite Permease of the drug/metabolite yddG yddG transporter transporter 12-TMS multidrug efflux protein 12-TMS multidrug efflux protein SEE1_1623 SEE2_1616 homolog homolog Multiple antibiotic resistance protein Multiple antibiotic resistance marC marC MarC protein MarC Multiple antibiotic resistance protein Multiple antibiotic resistance marR marR MarR protein MarR Multiple antibiotic resistance protein Multiple antibiotic resistance marA marA MarA protein MarA Multiple antibiotic resistance protein Multiple antibiotic resistance marB marB MarB protein MarB Permease of the drug/metabolite Permease of the drug/metabolite ydeD ydeD transporter transporter SEE1_1684 Acid shock protein precursor SEE2_1677 Acid shock protein precursor katE Catalase katE Catalase Multidrug-efflux transporter, major Multidrug-efflux transporter, major yceE yceE facilitator superfamily facilitator superfamily msyB Acidic protein msyB msyB Acidic protein msyB yegN Multidrug transporter MdtB yegN Multidrug transporter MdtB yegO Multidrug transporter MdtC yegO Multidrug transporter MdtC yegB Multidrug transporter MdtD yegB Multidrug transporter MdtD RND efflux system, outer membrane RND efflux system, outer yohG yohG lipoprotein, NodT family membrane lipoprotein, NodT family MFS family multidrug transport MFS family multidrug transport bcr protein, bicyclomycin resistance bcr protein, bicyclomycin resistance protein protein Polymyxin resistance protein PmrG; Polymyxin resistance protein PmrG; ais ais Ais protein Ais protein Polymyxin resistance protein ArnC, Polymyxin resistance protein ArnC, pmrF pmrF glycosyl transferase glycosyl transferase Polymyxin resistance protein Polymyxin resistance protein yfbG ArnA_DH, UDP- yfbG ArnA_DH, UDP-glucuronic acid decarboxylase decarboxylase Polymyxin resistance protein PmrJ, Polymyxin resistance protein PmrJ, pmrJ pmrJ predicted deacetylase predicted deacetylase Polymyxin resistance protein ArnT, Polymyxin resistance protein ArnT, pqaB undecaprenyl phosphate-alpha-L- pqaB undecaprenyl phosphate-alpha-L- Ara4N transferase; Ara4N transferase; pmrL Polymyxin resistance protein PmrL pmrL Polymyxin resistance protein PmrL

88

pmrM Polymyxin resistance protein PmrM pmrM Polymyxin resistance protein PmrM pmrD Polymyxin resistance protein PmrD pmrD Polymyxin resistance protein PmrD emrR Transcription repressor emrR Transcription repressor emrA Multidrug resistance protein A emrA Multidrug resistance protein A Inner membrane component of Inner membrane component of emrB tripartite multidrug resistance emrB tripartite multidrug resistance system system mdaB Modulator of drug activity B mdaB Modulator of drug activity B ygjT Integral membrane protein TerC ygjT Integral membrane protein TerC Transcription repressor of Transcription repressor of multidrug envR envR multidrug efflux pump acrAB efflux pump acrAB operon operon RND efflux system, membrane fusion RND efflux system, membrane acrE acrE protein CmeA fusion protein CmeA RND efflux system, inner membrane RND efflux system, inner acrF acrF transporter CmeB membrane transporter CmeB DamX, protein involved in bile DamX, protein involved in bile damX damX resistance resistance emrD Multidrug resistance protein D emrD Multidrug resistance protein D Lead, cadmium, zinc and mercury Lead, cadmium, zinc and mercury zntA transporting ATPase Copper- zntA transporting ATPase Copper- translocating P-type ATPase translocating P-type ATPase Streptomycin 3''-O- Streptomycin 3''-O- aadA adenylyltransferase/Spectinomycin aadA adenylyltransferase/Spectinomycin 9-O-adenylyltransferase 9-O-adenylyltransferase SEE1_3829 Putative beta-lactamase SEE2_3822 Putative beta-lactamase katG Catalase/Peroxidase katG Catalase/Peroxidase Hydrogen peroxide-inducible genes Hydrogen peroxide-inducible genes oxyR oxyR activator activator Quaternary ammonium compound- Quaternary ammonium compound- sugE sugE resistance protein SugE resistance protein SugE SEE1_2653 Alpha-2-macroglobulin SEE2_2651 Alpha-2-macroglobulin

89

Table 2.16: Signalling Genes

SEE1 SEE2 Gene Description Gene Description Phosphate regulon transcriptional Phosphate regulon transcriptional phoB phoB regulatory protein PhoB regulatory protein PhoB Phosphate regulon sensor protein phoR phoR Phosphate regulon sensor protein PhoR PhoR Putative two component system Putative two component system histidine copR copR histidine kinase YedV kinase YedV Putative two-component system Putative two-component system response copS copS response regulator YedW regulator YedW N-3-oxohexanoyl-L-homoserine N-3-oxohexanoyl-L-homoserine lactone sdiA lactone quorum-sensing sdiA quorum-sensing transcriptional activator transcriptional activator Cellular communication/signal SEE1_0172 SEE2_0171 Cellular communication/signal transduction transduction ynaI Mechanosensitive ion channel ynaI Mechanosensitive ion channel ynaJ Putative inner membrane protein ynaJ Putative inner membrane protein Methyl-accepting chemotaxis trg trg Methyl-accepting chemotaxis protein III protein III LysR family transcriptional ydcI ydcI LysR family transcriptional regulator YdcI regulator YdcI Transcriptional regulatory protein rstA rstA Transcriptional regulatory protein RstA RstA Sensory histidine kinase two- Sensory histidine kinase two-component rstB component regulatory system with rstB regulatory system with RstA RstA Transcriptional regulatory protein phoP phoP Transcriptional regulatory protein PhoP PhoP phoQ Sensor protein PhoQ phoQ Sensor protein PhoQ baeS Sensory histidine kinase BaeS baeS Sensory histidine kinase BaeS baeR Response regulator BaeR baeR Response regulator BaeR Hypothetical response regulatory Hypothetical response regulatory protein yehT yehT protein yehT yehT yehU Autolysin sensor kinase yehU Autolysin sensor kinase Two-component sensor protein yojN yojN Two-component sensor protein RcsD RcsD Two-component sensor protein rcsC rcsB Two-component sensor protein RcsC RcsC DNA-binding capsular synthesis DNA-binding capsular synthesis response rcsB rcsC response regulator RcsB regulator RcsB S-ribosylhomocysteine S-ribosylhomocysteine /Autoinducer-2 luxS lyase/Autoinducer-2 production luxS production protein LuxS protein LuxS barA BarA sensory histidine kinase barA BarA sensory histidine kinase Two-component system response Two-component system response regulator ygiX ygiX regulator QseB QseB ygiY Sensory histidine kinase QseC ygiY Sensory histidine kinase QseC Methyl-accepting chemotaxis SEE1_3296 SEE2_3291 Methyl-accepting chemotaxis protein I protein I

90

aer Aerotaxis sensor receptor protein aer Aerotaxis sensor receptor protein Large-conductance Large-conductance mechanosensitive mscL mscL mechanosensitive channel channel Osmolarity sensory histidine envZ envZ Osmolarity sensory histidine kinase EnvZ kinase EnvZ Two-component system response Two-component system response regulator ompR ompR regulator OmpR OmpR Homoserine/homoserine lactone Homoserine/homoserine lactone efflux rthB rthB efflux protein protein aer Aerotaxis sensor receptor protein aer Aerotaxis sensor receptor protein Autoinducer 2 (AI-2) ABC transport Autoinducer 2 (AI-2) ABC transport ego ego system,ATP-binding component system,ATP-binding component Autoinducer 2 (AI-2) ABC transport Autoinducer 2 (AI-2) ABC transport system, ydeY system, membrane channel ydeY membrane channel protein LsrC protein LsrC Autoinducer 2 (AI-2) ABC transport Autoinducer 2 (AI-2) ABC transport system, ydeZ system, membrane channel ydeZ membrane channel protein LsrD protein LsrD Autoinducer 2 (AI-2) ABC transport Autoinducer 2 (AI-2) ABC transport system, yneA system, periplasmic AI-2 binding yneA periplasmic AI-2 binding protein LsrB protein LsrB yneB Autoinducer 2 (AI-2) aldolase LsrF yneB Autoinducer 2 (AI-2) aldolase LsrF Autoinducer 2 (AI-2) modifying yneC yneC Autoinducer 2 (AI-2) modifying protein LsrG protein LsrG Putative luxR family bacterial Putative luxR family bacterial regulatory encR encR regulatory protein protein ydeV Autoinducer 2 (AI-2) kinase LsrK ydeV Autoinducer 2 (AI-2) kinase LsrK LsrR, transcriptional repressor of ydeW ydeW LsrR, transcriptional repressor of lsr operon lsr operon Methyl-accepting chemotaxis tcp tcp Methyl-accepting chemotaxis protein I protein I Methyl-accepting chemotaxis tsr tsr Methyl-accepting chemotaxis protein I protein I Putative methyl-accepting Putative methyl-accepting chemotaxis SEE1_3213 SEE2_3209 chemotaxis protein protein Methyl-accepting chemotaxis SEE1_3228 SEE2_3224 Methyl-accepting chemotaxis protein I protein I

91

Table 2.17: Miscellaneous Virulence-Associated Genes

SEE1 SEE2 Gene Description Gene Description SEE1_0088 Probable secreted protein SEE2_0087 Probable secreted protein SEE1_0089 Probable secreted protein SEE2_0088 Probable secreted protein SEE1_0336 Probable secreted protein SEE2_0307 Probable secreted protein SEE1_0350 Probable secreted protein SEE2_0349 Probable secreted protein Haemolysin expression modulating Haemolysin expression modulating hha hha protein protein ybaJ FIG00948312: hypothetical protein ybaJ FIG00948312: hypothetical protein yliH Biofilm regulator BssR yliH Biofilm regulator BssR SEE1_0512 Probable secreted protein SEE2_0511 Probable secreted protein SEE2_1070 Putative exported protein SEE2_1066 Putative exported protein SEE2_1072 Putative exported protein SEE2_1068 Putative exported protein SEE1_1073 Secreted protein Hcp SEE2_1070 Secreted protein Hcp Colanic acid capsular biosynthesis Colanic acid capsular biosynthesis rcsA rcsA activation accesory protein RcsA activation accesory protein RcsA Stage V sporulation protein involved in Stage V sporulation protein involved in ycgB ycgB spore cortex synthesis (SpoVR) spore cortex synthesis (SpoVR) Glucans biosynthesis protein D Glucans biosynthesis protein D opgD opgD precursor precursor ybgS Probable secreted protein ybgS Probable secreted protein Glucans biosynthesis Glucans biosynthesis mdoH mdoH glucosyltransferase H glucosyltransferase H Glucans biosynthesis protein G Glucans biosynthesis protein G mdoG mdoG precursor precursor mdoC Glucans biosynthesis protein C mdoC Glucans biosynthesis protein C UTP--glucose-1-phosphate UTP--glucose-1-phosphate galF galF uridylyltransferase uridylyltransferase Colanic acid biosynthesis protein wcaM wcaM Colanic acid biosynthesis protein wcaM wcaM Colanic acid biosynthesis glycosyl Colanic acid biosynthesis glycosyl wcaL wcaL transferase WcaL transferase WcaL Colanic acid biosysnthesis protein wcaK wcaK Colanic acid biosysnthesis protein WcaK WcaK Lipopolysaccharide biosynthesis Lipopolysaccharide biosynthesis protein wzxZ wzxZ protein WzxC WzxC Colanic acid biosynthsis UDP-glucose Colanic acid biosynthsis UDP-glucose wcaJ wcaJ lipid carrier transferase WcaJ lipid carrier transferase WcaJ cpsG Phosphomannomutase cpsG Phosphomannomutase Mannose-1-phosphate Mannose-1-phosphate manC guanylyltransferase/Mannose-6- manC guanylyltransferase/Mannose-6- phosphate phosphate isomerase Colanic acid biosysnthesis glycosyl Colanic acid biosysnthesis glycosyl wcaI wcaI transferase WcaI transferase WcaI wcaH GDP-mannose mannosyl hydrolase wcaH GDP-mannose mannosyl hydrolase GDP-L-fucose synthetase/Colanic acid GDP-L-fucose synthetase/Colanic acid wcaG wcaG biosynthesis protein wcaG biosynthesis protein wcaG

92

gmd GDP-mannose 4,6-dehydratase gmd GDP-mannose 4,6-dehydratase Colanic acid biosynthesis Colanic acid biosynthesis wcaF wcaF acetyltransferase WcaF acetyltransferase WcaF Colanic acid biosynthesis glycosyl Colanic acid biosynthesis glycosyl wcaE wcaE transferase WcaE transferase WcaE wcaD Colanic acid polymerase WcaD wcaD Colanic acid polymerase WcaD Colanic acid biosynthesis glycosyl Colanic acid biosynthesis glycosyl wcaC wcaC transferase WcaC transferase WcaC Colanic acid biosynthesis Colanic acid biosynthesis wcaB wcaB acetyltransferase WcaB acetyltransferase WcaB Putative N-acetylgalactosaminyl- Putative N-acetylgalactosaminyl- wcaA diphosphoundecaprenol wcaA diphosphoundecaprenol glucuronosyltransferase wzc Tyrosine-protein kinase Wzc wzc Tyrosine-protein kinase Wzc Low molecular weight protein- Low molecular weight protein-tyrosine- wzb wzb tyrosine-phosphatase Wzb phosphatase Wzb wza Polysaccharide export lipoprotein Wza wza Polysaccharide export lipoprotein Wza Putative capsular polysaccharide Putative capsular polysaccharide yegH yegH transport protein YegH transport protein YegH SEE1_4402 Entericidin B precursor SEE2_4396 Entericidin B precursor cvpA Colicin V production protein cvpA Colicin V production protein SEE1_3870 Probable secreted protein STY4010 SEE2_3863 Probable secreted protein STY4010 yiaF Probable exported protein YPO4070 yiaF Probable exported protein YPO4070 SEE1_2892 Putative exported protein SEE2_2889 Putative exported protein SEE1_2893 Putative exported protein SEE2_2890 Putative exported protein Von Willebrand factor type A domain Von Willebrand factor type A domain yfbK yfbK protein protein yebW Putative secreted protein YebW yebW Putative secreted protein YebW ynfB Putative secreted protein ynfB Putative secreted protein SEE1_1375 Putative secreted protein SEE2_1370 Putative secreted protein SEE1_1545 Putative exported protein SEE2_1539 Putative exported protein Ethanolamine operon regulatory eutR eutR Ethanolamine operon regulatory protein protein Ethanolamine utilization polyhedral- Ethanolamine utilization polyhedral- eutK eutK body-like protein EutK body-like protein EutK Ethanolamine utilization polyhedral- Ethanolamine utilization polyhedral- eutL eutL body-like protein EutL body-like protein EutL Ethanolamine ammonia-lyase light eutC eutC Ethanolamine ammonia-lyase light chain chain Ethanolamine ammonia-lyase heavy Ethanolamine ammonia-lyase heavy eutB eutB chain chain eutA Ethanolamine utilization protein EutA eutA Ethanolamine utilization protein EutA eutH Ethanolamine permease eutH Ethanolamine permease eutG Ethanolamine utilization protein EutG eutG Ethanolamine utilization protein EutG eutJ Ethanolamine utilization protein EutJ eutJ Ethanolamine utilization protein EutJ Acetaldehyde dehydrogenase, Acetaldehyde dehydrogenase, eutE eutE ethanolamine utilization cluster ethanolamine utilization cluster Ethanolamine utilization polyhedral- Ethanolamine utilization polyhedral- eutN eutN body-like protein EutN body-like protein EutN eutM Ethanolamine utilization polyhedral- eutM Ethanolamine utilization polyhedral-

93

body-like protein EutM body-like protein EutM

Phosphate acetyltransferase, Phosphate acetyltransferase, eutD eutD ethanolamine utilization-specific ethanolamine utilization-specific ATP:Cob(I)alamin adenosyltransferase, ATP:Cob(I)alamin adenosyltransferase, eutT eutT ethanolamine utilization ethanolamine utilization eutQ Ethanolamine utilization protein EutQ eutQ Ethanolamine utilization protein EutQ eutP Ethanolamine utilization protein EutP eutP Ethanolamine utilization protein EutP Ethanolamine utilization polyhedral- Ethanolamine utilization polyhedral- eutS eutS body-like protein EutS body-like protein EutS

94

Table 2.18: Single Nucleotide Polymorphisms in Human vs. Egg Isolates of SE

Position: SEE1 H.I. Gene AA CDS/Int Description SEE1 nt nt 31848 dsbA A G - CDS Putative exported protein 57022 lspA T C - CDS Lipoprotein signal peptidase 4-hydroxy-3-methylbut-2-enyl 58215 lytB A G - CDS diphosphate reductase 92942 yaaU A C - CDS Putative metabolite transport protein Ribosomal large subunit pseudouridine 109715 rluA T G - CDS synthase A 125734 tbpA A G - CDS Iron (III)- Binding periplasmic protein 207513 stiC T C Asn:His CDS Fimbriae usher protein stiC Multimodular transpeptidase- 223793 mcrB A G - CDS transglycosylase Hypothetical Protein of Unknown 238930 SEE1_0219 A G - CDS Function 252896 yaeH G C Thr:Ser CDS Chromosome Segregation ATPase 401919 prpE T C - CDS Propionate-CoA Ligase Cytochrome O ubiquinol oxidase subunit 474954 cyoB T G - CDS I ATP-binding component of a transport 497239 mdlA T C - CDS system 527064 hemH T C Val:Ala CDS , Protoheme ferro-lyase 534504 ushA G T - CDS UDP-Sugar hydrolase 572861 ylbE A C - CDS Putative cytoplasmic protein Putative HTH-Type transcriptional 599729 SEE1_0576 A G - CDS regulator Putative HTH-Type transcriptional 599730 SEE1_0576 G C - CDS regulator 604067 pheP G C - CDS Phenylalanine-Specific permease 606966 apeE T C - CDS Outer membrane esterase 606968 apeE A T - CDS Outer membrane esterase Octanoate-protein-N-octan 669952 lipB A G - CDS oyltransferase 2-octaprenyl-3-methyl-6-methoxy-1,4- 673376 rodA T C Thr:Ala CDS benzoquinol hydroxylase 708966 ubiF A G Ser:Gly CDS DNA-Binding Response Regulator 732371 pgm T G Cys:Gly CDS Phosphoglucomutase 738956 kdpE T G Glu:Gly CDS DNA-Binding Response Regulator 745304 kdpA A G - CDS Potassium-Transporting ATPase A chai 801585 SEE1_0770 T A - CDS Putative Inner membrane protein 801586 SEE1_0770 A G - CDS Putative Inner membrane protein Potential Promoter Region for Biotin 819444 bioA-bioB T C - Int Synthesis 982365 rpsA G A Gly:Asp CDS SSU ribosomal protein S1p Membrane protein, suppressor for 1079123 scsB C G - CDS copper-sensitivity

95

N-acetylmannosamine-6-phosphate 2- 1096948 SEE1_1062 G C - CDS epimerase 1103143 SEE1_1069 T C - CDS Hypothetical Protein 1103145 SEE1_1069 G T Leu:Phe CDS Hypothetical Protein 1135612 fliI T C - CDS Flagellum-specific ATP synthase 1321809 SEE1_1336 C G Ala:Pro CDS NAD-specific glutamate dehydrogenase 1321810 SEE1_1336 T C - CDS NAD-specific glutamate dehydrogenase Anthranilate synthase, amidotransferase 1395937 trpD A G - CDS component 1431667 sapA A G - CDS Peptide transport periplasmic protein 1505468 - C T - Int - 1575493 smvA T G Leu:Trp CDS Methyl Viologen Resistance Protein Promoter of nmpC Outer membrane 1578524 nmpC T C - Int porin 1598423 SEE1_1614 T C Ser:Gly CDS Aspartate aminotransferase 1598424 SEE1_1614 G T - CDS Aspartate aminotransferase 1677082 rstA G C Ala:Pro CDS Transcriptional regulatory protein 1777939 SEE1_1801 G C - CDS L-Cysteine uptake protein TcyP 1823463 pfkB A G - CDS 6- 1921614 ycfD A G Ser:Gly CDS FIG002776: hypothetical protein 1966222 rluC/rne C A - CDS Promoter Region of rluC or rne 1982608 mviN A G - CDS Proposed peptidoglycan lipid II flippase 2033997 SEE1_2087 C G - CDS FIG01047494: hypothetical protein 2043542 SEE1_2100 A C Val:Trp CDS c-hypothetical protein APECO1_2271 2043543 SEE1_2100 C A Val:Trp CDS c-hypothetical protein APECO1_2271 Propanediol Dehydratase reactivation 2084404 pduG T C - CDS factor large subunit Propanediol Utilization polyhdral body 2085366 pduJ A G Arg:Gly CDS protein 2113563 hisD T C - CDS Histidinol dehydrogenase 2118766 hisI T G Cys:Gly CDS Phosphoribosyl-AMP cyclohydrolase 2147088 wcaK A T Leu:Gln CDS Colanic acid biosysnthesis protein Anaerobic glycerol-3-phosphate 2345899 glpB T C - CDS dehydrogenase subunit B Anaerobic glycerol-3-phosphate 2345901 glpB G T - CDS dehydrogenase subunit B 2399238 yfbT C G Arg:Pro CDS Putative Phosphatase 2469762 SEE1_2515 T G - CDS FIG01047539: hypothetical protein 2480995 SEE1_2525 T C - CDS Putative ion-channel protein 2517829 yfeZ A G - CDS Inner membrane protein 2521821 eutR C G - CDS Ethanolamine operon regulatory protein 2528566 eutG G C - CDS Ethanolamine utilization protein 2539928 tktB C G Ala:Gly CDS Transketolase 2539929 tktB G C Ala:Gly CDS Transketolase 2565737 sdaR A C Phe:Leu CDS Sugar diacid utilization regulator SdaR

96

2635731 sseB T G - Int Promoter region of sseB 2638140 hscA A G - CDS Chaperone protein hscA Phosphoribosylformylglycinamidine 2669431 purG A C - CDS synthase 2677829 yfhH T G Ser:Ala CDS Sialic acid utilization regulator, 2750214 SEE1_2760 C G Ala:Gly CDS FIG01045638: hypothetical protein 2794512 iroB T C - CDS Glycotransferase IroB 2794514 iroB G T - CDS Glycotransferase IroB 2808516 - G A - Int - Tricarboxylate transport sensor protein 2810756 tctE G C Ala:Gly CDS TctE Tricarboxylate transport sensor protein 2810757 tctE C T Ala:Gly CDS TctE Tricarboxylate transport sensor protein 2810758 tctE T G - CDS TctE 2865256 - A G - Int - 2962759 ygcB A G - CDS CRISPR-Associated Cas3, 2975198 pyrG T G Thr:Pro Int CTP synthase 2998317 yqcC G A Pro:Ser Int Hypothetical protein YqcC 3226440 parE G C - CDS Topoisomerase IV subunit B 3397938 SEE1_3433 A G - CDS L(+)-Tartrate dehydratase alpha subunit Protein invovled in catabolism of 3641662 SEE1_3678 C G - CDS external DNA Selenocystein-specific translation 3745343 selB A G - CDS elongation factor Periplasmic septal ring factor, murein 3773761 yibP C G Ala:Gly CDS hydrolase Periplasmic septal ring factor, murein 3773762 yibP T C Ala:Gly CDS hydrolase 3791013 rfaG C G - CDS LPS alpha1,3-glucosyltransferase WaaG c-L-seryl-tRNA, selenium transferase- 3842086 SEE1_3864 A G - CDS related protein Promoter of emrD Multidrug resistance 3867194 emrD T C - Int protein D 3994930 rffG A G Asp:Gly CDS dTDP-glucose 4,6-dehydratase 4023990 pldA T C - CDS A1 precursor 4056624 trkH T C - CDS Potassium uptake protein TrkH 4134988 sbp T C Tyr:His CDS Sulfate-binding protein Sbp 4207410 argH T G - CDS Argininosuccinate lyase Biotin operon repressor / Biotin-protein 4223041 birA T C - CDS ligase 4266472 - T C - Int - 4268839 aceA T C - CDS Isocitrate Lyase 4283581 SEE1_4278 T G Met:Leu CDS FIG01045250: hypothetical protein 4300480 malE G C - CDS Maltose/maltodextrin ABC transporter 4376759 fdhF T C Ser:Arg CDS Formate dehydrogenase H 4536773 SEE1_4517 T C - CDS Carbamate kinase

97

4555658 idnO T C - CDS 5-keto-D-gluconate 5-reductase 4584081 SEE1_4579 A C Gln:His CDS Putative cytoplasmic protein USSDB7A 4599877 hsdR G C Ala:Gly CDS Type I restriction-modification system 4630621 - T C - Int - 4637677 holD C T - CDS DNA polymerase III psi subunit 4637678 holD T C - CDS DNA polymerase III psi subunit 4659052 yjjK T G - CDS ABC transporter, ATP-binding protein 4659053 yjjK G C - CDS ABC transporter, ATP-binding protein 4669607 sthE C G - CDS Putative major fimbrial subunit 4672838 sthB T C - CDS type 1 fimbriae anchoring protein FimD

H.I: Human isolate. CDS: Coding DNA Sequence. Int: Intragenic Region. nt:nucleotide

98

2.7 References

1. Vieira Aea (2009) A Resource to Link Human and Non-Human Sources of Salmonella. WHO Global Foodborne Infections Network Country Databank 2. CDC (2010) Salmonella Serotype Enteritidis. National Center for Emerging and Zoonotic Infectious Diseases. 3. Foley SL, Johnson TJ, Ricke SC, Nayak R, Danzeisen J (2013) Salmonella Pathogenicity and Host Adaptation in Chicken-Associated Serovars. Microbiology and Molecular Biology Reviews 77: 582-607. 4. Chai SJ, White PL, Lathrop SL, Solghan SM, Medus C, et al. (2012) Salmonella enterica Serotype Enteritidis: Increasing Incidence of Domestically Acquired Infections. Clinical Infectious Diseases 54: S488-S497. 5. Guard-Petter J (2001) The Chicken, the Egg and Salmonella Enteritidis. Environmental Microbiology 3: 421-430. 6. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Gast R, et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738. 7. Gast RK, Guraya R, Guard J (2013) Salmonella Enteritidis Deposition in Eggs after Experimental Infection of Laying Hens with Different Oral Doses. Journal of Food Protection 76: 108-113. 8. Gast RK, Guraya R, Guard-Bouldin J, Holt PS, Moore RW (2007) Colonization of Specific Regions of the Reproductive Tract and Deposition at Different Locations Inside Eggs Laid by Hens Infected with Salmonella Enteritidis or Salmonella Heidelberg. . Avian Diseases 51: 40-44. 9. CDC (2011) Vital Signs: Incidence and Trends of Infection with Pathogens Transmitted Commonly Through Food --- Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 1996--2010. 10. Sandt CH, Fedorka-Cray PJ, Tewari D, Ostroff S, Joyce K, et al. (2013) A Comparison of Non-Typhoidal Salmonella from Humans and Food Animals Using Pulsed-Field Gel Electrophoresis and Antimicrobial Susceptibility Patterns. PLoS ONE 8: e77836. 11. Aston C, Mishra B, Schwartz DC (1999) Optical mapping and its potential for large-scale sequencing projects. Trends in Biotechnology 17: 297-302. 12. Aziz RK BD, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O (2008) The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics 9: 1-15. 13. Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, et al. (2000) Artemis: sequence visualization and annotation. Bioinformatics (Oxford, England) 16: 944-945. 14. Ning Z, Cox AJ, Mullikin JC (2001) SSAHA: A Fast Search Method for Large DNA Databases. Genome Research 11: 1725-1729. 15. Alikhan N-F, Petty N, Ben Zakour N, Beatson S (2011) BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genomics 12: 402. 16. Bernal-Bayard J, Ramos-Morales F (2009) Salmonella Type III Secretion Effector SlrP Is an E3 Ubiquitin Ligase for Mammalian Thioredoxin. The Journal of Biological Chemistry 284: 27587-27595.

99 17. Cogan TA, Jørgensen F, Lappin-Scott HM, Benson CE, Woodward MJ, et al. (2004) Flagella and Curli Fimbriae are Important for the Growth of Salmonella enterica Serovars in Hen Eggs. Microbiology 150: 1063-1071. 18. Oberhettinger P, Schutz M, Leo JC, Heinz N, Berger J, et al. (2012) Intimin and Invasin Export Their C-Terminus to the Bacterial Cell Surface Using an Inverse Mechanism Compared to Classical Autotransport. PLoS ONE 7: e47069. 19. Wells TJ, Totsika M, Schembri MA (2010) Autotransporters of Escherichia coli: a Sequence- Based Characterization. Microbiology 156: 2459-2469. 20. Torres AG, Perna NT, Burland V, Ruknudin A, Blattner FR, et al. (2002) Characterization of Cah, a -Binding and Heat-Extractable Autotransporter Protein of Enterohaemorrhagic Escherichia coli. Molecular Microbiology 45: 951-966. 21. Collier-Hyams LS, Zeng H, Sun J, Tomlinson AD, Bao ZQ, et al. (2002) Cutting Edge: Salmonella AvrA Effector Inhibits the Key Proinflammatory, Anti-Apoptotic NF-κB Pathway. The Journal of Immunology 169: 2846-2850. 22. Bansal T, Englert D, Lee J, Hegde M, Wood TK, et al. (2007) Differential Effects of Epinephrine, Norepinephrine, and Indole on Escherichia coli O157:H7 Chemotaxis, Colonization, and Gene Expression. Infection and Immunity 75: 4597-4607. 23. Hritonenko V, Stathopoulos C (2007) Omptin Proteins: an Expanding Family of Outer Membrane Proteases in Gram-negative Enterobacteriaceae (Review). Molecular Membrane Biology 24: 395-406. 24. Wong SG, Dessen A (2014) Structure of a Bacterial α2-Macroglobulin Reveals Mimicry of Eukaryotic Innate Immunity. Nat Commun 5. 25. Shi L, Ansong C, Smallwood H, Rommereim L, McDermott JE, et al. (2009) Proteome of Salmonella enterica Serotype Typhimurium Grown in a Low Mg(2+)/pH Medium. Journal of proteomics & bioinformatics 2: 388-397. 26. Majowicz SE, Musto J, Scallan E, Angulo FJ, Kirk M, et al. (2010) The Global Burden of Nontyphoidal Salmonella Gastroenteritis. Clinical Infectious Diseases 50: 882-889. 27. Deng X, Desai PT, den Bakker HC, Mikoleit M, Tolar B, et al. (2014) Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages. Emerging Infectious Diseases 20: 1481-1489. 28. Allard MW, Luo Y, Strain E, Pettengill J, Timme R, et al. (2013) On the Evolutionary History, Population Genetics and Diversity among Isolates of Salmonella Enteritidis PFGE Pattern JEGX01.0004. PLoS ONE 8: e55254. 29. Knodler LA, Vallance BA, Celli J, Winfree S, Hansen B, et al. (2010) Dissemination of Invasive Salmonella via Bacterial-Induced Extrusion of Mucosal Epithelia. Proceedings of the National Academy of Sciences 107: 17733-17738. 30. Mariconda S, Wang Q, Harshey RM (2006) A Mechanical Role for the Chemotaxis System in Swarming Motility. Molecular Microbiology 60: 1590-1602. 31. Abu-Ali GS, Ouellette LM, Henderson ST, Whittam TS, Manning SD (2010) Differences in adherence and virulence gene expression between two outbreak strains of enterohaemorrhagic Escherichia coli O157:H7. Microbiology 156: 408-419. 32. Tong Y, Guo M (2009) Bacterial Heme-Transport Proteins and Their Heme-Coordination Modes. Archives of Biochemistry and Biophysics 481: 1-15. 33. Gao Q, Wang X, Xu H, Xu Y, Ling J, et al. (2012) Roles of iron acquisition systems in virulence of extraintestinal pathogenic Escherichia coli: salmochelin and aerobactin contribute more to virulence than heme in a chicken infection model. BMC Microbiology 12: 143-143. 34. Müller S, Valdebenito M, Hantke K (2009) Salmochelin, The Long-Overlooked Catecholate Siderophore of Salmonella. BioMetals 22: 691-695.

100 35. Choi J, Shin D, Ryu S (2007) Implication of Quorum Sensing in Salmonella enterica Serovar Typhimurium Virulence: the luxS Gene Is Necessary for Expression of Genes in Pathogenicity Island 1. Infection and Immunity 75: 4885-4890. 36. Choi J, Shin D, Kim M, Park J, Lim S, et al. (2012) LsrR-Mediated Quorum Sensing Controls Invasiveness of Salmonella Typhimurium by Regulating SPI-1 and Flagella Genes. PLoS ONE 7: e37059. 37. Moreira CG, Weinshenker D, Sperandio V (2010) QseC Mediates Salmonella enterica Serovar Typhimurium Virulence In Vitro and In Vivo. Infection and Immunity 78: 914- 926. 38. Behnsen J, Perez-Lopez A, Nuccio S-P, Raffatellu M (2015) Exploiting host immunity: the Salmonella paradigm. Trends in Immunology 36: 112-120. 39. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Van Immerseel F (2009) The Salmonella Enteritidis Lipopolysaccharide Biosynthesis Gene rfbH is Required for Survival in Egg Albumen. Zoonoses and Public Health 56: 145-149. 40. De Buck J, Immerseel FV, Haesebrouck F, Ducatelle R (2004) Effect of Type 1 Fimbriae of Salmonella enterica Serotype Enteritidis on Bacteraemia and Reproductive Tract Infection in Laying Hens. Avian Pathology 33: 314-320.

Chapter 3

Patho-Pan-Genomics of Salmonella enterica Serovars Reveals Host-Specific Factors and Potential Vaccine Targets of Salmonella Enteritidis

102

3.1 Abstract

Salmonella enterica subspecies enterica (S. enterica) is one of the most successful foodborne bacterial pathogens in the world and over 2,600 different serovars having been described. The serovar S. Enteritidis is responsible for the highest burden of all the S. enterica serovars. Salmonella Enteritidis has an extremely broad host range and has the potential to cause both gastrointestinal and systemic infections. The lack of understanding about the molecular mechanisms underlying the pathogenic strategies has led to the inability to limit the prevalence of S. Enteritidis infections in humans as well as in animals. Attempting to elucidate the genes involved in SE pathogenesis, a pan- genome analysis compared 10 different serovars of S. enterica to three S. Enteritidis isolates. The resulting genes were categorized as being important to colonize and/or infect poultry, humans, or core virulence genes involved in central pathogenic strategies most likely common to most serovars of S. enterica. These common genes were then analyzed for single nucleotide polymorphism and gene presence/absence to better understand the relatedness of the different serovars. The analysis revealed 276 clusters of orthologous genes (COGs) that were found in all poultry colonizing S. enterica serovars and 162 in all human colonizing serovars. The core which is comprised of genes shared amongst all serovars contained 2,817 COGs. Among the virulence-associated COGs, over 200 were present in the core whereas 10 and 12 COGs were present in S. enterica serovars colonizing poultry and human, respectively. This study offers the first comprehensive analysis of S. enterica genes that may be essential for colonization or

103 causing disease in various hosts. This information can also be used to identify genes unique to certain serovars. This study will also aid in identifying potential vaccine and antimicrobial targets that can be used to mitigate human foodborne illness due to SE.

104

3.2 Background

Salmonella enterica contains Gram negative, facultative anaerobic, intracellular bacteria which are predominantly motile [1,2]. There are six subspecies of Salmonella enterica including enterica, salamae, arizonae, diarizonae, houtenae, and indica representing subspecies I, II, III, IIIb, IV, and VI respectively [3]. Salmonella enterica subspecies enterica is a very well adapted subspecies which can infect a wide range of avian and mammalian species. This particular subspecies is responsible for over 99% of all human cases of Salmonella is a major driver of bacterial foodborne morbidity and mortality both in the United States and around the world [2,4-6]. The diseases caused by subspecies I in both humans and animals are divided into two different classes based on the clinical phenotypes. Typhoid and (para) typhoid-like diseases refers to systemic diseases and are not confined to the gastrointestinal tract (GIT). Non-Typhoidal diseases are generally confined to the GIT but in rare cases can become invasive and extra- intestinal [7-9].

In human, the most common forms of typhoid and paratyphoid illness are caused by Salmonella enterica serovar Typhi (STy) or Paratyphi (SPt). In chickens and other poultry, the most commonly associated serovars of Salmonella enterica serovars causing invasive disease are S. Gallinarum (SG) and S. Pullorum (SP). These serovars are host- restricted but the mechanisms involved in this host specificity is currently not known

[10]. Most serovars of Salmonella enterica, including Typhimurium (STym), Enteritidis

(SE), and Heidelberg (SH) exhibit broad host ranges [1]. At present, STym and SE are

105 the two most common sources of human acquired foodborne salmonellosis in the world, although SH is becoming an emerging issue [11,12]. In particular, recent epidemiological studies by the World Health Organization (W.H.O) and the Centers for

Disease Control and Prevention (CDC) suggest that SE is the most predominant serovar associated with human illness compared to other Salmonella serovars [4,13].

Human illness due to SE often results in a self-limiting gastroenteritis with vomiting, diarrhea, abdominal cramping and fever. Generally, it takes approximately 12-

72 hours after infection for symptoms to appear and usually is resolved within a couple of weeks at the latest. Immunocompromised individuals as well as individuals with at-risk immune systems, such as the extremely young and elderly may develop typhoid-like symptoms. This is because a competent immune system is required to limit SE to the

GIT and individuals lacking a robust immune response fail to prevent SE from becoming systemic. In all cases if dehydration is not controlled or if the bacteria become systemic the illness can be fatal; though this is a relatively rare event [4,5]. Most human infections have been traced back to contaminated shell eggs and poultry meat and their products.

Despite a number of counter-measures and overall decline in the prevalence of SE in poultry farms, it still poses a significant threat to human health. Recent data has shown that there has been an increase in SE in broiler chicken populations [4,14]. However, the major source of SE infections is still through contaminated shell eggs and their products.

In laying hens, infection is mostly asymptomatic and therefore, colonization and persistence of SE in hens may go unnoticed. Estimates now suggest that 1 in every

20,000 eggs may be contaminated with SE. Over 65 billion eggs being produced annually in the United States alone with approximately 3.25 million eggs are contaminated each

106 year [4,15]. As with many other foodborne pathogens, it is widely considered if a vaccine or other preventative measure could prevent hens from becoming colonized with

SE, there would be a significant reduction in the incidence of human foodborne illness due to SE.

Beyond the burden of SE specifically, many other serovars of Salmonella enterica are starting to show the capacity to jump their original host barriers. In recent reports, the previously reported human restricted serovar, SPt, has been reported in poultry meat [16].

Similarly, SC, which primarily infects swine, has gained attention as the non-typhoidal S. enterica serovar that displays the greatest propensity to become systemic [17]. It is believed that SE and SG previously shared a common ancestor or have evolved from one another [18,19]. With over 2,600 different serovars, S. enterica is constantly evolving and branching out to infect new hosts. There is a high degree of relatedness between SE and other serovars of S. enterica which exhibit a broad host range. However, to our knowledge, no study has been conducted to compare S. enterica serovars which can colonize and/or cause disease in multiple host species. In this study, the genomes of SE isolates were compared to 10 other S. enterica serovars infecting different host species.

This may help identifying genes and molecular mechanisms employed by SE to colonize, infect, and/or cause disease. Finally, this knowledge can be used to generate a list of potential vaccine and antimicrobial targets against SE and other S. enterica serovars.

107

3.3 Materials and Methods

3.3.1 Orthologous gene clusters and pan-genome matrices. Extracted coding sequences from 11 Salmonella genomes, listed in Table 3.1, were compared with each other using reciprocal all-against-all BLASTp [20]. Gene families, or Clusters of Orthologous Genes

(COGs), were determined using OrthoMCL with an E-value cutoff of 10-5, over 75% length coverage, and at least 50% protein identity [21]. COGs are then clustered using

Markov cluster algorithm. Unclustered gene families, which also did not have any

BLAST hits, were considered strain specific gene families (unique genes). The results of

OrthoMCL were converted to pan-genome matrices and redundant and non-redundant; the latter was used for further analysis. The pan-genome non-redundant matrix was constructed with each column as a genome and each row as a gene family. Cell (i,j) in the matrix is 1 when gene family “i” is present in genome “j”, or 0 when gene family is absent. This sortable table allows to query: (i) “core gene families” that are always present in all genomes compared, (ii) “shared gene families” that are present in more than

1 genome but less then all, and (iii) “unique genes” described above. *Note: Pan-genome data was too large to add to this document. Copy can be requested.

3.3.2 Pan-Genome Analysis. With the sortable table of orthologous gene clusters, table was originally sorted for all serovars to have “1” in the matrix, resulting in a table of orthologs that were present in all serovars. For genes considered for poultry infection, the “1” rule was applied to SEE1, SEE2. Rule “1 or 0” was applied to SG, SP, STym,

108 SPt-B, and SE. The “0” rule applied to SPt-A and ST. For human infection, same matrix rules applied accept SPt-A and ST are now rule “1” and SG and SP are “0.” Once Tables were assembled into (i) core gene families, (iia) shared gene families of poultry infecting

Salmonella enterica serovars, and (iib) shared families of human infecting Salmonella enterica serovars, genes were compared side by side to ensure no overlap. Resulting tables were then analyzed for genes of a virulence-associated nature as previously described in Chapter 2.

3.4 Results

3.4.1 Genetic Relatedness of the Various Salmonella enterica Serovars.

As Figure 3.1 shows, most of the serovars share very high amounts of identity in gene composition except for SH. Because SH is now becoming an emerging problem in poultry, we can use this similar/dissimilar approach to performing the pan-genome analysis to see a true core set of genes and genes that are truly unique to an infectious phenotype or serovar. Figure 3.3 shows the general relatedness of each serovar based on

SNPs in genes shared between SEE1/2 and the serovar listed. Based on SNP-based phylogenetic analysis of shared genes, the SE isolates map closest to SP and SG, both of which having less than 8,000 SNPs compared to SE. All of the broad host range serovars all have approximately between 39 and 42,000 SNPs compared to SE isolates. The human restricted serovars had approximately 52,000 SNPs compared to SE. These data contrast with the mapping of SE in comparison to these serovars based on gene

109 presence/absence; as SE was phylogenetically closest to the broad host range, human restricted, and poultry restricted serovars respectively in descending order or relatedness as shown in Figure 3.2.

3.4.2 Virulence-Associated COGs Present in Salmonella enterica Serovars that Infect Poultry

The COGs specific to S. enterica serovars that show the ability to infect poultry revealed 276 COGs also conserved among the SE isolates. Of these, only 10 cross- referenced to the virulence gene profile developed in Chapter 2. These COGs can be found in Table 3.2. Among these are two sets of fimbrial operons containing genes stiHBA and lpfDBA. There were four hypothetical genes that have some structure similarities to genes important for virulence. Among them, two are outer membrane proteins; SEE1_0554 and SEE1_1013. One gene is a predicted SopD-like protein, an effector of invasion (SEE1_0938). The last hypothetical gene was a methyl accepting chemotaxis protein encoding gene, SEE1_1540. This gene is a receptor for and ribose.

3.4.3 Virulence-Associated COGs Present in Salmonella enterica Serovars that Infect Humans

There were a total of 162 genes that were shown to be conserved in the genomes of S. enterica serovars that infect human including SE. As Table 3.3 shows, 12 of these

COGs were designated as virulence-related. Among these, four were related to flagellar biosynthesis and assembly. There was only one two component gene, torS, a hybrid

110 histidine kinase. Interestingly, iroD (encodes for salmochelin-specific esterase) was only present in the serovars that infect or colonize humans and not found in SG or SP. A second TonB-dependent iron transporter gene, yncD, was also maintained amongst these serovars. Two of the tetrathionate utilization genes, ttrB/C, were also conserved in this group. The sthA gene, which encodes a fimbrial chaperone protein, was the only fimbrial gene present in these serovars. One other gene was the SsrAB-activated gene srfB is only conserved in the isolates that can infect humans, but not poultry.

3.4.4 Virulence-Associated COGs in All Serovars of Salmonella enterica and Use for Rational Vaccine Design

A total of 2,817 COGs define the ‘core’ of the orthologous genes conserved in all serovars tested in this study. The whole list of each of the pan-genome analyses can be seen in the Appendix Tables A-1 through A-3. Analyzing these COGs with the definitions established in Chapter 2 for being virulence-associated, there were 247 COGs

(or ~9% of the whole core) that makeup the ‘core virulence gene profile’ of the serovars tested. These most likely represent the core virulence gene profile of Salmonella enterica as a whole, given the breadth of host and disease phenotypes exhibited by the serovars tested in this study.

Of these 247 COGs, 70 were related to signaling and motility, 18 involved in LPS biosynthesis, 59 encoded outer membrane proteins and adhesins, 20 were transport- related, 26 were miscellaneous virulence associated, and 61 COGs involved in the biosynthesis of SPI-related proteins (Tables 3.4 – 3.9). In this list of virulence-associated

COGs, the genes that encode bacterial surface or secreted proteins are the beat candidates

111 for future vaccine and antimicrobial development as these proteins are likely to interact with the host and its immune system. For example, many of the SPI-related COGs would be good vaccine targets except the effectors that are translocated from the bacteria to the host cell directly via a type 3 secretion system (T3SS). The two component regulators on the other hand, such as PhoP/Q, are excellent targets as they have been shown to be important for regulation of SPI-related genes and the ‘sensor’ component is effacing the extracellular space [22]. These genes have been highlighted in their respective tables.

In general, the vaccine candidates elucidated in this study can be categorized into three main classes (i) two-component signaling systems, (ii) fimbrial and non-fimbrial adhesins, and (iii) nutrient acquisition systems. All of these COGs that are being considered are listed in Tables 3.4 through 3.9 and are highlighted in blue. Although these three categories of virulence genes contain the most suitable targets for vaccine development, there are many other projected targets including COGs conserved in all serovars and are classified as miscellaneous virulence-associated genes, SPI, or non-SPI regulated.

3.5 Conclusions and Discussion

Salmonella enterica is one of the most successful bacterial pathogens in the world and benefits from having over 2,600 different serovars all with different host ranges and pathogenic strategies [1,2,12]. One serovar in particular, Salmonella enterica Enteritidis

(SE) is one of the most common human foodborne pathogens in the world and exhibits an

112 extremely broad host range. It is important to note that genotype ultimately affects phenotype, and in the instance of SE, or any other bacteria for that matter, that also includes the host range. For the first time to our knowledge, we compared the genomes of human-restricted serovars (STy and SPt), poultry-restricted serovars (SG and SP), and serovars that exhibit a multi- and broad host ranges (STym, SH, SC and SE including the two genomes sequenced in this study) to identify genes that may aid SE in its ability to colonize and cause disease in certain hosts.

A pan-genome analysis was performed and the genes that are specific to S. enterica serovars that have the ability to infect or cause disease in humans or poultry were determined by the presence/absence of SE genes in these serovars. Interestingly, the similarities between SE and other serovars in relation to presence or absence of genes did not correlate with the phylogenetic analysis based on SNPs in conserved genes, providing evidence of microevolution of genes between the serovars. The evolution of

SE appears to have occurred through gene acquisition and loss in relation to the poultry restricted serovars, and through acquisition of SNPs in relation to the human-restricted and broad host range serovars. These differences may play an important role in defining the host range or disease phenotype exhibited by SE and other serovars. These data also suggest that SE is unique based on the over 39,000 SNPs in genes that are shared by SE and the model non-typhoidal S. enterica serovar STym.

Overall, there were only very few genes that were specific to poultry- infecting isolates, but some of these genes have the potential to shed insight into bacterial mechanisms involved in SE’s ability to colonize the poultry host. There were two sets of fimbrial genes, the sti-, and lpf-, present in SEE1 and SEE2 and other serovars of S.

113 enterica infecting poultry. In particular, the latter encodes for long polar fimbriae which may be important for attachment of SE to specific niches in poultry and/or shell eggs. In fact, previous work has shown that lpf- fimbriae-deficient STym were completely unable to form biofilms and fail to attach to chicken epithelial cells, but only an intermediate loss of these characteristics to human HEp-2 cells [23]. It is our belief that this is one example of how these analyses in silico can prove to be biologically relevant and therefore are a good start to understanding these processes. Similarly, if in silico and in vivo evidence correlate, these may be novel vaccine targets that are specific for poultry infecting or colonizing S. enterica serovars. Other than these fimbrial genes, there were four hypothetical genes. Like the lpf- genes, these genes and their protein products may also be important for different aspects of colonizing poultry and shell eggs.

Unlike the genes associated with poultry colonization and infection, all of the virulence-associated COGs and their respective genes in S. enterica isolates infecting human had known functions. Salmonella enterica is known to produce an modified enterochelin which cannot be bound by lipocalin-2; salmochelin [24]. The esterase component of this operon, iroD is only found in the human infectious isolates. This would suggest that salmochelin is more important for acquiring iron in the human host than in the poultry host. There is also a predicted TonB-dependent iron receptor COG of human-infecting serovars, though its function remains unknown. However, because of its association with human-infecting serovars, it may be an important redundant pathway involved in iron acquisition in the mammalian host.

There are three flagellar biosynthesis and assembly COGs shared between SE and the other human infecting isolates. One gene is located on the flg- operon and two on the

114 fhl- operon. There are known frameshifts in in the flagellar genes of poultry-restricted isolates of S. enterica making them non-motile [18]. Although a complete operon for the tetrathionate reductase complex is present in all human infectious serovars, two of these genes are absent in poultry-restricted serovars [25]. This suggests tetrathionate utilization is most likely important for S. enterica survival in the mammalian gut.

It is known that the SPI-1 and SPI-2 are required for most S. enterica to colonize various hosts and cause disease in poultry and humans [26-28]. SPI-2 is regulated by a number of stimuli, some of which remain unknown; but PhoP/Q, and SsrAB sensor kinase are two well established regulators of this SPI [29]. However, the pan-genome analysis revealed that the COG associated with srfB (SsrAB-activated gene) is only conserved in human infecting isolates. This would suggest one of two scenarios, either srfB is not required for SPI-2 induction in the poultry infecting serovars of Salmonella, or an alternative mechanism activates SPI-2 without the induction of SsrAB within the poultry host.

An understanding of the ‘core’ genome helps to understand the core strategies employed by S. enterica to promote infection and disease. However, as it is known that S. enterica has multiple serovars with different host predilections and disease forms within those hosts. It is most likely that ruling out the core genes and more careful analysis of unique and “shared family

COGs” will help give a more coherent picture as to how S. enterica colonizes and causes disease in multiple hosts. To this end, it is not surprising that many virulence-associated ‘gene families’ identified in Chapter 2 were also found to be conserved amongst all of these different serovars.

The functional relevance of these genes have been previously discussed in Chapter 2.

115 These core genes will also be used in the future for analysis of potentially novel therapeutics and vaccine targets. However, these targets are only hypothetical and would need to show promise in vivo. This is especially true for enteric pathogens as there are many compounding factors with the development of the vaccine including disruption of microbiota, which may lead to more damage [30]. However, we hypothesize that some of these COGs will not only serve as good vaccine candidates but also aid in a better understanding of the molecular mechanisms of SE pathogenesis.

The role of two component signaling systems in signaling and motility, many of in S. enterica are still poorly understood. Because the ‘sensor’ protein is effacing the extracellular space and requires signals from the microenvironment to respond, these may be good candidates for vaccines. One example is the QseB/C regulator. In STym and enterohemmorhagic E. coli, this quorum regulator can be activated by binding type 3 autoinducers and/or norepinephrine to activate virulence genes [31,32]. It may be an evolutionary advantage for S. enterica to sense these molecules which are in high abundance in the GIT (due to the parasympathetic nervous system) [31].

Most of the COGs that identify as outer membrane proteins or adhesins may be good vaccine candidates. Conserved orthologs may be more important than the adhesins with a redundant function as they are under positive selection. Results from Chapter 2 revealed that there are very few nutrient acquisition and transport systems, especially for iron, in SE. Similarly, there are few iron transporter COGs encoded for by S. enterica.

These are most likely essential for the survival of Salmonella within a host [33,34].

Because iroD is present only in S. enterica serovars that can infect humans, the iron binding and receptor genes present in the core virulence genome may play a role in

116 acquiring iron from the host by all Salmonella. One such example is bacterioferritin, which uses molecular mimicry to sequester iron bound to ferritin within the eukaryotic cell. Therefore, vaccines targeting iron sequestration systems can inhibit the bacterial ability to acquire iron and prevent bacterial colonization. Also, siderophores that are specific to receptors on Salmonella (such as salmochelin) can be modified to use as antimicrobial compounds. These modified siderophores will be taken up by the bacterial cell allowing for the compound to inhibit bacterial growth.

This study is novel because it uses the pan-genome analyses to predict virulence genes and to identify potential therapeutic and vaccine targets for a diverse pathogen like

S. enterica. These data will help in identifying the roles of genes with hypothetical or unknown functions improving the current understanding Salmonella pathogenesis.

Finally, this study will address some of the critical knowledge gaps in relation to

Salmonella disease pathogenesis in the poultry and human hosts that will aid in formulating new intervention strategies for disease mitigation.

117

3.6 Author Contributions

Matthew R. Moreau1,2, Yury V. Ivanov1, Bhushan M. Jayarao1,3, and Subhashinie

Kariyawasam1,3

1Veterinary and Biomedical Sciences Department, The Pennsylvania State University

2Pathobiology Graduate Program, The Pennsylvania State University

3Animal Diagnostic Laboratory, Pennsylvania State University

Conceived and Designed Experiments: MRM, YVI, BMJ, SK

Performed Experiments: MRM, YVI

Analyzed Data: MRM

Wrote the Paper: MRM, YVI, BMJ, SK

118

Figure 3.1: Genome Composition Similarity of Different Serovars of Salmonella enterica This Image was created through BRIG. Genomes of all serovars indicated on the right were aligned for gene composition comparison through percent identity. Though there is a great number of areas where the genomes are dissimilar to SEE1 and SEE2, there are many areas where they are very similar. These alignments were used to run the BLAST for the pan-genome analysis.

119

Figure 3.2: Cluster Dendrogram of Salmonella enterica Serovars

120 This figure is a Bootstrap comparison of all serovars tested against SEE1 and SEE2 based on conserved COG families’ presence or absent in each genome. Au-Approximate P-value in percentage for boostrap. Bp-Bootstrap score percentage.

121

Figure 3.3: SNP-Based Phylogenetic Tree of Serovars Compared to SEE1/2 Figure shows relative distance of serovars to SEE1 and SEE2 compared to the other serovars tested in this study. Each serovar is overlaid with the actual number of SNPs in conserved genes labeled in blue.

122

Table 3.1: Strains and Accession Numbers Used in this Study Serovar and strain designations Genome accession# SEE1 This Study SEE2 This Study Enteritidis str. P125109 AM933172.1 Typhimurium str. LT2 AE006468.1 Heidelberg str. SL476 CP001120.1 Choleraesuis str. SC-B67 AE017220.1 Gallinarum str. 287/91 AM933173.1 Pullorum str. S06004 CP006575.1 Typhi str. CT18 AL513382.1 Paratyphi A str. ATCC 9150 CP000026.1 Paratyphi B str. SPB7 CP000886.1

Accession numbers are from NCBI GenBank Database.

123

Table 3.2: Poultry-Infectious Serovar COGs and Genes

Cog# Protein ID Description Gene COG4532: 62178744 fimbriae stiH COG4533: 62178746 fimbrial chaperone stiB COG4534: 62178747 fimbrial subunit stiA COG4554: 62179147 outer membrane protein SEE1_0554 COG4576: 62179496 secreted protein SopD-like protein SEE1_0938 COG4578: 62179603 outer membrane protein SEE1_1013 COG4623: 62180191 chemotaxis protein-ribose- galactose sensor receptor SEE1_1540 COG4717: 62182139 long polar fimbrial operon protein lpfD COG4718: 62182140 long polar fimbrial chaperone lpfB COG4719: 62182141 long polar fimbria lpfA

Table 3.3: Human-Infectious Serovar COGs and Genes

Cog# Protein ID Description Gene COG4526: 162139586 Predicted TonB-Dependent Iron Receptor yncD COG4581: 62179700 flagellar hook-associated protein FlgK flgK COG4600: 62179975 tetrathionate reductase complex subunit C ttrC COG4616: 62180161 ssrAB activated gene srfB COG4640: 62180490 flagellar biosynthesis protein FlhA fhlA COG4641: 62180491 flagellar biosynthesis protein FlhB fhlB COG4656: 62180735 outer membrane protein pegC COG4672: 62181276 enterochelin esterase iroD COG4727: 62182313 hybrid sensory histidine kinase TorS torS COG4761: 62183009 fimbrial chaperone protein sthA COG4843: 62179976 tetrathionate reductase complex subunit B ttrB COG4816: 162139584 secreted effector protein sifB

124

Table 3.4: Signaling and Motility COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1047: 162139591 methyl-accepting chemotaxis protein COG1048: 62178617 transcription regulator, histidine kinase for citrate COG1128: 162139574 chemotaxis regulator CheZ COG1464: 62179085 hemolysin expression-modulating protein COG1845: 62179688 flagellar biosynthesis chaperone COG1846: 62179689 anti-sigma-28 factor FlgM COG1847: 62179690 flagellar basal body P-ring biosynthesis protein FlgA COG1848: 62179691 flagellar basal-body rod protein FlgB COG1849: 62179692 flagellar basal body rod protein FlgC COG1850: 62179693 flagellar basal body rod modification protein COG1851: 62179694 flagellar hook protein FlgE COG1852: 62179695 flagellar basal body rod protein FlgF COG1853: 62179696 flagellar basal body rod protein FlgG COG1854: 62179697 flagellar basal body L-ring protein COG1855: 62179699 flagellar rod assembly protein/muramidase FlgJ COG1856: 62179701 flagellar hook-associated protein FlgL COG1894: 62179751 sensor protein PhoQ COG1895: 62179752 DNA-binding transcriptional regulator PhoP COG2051: 62180058 sensor protein RstB COG2286: 62180498 chemotaxis protein CheA COG2308: 62180529 flagella biosynthesis protein FliZ COG2309: 62180530 flagellar biosynthesis sigma factor COG2310: 62180534 flagellar capping protein COG2311: 62180535 flagellar protein FliS COG2312: 62180536 flagellar biosynthesis protein FliT COG2318: 62180542 flagellar hook-basal body protein FliE COG2319: 62180544 flagellar MS-ring protein COG2320: 62180545 flagellar motor switch protein G COG2321: 62180546 flagellar assembly protein H COG2322: 62180547 flagellum-specific ATP synthase COG2323: 62180548 flagellar biosynthesis chaperone COG2324: 62180549 flagellar hook-length control protein COG2325: 62180551 flagellar motor switch protein FliM COG2326: 62180553 flagellar biosynthesis protein FliO COG2327: 62180554 flagellar biosynthesis protein FliP COG2328: 62180555 flagellar biosynthesis protein FliQ COG2329: 62180556 flagellar biosynthesis protein FliR

125

COG2408: 62180702 signal transduction histidine-protein kinase BaeS COG2409: 62180703 DNA-binding transcriptional regulator BaeR COG2424: 62180744 two-component response-regulatory protein YehT COG2425: 62180745 sensor/kinase in regulatory system COG2521: 62180884 chemotaxis signal transduction protein COG2687: 62181127 transcriptional regulator of two-component regulator protein COG2689: 62181129 sensory kinase in regulatory system COG2736: 62181258 SsrA-binding protein COG2808: 62181371 flagellar biosynthesis/type III secretory pathway protein COG2866: 62181468 hybrid sensory histidine kinase BarA COG3004: 62181691 DNA-binding transcriptional regulator QseB COG3005: 62181692 sensor protein QseC COG3035: 62181731 transcriptional regulator COG3036: 62181732 methyl-accepting chemotaxis protein COG3037: 62181733 aerotaxis sensor receptor COG3234: 62182003 osmolarity sensor protein COG3235: 62182004 osmolarity response regulator COG3546: 62182519 two-component sensor protein COG3547: 62182520 DNA-binding transcriptional regulator CpxR COG3559: 62182538 autoinducer-2 (AI-2) modifying protein LsrG COG3613: 62182624 sensor protein ZraS COG3614: 62182625 transcriptional regulatory protein ZraR COG3653: 62182707 methyl-accepting chemotaxis protein COG3654: 62182708 ABC transporter outer membrane protein COG3656: 62182711 bacteriocin/lantibiotic ABC transporter COG3680: 62182740 sensor protein BasS/PmrB COG3681: 62182741 DNA-binding transcriptional regulator BasR COG3731: 62182823 biofilm stress and motility protein A COG3831: 62183003 DNA-binding response regulator CreB COG3832: 62183004 sensory histidine kinase CreC COG3837: 62183013 two-component response regulator *Blue highlighted COGs are potential vaccine targets

126

Table 3.5: LPS COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1258 62178690 UDP-N-acetylmuramoylalanyl-D-glutamate2,6-diaminopimelate ligase COG1259 62178691 UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase COG1260 62178692 phospho-N-acetylmuramoyl-pentapeptide-transferase COG1261 62178693 UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase COG1262 62178696 UDP-N-acetylmuramate--L-alanine ligase COG1324 62178796 UDP-3-O- COG1325 62178797 (3R)-hydroxymyristoyl-ACP dehydratase COG1326 62178798 UDP-N-acetylglucosamine acyltransferase COG1327 62178799 lipid-A-disaccharide synthase COG1504 62179144 UDP-2,3-diacylglucosamine hydrolase COG2265 62180467 lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase COG2595 62180972 lipid A biosynthesis palmitoleoyl acyltransferase COG3468 62182394 lipopolysaccharide biosynthesis protein WzzE COG3469 62182395 UDP-N-acetyl glucosamine -2-epimerase COG3470 62182397 dTDP-glucose 4,6-dehydratase COG3471 62182399 TDP-fucosamine acetyltransferase COG3472 62182400 TDP-4-oxo-6-deoxy-D-glucose transaminase COG3473 62182401 O-antigen translocase in LPS biosyntesis

127

Table 3.6: OMPs and Adherence COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1178: 229037912 autoagglutination protein COG1193: 62178590 fimbrial subunit COG1194: 62178591 fimbrial chaperone COG1195: 62178592 fimbrial subunit COG1196: 62178593 fimbrial subunit COG1197: 62178595 fimbrial chaperone COG1227: 62178645 outer membrane lipoprotein COG1273: 62178711 type IV pilin biogenesis protein COG1275: 62178713 major pilin subunit COG1308: 62178776 vitamin B12-transporter protein BtuF COG1339: 62178813 outer membrane lipoprotein COG1352: 62178877 adhesin COG1368: 62178952 fimbriae chaperone COG1369: 62178953 fimbriae major subunit COG1374: 62178962 outer membrane lipoprotein COG1390: 62178988 outer membrane lipoprotein COG1509: 62179155 outer membrane usher protein COG1510: 62179156 minor fimbrial subunit COG1511: 62179157 fimbrial protein COG1514: 62179160 fimbrial protein COG1521: 62179178 N-acetyl phenylalanine beta-naphthyl ester-cleaving esterase COG1587: 62179278 outer membrane protein COG1672: 62179399 outer membrane protein X COG1766: 62179526 outer membrane protein 1a (IA;b;f), porin COG1777: 62179587 outer membrane protein COG1781: 62179592 outer membrane protein OmpA COG1798: 62179613 outer protein COG1821: 62179651 outer membrane protein COG1871: 62179726 outer membrane lipoprotein COG1872: 62179727 outer membrane lipoprotein COG1878: 62179734 outer membrane protein COG1906: 62179836 outer membrane lipoprotein COG1959: 62179919 outer membrane protein COG2034: 62180033 outer membrane lipoprotein COG2053: 62180060 outer membrane protein N, non-specific porin COG2082: 62180115 outer membrane protein COG2093: 62180174 outer membrane lipoprotein

128

COG2110: 62180210 outer membrane lipoprotein COG2220: 62180383 outer membrane protein COG2247: 62180433 PhoPQ-activated integral membrane protein COG2294: 62180510 outer membrane lipoprotein COG2419: 62180737 fimbrial-like protein COG2464: 62180800 outer membrane lipoprotein COG2590: 62180966 outer membrane protease COG2652: 62181083 outer membrane protein COG2925: 62181563 outer membrane protein COG3003: 62181690 outer membrane protein COG3134: 62181869 outer membrane protein COG3156: 62181900 outer membrane lipoprotein COG3484: 62182415 outer membrane lipoprotein COG3572: 62182558 outer membrane lipoprotein COG3627: 62182672 outer membrane lipoprotein COG3628: 62182674 outer membrane lipoprotein COG3649: 62182702 outer membrane lipoprotein COG3706: 62182788 outer membrane lipoprotein Blc COG3834: 62183007 fimbrial subunit COG3835: 62183010 fimbrial chaperone protein

*Blue highlighted COGs are potential vaccine targets

129

Table 3.7: Transporter COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1159: 162139617 iron-hydroxamate transporter ATP-binding subunit COG1308: 62178776 vitamin B12-transporter protein BtuF COG1379: 62178976 transporter COG1389: 62178987 transporter COG1528: 62179187 enterobactin/ferric enterobactin esterase COG1530: 62179191 iron-enterobactin transporter ATP-binding protein COG1531: 62179193 iron-enterobactin transporter membrane protein COG1532: 62179194 enterobactin exporter EntS COG1533: 62179195 enterobactin transporter periplasmic binding protein COG2293: 62180508 ferritin COG2801: 62181364 iron transporter: fur regulated COG2802: 62181365 iron transporter: fur regulated COG2803: 62181366 iron transporter: fur regulated COG2804: 62181367 iron transporter: fur regulated COG3237: 62182007 ferrous iron transport protein A COG3238: 62182008 ferrous iron transport protein B COG3477: 62182405 transporter COG3654: 62182708 ABC transporter outer membrane protein COG3656: 62182711 bacteriocin/lantibiotic ABC transporter *Blue highlighted COGs are potential vaccine targets

130

Table 3.8: SPI- COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1795: 62179609 pathogenicity island encoded protein: SPI3 COG1796: 62179610 pathogenicity island encoded protein: SPI3 COG1988: 62179981 MerR family transcriptional regulator COG1989: 62179982 secretion system transcriptonal activator COG1990: 62179983 secretion system regulator:sensor component COG1991: 62179985 secretion system apparatus protein SsaC COG1992: 62179986 secretion system apparatus protein SsaD COG1993: 62179987 secretion system effector protein SsaE COG1994: 62179989 secretion system effector protein SseB COG1995: 62179990 secretion system chaperone protein SscA COG1996: 62179991 secretion system effector protein SseC COG1997: 62179992 secretion system effector protein SseD COG1998: 62179993 secretion system effector SseE COG1999: 62179994 secretion system chaperone protein SscB COG2000: 62179995 secretion system effector protein SseF COG2001: 62179996 secretion system effector protein SseG COG2002: 62179997 secretion system apparatus protein SsaG COG2003: 62179998 secretion system apparatus protein SsaH COG2004: 62179999 secretion system apparatus protein SsaI COG2005: 62180000 secretion system apparatus protein SsaJ COG2006: 62180001 hypothetical protein SC1431 COG2007: 62180002 secretion system apparatus protein SsaK COG2008: 62180003 secretion system apparatus protein SsaL COG2009: 62180004 secretion system apparatus protein SsaM COG2010: 62180005 secretion system apparatus protein SsaV COG2011: 62180006 type III secretion system ATPase COG2012: 62180007 secretion system apparatus protein SsaO COG2013: 62180008 secretion system apparatus protein SsaP COG2014: 62180009 type III secretion system protein COG2015: 62180010 type III secretion system protein COG2016: 62180011 secretion system apparatus protein SsaS COG2017: 62180012 secretion system apparatus protein SsaT COG2018: 62180013 secretion system apparatus protein SsaU COG2810: 62181373 cell invasion protein COG2811: 62181374 cell invasion protein COG2812: 62181375 cell invasion protein COG2813: 62181376 cell invasion protein

131

COG2814: 62181377 regulatory protein COG2815: 62181378 invasion protein regulator COG2816: 62181379 cell invasion protein COG2817: 62181381 virulence associated chaperone COG2818: 62181383 acyl carrier protein COG2819: 62181384 cell invasion protein COG2820: 62181385 cell invasion protein COG2821: 62181386 cell invasion protein COG2822: 62181387 cell invasion protein COG2823: 62181388 surface presentation of antigens; secretory proteins COG2824: 62181389 surface presentation of antigens protein SpaS COG2825: 62181390 surface presentation of antigens; secretory proteins COG2826: 62181391 surface presentation of antigens; secretory proteins COG2827: 62181392 surface presentation of antigens protein SpaP COG2828: 62181393 surface presentation of antigens protein SpaO COG2829: 62181394 surface presentation of antigens; secretory proteins COG2830: 62181395 surface presentation of antigens; secretory proteins COG2831: 62181396 ATP synthase SpaL COG2832: 62181397 surface presentation of antigens; secretory proteins COG2833: 62181398 invasion protein COG2834: 62181399 invasion protein COG2835: 62181400 invasion protein; outer membrane COG2836: 62181401 invasion protein COG2837: 62181402 invasion protein *Blue highlighted COGs are potential vaccine targets

132

Table 3.9: Misc. Virulence-Associated COGs from Core With Potential Vaccine Targets COG# Protein ID Description COG1647: 62179362 biotin synthetase COG1724: 62179465 virK COG1844: 62179687 virulence factor COG1888: 62179744 secreted effector protein COG1903: 62179822 macrophage survival gene; reduced mouse virulence COG1923: 62179856 hemolysin COG1965: 62179928 integration host factor subunit alpha COG1985: 62179978 tetrathionate reductase complex: response regulator COG2025: 62180020 superoxide dismutase COG2031: 62180029 superoxide dismutase COG2330: 62180557 capsular/exo- polysaccharide synthesis transcriptional regulator COG2390: 62180673 colanic acid exporter COG2391: 62180677 glycosyl transferase family protein COG2392: 62180678 glycosyl transferase in colanic acid biosynthesis COG2393: 62180680 GDP-D-mannose dehydratase COG2394: 62180681 colanic acid biosynthesis acetyltransferase WcaF COG2395: 62180682 glycosyl transferase family protein COG2396: 62180684 glycosyl transferase family protein COG2397: 62180685 colanic acid biosynthesis acetyltransferase WcaB COG2623: 62181027 transport protein in ethanolamine utilization COG2627: 62181034 ethanolamine utilization protein COG2628: 62181035 ethanolamine utilization protein COG2738: 62181265 HlyD family secretion protein COG2742: 62181282 virulence protein VirK COG3544: 62182516 superoxide dismutase COG3731: 62182823 biofilm stress and motility protein A *Blue highlighted COGs are potential vaccine targets

133

3.7 References

1. Foley SL, Johnson TJ, Ricke SC, Nayak R, Danzeisen J (2013) Salmonella Pathogenicity and Host Adaptation in Chicken-Associated Serovars. Microbiology and Molecular Biology Reviews 77: 582-607. 2. Behnsen J, Perez-Lopez A, Nuccio S-P, Raffatellu M (2015) Exploiting host immunity: the Salmonella paradigm. Trends in Immunology 36: 112-120. 3. Desai PT, Porwollik S, Long F, Cheng P, Wollam A, et al. (2013) Evolutionary Genomics of Salmonella enterica Subspecies. mBio 4. 4. CDC (2010) Salmonella Serotype Enteritidis. National Center for Emerging and Zoonotic Infectious Diseases. 5. CDC (2011) Vital Signs: Incidence and Trends of Infection with Pathogens Transmitted Commonly Through Food --- Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 1996--2010. 6. Majowicz SE, Musto J, Scallan E, Angulo FJ, Kirk M, et al. (2010) The Global Burden of Nontyphoidal Salmonella Gastroenteritis. Clinical Infectious Diseases 50: 882-889. 7. Bhan MK, Bahl R, Bhatnagar S (2005) Typhoid and Paratyphoid Fever. The Lancet 366: 749- 762. 8. Acheson D, Hohmann EL (2001) Nontyphoidal Salmonellosis. Clinical Infectious Diseases 32: 263-269. 9. Gordon MA (2011) Invasive Non-typhoidal Salmonella Disease – epidemiology, pathogenesis and diagnosis. Current opinion in infectious diseases 24: 484-489. 10. Uzzau S, Brown DJ, Wallis T, Rubino S, Leori G, et al. (2000) Host Adapted Serotypes of Salmonella enterica. Epidemiol Infect 125: 229-255. 11. Gast RK, Guraya R, Guard-Bouldin J, Holt PS, Moore RW (2007) Colonization of Specific Regions of the Reproductive Tract and Deposition at Different Locations Inside Eggs Laid by Hens Infected with Salmonella Enteritidis or Salmonella Heidelberg. . Avian Diseases 51: 40-44. 12. Foley SL, Nayak R, Hanning IB, Johnson TJ, Han J, et al. (2011) Population Dynamics of Salmonella enterica Serotypes in Commercial Egg and Poultry Production. Applied and Environmental Microbiology 77: 4273-4279. 13. Vieira Aea (2009) A Resource to Link Human and Non-Human Sources of Salmonella. WHO Global Foodborne Infections Network Country Databank 14. Chai SJ, White PL, Lathrop SL, Solghan SM, Medus C, et al. (2012) Salmonella enterica Serotype Enteritidis: Increasing Incidence of Domestically Acquired Infections. Clinical Infectious Diseases 54: S488-S497. 15. Ebel E, Schlosser W (2000) Estimating the annual fraction of eggs contaminated with Salmonella enteritidis in the United States. International Journal of Food Microbiology 61: 51-62. 16. VAN IMMERSEEL F, MEULEMANS L, DE BUCK J, PASMANS F, VELGE P, et al. (2004) Bacteria–Host Interactions of Salmonella Paratyphi B dT+ in Poultry. Epidemiology & Infection 132: 239-243. 17. Suez J, Porwollik S, Dagan A, Marzel A, Schorr YI, et al. (2013) Virulence Gene Profiling and Pathogenicity Characterization of Non-Typhoidal Salmonella Accounted for Invasive Disease in Humans. PLoS ONE 8: e58449.

134 18. Thomson NR, Clayton DJ, Windhorst D, Vernikos G, Davidson S, et al. (2008) Comparative genome analysis of Salmonella Enteritidis PT4 and Salmonella Gallinarum 287/91 provides insights into evolutionary and host adaptation pathways. Genome Research 18: 1624-1637. 19. Suar M, Jantsch J, Hapfelmeier S, Kremer M, Stallmach T, et al. (2006) Virulence of Broad- and Narrow-Host-Range Salmonella enterica Serovars in the Streptomycin-Pretreated Mouse Model. Infection and Immunity 74: 632-644. 20. SF Altschul, Gish W, Miller W, Myers E, Lipman D (1990) Basic Local Alignment Search Tool. J Mol Biol 215: 403-410. 21. Chen F, Mackey AJ, Stoeckert CJ, Roos DS (2006) OrthoMCL-DB: Querying a Comprehensive Multi-Species Collection of Ortholog Groups. Nucleic Acids Research 34: D363-D368. 22. Holden DW (2002) Trafficking of the Salmonella Vacuole in Macrophages. Traffic 3: 161- 169. 23. Ledeboer NA, Frye JG, McClelland M, Jones BD (2006) Salmonella enterica Serovar Typhimurium Requires the Lpf, Pef, and Tafi Fimbriae for Biofilm Formation on HEp-2 Tissue Culture Cells and Chicken Intestinal Epithelium. Infection and Immunity 74: 3156-3169. 24. Müller S, Valdebenito M, Hantke K (2009) Salmochelin, The Long-Overlooked Catecholate Siderophore of Salmonella. BioMetals 22: 691-695. 25. Lopez CA, Winter SE, Rivera-Chávez F, Xavier MN, Poon V, et al. (2012) Phage-Mediated Acquisition of a Type III Secreted Effector Protein Boosts Growth of Salmonella by Nitrate Respiration. mBio 3: e00143-00112. 26. Rychlik I, Karasova D, Sebkova A, Volf J, Sisak F, et al. (2009) Virulence potential of five major pathogenicity islands (SPI-1 to SPI-5) of Salmonella enterica serovar Enteritidis for chickens. BMC Microbiology 9: 268. 27. Phoebe Lostroh C, Lee CA (2001) The Salmonella Pathogenicity Island-1 Type III Secretion System. Microbes and Infection 3: 1281-1291. 28. Waterman SR, Holden DW (2003) Functions and Effectors of the Salmonella Pathogenicity Island 2 Type III Secretion System. Cellular Microbiology 5: 501-511. 29. Xu X, Hensel M (2010) Systematic Analysis of the SsrAB Virulon of Salmonella enterica. Infection and Immunity 78: 49-58. 30. Chassaing B, Kumar M, Baker M, Singh V, Vijay-Kumar M (2014) Mammalian gut immunity. 246-258 p. 31. Sperandio V, Torres AG, Kaper JB (2002) Quorum sensing Escherichia coli regulators B and C (QseBC): A Novel Two-Component Regulatory System Involved in the Regulation of Flagella and Motility by Quorum Sensing in E. coli. Molecular Microbiology 43: 809- 821. 32. Moreira CG, Weinshenker D, Sperandio V (2010) QseC Mediates Salmonella enterica Serovar Typhimurium Virulence In Vitro and In Vivo. Infection and Immunity 78: 914- 926. 33. Santos RL (2014) Pathobiology of Salmonella, Intestinal Microbiota, and the Host innate Immune Response. Frontiers in Immunology 5. 34. Raffatellu M, Bäumler AJ (2010) Salmonella's iron armor for battling the host and its microbiota. Gut Microbes 1: 70-72.

Chapter 4

Growth in Egg Yolk Enhances Salmonella Enteritidis Colonization and Virulence in a Mouse Model of Salmonella Colitis

136

4.1 Abstract

Salmonella Enteritidis (SE) is one of the major bacterial agents that cause foodborne illness in the

US and many other countries. The cause of infection is eating improperly prepared or stored food contaminated with SE with eggs and egg products being the most common sources. Despite this high association between eggs and human SE infection, the effect of the microenvironment of eggs on SE virulence is currently unknown. In this study, we examined the virulence of SE grown in egg yolk as compared to SE grown in Luria Bertani broth (LB) and SE passed through mice using a streptomycin-treated mouse model of human SE infection. Mice infected with SE grown in yolk displayed more severe disease as determined by mortalities and clinical signs than the mice infected with SE from other two sources. Furthermore, the SE grown in yolk displayed higher rates of colonization in both intestinal and extra-intestinal tissues in infected mice than the other two groups. Overall, these results suggest that reservoir-pathogen dynamics may be critical in determining the ability of SE to establish both colonization and priming for virulence potential.

As a conclusion, this study indicated that the source of SE may contribute to the overall disease pathogenesis in a second host.

137

4.2 Introduction

Salmonella enterica serovar Enteritidis (SE) is a Gram-negative, peritrichous, facultative anaerobic enteric bacterial pathogen which is the most common cause of foodborne salmonellosis in the world and as such is a major driving force of foodborne morbidity and mortality [1-3].

Despite many preventative methods; SE remains an important pathogen for many species including humans, broiler chickens, and smaller mammals (such as hedgehogs and mice) [4-6].

Over the last two decades the number of cases and outbreaks of SE have been on a steady incline reaching upwards of an increase of 44% in the United States alone [7]. Salmonella Enteritidis contains many genes present in other Salmonella enterica serovars such as Typhimurium important for pathogenesis [3,8].

In most cases, SE causes a self-limiting gastroenteritis which presents with signs, such as watery diarrhea, abdominal cramps and dehydration [1,3,9]. However, immunocompromised individuals as well as extremely young and elderly individuals are at high risk for systemic spread of the bacteria to internal organs and complications from dehydration. If left untreated, these complications can potentially be fatal. The most common source of infection is SE is eggs and egg products contaminated with SE followed by poultry, pork, beef, and dairy. It is estimated approximately 1:20,000 eggs is contaminated with SE. With ~65 billion eggs produced each year in the U.S. alone this equals to about 3.25 million SE contaminated shell eggs in the market and eggs diverted for pasteurization. In the 1970’s surveillance systems and stringent cleaning practices in commercial layer operations helped limit the number of illnesses due to egg shell- contamination; however, illnesses due to t internally infected eggs are currently in the rise [9].

Laying hens themselves can be infected via the fecal-oral route [10,11]. It has become well established that hens can pass SE to one another through fecal shedding after initial

138 inoculation. Despite the high prevalence of SE in the environment and in animals that are in constant contact with humans, the single most important source of SE infections of people remains to be shell eggs. If hens are infected with a disseminated infection where SE colonizes the reproductive tract, depending on where along the tract the SE has colonized will give rise to spatial deposition of SE in the developing egg prior to it being laid [11,12]. If this is the case for a breeder hen, the SE can be passed to the chick and reside within the newborn after hatching.

Eggs can also be infected post-lay if they come into contact with SE in the vagina or feces

[13,14]. Unlike the serovar Salmonella Typhi, there is no significant evidence that human-human spread of SE occurs. The question that remains is why shell eggs and their products remain the major source of SE transmission to humans?

In most studies of pathogens, the main focus has been on the host or the reservoir with little focus on the transmission event. It has been shown that the pulsed-field gel electrophoresis

(PFGE) profile JEGX01.0004 is the most common PFGE type of SE associated with human infection [15]. Not surprisingly, many of the isolates of SE of this PFGE type have been from clinical samples and genome sequencing and comparative genomics reveal very little diversity between the isolates tested. Much of the diversity that was found was in the form of phage and plasmids [15,16]. Not coincidentally, one study showed that this PFGE pattern was found in 99% of all poultry samples tested, thus establishing another link between SE infection of hens and human disease [17]. A virulence gene profile analysis revealed that this PFGE type contains multiple two-component regulatory systems and quorum sensing systems. In light of this, it can be speculated that SE uses its complement of sensors to identify the differences in the microenvironment and respond accordingly. A phenotypic microarray experiment has suggested the nutrients within the egg yolk may trigger the expression of certain virulence genes through linkage with certain metabolic pathways [18]. Furthermore a study performed by Gantois and colleagues indicated that growth inside the egg causes SE to express pef- fimbriae in S.

139 Typhimurium, which had been previously shown to mediate, in part, the colonization of the mouse gut [19].

Considering all of this, the study presented here examined the impact of egg yolk on SE virulence to a second host. The overall pathogenesis and colonization potential of SE grown under three different conditions (grown in egg yolk, Luria Bertani (LB) broth or SE recovered from the feces of mice infected with SE) were assessed in vivo using a mouse colitis model of human foodborne Salmonella infection. This will provide insight into the genetic and phenotypic states of SE within the reservoir or source during the process of transmission to a second host, which may define the differences in the overall disease outcome exhibited by the host. This study has far-reaching implications as it relates to host-pathogen dynamics in relation to the transmission event and may help us better understand the population genetics, epidemiology, and virulence strategies employed by SE.

4.3 Materials and Methods

4.3.1 Bacteria, Media Used and Innocula. Two recently characterized shell egg isolates of

Salmonella Enteritidis, SEE1 and SEE2 were the only two bacteria used in this study. The media used to grow the bacteria were Luria Bertani (LB) broth and LB agar (LA) (both from BD

Biosciences, Sparks, MD). To collect SEE1 from feces, feces from each mouse was collected separately, then pooled by the experimental group, and homogenized before plating. An aliquot was plated on Xylose lysine deoxycholate agar (XLD) agar without antibiotics (Remel, Lenexa,

KS) to count colony forming units (cfu) of each inoculum. Egg yolk was separated from egg albumen and placed in a sterile 45 mL Falcon conical tube. Egg yolk was inoculated with a

140 single colony of SEE1 or SEE2 followed by incubation for 24 hr at 37oC, and serial dilutions were plated on XLD agar without antibiotics. Phosphate-buffered saline (PBS) was used as the diluent to make serial dilutions, to prepare bacterial preparations for mouse inoculations, and for homogenization.

4.3.2 Animal Experiments. All animal experiments were approved by the Institutional Animal

Care and Use Committee (IACUC) committee of the Pennsylvania State University (University

Park, PA). All of the procedures and experiments have been described previously with a few exceptions [20,21]. Briefly, C57/BL6 mice were treated with 20 mg/50 l/mouse of

Streptomycin by oral gavage. Each mouse was then inoculated with approximately 5x107 cells of

SEE1 or SEE2 in 50 μl PBS (SE from mouse feces or LB) or in 50 μl of egg yolk by oral gavage,

24 hours after streptomycin treatment. Mice treated with PBS, egg yolk, or mouse feces in PBS served as negative controls. Mice were observed for 48 hours then sacrificed by CO2 asphyxiation to perform necropsy and to collect samples. The small intestines and large intestines (including cecum processed separately), liver, and spleen were then resected and sectioned for enumeration by homogenization in Stomacher 80 bags with a 500 μm pore size filter (Seward, United Kingdom). Tissue sections were placed in 10% neutral buffered formalin for histopathology. Tissues were pooled and weighed by group, homogenized with 1 mL of sterile PBS, serially diluted and plated onto XLD agar in triplicate. Pooled tissues were weighed to normalize to CFU/g of organ. Feces were taken just prior to euthanasia by stripping from the large intestine and homogenized in PBS with a hand homogenizer. These samples were then serially diluted and plated in triplicate. Each group had a minimum of 3 mice and each experiment had a minimum of two replicates. Bacteria were plated onto XLD agar with no antibiotics and black colonies were selected for counting. All statistics were performed using

GraphPad Prism 5 software. Data collected were then analyzed by either One-Way ANOVA with Tukey’s pairwise comparison post-test for SE counts in each tissue type and Kruskal-Wallis

141 coupled to a Dunn’s pairwise comparison post-test for fecal counts of SE. The statistically significant difference was set at p<0.05.

4.3.3 Histopathology. Histopathology slides were formalin-fixed, paraffin embedded (FFPE) and stained with hematoxylin and eosin (H&E). Prepared slides were then read by a laboratory animal veterinarian with training in pathology (Dr. Mary Kennett), for inflammatory histopathological scoring of each tissue except the spleen (as it was too small to section). Scores ranged from 0-4 in three main areas of observation; inflammation, mucosal erosion, and distension. Each score from each mouse was added and normalized to the average score of its respective control group.

4.4 Results

4.4.1 SE Grown in Yolk Displays Increased Colonization of the GIT

First, it was established that the mouse model of human infection of SE previously used by others could be replicated. As shown in Figure 4.1, SEE1 grown under all three conditions were able to produce an infection in mice with CFU/g of SE. These results also demonstrated the

SEE1 which was isolated from shell eggs and belonging to PFGE type JEGX01.0004 was able to infect mice similar to the SE strain P125109 which was used to develop the mouse infection model [22]. It was also evident that intraspecies passage of SE can occur, at least, in the mouse.

Previous studies also have demonstrated the intraspecies passage of SE using mouse models, but these studies have used intraperitoneal (Ip) route of SE administration oppose to oral gavage, which is the natural route of SE infection in human foodborne illness [23]. In fact, the infection from mouse feces isolated SE was also as successful as the LB grown SE with no significant

142 difference in in colonization of the GIT tissues between these two groups (Figure 4.1). These results indicated that, compared to the SE isolated from mouse feces (a test of intraspecies transmission) and LB (control), SE grown in yolk displayed a significant increase in colonization of small and large intestines. The bacterial counts in the large intestines, including the cecum, were approximately 1-log higher for SE grown in egg yolk as compared to SE grown in LB or isolated from mouse feces. The greatest difference in colonization was observed in the small intestine which was approximately a 2-log difference.

4.4.2 SE From Yolk Increases Fecal Shedding

Unlike the tissue samples that were pooled over multiple experiments, fecal counts of SE were collected and enumerated by each mouse allowing for correlation between the pooled results and the individual mouse results. Fecal shedding of SE is often used as a measure of GIT colonization. As was with the GIT colonization, fecal shedding of SEE grown in egg yolk also displayed increase in fecal shed rate of SE as measured by CFU/g (Figure 4.1). However, in comparison to GIT colonization, there was no statistically significant difference in bacterial shedding in feces between the groups infected with SEE1 isolated from mouse feces and SEE1 grown in egg yolk, despite the mice infected with SEE1 grown in egg yolk shed approximately 1- log greater CFU/g of SEE1 in feces. Also, SEE1 grown in feces was not significantly different from SEE1 grown in LB which was approximately 2-logs lower than SEE1 grown in egg yolk.

Taken together, there is a clear pattern of colonization and fecal shedding on account of SEE1 being grown in egg yolk. Thus, it has been considered that growth in yolk has increased fecal shedding of SE compared to the other two groups of SEE1.

143 4.4.3 SEE1 From Yolk and Mouse Show Variable Dissemination

In order to understand the complete picture of colonization pattern of SE and the effect of source on systemic spread of SE, bacterial counts were also taken from the liver and the spleen.

Livers from mice infected with SEE1 grown in yolk and SEE1 isolated from mouse feces showed higher bacterial counts in the liver as compared to the mice infected with SEE1 grown in LB

(Figure 4.2). However, in the spleen, the mice infected with SEE1 grown in yolk had higher bacterial counts than both fecal and LB grown SEE1-infected groups. As to why there were differences between the two major organs of dissemination it is not currently known. A previous experiment by Mastroeni and colleagues revealed that Salmonella enterica serovar Typhimurium

(STym) passage through mice from the liver and delivered via ip injection increased its colonization in the liver and spleen [23]. Because this is a different organism and a different route of infection (oral gavage vs. ip) it is hard to draw conclusions based on these findings and previous work, however it cannot be ruled out that SE from yolk enhances total dissemination and the mouse-to-mouse transmission may have its limits.

4.4.4 Growth in Egg Yolk Enhances SE Virulence in Vivo

The first measure of the difference in virulence was the clinical presentation of the mice

48 hours after inoculation. Mice infected with yolk grown SEE1 or SEE2 both displayed signs of significant distress including being extremely moribund. As Figure 4.3 shows, the mice inoculated with SEE1 or SEE2 showed greater gross lesions compared to the mice in the control groups or infected with SEE1 or SEE2 grown in LB or isolated from mouse feces. The gross pathology revealed that SEE1 and SEE2 grown in yolk resulted in mice having greater accumulation of fluid in the form of and perfusion in the GIT. Mice infected with egg

144 yolk-grown SE displayed signs of severe necrosis in many of the tissues including the liver and spleen, which was present in the groups infected with SE grown in LB or SE isolated from the mouse feces. Signs of toxemia and systemic disease, such as necrosis of the gallbladder and yellow discharge in the GI cavity commonly associated with sepsis were only found in mice infected with SE grown in egg yolk. All mice displayed evidence of splenomegaly; however spleens from mice infected with SE from egg yolk showed a higher degree of necrosis as compared to the other groups (Data not shown).

The final measure of virulence was measured by total histopathology of the tissues of the intestines and the liver. Mice infected with SEE1 grown in yolk showed greater total histopathology in the cecum (the organ most often used for histopathological scoring for SE mouse model infections) compared to mice infected with SEE1 from LB or mouse feces (Figure

4.4). Taken together, these data suggest that growth in yolk enhances the virulence of SE and its ability to cause severe disease in a mammalian host.

4.5 Conclusions and Discussion

For bacteria like SE which can infect multiple hosts, there can be a multitude of sources by which hosts can become infected. Perhaps this is one of the reasons for high incidence rate of human foodborne outbreaks due to SE [7]. Despite the multiple reservoirs and sources of SE, it is not known if SE virulence undergoes evolutionary changes in its various microenvironments

(eggs, human intestines, mouse intestines, etc.), which may influence the disease outcome in a second host after SE transmission from one host to the other. In this study we examined if SE changed its virulence in three different microenvironments: LB (laboratory media), egg yolk

(shell eggs) and mouse feces (mouse or human intestines). The results of this study provides the first evidence for microevolution of SE in response to its environment and in particular, enhanced

145 virulence potential of SE in egg yolk, which is the primary niche of SE in shell eggs following vertical transmission of SE. It is expected, this study will add new knowledge to the current understanding of SE pathogenesis and will provide the basis for future studies of functional transcriptomics to identify the genes/gene products and their regulation involved in SE transmission between hosts, SE persistence in different food vehicles, and disease pathology in different hosts.

Upregulation of genes involved in infection prior to the time of transmission would most likely facilitate the pathogenic process of bacteria. The ability of a bacterium to establish successful colonization and cause disease in the host upon acquiring it from a particular source may be due to the differential expression and regulation of a specific set of genes during growth within that source. Accordingly, any change in SE at the genetic level inside the egg prior to infecting the human host, could help successful colonization of SE in the human gut with subsequent tissue damage and diarrhea. However, if SE is acquired through another source (e.g. peanuts or cheese), SE may either not be “programmed” to cause disease or be “programmed” with a different set of genes resulting in a different disease outcome. .

To date, the streptomycin-treated mouse colitis model is the best characterized experimental model system for human infection of SE [20,24]. This study shows convincing evidence that source plays an important role in the disease phenotype caused by SE, even though further studies are needed to determine the molecular mechanisms underlying this observation.

From these data, it can be concluded that SE has a fitness advantage when grown in yolk as opposed to mouse feces or LB, in relationship to the colonization and virulence in a second host such as the mouse in this study. From these data we speculate that this increase in virulence, colonization and fecal shedding of SEE1 may be the result of expression of genes prior to infection that mediate enhanced colonization kinetics and/or virulence, thus allowing for enhanced growth and tissue damage. It has been shown previously that there is a fitness cost to

146 STym in expression of virulence factors such as the T3SS-1 (of SPI-1) in nutrient limiting conditions. This may be because many virulence factors and virulence-associated factors require a lot of energy for their synthesis and function, as is the case with the amount of ATP required to operate the type 3 (T3SS) and type 1 secretion systems (T1SS) [25]. Other experiments have shown through phenotypic microarray that STym is very metabolically active in the egg yolk

[26]. Previous studies have also shown that extensive growth is found within the egg yolk which is a very nutrient rich source [12].

In nutrient rich conditions, generally, the expression of virulence factors is decreased.

However, our data seem to conflict with this paradigm as SE grew extremely well in the egg yolk and appeared to have increased fitness in the mouse host in terms of both colonization and disease outcomes. This most likely is an effect from the yolk as the same phenotypes in mice could not be replicated by SE grown in LB (a nutrient rich medium). Pathogens cause damage not only to establish colonization through niche space creation or through invasion, but also to acquire nutrients under harsh conditions presented by most hosts. So the question remains to be answered is; why does SE increase its virulence when it can grow easily to high numbers in the egg yolk?

The answer may lie in two important aspects that need to be addressed in the future; (i) the regulation of virulence factors that are involved in SE’s pathogenic strategy in another host to ensure its survival and subsequent spread to other hosts, and (ii) the ability of the nutrients in yolk to provide the energy required for producing and functioning of virulence factors to prime SE for nutrient scarce environments such as the mammalian host. One other important point in the context of evolution is how these genes are under selection to allow for SE to persist in such a broad host range. This has been addressed to a certain extent by Sturm and colleagues who showed that T3SS-1 can be induced by low oxygen environments [25]. However, they acknowledged that there are many other cues which may also induce SPI-1, and one such cue may be the microenvironment of the egg.

147 SPI-1 and T3SS-1 have been linked to the success of the ability of SE to colonize the GIT of many hosts including poultry and humans and defects within these genes significantly inhibit

SE colonization [27,28]. Coupling this to the finding that SPI-1 responds to low oxygen tension and the anaerobic environment of the GIT, there is a biologically relevant mechanism presented here. Although oxygen tension may be a signal for SPI-1 activation, it does not rule in or out that oxygen sensing may be epistatic to all other incoming signals. Considering the role of SPI-1 in

SE invasion of GIT epithelial cells, and the fitness cost of SE to produce SPI-1 gene products, SE should express SPI genes only when they are required (such as in the GIT) [25]. The dense liquid of the egg yolk is also extremely limited in oxygen thus presenting an intriguing scenario. The lack of oxygen tension in the egg yolk may couple to other signals either in the egg yolk or from the bacteria growing in the egg yolk and the resulting expression of SPI-1 will then depend on the epistatic relationship of those incoming signals. For example, it has also been shown that SPI-1 is regulated through autoinducer (AI) type 1 and 2 through quorum sensing [29,30]. Since SE grew to greater than 109 CFU/mL in egg yolk, there would be an abundance of AIs in the media, creating a scenario where SPI-1 can be synergistically upregulated through all the incoming signals of AI-1, AI-2 and low oxygen tension. Thus, it can be speculated that SPI-1 and all other co-regulated genes (such as those on SPI-4) may also be induced within the egg yolk as shown in

Figure 4.5 [31].

One protein that has a dual function as it relates to SE virulence is AvrA, encoded by the most upstream gene of SPI-1. AvrA is an anti-NF-κB protein that prevents both the epithelial cell cytokine production and the anti-apoptosis pathways [32]. If this gene is overexpressed prior to infection, it could cause an added delay in the immune response thus allowing for enhanced colonization. With many SE isolates having genes that encode for different types of exotoxins and effectors which can damage host cells these would only enhance this phenotype. The energy provided by the egg yolk will also allow for the production of any other basally expressed

148 virulence gene, thus allowing SE access to all of the genes and gene products that may aid in facilitating the disease process. Taken together, it is likely that SE would receive the low oxygen tension, AI signaling, and possibly other signals from the microenvironment and have the energy stores to continually produce these virulence-associated gene products from the egg yolk, thus allowing for enhanced colonization and virulence in vivo. After colonization had been established and damage begun, the transcriptional program of SE would most likely adapt to the change in microenvironment, which should only perpetuate these phenotypes through signaling and expression of other factors, such as the presence of norepinephrine from the mesenteric nerve in the GIT [33].

Given that there are over 600 genes identified as being virulence-associated within the genomes of SEE1 and SEE2 (Chapter 2), and their similarity to other clinical isolates of SE belonging the PFGE type JEGX01.0004, it is possible that there are other genes under regulation here. Many signals, such as quorum sensing, nutrients, and biochemical pathways could all play a critical role within the egg yolk, especially if the bacterial numbers reach a critical threshold before deprivation of the nutrients. There may be other metabolites and/or nutrients within the egg yolk may also be involved in regulation of virulence genes as well. However, these are only hypothetical scenarios at this point, as most of the aspects of regulation of SE virulence have yet to be elucidated.

This study shows that the source can influence the outcome of an infection as indicated by SE growth in egg yolk enhancing its ability to colonize and cause more severe disease in a mouse model of human infection. Future research on transmission dynamics of SE from one host to the other may help understand not only the disease kinetics of many different pathogens based on their transcriptional programs in one host (or source) prior to infection of a second host but also may help to answer questions surrounding the epidemiology of certain pathogens. Finally,

149 this study will strengthen the current understanding of the pathogenic mechanisms employed by

SE and SE like-enteric pathogens to cause infection and/or disease in their corresponding host/s.

150

4.6 Author Contributions

Matthew R. Moreau1,2, Megan L. Bailey1,2, Sudharsan R. Gongati1,2, Dona Saumya S. Wijetunge1,

Eranda Mangala K. Kurundu Hewage3, Mary J. Kennett1,4, Yury V. Ivanov1, Bhushan M.

Jayarao1,5 and Subhashinie Kariyawasam1,5.

1Department of Veterinary and Biomedical Sciences, Pennsylvania State University

2Pathobioloy Graduate Program, Pennsylvania State University

3Department of Food Science, Pennsylvania State University

4Centralized Biological Laboratory, Pennsylvania State University

5Animal Diagnostic Laboratory, Pennsylvania State University

Conceived and Designed Experiments: MRM, BMJ, SK

Performed Experiments: MRM, MLB, SRG, DSSW, EMKKH, MJK, YVI

Analyzed Data: MRM, MLB, SRG, MJK, SK, BJM

Wrote the Paper: MRM, SK, BMJ

151

Figure 4.1: Effect of Source on Colonization and Fecal Shedding of SE In Vivo. Small intestines, large intestines, cecum (processed separate from the rest of the large intestines) and feces of mice were harvested and bacteria were enumerated on XLD agar. Bacterial counts were statistically analyzed by 1-Way ANOVA and Tukey’s Pairwise comparison post-test (for organs) and Kruskal-Wallis ANOVA and Dunn’s pairwise comparison post-test for feces. Significant differences indicated by * with p<0.05. Y – Egg yolk grown SE, M - Mouse passed SE, and LB - SE grown in Luria Bertani broth.

152

Figure 4.2: Dissemination of SE Grown in Various Sources. SE grown in either Y – Egg yolk, M - Mouse passed, or LB – SE grown in Luria Bertani broth, were also examined for the ability to disseminate into the extra-intestinal organs, the liver and spleen. Bacteria from each tissue type were enumerated on XLD plates and the bacterial counts were subjected to 1-way ANOVA with Tukey’s post-test Pairwise Comparison with p<0.05 indicating significance by *.

153

A B C

D E F

Figure 4.3: Necropsy Evidence of Enhanced Virulence of Egg Yolk-Grown SEE1 and SEE2. Representative photographs taken from an array of mice at the time of necropsy. The first column represents mice inoculated with SE grown in Luria Bertani Broth, second column represents mice inoculated with SE grown in egg yolk, and third column represents mice inoculated with SE isolated from the feces of mice experimentally infected with SE. Panels A-C: SEE1 and Panels D-F: SEE2.

154

Figure 4.4: Average Total Histopathological Scores of SEE1 in the Cecum. This graph represents the average total histopathology scores for 3 mice per source group. As the legend shows, the blue column represents SE grown in egg yolk, red for SE grown in Luria Bertani Broth, and green for SE isolated from mouse feces post-infection with SE. Error bars represent standard error of the mean. The averages were calculated as detailed in the Materials and Methods Section. * Denotes significance considering error.

155

Figure 4.5: A Hypothetical Pathway for Gene Regulation of SE during Growth in Egg Yolk. The regulation of SPI-1, which co-regulates SPI-4, is under the control of many stimuli. Low O2 tension, high amounts of AI-2 and other signals can contribute to the upregulation of SPI-1. This upregulation will in turn upregulate the regulatory protein, HilA. The HilA then acts on SPI-4 to upregulate Sii- proteins involved in hemolysin-like protein export which is also encoded by SPI- 4.

156

4.7 References

1. CDC (2011) Vital Signs: Incidence and Trends of Infection with Pathogens Transmitted Commonly Through Food --- Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 1996--2010. 2. Santos RL (2014) Pathobiology of Salmonella, Intestinal Microbiota, and the Host innate Immune Response. Frontiers in Immunology 5. 3. Foley SL, Johnson TJ, Ricke SC, Nayak R, Danzeisen J (2013) Salmonella Pathogenicity and Host Adaptation in Chicken-Associated Serovars. Microbiology and Molecular Biology Reviews 77: 582-607. 4. Henzler DJ, Opitz HM (1992) The Role of Mice in the Epizootiology of Salmonella Enteritidis Infection on Chicken Layer Farms. Avian Diseases 36: 625-631. 5. Nauerby B, Pedersen K, Dietz HH, Madsen M (2000) Comparison of Danish Isolates of Salmonella entericaSerovar Enteritidis PT9a and PT11 from Hedgehogs (Erinaceus europaeus) and Humans by Plasmid Profiling and Pulsed-Field Gel Electrophoresis. Journal of Clinical Microbiology 38: 3631-3635. 6. Meerburg BG, Kijlstra A (2007) Role of Rodents in Transmission of Salmonella and Campylobacter. Journal of the Science of Food and Agriculture 87: 2774-2781. 7. Chai SJ, White PL, Lathrop SL, Solghan SM, Medus C, et al. (2012) Salmonella enterica Serotype Enteritidis: Increasing Incidence of Domestically Acquired Infections. Clinical Infectious Diseases 54: S488-S497. 8. Silva CA, Blondel CJ, Quezada CP, Porwollik S, Andrews-Polymenis HL, et al. (2012) Infection of Mice by Salmonella enterica Serovar Enteritidis Involves Additional Genes That Are Absent in the Genome of Serovar Typhimurium. Infection and Immunity 80: 839-849. 9. CDC (2010) Salmonella Serotype Enteritidis. National Center for Emerging and Zoonotic Infectious Diseases. 10. Guard-Petter J (2001) The Chicken, the Egg and Salmonella Enteritidis. Environmental Microbiology 3: 421-430. 11. Gast RK, Guraya R, Guard-Bouldin J, Holt PS, Moore RW (2007) Colonization of Specific Regions of the Reproductive Tract and Deposition at Different Locations Inside Eggs Laid by Hens Infected with Salmonella Enteritidis or Salmonella Heidelberg. . Avian Diseases 51: 40-44. 12. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Gast R, et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738. 13. De Reu K, Grijspeerdt K, Messens W, Heyndrickx M, Uyttendaele M, et al. (2006) Eggshell factors influencing eggshell penetration and whole egg contamination by different bacteria, including Salmonella enteritidis. International Journal of Food Microbiology 112: 253-260. 14. Messens W, Grijspeerdt K, De Reu K, De Ketelaere B, Mertens K, et al. (2007) Eggshell Penetration of Various Types of Hens' Eggs by Salmonella enterica Serovar Enteritidis. Journal of Food Protection 70: 623-628.

157 15. Deng X, Desai PT, den Bakker HC, Mikoleit M, Tolar B, et al. (2014) Genomic Epidemiology of Salmonella enterica Serotype Enteritidis based on Population Structure of Prevalent Lineages. Emerging Infectious Diseases 20: 1481-1489. 16. Allard MW, Luo Y, Strain E, Pettengill J, Timme R, et al. (2013) On the Evolutionary History, Population Genetics and Diversity among Isolates of Salmonella Enteritidis PFGE Pattern JEGX01.0004. PLoS ONE 8: e55254. 17. Sandt CH, Fedorka-Cray PJ, Tewari D, Ostroff S, Joyce K, et al. (2013) A Comparison of Non-Typhoidal Salmonella from Humans and Food Animals Using Pulsed-Field Gel Electrophoresis and Antimicrobial Susceptibility Patterns. PLoS ONE 8: e77836. 18. Morales CA, Porwollik S, Frye JG, Kinde H, McClelland M, et al. (2005) Correlation of Phenotype with the Genotype of Egg-Contaminating Salmonella enterica Serovar Enteritidis. Applied and Environmental Microbiology 71: 4388-4399. 19. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Van Immerseel F (2008) Salmonella enterica Serovar Enteritidis Genes Induced during Oviduct Colonization and Egg Contamination in Laying Hens. Applied and Environmental Microbiology 74: 6616- 6622. 20. Barthel M, Hapfelmeier S, Quintanilla-Martínez L, Kremer M, Rohde M, et al. (2003) Pretreatment of Mice with Streptomycin Provides a Salmonella enterica Serovar Typhimurium Colitis Model That Allows Analysis of Both Pathogen and Host. Infection and Immunity 71: 2839-2858. 21. Suar M, Jantsch J, Hapfelmeier S, Kremer M, Stallmach T, et al. (2006) Virulence of Broad- and Narrow-Host-Range Salmonella enterica Serovars in the Streptomycin-Pretreated Mouse Model. Infection and Immunity 74: 632-644. 22. Vishwakarma V, Periaswamy B, Bhusan Pati N, Slack E, Hardt W-D, et al. (2012) A Novel Phage Element of Salmonella enterica Serovar Enteritidis P125109 Contributes to Accelerated Type III Secretion System 2-Dependent Early Inflammation Kinetics in a Mouse Colitis Model. Infection and Immunity 80: 3236-3246. 23. Mastroeni P, Morgan FJE, McKinley TJ, Shawcroft E, Clare S, et al. (2010) Enhanced Virulence of Salmonella enterica Serovar Typhimurium after Passage Through Mice. Infection and Immunity 79: 636-643. 24. Ferreira RBR, Gill N, Willing BP, Antunes LCM, Russell SL, et al. (2011) The Intestinal Microbiota Plays a Role in Salmonella Induced Colitis Independent of Pathogen Colonization. PLoS ONE 6: e20338. 25. Sturm A, Heinemann M, Arnoldini M, Benecke A, Ackermann M, et al. (2011) The Cost of Virulence: Retarded Growth of Salmonella Typhimurium Cells Expressing Type III Secretion System 1. PLoS Pathog 7: e1002143. 26. Bochner BR (2009) Global phenotypic characterization of bacteria. FEMS Microbiology Reviews 33: 191-205. 27. Rychlik I, Karasova D, Sebkova A, Volf J, Sisak F, et al. (2009) Virulence potential of five major pathogenicity islands (SPI-1 to SPI-5) of Salmonella enterica serovar Enteritidis for chickens. BMC Microbiology 9: 268. 28. Phoebe Lostroh C, Lee CA (2001) The Salmonella Pathogenicity Island-1 Type III Secretion System. Microbes and Infection 3: 1281-1291. 29. Choi J, Shin D, Kim M, Park J, Lim S, et al. (2012) LsrR-Mediated Quorum Sensing Controls Invasiveness of Salmonella typhimurium by Regulating SPI-1 and Flagella Genes. PLoS ONE 7: e37059. 30. Choi J, Shin D, Ryu S (2007) Implication of Quorum Sensing in Salmonella enterica Serovar Typhimurium Virulence: the luxS Gene Is Necessary for Expression of Genes in Pathogenicity Island 1. Infection and Immunity 75: 4885-4890.

158 31. Main-Hester KL, Colpitts KM, Thomas GA, Fang FC, Libby SJ (2008) Coordinate Regulation of Salmonella Pathogenicity Island 1 SPI1 and SPI4 in Salmonella enterica Serovar Typhimurium. Infection and Immunity 76: 1024-1035. 32. Collier-Hyams LS, Zeng H, Sun J, Tomlinson AD, Bao ZQ, et al. (2002) Cutting Edge: Salmonella AvrA Effector Inhibits the Key Proinflammatory, Anti-Apoptotic NF-κB Pathway. The Journal of Immunology 169: 2846-2850. 33. Moreira CG, Weinshenker D, Sperandio V (2010) QseC Mediates Salmonella enterica Serovar Typhimurium Virulence In Vitro and In Vivo. Infection and Immunity 78: 914- 926.

Chapter 5

Summary and Significance

160

5.1 Synopsis

Salmonella enterica as a species is comprised of six different subspecies with subspecies 1

(Salmonella enterica subspecies enterica) containing 2,600 different serovars [1]. Many of these serovars are unique from one another genotypically and phenotypically [2,3]. These serovars infect a diverse spectrum of host species including humans with some serovars having only one host and some others having multiple hosts. Of all the Salmonella enterica serovars, Salmonella enterica serovar Enteritidis (SE) is the most common cause of human salmonellosis in the world

[4,5]. Despite many intervention strategies to prevent outbreaks and spread of this bacterium, SE is still a major cause of human foodborne morbidity and mortality [6,7]. Despite this importance, the molecular mechanisms involved in SE pathogenesis are understudied [8-11]. The overall objective of this study was to better understand the molecular mechanisms involved in SE pathogenesis and related source dynamics using a combination of genomic, molecular and in vivo approaches. Elucidating these mechanisms will aid in the identification of novel targets for vaccine and antimicrobial design.

5.2 Evolution of Pathogenicity Revealed by Whole Genome Sequencing and Comparative Genomics of Two Egg Isolates of Salmonella Enteritidis

5.2A Summary and Significance

Many epidemiological studies and surveillance networks have been set up in order to track and record outbreaks of SE from the Centers of Disease Control and Prevention (CDC) and the World Health Organization (WHO) [5-7]. These studies show that SE remains a very important threat to human health. Shell eggs and egg products remain the most common source of

161 human foodborne salmonellosis worldwide. Although many SE genomes sequences are available in the public genomic databases, to our knowledge, this is the first study that has sequenced, annotated and performed comparative genomics of SE isolated from shell eggs. These two strains of SE were designated as SEE1 and SEE2. These genomes were approximately 4.67 Mb long,

52% GC content; and consisted of 22 rRNAs, 84 tRNAs and ~4670 open reading frames. Thus,

SEE1 and SEE2 are very close to other SE genomes that have been sequenced and characterized.

This study also revealed the first virulence profiling of SE isolates belonging to the pulsed-field

(PFGE) type JEGX01.0004, which is also the most common PFGE type associated with human disease (Tables 2.5-2.17) [12]. This analysis identified over 600 genes virulence associated genes, approximately 12.5% of all genes present in the genomes of SEE1 and SEE2.

Genome wide single nucleotide polymorphism (SNP) analysis was performed and identified SNPs conserved in egg versus those conserved in human isolates to that may display evidence of microevolution of SE from the microenvironment of the egg to transmission to and colonization of the human host. As Figures 2.5 and 2.6 shows there is high similarity among all

SE in gene presence/absence and sequence of conserved genes, respectively. There are many genes containing SNPs differentially conserved between human and egg isolates (Table 2.18). Of these SNPs, two non-synonymous SNPs (nsSNPs) were found in virulence-associated genes. The is a T:C missense mutation in stiC, a fimbrial usher protein resulting in a non-conservative amino acid change of a polar, uncharged asparagine (Asn) to a polar, ring-bearing positive charged histidine. Fimbrial adhesins are known to be extremely important for many enteric pathogens, including Salmonella enterica, for adherence to a multitude of surfaces and cell types [13,14].

The second is an A:G missense mutation in rffG which creates an aspartate (Asp), polar negative, to a glycine (Gly) which is a non-polar uncharged amino acid. This gene is involved in the production of LPS which displays a multitude of functions in many enteric bacterial

162 pathogens including stimulating the host immune responses and adherence [15-18]. Charges, structures, and polarity all affect the intramolecular interactions of amino acids which influence both the structure and function of a protein. Because stiC and rffG are potentially associated with

SE virulence, nsSNPs in these genes may lead to, structural and functional changes of corresponding proteins. These changes may enhance the ‘fitness’ of SE within the respective host. Aside from coding SNPs, there were also intergenic SNPs, such as that of sseB, which is a type 3 secretion system effector that could affect genetic regulation of genes in cis. To our knowledge this study provides the first evidence of microevolution that occurs during the course of human infection. These findings may help to elucidate the important genomic changes that may occur to help SE better adapt to different hosts through the use of important virulence- associated genes and gene products.

5.3 Patho-Pan-Genomics of Salmonella enterica Serovars Reveals Host-Specific Factors and Potential Vaccine Targets of Salmonella Enteritidis

5.3A Summary and Significance

Despite the use of poultry vaccines to minimize SE contamination of shell eggs, human foodborne illness due to the consumption of shell eggs still remains a major public health issue

[20]. Similarly, there has been mounting evidence of emergence of other Salmonella serovars, such as Salmonella Heidelberg and it has recently been indicated the ability of human-restricted serovar Salmonella Paratyphi B to cross the host-specific barrier [21,22]. Previous studies have looked into pair wise comparisons of SE to other Salmonella enterica serovars have been the only studies attempting to understand the evolution of SE’s genome [23-25] To our knowledge, this is the first study of pan-genome analysis of SE against 10 other serovars with various host-

163 specificities and different disease outcomes in the hosts (Table 3.1). This study identified core genes present in all serovars, genes present in serovars which has the ability to infect a common host, or the genes unique to a particular serovar. It is expected that this new information can be used to identify the genes or gene pools that determine the host and disease specificity of

Salmonella serovars as well as to identify the vaccine and therapeutic targets.

This study identified the clusters of orthologous genes (COGs) that are unique to SE and either poultry-infectious serovars or human-infectious serovars. This, in combination with the work in Chapter 2, will allow for the identification of virulence factors unique to SE or common to few serovars. For example, the lpf- operon was found to be conserved in all serovars that can infect poultry but not in serovars infecting humans (Table 3.2). Previous studies have shown that deletion of the lpf- operon completely ablated the ability of Salmonella Typhimurium to attach and form biofilms on poultry colonic epithelial cells but not human epithelial cells [26]. Two genes related to tetrathionate utilization and the salmochelin-specific enterobactin esterase were among the 12 virulence-associated genes conserved in all human infecting serovars but not found in poultry restricted serovars. This would indicate that tetrathionate utilization and iron sequestration through salmochelin is more important for survival in the mammalian GIT than they are in the poultry GIT.

This study also identified COGs that are conserved in many of the serovars of Salmonella enterica. As Figure 3.1 shows, there is a high degree of similarity within most serovars (except for S. Heidelberg) based on whole genome-gene composition comparisons. For this reason,

COGs that are retained in all serovars, including the distant SH serovar, would be an excellent targets for the development of a polyvalent vaccine and antimicrobials against SE as well as other

Salmonella serovars. Tables 3.4-3.9 contain all of the COGs identified in the virulence gene core of the serovars tested. From these tables we were able to identify a list of potential targets for

164 vaccine and antimicrobial development based on their functions as virulence factors and relative locations in the bacterial cell membrane.

5.4 Growth in Egg Yolk Enhances Salmonella Enteritidis Colonization and Virulence in a Mouse Colitis Model of Salmonella infection

5.4A Summary and Significance

The most important source of infection of SE in humans is through consumption of contaminated shell eggs and egg products [27,28]. Despite the importance of shell eggs as a source of SE, no study has been conducted to determine if SE changes its virulence inside the egg. This study was aimed at understanding the dynamics between microbe and source as it relates to the context of human infection using a mouse colitis model of human Salmonella infection [30,31]. In this study to reflect three different sources of SE, mice were infected with

SE in egg yolk SE grown in Luria Bertani broth (LB) or SE isolated from feces of mice infected with SE. The egg yolk was used to represent the SE from shell eggs, which is the most common source of SE human foodborne illness. Mouse feces used to represent the outbreaks of SE infection due to other sources, like peanuts, which are thought to be the result of rodent fecal transmission. LB was used as a control baseline and the media most often used in laboratory experiments [32]. Salmonella Enteritidis grown in egg yolk demonstrated increased colonization and survival in the gastrointestinal tract (GIT) as well as their dissemination to the liver and spleen indicating that egg yolk may enhance SE fitness to colonize and disseminate within the host. This increase in colonization of mice with SEE1 was correlated with its enhanced virulence as determined by gross and histopathological lesions (Figures 4.3 and 4.4). Mice infected with

165 SEE1 or SEE2 grown in egg yolk displayed more severe signs of edema and perfusion and evidence of necrosis in many organs as compared to the other groups.

These results suggest that SE grown in egg yolk may have a difference in its transcriptional programming at the time of infection, thus leading to an overall increase in fitness and virulence. This phenomenon may not only be important to determine the epidemiology and disease outcomes of SE, but also for other enteric pathogens such as, Escherichia coli, which also infect humans through a variety of sources [33].

166

5.5 References

1. Foley SL, Johnson TJ, Ricke SC, Nayak R, Danzeisen J (2013) Salmonella Pathogenicity and Host Adaptation in Chicken-Associated Serovars. Microbiology and Molecular Biology Reviews 77: 582-607. 2. Suez J, Porwollik S, Dagan A, Marzel A, Schorr YI, et al. (2013) Virulence Gene Profiling and Pathogenicity Characterization of Non-Typhoidal Salmonella Accounted for Invasive Disease in Humans. PLoS ONE 8: e58449. 3. Behnsen J, Perez-Lopez A, Nuccio S-P, Raffatellu M (2015) Exploiting host immunity: the Salmonella paradigm. Trends in Immunology 36: 112-120. 4. Majowicz SE, Musto J, Scallan E, Angulo FJ, Kirk M, et al. (2010) The Global Burden of Nontyphoidal Salmonella Gastroenteritis. Clinical Infectious Diseases 50: 882-889. 5. Vieira Aea (2009) A Resource to Link Human and Non-Human Sources of Salmonella. WHO Global Foodborne Infections Network Country Databank 6. CDC (2010) Salmonella Serotype Enteritidis. National Center for Emerging and Zoonotic Infectious Diseases. 7. CDC (2011) Vital Signs: Incidence and Trends of Infection with Pathogens Transmitted Commonly Through Food --- Foodborne Diseases Active Surveillance Network, 10 U.S. Sites, 1996--2010. 8. Santos RL (2014) Pathobiology of Salmonella, Intestinal Microbiota, and the Host innate Immune Response. Frontiers in Immunology 5. 9. Rychlik I, Karasova D, Sebkova A, Volf J, Sisak F, et al. (2009) Virulence potential of five major pathogenicity islands (SPI-1 to SPI-5) of Salmonella enterica serovar Enteritidis for chickens. BMC Microbiology 9: 268. 10. Phoebe Lostroh C, Lee CA (2001) The Salmonella Pathogenicity Island-1 Type III Secretion System. Microbes and Infection 3: 1281-1291. 11. Hautefort I, Thompson A, Eriksson-Ygberg S, Parker ML, Lucchini S, et al. (2008) During Infection of Epithelial Cells Salmonella enterica Serovar Typhimurium Undergoes a Time-Dependent Transcriptional Adaptation That Results in Simultaneous Expression of Three Type 3 Secretion Systems. Cellular Microbiology 10: 958-984. 12. Allard MW, Luo Y, Strain E, Pettengill J, Timme R, et al. (2013) On the Evolutionary History, Population Genetics and Diversity among Isolates of Salmonella Enteritidis PFGE Pattern JEGX01.0004. PLoS ONE 8: e55254. 13. Nuccio S-P, Baumler AJ (2007) Evolution of the Chaperone/Usher Assembly Pathway: Fimbrial Classification Goes Greek. Microbiology and Molecular Biology Reviews 71: 551-575. 14. De Buck J, Immerseel FV, Haesebrouck F, Ducatelle R (2004) Effect of Type 1 Fimbriae of Salmonella enterica Serotype Enteritidis on Bacteraemia and Reproductive Tract Infection in Laying Hens. Avian Pathology 33: 314-320. 15. Raetz CRH, Reynolds CM, Trent MS, Bishop RE (2007) Lipid A Modification Systems in Gram-Negative Bacteria. Annual Review of Biochemistry 76: 295-329. 16. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Van Immerseel F (2009) The Salmonella Enteritidis Lipopolysaccharide Biosynthesis Gene rfbH is Required for Survival in Egg Albumen. Zoonoses and Public Health 56: 145-149.

167 17. Mageed AMA, Isobe N, Yoshimura Y (2008) Expression of Avian β-Defensins in the Oviduct and Effects of Lipopolysaccharide on Their Expression in the Vagina of Hens. Poultry Science 87: 979-984. 18. Carter JA, Blondel CJ, Zaldivar M, Ãlvarez SA, Marolda CL, et al. (2007) O-Antigen Modal Chain Length in Shigella flexneri 2a is Growth-Regulated Through RfaH-Mediated Transcriptional Control of the wzy Gene. Microbiology 153: 3499-3507. 19. Moxon R, Rappuoli R (2002) Bacterial pathogen genomics and vaccines. British Medical Bulletin 62: 45-58. 20. Chai SJ, White PL, Lathrop SL, Solghan SM, Medus C, et al. (2012) Salmonella enterica Serotype Enteritidis: Increasing Incidence of Domestically Acquired Infections. Clinical Infectious Diseases 54: S488-S497. 21. Schoeni JL, Glass KA, McDermott JL, Wong ACL (1995) Growth and Penetration of Salmonella Enteritidis, Salmonella Heidelberg and Salmonella Typhimurium in Eggs. International Journal of Food Microbiology 24: 385-396. 22. Van Immerseel F, Meulemans L, De Buck J, Pasmans F, Velge P, et al. (2004) Bacteria–Host Interactions of Salmonella Paratyphi B dT+ in Poultry. Epidemiology & Infection 132: 239-243. 23. Thomson NR, Clayton DJ, Windhorst D, Vernikos G, Davidson S, et al. (2008) Comparative genome analysis of Salmonella Enteritidis PT4 and Salmonella Gallinarum 287/91 provides insights into evolutionary and host adaptation pathways. Genome Research 18: 1624-1637. 24. Silva CA, Blondel CJ, Quezada CP, Porwollik S, Andrews-Polymenis HL, et al. (2012) Infection of Mice by Salmonella enterica Serovar Enteritidis Involves Additional Genes That Are Absent in the Genome of Serovar Typhimurium. Infection and Immunity 80: 839-849. 25. Betancor L, Yim L, Martínez A, Fookes M, Sasias S, et al. (2012) Genomic Comparison of the Closely Related Salmonella enterica Serovars Enteritidis and Dublin. The Open Microbiology Journal 6: 5-13. 26. Ledeboer NA, Frye JG, McClelland M, Jones BD (2006) Salmonella enterica Serovar Typhimurium Requires the Lpf, Pef, and Tafi Fimbriae for Biofilm Formation on HEp-2 Tissue Culture Cells and Chicken Intestinal Epithelium. Infection and Immunity 74: 3156-3169. 27. Guard-Petter J (2001) The Chicken, the Egg and Salmonella Enteritidis. Environmental Microbiology 3: 421-430. 28. Gantois I, Ducatelle R, Pasmans F, Haesebrouck F, Gast R, et al. (2009) Mechanisms of Egg Contamination by Salmonella Enteritidis. FEMS Microbiology Reviews 33: 718-738. 29. Ebel E, Schlosser W (2000) Estimating the annual fraction of eggs contaminated with Salmonella enteritidis in the United States. International Journal of Food Microbiology 61: 51-62. 30. Barthel M, Hapfelmeier S, Quintanilla-Martínez L, Kremer M, Rohde M, et al. (2003) Pretreatment of Mice with Streptomycin Provides a Salmonella enterica Serovar Typhimurium Colitis Model That Allows Analysis of Both Pathogen and Host. Infection and Immunity 71: 2839-2858. 31. Vishwakarma V, Periaswamy B, Bhusan Pati N, Slack E, Hardt W-D, et al. (2012) A Novel Phage Element of Salmonella enterica Serovar Enteritidis P125109 Contributes to Accelerated Type III Secretion System 2-Dependent Early Inflammation Kinetics in a Mouse Colitis Model. Infection and Immunity 80: 3236-3246. 32. Meerburg BG, Kijlstra A (2007) Role of Rodents in Transmission of Salmonella and Campylobacter. Journal of the Science of Food and Agriculture 87: 2774-2781.

168 33. Pruimboom-Brees IM, Morgan TW, Ackermann MR, Nystrom ED, Samuel JE, et al. (2000) Cattle Lack Vascular Receptors for Escherichia coli O157:H7 Shiga toxins. Proceedings of the National Academy of Sciences 97: 10325-10329.

169

Appendices

170

Appendix A

Pan-Genome Reference Tables

Table A-1: Full Human-Infectious COG List

COG# protein ID Description

COG1054: 16761555 DNA-invertase

COG3992: 62179781 Fels-1 prophage chitinase

COG4523: 16764408 host specificity protein J

COG4524: 162139526 transporter

COG4526: 162139586 outer membrane receptor

COG4539: 62178891 outer membrane phosphoporin protein E

COG4546: 62178951 fimbriae usher protein

COG4547: 62178954 inner membrane protein

COG4549: 62178968 DNA methylase; restriction system

COG4550: 62178974 AraC family transcriptional regulator

COG4551: 62179115 thioredoxin protein

COG4581: 62179700 flagellar hook-associated protein FlgK

COG4586: 62179869

COG4598: 62179972 hypothetical protein SC1402

COG4600: 62179975 tetrathionate reductase complex subunit C

COG4602: 62180072 protease

COG4603: 62180081 ABC transporter permease

COG4606: 62180114 inner membrane protein

COG4614: 62180153 outer membrane lipoprotein

COG4615: 62180154 hypothetical protein SC1584

COG4616: 62180161 ssrAB activated gene

COG4628: 62180215 pyruvate-flavodoxin

COG4629: 62180231 hypothetical protein SC1661

COG4630: 62180234 serine/threonine protein kinase

COG4636: 62180356 hydrogenase-1 protein

COG4640: 62180490 flagellar biosynthesis protein FlhA

COG4641: 62180491 flagellar biosynthesis protein FlhB

COG4648: 62180598 synthesis of vitamin B12 adenosyl cobalamide

COG4650: 62180614 propanediol utilization transcriptional regulator

COG4654: 62180705 inner membrane protein

171

COG4656: 62180735 outer membrane protein

COG4658: 62180931 amino acid transporter

COG4659: 62181016 hypothetical protein SC2446

COG4667: 62181112 anaerobic sulfide reductase

COG4671: 62181264 ABC transporter

COG4672: 62181276 enterochelin esterase

COG4674: 62181292 hydroxyglutarate oxidase

COG4675: 62181302 DNA binding protein

COG4678: 62181319 MFS superfamily, multidrug transport protein

COG4696: 62181651 glutathione S-transferase

COG4701: 62181740 dehydrogenase

COG4703: 62181911 DNA protecting protein DprA

COG4704: 62181961 glutathione-regulated potassium-efflux system protein KefB

COG4709: 62182038 glycogen branching protein

COG4716: 62182128 dipeptide transporter ATP-binding subunit

COG4727: 62182313 hybrid sensory histidine kinase TorS

COG4739: 62182527 aminoimidazole riboside kinase

COG4742: 62182578 hypothetical protein SC4008

COG4745: 62182777 hypothetical protein SC4207

COG4749: 62182872 hypothetical protein SC4302

COG4751: 62182879 trehalose-6-phosphate hydrolase

COG4753: 62182929 hypothetical protein SC4359

COG4761: 62183009 fimbrial chaperone protein

172

Table A-2: Full Poultry-Infectious COG List

Cog# Protein ID Description COG4532: 62178744 fimbriae COG4533: 62178746 fimbrial chaperone COG4534: 62178747 fimbrial subunit COG4540: 62178942 hypothetical protein SC0372 COG4541: 62178943 permease COG4542: 62178944 isopropylmalate isomerase large subunit COG4543: 62178946 fumarylacetoacetate (FAA) hydrolase COG4544: 62178948 LysR family transcriptional regulator COG4548: 62178964 cation efflux pump COG4552: 62179131 allantoin permease COG4554: 62179147 outer membrane protein COG4555: 62179148 inner membrane protein COG4558: 62179253 hypothetical protein SC0683 COG4559: 62179254 hypothetical protein SC0684 COG4560: 62179255 DnaJ family molecular chaperone COG4561: 62179256 hypothetical protein SC0686 COG4562: 62179257 molecular chaperone DnaK COG4565: 62179330 fumarate hydratase class I anaerobic COG4566: 62179331 fumarate hydratase COG4567: 62179332 LysR family transcriptional regulator COG4568: 62179337 hypothetical protein SC0767 COG4569: 62179377 inner membrane protein COG4570: 62179418 electron transfer flavoprotein subunit beta COG4571: 62179419 electron transfer flavoprotein alpha subunit COG4572: 62179421 acyl-CoA dehydrogenase COG4573: 62179422 dehydrogenase COG4574: 62179424 LysR family transcriptional regulator, partial COG4576: 62179496 secreted protein SopD-like protein COG4578: 62179603 outer membrane protein COG4582: 62179705 inner membrane lipoprotein COG4588: 62179940 ferredoxin COG4589: 62179942 electron transfer flavoprotein subunit YdiR COG4590: 62179944 AraC family transcriptional regulator COG4591: 62179945 acyl-CoA dehydrogenase COG4592: 62179946 acetyl-CoA:acetoacetyl-CoA transferase subunit beta COG4593: 62179948 quinate/shikimate dehydrogenase COG4594: 62179950 MFS family transporter

173

COG4595: 62179951 hypothetical protein SC1381 COG4599: 62179973 DeoR family transcriptional regulator COG4621: 62180189 hypothetical protein SC1619 COG4622: 62180190 LysR family transcriptional regulator COG4623: 62180191 methyl-accepting chemotaxis protein III, ribose and galactose sensor receptor COG4624: 62180192 alcohol dehydrogenase COG4625: 62180193 hypothetical protein SC1623 COG4626: 62180196 translocated effector: regulated by SPI-2 COG4627: 62180204 SAM-dependent methyltransferase COG4635: 62180352 hydrogenase-1 small subunit COG4643: 62180509 hypothetical protein SC1939 COG4644: 62180531 flagellin methylation protein COG4647: 62180581 hypothetical protein SC2011 COG4651: 62180641 D-alanyl-D-alanine carboxypeptidase COG4653: 62180672 pyruvyl transferase COG4665: 62181075 IS3-like transposase COG4666: 62181097 anaerobic dimethylsulfoxide reductase COG4685: 62181591 mannitol dehydrogenase COG4686: 62181592 malate/L-lactate dehydrogenase COG4687: 62181593 zinc-binding dehydrogenase COG4688: 62181594 GntR family transcriptional regulator COG4689: 62181595 outer membrane lipoprotein COG4690: 62181628 acetyl-CoA hydrolase COG4691: 62181629 monoamine oxidase COG4692: 62181630 LysR family transcriptional regulator COG4693: 62181631 LysR family transcriptional regulator COG4694: 62181632 arylsulfatase COG4695: 62181633 arylsulfatase regulator COG4698: 62181683 dicarboxylate-binding periplasmic protein COG4699: 62181684 inner membrane protein COG4700: 62181685 integral membrane protein, transporter COG4705: 62182028 hypothetical protein SC3458 COG4706: 62182030 glycerol dehydrogenase COG4707: 62182032 dihydrodipicolinate synthetase COG4708: 62182033 transcriptional regulator COG4711: 62182094 phosphatase COG4712: 62182103 anaerobic C4-dicarboxylate transporter COG4713: 62182104 ribokinase family sugar kinase COG4717: 62182139 long polar fimbrial operon protein

174

COG4718: 62182140 long polar fimbrial chaperone COG4719: 62182141 long polar fimbria COG4723: 62182266 hypothetical protein SC3696 COG4730: 62182317 2-oxo-3-deoxygalactonate kinase COG4731: 62182318 galactonate operon transcriptional repressor COG4738: 62182526 Na+:galactoside symporter family permease COG4740: 62182564 hypothetical protein SC3994, partial COG4741: 62182569 hypothetical protein SC3999 COG4743: 62182668 hypothetical protein SC4098 COG4754: 62182957 PTS permease COG4755: 62182958 PTS permease COG4756: 62182959 PTS permease COG4757: 62182960 PTS permease COG4758: 62182961 glucosamine-fructose-6-phosphate aminotransferase COG4759: 62182962 glucosamine-fructose-6-phosphate aminotransferase

175

Table A-3: Complete Salmonella enterica Shared COG List

COG# Protein ID Description COG1001: 62179488 anaerobic dimethyl sulfoxide reductase subunit A COG1002: 62180390 L-serine deaminase I/L-threonine deaminase I COG1003: 62178618 oxalacetate decarboxylase subunit beta COG1004: 62178619 oxaloacetate decarboxylase COG1005: 62179087 acridine efflux system protein COG1006: 229037907 outer membrane receptor FepA COG1007: 62178689 division specific transpeptidase, penicillin-binding protein 3 re COG1008: 62178804 lysine decarboxylase 2, constitutive COG1009: 62179201 carbon starvation protein COG1010: 62179497 pyruvate formate lyase I COG1011: 62180823 heme lyase disulfide oxidoreductase, cytocyhrome c-type biogenesis COG1012: 62180824 cytochrome c-type biogenesis protein COG1013: 62181040 transketolase COG1014: 162139536 elongation factor Tu COG1015: 62178625 citrate lyase subunit alpha/citrate-ACP transferase COG1016: 62179240 penicillin-binding protein 2 COG1017: 62180147 nitrate reductase 2 subunit beta COG1018: 62181045 oxidoreductase Fe-S binding subunit COG1019: 62182254 Mg2+ transport protein COG1020: 207857670 cytochrome c-type biogenesis protein CcmE COG1021: 207857671 heme exporter protein D2 COG1022: 205353371 heme exporter protein B COG1023: 62180055 A (fumarate hydratase class I), aerobic isozyme COG1024: 207856184 cytochrome d ubiquinol oxidase subunit I COG1025: 207856705 hydrogenase 1 large subunit COG1026: 207856729 respiratory nitrate reductase 1 subunit alpha COG1029: 205353370 heme exporter protein C2 COG1030: 62182478 GPH family transport protein COG1031: 62178861 integrase core subunit COG1032: 62180829 cytochrome c biogenesis protein CcmA COG1034: 62178939 bactoprenol glucosyl transferase COG1039: 194448978 flagellin COG1041: 62179455 hypothetical protein SC0885 COG1042: 62182398 glucose-1-phosphate thymidylyltransferase COG1043: 62179162 glycosyl translocase COG1045: 194447792 dimethylsulfoxide reductase, B subunit COG1047: 162139591 methyl-accepting chemotaxis protein

176

COG1048: 62178617 transcription regulator, histidine kinase for citrate COG1050: 62179813 periplasmic murein peptide-binding protein precurs COG1051: 62180663 phosphomannomutase COG1053: 161613759 bifunctional acetaldehyde-CoA/alcohol dehydrogenase COG1055: 62178620 oxaloacetate decarboxylase subunit gamma COG1058: 62182710 inner membrane protein COG1060: 162139494 lytic murein transglycosylase COG1061: 162139495 thymidine COG1062: 162139496 ribosomal-protein-alanine N-acetyltransferase COG1063: 162139497 DNA-binding transcriptional activator BglJ COG1064: 162139498 hypothetical protein SC4355 COG1065: 162139499 peptidase PmbA COG1066: 162139500 peptidyl-prolyl cis-trans isomerase COG1067: 162139501 3-keto-L-gulonate-6-phosphate decarboxylase COG1068: 162139502 PTS system ascorbate-specific transporter subunit IIC COG1069: 162139503 L-ascorbate 6-phosphate lactonase COG1070: 162139505 transcriptional repressor NsrR COG1071: 162139506 ribosome-associated GTPase COG1072: 162139507 elongation factor P COG1073: 162139508 co-chaperonin GroES COG1074: 162139509 aromatic amino acid aminotransferase COG1075: 162139510 maltose ABC transporter substrate-binding protein COG1076: 162139511 B12-dependent COG1077: 162139512 thiazole synthase COG1078: 162139513 50S ribosomal protein L7/L12 COG1079: 162139514 50S ribosomal protein L31 COG1080: 162139515 DNA-binding transcriptional regulator CytR COG1081: 162139518 regulation protein NR(I) COG1082: 162139519 multifunctional fatty acid oxidation complex subunit alpha COG1083: 162139520 hypothetical protein SC3855 COG1084: 162139521 diaminopimelate epimerase COG1085: 162139522 UDP-N-acetyl-D-mannosamine dehydrogenase COG1086: 162139523 F0F1 ATP synthase subunit I COG1087: 162139524 hypothetical protein SC3763 COG1088: 162139525 hypothetical protein SC3733 bifunctional phosphopantothenoylcysteine COG1089: 162139528 decarboxylase/phosphopantothenate synthase COG1090: 162139530 major facilitator superfamily transporter COG1091: 162139531 sulfur transfer protein SirA COG1092: 162139532 gluconate kinase

177

COG1093: 162139533 heat shock protein 33 COG1094: 162139534 ADP-ribose diphosphatase NudE COG1095: 162139535 30S ribosomal protein S12 COG1096: 162139537 50S ribosomal protein L13 COG1097: 162139538 preprotein translocase subunit SecG COG1098: 162139539 argininosuccinate synthase COG1099: 162139540 hypothetical protein SC3229 COG1100: 162139541 ATP-dependent RNA helicase DeaD COG1101: 162139542 hypothetical protein SC3206 COG1102: 162139543 putrescine--2-oxoglutarate aminotransferase COG1103: 162139544 outer membrane channel protein COG1104: 162139546 hypothetical protein SC3042 COG1105: 162139547 hypothetical protein SC3036 COG1106: 162139548 16S ribosomal RNA methyltransferase RsmE COG1107: 162139549 fructose-bisphosphate aldolase COG1108: 162139550 hypothetical protein SC3000 COG1109: 162139551 glycine dehydrogenase COG1110: 162139552 flavodoxin FldB COG1111: 162139553 acetyl-CoA acetyltransferase COG1112: 162139554 hypothetical protein SC2721 COG1113: 162139555 heat shock protein GrpE COG1114: 162139557 hypothetical protein SC2680 COG1115: 162139558 30S ribosomal protein S16 COG1116: 162139559 16S rRNA-processing protein RimM COG1117: 162139560 aminopeptidase COG1118: 162139561 VII large subunit COG1119: 162139562 glycine cleavage system transcriptional repressor COG1120: 162139563 acetyltransferase COG1121: 162139564 cysteine synthase B COG1122: 162139565 5-methylaminomethyl-2-thiouridine methyltransferase COG1123: 162139566 acetyl-CoA carboxylase subunit beta COG1124: 162139567 PTS system ascorbate-specific transporter subunit IIC COG1125: 162139568 phosphatase COG1126: 162139569 Z COG1127: 162139570 methionyl-tRNA synthetase COG1128: 162139574 chemotaxis regulator CheZ COG1129: 162139575 high-affinity zinc transporter ATPase COG1130: 162139576 high-affinity zinc transporter periplasmic protein COG1131: 162139577 serine/threonine 1

178

COG1132: 162139579 hypothetical protein SC1804 COG1133: 162139580 formyltetrahydrofolate deformylase COG1134: 162139582 fumarate/nitrate reduction transcriptional regulator COG1135: 162139583 ATP-dependent RNA helicase HrpA COG1136: 162139588 hypothetical protein SC1521 COG1137: 162139589 hypothetical protein SC1518 COG1138: 162139592 50S ribosomal protein L35 COG1139: 162139593 hypothetical protein SC1342 COG1140: 162139594 tRNA-specific 2-thiouridylase MnmA COG1141: 162139595 23S rRNA pseudouridylate synthase C COG1142: 162139596 lipid A biosynthesis lauroyl acyltransferase COG1143: 162139598 pyruvate formate lyase-activating enzyme 1 COG1144: 162139599 adenosylmethionine-8-amino-7-oxononanoate aminotransferase COG1145: 162139600 imidazolonepropionase COG1146: 162139602 phospho-2-dehydro-3-deoxyheptonate aldolase COG1147: 162139603 replication initiation regulator SeqA COG1148: 162139604 cold shock protein CspE COG1149: 162139606 hypothetical protein SC0580 COG1150: 162139607 potassium efflux protein KefA COG1151: 162139608 ATP-dependent Clp protease proteolytic subunit COG1152: 162139609 hypothetical protein SC0487 COG1153: 162139610 recombination associated protein COG1154: 162139612 COG1155: 162139613 DL-methionine transporter ATP-binding subunit COG1156: 162139614 prolyl-tRNA synthetase COG1157: 162139615 CDP-diglyceride synthase COG1158: 162139616 elongation factor Ts COG1159: 162139617 iron-hydroxamate transporter ATP-binding subunit COG1160: 162139618 RNA polymerase-binding transcription factor COG1161: 162139619 glutamyl-Q tRNA(Asp) synthetase COG1162: 162139620 COG1163: 162139621 hypoxanthine-guanine phosphoribosyltransferase COG1164: 162139622 hypothetical protein SC0160 COG1165: 162139623 quinolinate phosphoribosyltransferase COG1166: 162139625 transaldolase B COG1167: 229037899 fumarate reductase flavoprotein subunit COG1168: 229037900 vitamin B12/cobalamin outer membrane transporter COG1169: 229037901 porin COG1170: 229037902 regulatory ATPase RavA

179

COG1171: 229037903 peptidoglycan synthetase COG1172: 229037904 rod shape-determining protein MreB COG1173: 229037905 protease TldD COG1174: 229037906 ligase COG1175: 229037908 flagella biosynthesis regulator COG1176: 229037909 long-chain-fatty-acid--CoA ligase COG1177: 229037910 malate dehydrogenase COG1178: 229037912 autoagglutination protein COG1179: 229037913 aminopeptidase COG1180: 342240211 50S ribosomal protein L22 COG1181: 62178572 bifunctional aspartokinase I/homoserine dehydrogenase I COG1182: 62178574 COG1183: 62178575 hypothetical protein SC0005 COG1184: 62178578 molybdenum cofactor biosynthesis protein MogA COG1185: 62178579 hypothetical protein SC0009 COG1186: 62178582 molecular chaperone DnaK COG1187: 62178583 molecular chaperone DnaJ COG1188: 62178584 LysR family transcriptional regulator COG1189: 62178585 hypothetical protein SC0015 COG1190: 62178586 hypothetical protein SC0016 COG1191: 62178587 hypothetical protein SC0017, partial COG1192: 62178589 hypothetical protein SC0019 COG1193: 62178590 fimbrial subunit COG1194: 62178591 fimbrial chaperone COG1195: 62178592 fimbrial subunit COG1196: 62178593 fimbrial subunit COG1197: 62178595 fimbrial chaperone COG1198: 62178596 hypothetical protein SC0026 COG1199: 62178598 hypothetical protein SC0028 COG1200: 62178601 hypothetical protein SC0031 COG1201: 62178602 arylsulfatase regulatory protein COG1202: 62178604 pH-dependent sodium/proton antiporter COG1203: 62178605 transcriptional activator NhaR COG1204: 62178606 glycosyl hydrolase COG1205: 62178607 30S ribosomal protein S20 COG1206: 62178609 bifunctional kinase/FMN adenylyltransferase COG1207: 62178610 isoleucyl-tRNA synthetase COG1208: 62178611 lipoprotein signal peptidase COG1209: 62178612 FKBP-type peptidylprolyl isomerase

180

COG1210: 62178613 4-hydroxy-3-methylbut-2-enyl diphosphate reductase COG1211: 62178615 ribonucleoside hydrolase RihC COG1212: 62178616 transcription regulator sensor for citrate COG1213: 62178621 citrate-sodium symport COG1214: 62178623 citrate lyase subunit gamma COG1215: 62178626 hypothetical protein SC0056 COG1216: 62178627 modifier of citrate lyase COG1217: 62178628 dihydrodipicolinate reductase COG1218: 62178630 carbamoyl phosphate synthase small subunit COG1219: 62178632 DNA-binding transcriptional activator CaiF COG1220: 62178633 carnitine operon protein CaiE COG1221: 62178634 carnitinyl-CoA dehydratase COG1222: 62178636 crotonobetainyl-CoA:carnitine CoA-transferase COG1223: 62178637 crotonobetainyl-CoA dehydrogenase COG1224: 62178640 electron transfer flavoprotein FixA COG1225: 62178643 ferredoxin, carnitine metabolism COG1226: 62178644 MFS family transporter COG1227: 62178645 outer membrane lipoprotein COG1228: 62178646 hypothetical protein SC0076 COG1229: 62178647 hypothetical protein SC0077 COG1230: 62178651 glutathione-regulated potassium-efflux system protein KefC COG1231: 62178652 dihydrofolate reductase COG1232: 62178653 diadenosine tetraphosphatase COG1233: 62178654 ApaG protein COG1234: 62178655 dimethyladenosine transferase COG1235: 62178656 4-hydroxythreonine-4-phosphate dehydrogenase COG1236: 62178657 peptidyl-prolyl cis-trans isomerase SurA COG1237: 62178658 organic solvent tolerance protein COG1238: 62178659 Dna-J like membrane chaperone protein COG1239: 62178660 23S rRNA/tRNA pseudouridine synthase A COG1240: 62178661 ATP-dependent helicase HepA COG1241: 62178662 DNA polymerase II COG1242: 62178664 hypothetical protein SC0094 COG1243: 62178665 L-ribulose-5-phosphate 4-epimerase COG1244: 62178666 L-arabinose isomerase COG1245: 62178668 DNA-binding transcriptional regulator AraC COG1246: 62178669 DedA family membrane protein COG1247: 62178670 thiamine transporter ATP-binding subunit COG1248: 62178671 thiamine transporter membrane protein

181

COG1249: 62178672 thiamine transporter substrate binding subunit COG1250: 62178673 transcriptional regulator SgrR COG1251: 62178676 isopropylmalate isomerase small subunit COG1252: 62178678 3-isopropylmalate dehydrogenase COG1253: 62178684 acetolactate synthase 3 regulatory subunit COG1254: 62178685 DNA-binding transcriptional regulator FruR COG1255: 62178686 cell division protein MraZ COG1256: 62178687 S-adenosyl-methyltransferase MraW COG1257: 62178688 cell division protein FtsL COG1258: 62178690 UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase COG1259: 62178691 UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-alanine ligase COG1260: 62178692 phospho-N-acetylmuramoyl-pentapeptide-transferase COG1261: 62178693 UDP-N-acetylmuramoyl-L-alanyl-D-glutamate synthetase COG1262: 62178696 UDP-N-acetylmuramate--L-alanine ligase COG1263: 62178698 cell division protein FtsQ COG1264: 62178699 cell division protein FtsA COG1265: 62178700 cell division protein FtsZ COG1266: 62178701 UDP-3-O- COG1267: 62178703 preprotein translocase subunit SecA COG1268: 62178704 nucleoside triphosphate pyrophosphohydrolase COG1269: 62178707 zinc-binding protein COG1270: 62178708 hypothetical protein SC0138 COG1271: 62178709 dephospho-CoA kinase COG1272: 62178710 guanosine 5'-monophosphate oxidoreductase COG1273: 62178711 type IV pilin biogenesis protein COG1274: 62178712 hypothetical protein SC0142 COG1275: 62178713 major pilin subunit COG1276: 62178715 N-acetyl-anhydromuranmyl-L-alanine amidase COG1277: 62178716 regulatory protein AmpE COG1278: 62178718 Na+:galactoside symporter family permease COG1279: 62178719 aromatic amino acid transporter COG1280: 62178720 transcriptional regulator PdhR COG1281: 62178721 pyruvate dehydrogenase subunit E1 COG1282: 62178722 dihydrolipoamide acetyltransferase COG1283: 62178723 dihydrolipoamide dehydrogenase COG1284: 62178724 outer membrane protein COG1285: 62178725 hypothetical protein SC0155 COG1286: 62178726 hypothetical protein SC0156 COG1287: 62178731 2-keto-3-deoxygluconate permease

182

COG1288: 62178734 LysR family transcriptional regulator COG1289: 62178735 S-adenosylmethionine decarboxylase COG1290: 62178736 spermidine synthase COG1291: 62178738 multicopper oxidase COG1292: 62178742 multidrug ABC transporter ATPase COG1293: 62178743 ABC transporter membrane protein COG1294: 62178749 hypothetical protein SC0179 COG1295: 62178750 aspartate alpha-decarboxylase COG1296: 62178751 pantoate--beta-alanine ligase COG1297: 62178752 3-methyl-2-oxobutanoate hydroxymethyltransferase COG1298: 62178753 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase COG1299: 62178754 poly(A) polymerase COG1300: 62178757 sugar fermentation stimulation protein A COG1301: 62178758 2'-5' RNA ligase COG1302: 62178759 ATP-dependent RNA helicase HrpB COG1303: 62178760 penicillin-binding protein 1b COG1304: 62178772 glutamate-1-semialdehyde aminotransferase COG1305: 62178773 chloride channel protein COG1306: 62178774 iron-sulfur cluster insertion protein ErpA COG1307: 62178775 hypothetical protein SC0205 COG1308: 62178776 vitamin B12-transporter protein BtuF COG1309: 62178777 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase COG1310: 62178778 deoxyguanosinetriphosphate triphosphohydrolase COG1311: 62178779 serine endoprotease COG1312: 62178781 hypothetical protein SC0211 COG1313: 62178783 2,3,4,5-tetrahydropyridine-2,6-carboxylate N-succinyltransferase COG1314: 62178784 PII uridylyl-transferase COG1315: 62178785 methionine aminopeptidase COG1316: 62178786 30S ribosomal protein S2 COG1317: 62178788 uridylate kinase COG1318: 62178789 ribosome recycling factor COG1319: 62178790 1-deoxy-D-xylulose 5-phosphate reductoisomerase COG1320: 62178791 undecaprenyl synthase COG1321: 62178793 zinc metallopeptidase RseP COG1322: 62178794 outer membrane protein assembly factor YaeT COG1323: 62178795 periplasmic chaperone COG1324: 62178796 UDP-3-O- COG1325: 62178797 (3R)-hydroxymyristoyl-ACP dehydratase COG1326: 62178798 UDP-N-acetylglucosamine acyltransferase

183

COG1327: 62178799 lipid-A-disaccharide synthase COG1328: 62178800 ribonuclease HII COG1329: 62178801 DNA polymerase III subunit alpha COG1330: 62178802 acetyl-CoA carboxylase carboxyltransferase subunit alpha COG1331: 62178803 endochitinase COG1332: 62178805 hypothetical protein SC0235 COG1333: 62178806 tRNA(Ile)-lysidine synthetase COG1334: 62178807 Rho-binding antiterminator COG1335: 62178808 hypothetical protein SC0238 COG1336: 62178809 hypothetical protein SC0239 COG1337: 62178810 peptidyl-tRNA hydrolase domain-containing protein COG1338: 62178812 hypothetical protein SC0242 COG1339: 62178813 outer membrane lipoprotein COG1340: 62178814 DL-methionine transporter substrate-binding subunit COG1341: 62178815 DL-methionine transporter permease COG1342: 62178817 D,D-heptose 1,7-bisphosphate phosphatase COG1343: 62178821 2,5-diketo-D-gluconate reductase B COG1344: 62178822 LysR family transcriptional regulator COG1345: 62178823 drug efflux protein COG1346: 62178824 hypothetical protein SC0254 COG1347: 62178825 methyltransferase in menaquinone/biotin biosynthesis COG1348: 62178826 membrane-bound lytic murein transglycosylase D COG1349: 62178827 hydroxyacylglutathione hydrolase COG1350: 62178828 hypothetical protein SC0258 COG1351: 62178830 DNA polymerase III subunit epsilon COG1352: 62178877 adhesin COG1353: 62178878 hypothetical protein SC0308 COG1354: 62178879 hypothetical protein SC0309 COG1355: 62178880 acyl-CoA dehydrogenase COG1356: 62178881 phosphoheptose isomerase COG1357: 62178882 hypothetical protein SC0312 COG1358: 62178883 hypothetical protein SC0313 COG1359: 62178884 DNA polymerase IV COG1360: 62178886 peptide chain release factor-like protein COG1361: 62178887 aminoacyl-histidine dipeptidase COG1362: 62178888 xanthine-guanine phosphoribosyltransferase COG1363: 62178889 fermentation/respiration switch protein COG1364: 62178890 DNA-binding transcriptional regulator Crl COG1365: 62178892 gamma-glutamyl kinase

184

COG1366: 62178893 gamma-glutamyl phosphate reductase COG1368: 62178952 fimbriae chaperone COG1369: 62178953 fimbriae major subunit COG1370: 62178956 diguanylate cyclase/ domain-containing protein COG1371: 62178957 response regulator COG1372: 62178959 hypothetical protein SC0389 COG1373: 62178960 response regulator COG1374: 62178962 outer membrane lipoprotein COG1375: 62178967 inner membrane protein COG1377: 62178971 cytochrome BD2 subunit I COG1378: 62178972 cytochrome BD2 subunit II COG1379: 62178976 transporter COG1380: 62178977 hypothetical protein SC0407 COG1381: 62178978 prp operon regulator COG1382: 62178979 2-methylisocitrate lyase COG1383: 62178980 methylcitrate synthase COG1384: 62178981 2-methylcitrate dehydratase COG1385: 62178983 delta-aminolevulinic acid dehydratase COG1386: 62178984 flagellar protein COG1387: 62178985 DNA-binding transcriptional regulator COG1388: 62178986 beta-lactam binding protein AmpH COG1389: 62178987 transporter COG1390: 62178988 outer membrane lipoprotein COG1391: 62178989 hypothetical protein SC0419 COG1392: 62178990 hypothetical protein SC0420 COG1393: 62178992 hypothetical protein SC0422 COG1394: 62178994 hypothetical protein SC0424 COG1395: 62178995 hypothetical protein SC0425 COG1396: 62178999 COG1397: 62179000 hypothetical protein SC0430 COG1398: 62179001 hypothetical protein SC0431 COG1399: 62179002 hypothetical protein SC0432 COG1400: 62179004 COG1401: 62179007 SbcD COG1402: 62179008 transcriptional regulator PhoB COG1403: 62179009 phosphate regulon sensor protein COG1404: 62179011 LIVCS family branched chain amino acid transporter system II (LIV-II) COG1405: 62179014 thiol - alkyl hydroperoxide reductase COG1406: 62179015 ACP phosphodieterase

185

COG1407: 62179016 S-adenosylmethionine--tRNA ribosyltransferase-isomerase COG1408: 62179017 queuine tRNA-ribosyltransferase COG1409: 62179018 preprotein translocase subunit YajC COG1410: 62179019 preprotein translocase subunit SecD COG1411: 62179020 preprotein translocase subunit SecF COG1412: 62179021 hypothetical protein SC0451 COG1413: 62179022 regulatory protein COG1414: 62179023 hypothetical protein SC0453 COG1415: 62179024 nucleoside channel phage T6/colicin K receptor COG1416: 62179025 hypothetical protein SC0455 COG1417: 62179027 bifunctional diaminohydroxyphosphoribosylaminopyrimidine deaminase COG1418: 62179028 6,7-dimethyl-8-ribityllumazine synthase COG1419: 62179029 transcription antitermination protein NusB COG1420: 62179030 thiamine monophosphate kinase COG1421: 62179031 phosphatidylglycerophosphatase A COG1422: 62179032 oxidoreductase / K + channel protein COG1423: 62179033 1-deoxy-D-xylulose-5-phosphate synthase COG1424: 62179034 geranyltranstransferase COG1425: 62179035 exodeoxyribonuclease VII small subunit COG1426: 62179036 thiamine biosynthesis protein ThiI COG1427: 62179037 2-aminoethylphosphonate ABC transporter permease COG1428: 62179039 2-aminoethylphosphonate ABC transporter ATPase COG1429: 62179040 2-aminoethylphosphonate ABC transporter substrate-binding protein COG1430: 62179041 2-aminoethylphosphonate transport, repressor COG1431: 62179042 2-aminoethylphosphonate--pyruvate transaminase COG1432: 62179043 phosphonoacetaldehyde hydrolase COG1433: 62179044 DJ-1 family protein COG1434: 62179045 2-dehydropantoate 2-reductase COG1435: 62179046 nucleotide-binding protein COG1436: 62179047 MFS family transporter COG1437: 62179051 protoheme IX farnesyltransferase COG1438: 62179052 cytochrome o ubiquinol oxidase subunit IV COG1439: 62179053 cytochrome o ubiquinol oxidase subunit III COG1440: 62179054 cytochrome o ubiquinol oxidase subunit I COG1441: 62179055 cytochrome o ubiquinol oxidase subunit II COG1442: 62179056 muropeptide transporter COG1443: 62179058 transcriptional regulator BolA COG1444: 62179059 trigger factor COG1445: 62179061 ATP-dependent protease ATP-binding subunit ClpX

186

COG1446: 62179062 DNA-binding ATP-dependent protease La COG1447: 62179063 transcriptional regulator HU subunit beta COG1448: 62179064 peptidyl-prolyl cis-trans isomerase COG1449: 62179065 hypothetical protein SC0495 COG1450: 62179067 queuosine biosynthesis protein QueC COG1451: 62179068 ABC transporter substrate-binding protein COG1452: 62179070 cysteine synthase/cystathionine beta-synthase COG1453: 62179071 transcriptional regulator (AsnC family) COG1454: 62179075 nitrogen regulatory protein P-II 2 COG1455: 62179076 ammonium transporter COG1456: 62179077 acyl-CoA thioesterase COG1457: 62179078 glycoprotein/polysaccharide metabolism COG1458: 62179079 methyltransferase COG1459: 62179080 diguanylate cyclase/phosphodiesterase domain-containing protein COG1460: 62179081 50S ribosomal protein L31 COG1461: 62179082 50S ribosomal protein L36 COG1462: 62179083 hypothetical protein SC0513 COG1463: 62179084 maltose O-acetyltransferase COG1464: 62179085 hemolysin expression-modulating protein COG1465: 62179086 hypothetical protein SC0516 COG1466: 62179088 acridine efflux pump COG1467: 62179091 hypothetical protein SC0521 COG1468: 62179093 hypothetical protein SC0523 COG1469: 62179094 adenine phosphoribosyltransferase COG1470: 62179096 hypothetical protein SC0526 COG1471: 62179097 recombination protein RecR COG1472: 62179098 heat shock protein 90 COG1473: 62179099 COG1474: 62179101 ferrochelatase COG1475: 62179103 inosine-guanosine kinase COG1476: 62179104 cation:proton antiport protein COG1477: 62179105 MFS family transporter COG1478: 62179107 hypothetical protein SC0537 COG1479: 62179108 hypothetical protein SC0538 COG1480: 62179109 copper exporting ATPase COG1481: 62179110 DNA-binding transcriptional regulator CueR COG1482: 62179111 hypothetical protein SC0541 COG1483: 62179113 ABC transporter ATP-binding protein COG1484: 62179114 hypothetical protein SC0544

187

COG1485: 62179116 short chain dehydrogenase COG1486: 62179117 multifunctional acyl-CoA thioesterase I/protease I/ L1 COG1487: 62179118 ABC transporter ATP-binding protein COG1488: 62179121 ABC-type transport system ATPase/cell division protein COG1489: 62179123 binding-protein-dependent transport system inner membrane protein COG1490: 62179124 tRNA 2-selenouridine synthase COG1491: 62179127 DNA-binding transcriptional repressor AllR COG1492: 62179128 hydroxypyruvate isomerase COG1493: 62179129 tartronic semialdehyde reductase COG1494: 62179130 permease COG1495: 62179132 allantoinase COG1496: 62179134 glycerate kinase COG1497: 62179135 hypothetical protein SC0565 COG1498: 62179136 allantoate COG1499: 62179139 hypothetical protein SC0569 COG1500: 62179140 hypothetical protein SC0570 COG1501: 62179141 carbamate kinase COG1502: 62179142 phosphoribosylaminoimidazole carboxylase ATPase subunit COG1503: 62179143 phosphoribosylaminoimidazole carboxylase catalytic subunit COG1504: 62179144 UDP-2,3-diacylglucosamine hydrolase COG1505: 62179145 peptidyl-prolyl cis-trans isomerase B COG1506: 62179146 cysteinyl-tRNA synthetase COG1507: 62179149 membrane-bound metal-dependent hydrolase COG1508: 62179151 bifunctional 5,10-methylene-tetrahydrofolate dehydrogenase COG1509: 62179155 outer membrane usher protein COG1510: 62179156 minor fimbrial subunit COG1511: 62179157 fimbrial protein COG1512: 62179158 transcriptional regulator FimZ COG1513: 62179159 regulatory protein COG1514: 62179160 fimbrial protein COG1515: 62179170 AraC family transcriptional regulator COG1516: 62179171 pyridine nucleotide-disulfide oxidoreductase COG1517: 62179172 hypothetical protein SC0602 COG1518: 62179173 hypothetical protein SC0603 COG1519: 62179175 phenylalanine transporter COG1520: 62179176 hypothetical protein SC0606 outer membrane N-acetyl phenylalanine beta-naphthyl ester-cleaving COG1521: 62179178 esterase COG1522: 62179179 dihydropteridine reductase COG1523: 62179180 hypothetical protein SC0610

188

COG1524: 62179181 regulatory protein COG1525: 62179182 regulatory protein COG1526: 62179183 hypothetical protein SC0613 phosphopantetheinyltransferase component of enterobactin synthase COG1527: 62179185 multienzyme complex COG1528: 62179187 enterobactin/ferric enterobactin esterase COG1529: 62179188 hypothetical protein SC0618 COG1530: 62179191 iron-enterobactin transporter ATP-binding protein COG1531: 62179193 iron-enterobactin transporter membrane protein COG1532: 62179194 enterobactin exporter EntS COG1533: 62179195 iron-enterobactin transporter periplasmic binding protein COG1534: 62179198 2,3-dihydro-2,3-dihydroxybenzoate synthetase COG1535: 62179199 2,3-dihydroxybenzoate-2,3-dehydrogenase COG1536: 62179200 hypothetical protein SC0630 COG1537: 62179202 hypothetical protein SC0632 COG1538: 62179203 hypothetical protein SC0633 COG1539: 62179204 aminotransferase COG1540: 62179205 hypothetical protein SC0635 COG1541: 62179207 LysR family transcriptional regulator COG1542: 62179208 disulfide isomerase/thiol-disulfide oxidase COG1543: 62179209 alkyl hydroperoxide reductase COG1544: 62179210 alkyl hydroperoxide reductase COG1545: 62179211 hypothetical protein SC0641 COG1546: 62179212 oxidoreductase COG1547: 62179213 hydrogenase COG1548: 62179214 hydrogenase COG1549: 62179215 universal stress protein UspA and related nucleotide-binding protein COG1550: 62179216 nucleoside diphosphate kinase regulator COG1551: 62179217 DASS family citrate:succinate transporter COG1552: 62179219 2-(5''-triphosphoribosyl)-3'-dephosphocoenzyme-A synthase COG1553: 62179221 citrate lyase subunit beta COG1554: 62179222 citrate lyase subunit gamma COG1555: 62179223 citrate lyase synthetase COG1556: 62179226 two-component response regulator DpiA COG1557: 62179227 C4-dicarboxylate transporter DcuC COG1558: 62179229 camphor resistance protein CrcB COG1559: 62179231 twin translocase E COG1560: 62179232 lipoyl synthase COG1561: 62179233 DNA-binding transcriptional regulator COG1562: 62179234 lipoate-protein ligase B

189

COG1563: 62179235 hypothetical protein SC0665 COG1564: 62179236 D-alanyl-D-alanine carboxypeptidase COG1565: 62179239 cell wall shape-determining protein COG1566: 62179241 rRNA large subunit methyltransferase COG1567: 62179242 hypothetical protein SC0672 COG1568: 62179243 alpha ribazole-5'-P phosphatase COG1569: 62179245 nicotinic acid mononucleotide adenylyltransferase COG1570: 62179246 DNA polymerase III subunit delta COG1571: 62179250 2-keto-3-deoxygluconate permease COG1572: 62179252 hypothetical protein SC0682 COG1573: 62179259 ribonucleoside hydrolase 1 COG1574: 62179260 glutamate/aspartate ABC transporter ATP-binding protein COG1575: 62179261 glutamate/aspartate ABC transporter COG1576: 62179262 glutamate/aspartate ABC transporter COG1577: 62179264 apolipoprotein N-acyltransferase COG1578: 62179265 hypothetical protein SC0695 COG1579: 62179266 metalloprotease COG1580: 62179267 phosphate starvation-inducible protein, ATP-binding COG1581: 62179268 (dimethylallyl)adenosine tRNA methylthiotransferase COG1582: 62179272 UMP phosphatase COG1583: 62179273 N-acetylglucosamine operon transcriptional repressor COG1584: 62179274 N-acetylglucosamine-6-phosphate deacetylase COG1585: 62179275 glucosamine-6-phosphate deaminase COG1586: 62179277 glutaminyl-tRNA synthetase COG1587: 62179278 outer membrane protein COG1588: 62179279 lipoprotein COG1589: 62179280 citrate-proton symporter COG1590: 62179281 tricarballylate dehydrogenase COG1591: 62179282 LysR family transcriptional regulator COG1592: 62179283 ferric uptake regulator COG1593: 62179284 flavodoxin FldA COG1594: 62179285 LexA regulated protein COG1595: 62179286 hypothetical protein SC0716 COG1596: 62179288 phosphoglucomutase COG1597: 62179289 hypothetical protein SC0719 COG1598: 62179290 putrescine transporter COG1599: 62179292 DNA-binding transcriptional activator KdpE COG1600: 62179293 sensor protein KdpD COG1601: 62179295 potassium-transporting ATPase subunit B

190

COG1602: 62179296 potassium-transporting ATPase subunit A COG1603: 62179298 deoxyribodipyrimidine photolyase COG1604: 62179299 POT family transport protein COG1605: 62179300 hydrolase-oxidase COG1606: 62179301 hypothetical protein SC0731 COG1607: 62179302 hypothetical protein SC0732 COG1608: 62179303 LamB/YcsF family protein COG1609: 62179305 hypothetical protein SC0735 COG1610: 62179306 type II citrate synthase COG1611: 62179307 succinate dehydrogenase cytochrome b556 large membrane subunit COG1612: 62179309 succinate dehydrogenase iron-sulfur subunit COG1613: 62179311 dihydrolipoamide succinyltransferase COG1614: 62179312 succinyl-CoA synthetase subunit beta COG1615: 62179313 succinyl-CoA synthetase subunit alpha COG1616: 62179315 cytochrome d terminal oxidase polypeptide subunit II COG1617: 62179316 hypothetical protein SC0746 COG1618: 62179317 hypothetical protein SC0747 COG1619: 62179318 acyl-CoA thioester hydrolase COG1620: 62179319 colicin uptake protein TolQ COG1621: 62179320 colicin uptake protein TolR COG1622: 62179321 cell envelope integrity inner membrane protein TolA COG1623: 62179322 translocation protein TolB COG1624: 62179323 peptidoglycan-associated outer membrane lipoprotein COG1625: 62179324 tol-pal system protein YbgF COG1626: 62179325 quinolinate synthetase COG1627: 62179327 zinc transporter ZitB COG1628: 62179328 homeobox protein COG1629: 62179338 ABC transporter COG1630: 62179339 ABC-type cobalamin/Fe3+-siderophores transport system, ATPase COG1631: 62179340 phosphoglyceromutase COG1632: 62179341 aldose 1-epimerase COG1633: 62179342 COG1634: 62179343 galactose-1-phosphate uridylyltransferase COG1635: 62179344 UDP-galactose-4-epimerase COG1636: 62179345 inner membrane protein COG1637: 62179347 DNA-binding transcriptional regulator ModE COG1638: 62179349 molybdate transporter periplasmic protein COG1639: 62179350 molybdate ABC transporter permease COG1640: 62179351 molybdate transporter ATP-binding protein

191

COG1641: 62179352 COG1642: 62179353 6-phosphogluconolactonase COG1643: 62179354 COG1644: 62179357 histidine utilization repressor COG1645: 62179359 histidine ammonia-lyase COG1646: 62179360 kinase inhibitor protein COG1647: 62179362 biotin synthetase COG1648: 62179363 8-amino-7-oxononanoate synthase COG1649: 62179365 dithiobiotin synthetase COG1650: 62179366 excinuclease ABC subunit B COG1651: 62179369 hypothetical protein SC0799 COG1652: 62179370 molybdenum cofactor biosynthesis protein A COG1653: 62179371 molybdopterin biosynthesis, protein B COG1654: 62179372 molybdenum cofactor biosynthesis protein MoaC COG1655: 62179373 molybdopterin synthase small subunit COG1656: 62179374 molybdopterin guanine dinucleotide biosynthesis protein MoaE COG1657: 62179376 integral membrane protein COG1658: 62179378 hypothetical protein SC0808 COG1659: 62179381 hypothetical protein SC0811 COG1660: 62179382 ABC transporter membrane protein COG1661: 62179383 ABC transporter membrane protein COG1662: 62179384 multidrug ABC transporter ATPase COG1663: 62179385 hypothetical protein SC0815 COG1664: 62179386 DNA-binding transcriptional regulator COG1665: 62179387 ATP-dependent RNA helicase RhlE COG1666: 62179390 hypothetical protein SC0820 COG1667: 62179391 hypothetical protein SC0821 COG1668: 62179392 hypothetical protein SC0822 COG1669: 62179393 glutamine ABC transporter ATP-binding protein COG1670: 62179394 glutamine ABC transporter permease COG1671: 62179396 DNA starvation/stationary phase protection protein Dps COG1672: 62179399 outer membrane protein X COG1673: 62179400 hypothetical protein SC0830 COG1674: 62179402 manganese transport regulator MntR COG1675: 62179403 hypothetical protein SC0833 COG1676: 62179406 hypothetical protein SC0836 COG1677: 62179407 HAD family hydrolase COG1678: 62179408 pyruvate formate lyase COG1679: 62179409 pyruvate formate lyase activating enzyme

192

COG1680: 62179410 molybdopterin biosynthesis protein MoeB COG1681: 62179411 molybdopterin biosynthesis protein MoeA COG1682: 62179412 L- COG1683: 62179413 glutathione transporter ATP-binding protein COG1684: 62179414 ABC transporter substrate-binding protein COG1685: 62179415 ABC transporter substrate-binding protein COG1686: 62179416 ABC transporter inner membrane component COG1687: 62179417 ribosomal protein S12 methylthiotransferase COG1688: 62179426 glutathione S-transferase COG1689: 62179427 D-alanyl-D-alanine carboxypeptidase COG1690: 62179428 DNA-binding transcriptional repressor DeoR COG1691: 62179429 undecaprenyl pyrophosphate phosphatase COG1692: 62179430 multidrug translocase COG1693: 62179431 hypothetical protein SC0861 COG1694: 62179432 hypothetical protein SC0862 COG1695: 62179433 paral regulator COG1696: 62179434 hypothetical protein SC0864 COG1697: 62179435 hypothetical protein SC0865 COG1698: 62179436 glutaredoxin COG1699: 62179437 hypothetical protein SC0867 COG1700: 62179438 ribosomal protein S6 modification protein COG1701: 62179439 hypothetical protein SC0869 COG1702: 62179440 putrescine ABC transporter periplasmic-binding protein COG1703: 62179441 putrescine ABC transporter ATP-binding protein COG1704: 62179442 putrescine ABC transporter membrane protein COG1705: 62179443 putrescine ABC transporter membrane protein COG1706: 62179444 hypothetical protein SC0874 COG1707: 62179446 PTS system ascorbate-specific transporter subunit IIC COG1708: 62179447 inner membrane protein COG1709: 62179448 COG1710: 62179449 arginine ABC transporter ATP-binding protein COG1711: 62179450 arginine transporter permease subunit ArtM COG1712: 62179451 arginine transporter permease subunit ArtQ COG1713: 62179452 arginine ABC transporter ATP-binding protein COG1714: 62179453 arginine transporter ATP-binding subunit COG1715: 62179454 lipoprotein COG1716: 62179457 nucleoside-diphosphate-sugar epimerase COG1717: 62179458 hypothetical protein SC0888 COG1718: 62179459 L-threonine aldolase

193

COG1719: 62179460 pyruvate dehydrogenase COG1720: 62179461 HCP oxidoreductase COG1721: 62179462 hydroxylamine reductase COG1722: 62179463 hypothetical protein SC0893 COG1723: 62179464 hypothetical protein SC0894 COG1724: 62179465 virK COG1725: 62179468 CspA-like protein COG1726: 62179469 ATP-dependent Clp protease adaptor protein ClpS COG1727: 62179470 ATP-dependent Clp protease ATP-binding subunit COG1728: 62179474 slsA in STM COG1729: 62179475 hypothetical protein SC0905 COG1730: 62179476 LysR family transcriptional regulator COG1731: 62179477 translation initiation factor IF-1 COG1732: 62179479 leucyl/phenylalanyl-tRNA--protein transferase COG1733: 62179480 cysteine/glutathione ABC transporter membrane/ATP-binding protein COG1734: 62179481 cysteine/glutathione ABC transporter membrane/ATP-binding protein COG1735: 62179482 thioredoxin reductase COG1736: 62179483 leucine-responsive transcriptional regulator COG1737: 62179484 DNA translocase FtsK COG1738: 62179485 outer-membrane lipoprotein carrier protein COG1739: 62179486 recombination factor protein RarA COG1740: 62179487 seryl-tRNA synthetase COG1741: 62179490 anaerobic dimethyl sulfoxide reductase subunit C COG1742: 62179492 MFS family transporter protein COG1743: 62179493 amino acid APC transporter COG1744: 62179498 formate transporter COG1745: 62179499 hypothetical protein SC0929 COG1746: 62179500 hypothetical protein SC0930 COG1747: 62179501 phosphoserine aminotransferase COG1748: 62179502 3-phosphoshikimate 1-carboxyvinyltransferase COG1749: 62179503 Zn-dependent protease with chaperone function COG1750: 62179504 cytidylate kinase COG1751: 62179505 30S ribosomal protein S1 COG1752: 62179506 integration host factor subunit beta COG1753: 62179511 lipid transporter ATP-binding protein/permease COG1754: 62179512 tetraacyldisaccharide 4'-kinase COG1755: 62179514 hypothetical protein SC0944 COG1756: 62179515 3-deoxy-manno-octulosonate cytidylyltransferase COG1757: 62179517 hypothetical protein SC0947

194

COG1758: 62179518 hypothetical protein SC0948 COG1759: 62179519 metallothionein SmtA COG1760: 62179520 condesin subunit F COG1761: 62179521 condesin subunit E COG1762: 62179522 hypothetical protein SC0952 COG1763: 62179523 hypothetical protein SC0953 COG1764: 62179524 hypothetical protein SC0954 COG1765: 62179525 aromatic amino acid aminotransferase COG1766: 62179526 outer membrane protein 1a (IA;b;f), porin COG1767: 62179527 asparaginyl-tRNA synthetase COG1768: 62179528 leucine response regulator COG1769: 62179531 Lrp family transcriptional regulator COG1770: 62179532 nicotinate phosphoribosyltransferase COG1771: 62179581 dihydroorotate dehydrogenase 2 COG1772: 62179582 hypothetical protein SC1012 COG1773: 62179583 hypothetical protein SC1013 COG1774: 62179584 23S rRNA m(2)G2445 methyltransferase COG1775: 62179585 paraquat-inducible protein A COG1776: 62179586 paraquat-inducible protein B COG1777: 62179587 outer membrane protein COG1778: 62179589 3-hydroxydecanoyl-ACP dehydratase COG1779: 62179590 hypothetical protein SC1020 COG1780: 62179591 hypothetical protein SC1021 COG1781: 62179592 outer membrane protein OmpA COG1782: 62179593 SOS cell division inhibitor COG1783: 62179594 hypothetical protein SC1024 COG1784: 62179595 efflux (PET) family transporter COG1785: 62179596 hypothetical protein SC1026 COG1786: 62179597 DNA helicase IV COG1787: 62179598 methylglyoxal synthase COG1788: 62179599 hypothetical protein SC1029 COG1789: 62179600 hypothetical protein SC1030 COG1790: 62179601 heat shock protein HspQ COG1791: 62179602 hypothetical protein SC1032 COG1792: 62179605 acylphosphatase COG1793: 62179606 sulfur transfer protein TusE COG1794: 62179607 hypothetical protein SC1037 COG1795: 62179609 pathogenicity island encoded protein: SPI3 COG1796: 62179610 pathogenicity island encoded protein: SPI3

195

COG1797: 62179612 hypothetical protein SC1042 COG1798: 62179613 outer protein COG1799: 62179615 copper resistance; histidine kinase COG1800: 62179617 hypothetical protein SC1047 COG1801: 62179618 4-hydroxyphenylacetate catabolism COG1802: 62179619 4-hydroxyphenylacetate catabolism COG1803: 62179620 4-hydroxyphenylacetate catabolism COG1804: 62179623 4-hydroxyphenylacetate catabolism COG1805: 62179624 4-hydroxyphenylacetate catabolism COG1806: 62179625 4-hydroxyphenylacetate catabolism COG1807: 62179627 4-hydroxyphenylacetate catabolism COG1808: 62179629 hypothetical protein SC1059 COG1809: 62179632 chaperone-modulator protein CbpM COG1810: 62179633 curved DNA-binding protein CbpA COG1811: 62179634 suppression of copper sensitivity copper binding protein COG1812: 62179635 hypothetical protein SC1065 COG1813: 62179636 hypothetical protein SC1066 COG1814: 62179638 hypothetical protein SC1068 COG1815: 62179639 trp-repressor binding protein COG1816: 62179643 hypothetical protein SC1073 COG1817: 62179646 SSS family major sodium/proline symporter COG1818: 62179647 hypothetical protein SC1077 COG1819: 62179648 transcriptional regulator COG1820: 62179649 N-acetylmannosamine-6-phosphate 2-epimerase COG1821: 62179651 outer membrane protein COG1822: 62179653 oxidoreductase COG1823: 62179656 hypothetical protein SC1086 COG1824: 62179657 transcriptional regulator in curly assembly/transport, 2nd curli operon COG1825: 62179659 curli assembly protein CsgE COG1826: 62179660 DNA-binding transcriptional regulator CsgD COG1827: 62179661 curlin minor subunit COG1828: 62179662 cryptic curlin major subunit COG1829: 62179664 hypothetical protein SC1094 COG1830: 62179665 hypothetical protein SC1095 COG1831: 62179666 glucans biosynthesis protein COG1832: 62179667 glucan biosynthesis protein G COG1833: 62179668 glucosyltransferase MdoH COG1834: 62179669 hypothetical protein SC1099 COG1835: 62179671 drug efflux system protein MdtG

196

COG1836: 62179673 hypothetical protein SC1103 COG1837: 62179674 hypothetical protein SC1104 COG1838: 62179678 N-methyltryptophan oxidase COG1839: 62179679 biofilm formation regulatory protein BssS COG1840: 62179681 COG1841: 62179683 glutaredoxin COG1842: 62179684 multidrug resistance protein MdtH COG1843: 62179685 ribosomal-protein-S5-alanine N-acetyltransferase COG1844: 62179687 virulence factor COG1845: 62179688 flagellar biosynthesis chaperone COG1846: 62179689 anti-sigma-28 factor FlgM COG1847: 62179690 flagellar basal body P-ring biosynthesis protein FlgA COG1848: 62179691 flagellar basal-body rod protein FlgB COG1849: 62179692 flagellar basal body rod protein FlgC COG1850: 62179693 flagellar basal body rod modification protein COG1851: 62179694 flagellar hook protein FlgE COG1852: 62179695 flagellar basal body rod protein FlgF COG1853: 62179696 flagellar basal body rod protein FlgG COG1854: 62179697 flagellar basal body L-ring protein COG1855: 62179699 flagellar rod assembly protein/muramidase FlgJ COG1856: 62179701 flagellar hook-associated protein FlgL COG1857: 62179707 Maf-like protein COG1858: 62179708 hypothetical protein SC1138 COG1859: 62179710 glycerol-3-phosphate acyltransferase PlsX COG1860: 62179711 3-oxoacyl-ACP synthase COG1861: 62179712 ACP S-malonyltransferase COG1862: 62179713 3-ketoacyl-ACP reductase COG1863: 62179714 acyl carrier protein COG1864: 62179718 4-amino-4-deoxychorismate lyase COG1865: 62179719 hypothetical protein SC1149 COG1866: 62179720 thymidylate kinase COG1867: 62179721 DNA polymerase III subunit delta' COG1868: 62179722 metallodependent hydrolase COG1869: 62179723 PTS system glucose-specific transporter subunit IIBC COG1870: 62179725 purine nucleoside phosphoramidase COG1871: 62179726 outer membrane lipoprotein COG1872: 62179727 outer membrane lipoprotein COG1873: 62179728 thiamine kinase COG1874: 62179729 beta-hexosaminidase

197

COG1875: 62179730 hypothetical protein SC1160 COG1876: 62179731 respiratory NADH dehydrogenase 2; cupric reductase COG1877: 62179733 TetR/AcrR family transcriptional regulator COG1878: 62179734 outer membrane protein COG1879: 62179735 hypothetical protein SC1165 COG1880: 62179736 transcription-repair coupling factor COG1881: 62179737 outer membrane-specific lipoprotein transporter subunit LolC COG1882: 62179738 lipoprotein transporter ATP-binding subunit COG1883: 62179739 outer membrane-specific lipoprotein transporter subunit LolE COG1884: 62179740 N-acetyl-D-glucosamine kinase COG1885: 62179741 NAD-dependent deacetylase COG1886: 62179742 spermidine/putrescine ABC transporter substrate-binding protein COG1887: 62179743 spermidine/putrescine ABC transporter membrane protein COG1888: 62179744 secreted effector protein COG1889: 62179746 spermidine/putrescine ABC transporter membrane protein COG1890: 62179747 putrescine/spermidine ABC transporter ATPase COG1891: 62179748 peptidase T COG1892: 62179749 hypothetical protein SC1179 COG1893: 62179750 hypothetical protein SC1180 COG1894: 62179751 sensor protein PhoQ COG1895: 62179752 DNA-binding transcriptional regulator PhoP COG1896: 62179753 adenylosuccinate lyase COG1897: 62179754 hypothetical protein SC1184 COG1898: 62179756 MutT-like protein COG1899: 62179757 hypothetical protein SC1187 COG1900: 62179758 ribosomal large subunit pseudouridine synthase COG1901: 62179759 isocitrate dehydrogenase COG1902: 62179818 hypothetical protein SC1248 COG1903: 62179822 macrophage survival gene; reduced mouse virulence COG1904: 62179823 envelope protein COG1905: 62179826 PhoP regulated protein: reduced macrophage survival COG1906: 62179836 outer membrane lipoprotein COG1907: 62179837 ABC transporter substrate-binding protein COG1908: 62179838 ABC transporter COG1909: 62179839 ABC transporter COG1910: 62179840 ABC transporter ATPase COG1911: 62179841 ABC transporter ATPase COG1912: 62179844 hypothetical protein SC1274 COG1913: 62179845 aminoglycoside resistance protein

198

COG1914: 62179846 response regulator COG1915: 62179847 transcriptional regulator COG1916: 62179848 hypothetical protein SC1278 COG1917: 62179849 chorismate mutase COG1918: 62179850 leucine export protein LeuE COG1919: 62179851 hypothetical protein SC1281 COG1920: 62179852 hypothetical protein SC1282 COG1921: 62179853 hypothetical protein SC1283 COG1922: 62179855 hypothetical protein SC1285 COG1923: 62179856 hemolysin COG1924: 62179857 hypothetical protein SC1287 COG1925: 62179858 hypothetical protein SC1288 COG1926: 62179859 MFS family transporter COG1927: 62179860 AraC family transcriptional regulator COG1928: 62179861 hypothetical protein SC1291 COG1929: 62179864 hypothetical protein SC1294 COG1930: 62179873 glyceraldehyde-3-phosphate dehydrogenase COG1931: 62179874 methionine sulfoxide reductase B COG1932: 62179875 hypothetical protein SC1305 COG1933: 62179885 nicotinamidase/pyrazinamidase COG1934: 62179886 asparaginase COG1935: 62179888 hypothetical protein SC1318 COG1936: 62179889 selenophosphate synthetase COG1937: 62179893 pyrimidine (deoxy)nucleoside triphosphate pyrophosphohydrolase COG1938: 62179894 exonuclease III COG1939: 62179896 bifunctional succinylornithine transaminase/acetylornithine transaminase COG1940: 62179898 succinylglutamate desuccinylase COG1941: 62179900 nucleotide excision repair endonuclease COG1942: 62179901 NAD synthetase COG1943: 62179902 DNA-binding transcriptional activator OsmE COG1944: 62179903 PTS system N,N'-diacetylchitobiose-specific transporter subunit IIB COG1945: 62179904 PTS system N,N'-diacetylchitobiose-specific transporter subunit IIC COG1946: 62179905 PTS system N,N'-diacetylchitobiose-specific transporter subunit IIA COG1947: 62179906 DNA-binding transcriptional regulator ChbR COG1948: 62179907 phospho-beta-glucosidase COG1949: 62179908 hypothetical protein SC1338 COG1950: 62179909 hydroperoxidase II COG1951: 62179910 cell division modulator COG1952: 62179911 hypothetical protein SC1341

199

COG1953: 62179913 2-deoxyglucose-6-phosphatase COG1954: 62179914 hypothetical protein SC1344 COG1955: 62179915 hypothetical protein SC1345 COG1956: 62179916 hypothetical protein SC1346 COG1957: 62179917 6-phosphofructokinase COG1958: 62179918 salt-induced outer membrane protein COG1959: 62179919 outer membrane protein COG1960: 62179922 threonyl-tRNA synthetase COG1961: 62179923 translation initiation factor IF-3 COG1962: 62179925 50S ribosomal protein L20 COG1963: 62179926 phenylalanyl-tRNA synthetase subunit alpha COG1964: 62179927 phenylalanyl-tRNA synthetase subunit beta COG1965: 62179928 integration host factor subunit alpha COG1966: 62179929 vtamin B12-transporter permease COG1967: 62179930 glutathione peroxidase COG1968: 62179931 vitamin B12-transporter ATPase COG1969: 62179932 lipoprotein COG1970: 62179935 hypothetical protein SC1365 COG1971: 62179936 phospho-2-dehydro-3-deoxyheptonate aldolase COG1972: 62179937 hypothetical protein SC1367 COG1973: 62179947 3-dehydroquinate dehydratase COG1974: 62179954 inner membrane protein COG1975: 62179955 oxidase COG1976: 62179956 hypothetical protein SC1386 COG1977: 62179958 Na+-dicarboxylate symporter COG1978: 62179959 iron-sulfur cluster assembly scaffold protein COG1979: 62179960 cysteine desulfurase COG1980: 62179961 cysteine desulfurase COG1981: 62179964 cysteine desufuration protein SufE COG1982: 62179969 COG1983: 62179970 amino acid permease COG1984: 62179974 tetrathionate reductase complex subunit A COG1985: 62179978 tetrathionate reductase complex: response regulator COG1986: 62179979 hypothetical protein SC1409 COG1987: 62179980 inner membrane protein COG1988: 62179981 MerR family transcriptional regulator COG1989: 62179982 secretion system transcriptonal activator COG1990: 62179983 secretion system regulator:sensor component COG1991: 62179985 secretion system apparatus protein SsaC

200

COG1992: 62179986 secretion system apparatus protein SsaD COG1993: 62179987 secretion system effector protein SsaE COG1994: 62179989 secretion system effector protein SseB COG1995: 62179990 secretion system chaperone protein SscA COG1996: 62179991 secretion system effector protein SseC COG1997: 62179992 secretion system effector protein SseD COG1998: 62179993 secretion system effector SseE COG1999: 62179994 secretion system chaperone protein SscB COG2000: 62179995 secretion system effector protein SseF COG2001: 62179996 secretion system effector protein SseG COG2002: 62179997 secretion system apparatus protein SsaG COG2003: 62179998 secretion system apparatus protein SsaH COG2004: 62179999 secretion system apparatus protein SsaI COG2005: 62180000 secretion system apparatus protein SsaJ COG2006: 62180001 hypothetical protein SC1431 COG2007: 62180002 secretion system apparatus protein SsaK COG2008: 62180003 secretion system apparatus protein SsaL COG2009: 62180004 secretion system apparatus protein SsaM COG2010: 62180005 secretion system apparatus protein SsaV COG2011: 62180006 type III secretion system ATPase COG2012: 62180007 secretion system apparatus protein SsaO COG2013: 62180008 secretion system apparatus protein SsaP COG2014: 62180009 type III secretion system protein COG2015: 62180010 type III secretion system protein COG2016: 62180011 secretion system apparatus protein SsaS COG2017: 62180012 secretion system apparatus protein SsaT COG2018: 62180013 secretion system apparatus protein SsaU COG2019: 62180014 multidrug efflux protein COG2020: 62180015 riboflavin synthase subunit alpha COG2021: 62180016 cyclopropane-fatty-acyl-phospholipid synthase COG2022: 62180017 inner membrane transport protein YdhC COG2023: 62180018 DNA-binding transcriptional regulator COG2024: 62180019 DNA-binding transcriptional repressor PurR COG2025: 62180020 superoxide dismutase COG2026: 62180021 cell wall-associated hydrolase COG2027: 62180022 hypothetical protein SC1452 COG2028: 62180023 COG2029: 62180024 glyoxalase I COG2030: 62180026 TetR/AcrR family transcriptional regulator

201

COG2031: 62180029 superoxide dismutase COG2032: 62180031 multidrug resistance efflux pump COG2033: 62180032 transcriptional regulator SlyA COG2034: 62180033 outer membrane lipoprotein COG2035: 62180034 anhydro-N-acetylmuramic acid kinase COG2036: 62180036 pyridoxamine 5'-phosphate oxidase COG2037: 62180037 tyrosyl-tRNA synthetase COG2038: 62180039 glutathionine S-transferase COG2039: 62180040 tripeptide transporter permease COG2040: 62180041 endonuclease III COG2041: 62180042 electron transport complex protein RsxE COG2042: 62180043 electron transport complex protein RnfG COG2043: 62180044 electron transport complex protein RnfD COG2044: 62180045 electron transport complex protein RnfC COG2045: 62180046 electron transport complex protein RnfB COG2046: 62180047 Na(+)-translocating NADH-quinone reductase subunit E COG2047: 62180048 hypothetical protein SC1478 COG2048: 62180054 mannose-6-phosphate isomerase COG2049: 62180056 fumarate hydratase COG2050: 62180057 DNA replication terminus site-binding protein COG2051: 62180058 sensor protein RstB COG2052: 62180059 hypothetical protein SC1489 COG2053: 62180060 outer membrane protein N, non-specific porin COG2054: 62180063 DNA-binding transcriptional regulator RstA COG2055: 62180065 amino acid transporter COG2056: 62180066 hypothetical protein SC1496 COG2057: 62180067 NAD(P) transhydrogenase subunit alpha COG2058: 62180068 pyridine nucleotide transhydrogenase COG2059: 62180075 LysR family transcriptional regulator COG2060: 62180076 ptsG and ptsHI transcriptional repressor COG2061: 62180077 dithiobiotin synthetase binding-protein-dependent transport system, inner membrane COG2062: 62180079 component COG2063: 62180080 periplasmic component, ABC transport system COG2064: 62180083 dimethylsulfoxide reductase COG2065: 62180089 spermidine N1-acetyltransferase COG2066: 62180090 hypothetical protein SC1520 COG2067: 62180092 dehydratase COG2068: 62180093 dehydrogenase COG2069: 62180095 mannitol dehydrogenase

202

COG2070: 62180096 hypothetical protein SC1526 COG2071: 62180097 GntR family transcriptional regulator COG2072: 62180099 dipeptidyl carboxypeptidase II COG2073: 62180102 competence damage-inducible protein A COG2074: 62180103 hypothetical protein SC1533 COG2075: 62180104 MFS-type transporter YdeE COG2076: 62180106 hypothetical protein SC1536 COG2077: 62180107 DNA-binding transcriptional activator MarA COG2078: 62180108 DNA-binding transcriptional repressor MarR COG2079: 62180109 multiple drug resistance protein MarC COG2080: 62180110 sugar efflux transporter COG2081: 62180113 hypothetical protein SC1543 COG2082: 62180115 outer membrane protein COG2083: 62180119 dehydrogenase COG2084: 62180120 hydrogenase COG2085: 62180121 hydrogenase COG2086: 62180123 hydrogenase maturation protease COG2087: 62180124 Ni/Fe-hydrogenase 1 b-type cytochrome subunit COG2088: 62180133 acid-resistance protein COG2089: 62180134 resistance protein, osmotically inducible COG2090: 62180135 biofilm-dependent modulation protein COG2091: 62180136 30S ribosomal subunit S22 COG2092: 62180138 alcohol dehydrogenase COG2093: 62180174 outer membrane lipoprotein COG2094: 62180175 tellurite resistance protein TehB COG2095: 62180176 potassium-tellurite ethidium and proflavin transporter COG2096: 62180177 ribosomal-protein-L7/L12-serine acetyltransferase COG2097: 62180178 cellulase COG2098: 62180179 PTS system, enzymeIIB component COG2099: 62180180 PTS system enzyme IIC component COG2100: 62180181 nucleoside triphosphatase COG2101: 62180183 epimerase COG2102: 62180184 DeoR family transcriptional regulator COG2103: 62180185 cryptic aminoglycoside resistance gene COG2104: 62180187 hypothetical protein SC1617 COG2105: 62180188 glucan biosynthesis protein D COG2107: 62180207 azoreductase COG2108: 62180208 inner membrane protein COG2109: 62180209 hypothetical protein SC1639

203

COG2110: 62180210 outer membrane lipoprotein COG2111: 62180211 hypothetical protein SC1641 COG2112: 62180212 D-lactate dehydrogenase COG2113: 62180213 heat-inducible protein COG2114: 62180214 hypothetical protein SC1644 COG2115: 62180216 hypothetical protein SC1646 COG2116: 62180217 membrane transporter of cations COG2117: 62180219 C32 tRNA thiolase COG2118: 62180221 zinc transporter COG2119: 62180224 Smr domain-containing protein COG2120: 62180225 O-6-alkylguanine-DNA:cysteine-protein methyltransferase COG2121: 62180227 universal stress protein UspE COG2122: 62180228 hypothetical protein SC1658 COG2123: 62180229 hypothetical protein SC1659 COG2124: 62180240 aldo/keto reductase COG2125: 62180241 LysR family transcriptional regulator COG2126: 62180244 chloromuconate cycloisomerase (muconate cycloisomerase) COG2127: 62180245 thiol peroxidase COG2128: 62180246 DNA-binding transcriptional regulator TyrR COG2129: 62180247 hypothetical protein SC1677 COG2130: 62180249 thiosulfate:cyanide sulfurtransferase COG2131: 62180250 peripheral inner membrane phage-shock protein COG2132: 62180251 DNA-binding transcriptional activator PspC COG2133: 62180252 phage shock protein B COG2134: 62180253 phage shock protein PspA COG2135: 62180255 peptide ABC transporter substrate-binding protein COG2136: 62180256 peptide ABC transporter COG2137: 62180257 peptide ABC transporter COG2138: 62180258 peptide ABC transporter ATP-binding protein COG2139: 62180259 peptide ABC transporter ATP-binding protein COG2140: 62180264 enoyl-ACP reductase COG2141: 62180266 II COG2142: 62180267 RNase II stability modulator COG2143: 62180269 DeoR family transcriptional regulator COG2144: 62180271 translation initiation factor Sui1 COG2145: 62180272 orotidine 5'-phosphate decarboxylase COG2146: 62180273 tetratricopeptide repeat protein COG2147: 62180274 hypothetical protein SC1704 COG2148: 62180275 phosphatidylglycerophosphatase B

204

COG2149: 62180276 GTP cyclohydrolase II COG2150: 62180277 aconitate hydratase COG2151: 62180280 transcriptional regulator CysB COG2152: 62180281 DNA topoisomerase I COG2153: 62180282 hypothetical protein SC1712 COG2154: 62180283 periplasmic protease COG2155: 62180284 short chain dehydrogenase COG2156: 62180285 cob(I)yrinic acid a,c-diamide adenosyltransferase COG2157: 62180286 23S rRNA pseudouridylate synthase B COG2158: 62180287 hypothetical protein SC1717 COG2159: 62180288 hypothetical protein SC1718 COG2160: 62180289 anthranilate synthase component I bifunctional glutamine amidotransferase/anthranilate COG2161: 62180290 phosphoribosyltransferase COG2162: 62180292 subunit beta COG2163: 62180293 tryptophan synthase subunit alpha COG2164: 62180295 hypothetical protein SC1725 COG2165: 62180296 hypothetical protein SC1726 COG2166: 62180298 outer membrane protein W COG2167: 62180299 hypothetical protein SC1729 COG2168: 62180300 hypothetical protein SC1730 COG2169: 62180301 intracellular septation protein A COG2170: 62180302 acyl-CoA thioester hydrolase COG2171: 62180303 transporter COG2172: 62180304 hypothetical protein SC1734 COG2173: 62180305 cardiolipin synthetase COG2174: 62180307 voltage-gated potassium channel COG2175: 62180308 oligopeptide ABC transporter ATP-binding protein COG2176: 62180309 oligopeptide ABC transporter ATP-binding protein COG2177: 62180311 oligopeptide transporter permease COG2178: 62180312 oligopeptide ABC transporter substrate-binding protein COG2179: 62180313 hypothetical protein SC1743 COG2180: 62180315 COG2181: 62180316 global DNA-binding transcriptional dual regulator H-NS COG2182: 62180317 UTP-glucose-1-phosphate uridylyltransferase COG2183: 62180318 response regulator of RpoS COG2184: 62180319 hypothetical protein SC1749 COG2185: 62180325 nitrate reductase 1, cytochrome b(NR), gamma subunit COG2186: 62180326 nitrate reductase 1 subunit delta COG2187: 62180329 major facilitator superfamily nitrite extrusion protein

205

COG2188: 62180331 transcriptional regulator NarL COG2189: 62180332 hypothetical protein SC1762 COG2190: 62180333 hypothetical protein SC1763 COG2191: 62180336 2-dehydro-3-deoxyphosphooctonate aldolase COG2192: 62180337 transcriptional regulator COG2193: 62180338 transcriptional regulator COG2194: 62180339 N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase COG2195: 62180340 peptide chain release factor 1 COG2196: 62180341 glutamyl-tRNA reductase COG2197: 62180342 molecular chaperone LolB COG2198: 62180343 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase COG2199: 62180344 ribose-phosphate pyrophosphokinase COG2200: 62180345 sulfate transporter YchM COG2201: 62180347 peptidyl-tRNA hydrolase COG2202: 62180348 GTP-dependent nucleic acid-binding protein EngD COG2203: 62180353 hydrogenase 1 b-type cytochrome subunit COG2204: 62180354 hydrogenase 1 maturation protease COG2205: 62180360 transglycosylase-associated protein COG2206: 62180362 membrane-bound lytic murein transglycosylase E COG2207: 62180363 L,D-carboxypeptidase A COG2208: 62180364 potassium/proton antiporter COG2209: 62180367 SpoVR family protein COG2210: 62180368 fatty acid metabolism regulator COG2211: 62180369 sodium/proton antiporter COG2212: 62180370 disulfide bond formation protein B COG2213: 62180371 hypothetical protein SC1801 COG2214: 62180372 hypothetical protein SC1802 COG2215: 62180375 hypothetical protein SC1805 COG2216: 62180377 septum formation inhibitor COG2217: 62180378 cell division inhibitor MinD COG2218: 62180379 cell division topological specificity factor MinE COG2219: 62180380 ribonuclease D COG2220: 62180383 outer membrane protein COG2221: 62180384 molecular chaperone COG2222: 62180385 DNA helicase COG2223: 62180386 hypothetical protein SC1816 COG2224: 62180387 hypothetical protein SC1817 COG2225: 62180388 para-aminobenzoate synthase component I COG2226: 62180391 hypothetical protein SC1821

206

COG2227: 62180394 sugar specific PTS family mannose-specific enzyme IIAB COG2228: 62180395 sugar specific PTS family mannose-specific enzyme IIC COG2229: 62180396 PTS system mannose-specific transporter subunit IID COG2230: 62180398 hypothetical protein SC1828 COG2231: 62180399 23S rRNA methyltransferase COG2232: 62180401 cold shock-like protein CspC COG2233: 62180405 hypothetical protein SC1835 COG2234: 62180406 hypothetical protein SC1836 COG2235: 62180407 hypothetical protein SC1837 COG2236: 62180408 IclR family transcriptional regulator COG2237: 62180410 heat shock protein HtpX COG2238: 62180411 carboxy-terminal protease COG2239: 62180413 GAF domain-containing protein COG2240: 62180414 hypothetical protein SC1844 COG2241: 62180415 hypothetical protein SC1845 COG2242: 62180417 hypothetical protein SC1847 COG2243: 62180418 hypothetical protein SC1848 COG2244: 62180423 hypothetical protein SC1853 COG2245: 62180424 hypothetical protein SC1854 COG2246: 62180425 acetyltransferase COG2247: 62180433 PhoPQ-activated integral membrane protein COG2248: 62180434 inner membrane protein COG2249: 62180435 inner membrane protein COG2250: 62180448 hypothetical protein SC1878 COG2251: 62180449 inner membrane protein COG2252: 62180450 hypothetical protein SC1880 COG2253: 62180451 DNA polymerase III subunit theta COG2254: 62180452 hypothetical protein SC1882 COG2255: 62180453 exodeoxyribonuclease X COG2256: 62180454 protease 2 COG2257: 62180455 hypothetical protein SC1885 COG2258: 62180456 hypothetical protein SC1886 COG2259: 62180458 phosphoribosylglycinamide formyltransferase 2 COG2260: 62180459 keto-hydroxyglutarate-aldolase/keto-deoxy-phosphogluconate aldolase COG2261: 62180460 phosphogluconate dehydratase COG2262: 62180461 glucose-6-phosphate 1-dehydrogenase COG2263: 62180463 DNA-binding transcriptional regulator HexR COG2264: 62180464 pyruvate kinase COG2265: 62180467 lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA acyltransferase

207

COG2266: 62180468 hypothetical protein SC1898 COG2267: 62180471 high-affinity zinc transporter membrane protein COG2268: 62180472 Holliday junction DNA helicase RuvB COG2269: 62180473 Holliday junction DNA helicase RuvA COG2270: 62180476 Holliday junction resolvase COG2271: 62180477 hypothetical protein SC1907 COG2272: 62180478 dATP pyrophosphohydrolase COG2273: 62180479 aspartyl-tRNA synthetase COG2274: 62180481 hypothetical protein SC1911 COG2275: 62180482 hypothetical protein SC1912 COG2276: 62180483 hypothetical protein SC1913 COG2277: 62180484 hypothetical protein SC1914 COG2278: 62180485 hypothetical protein SC1915 COG2279: 62180486 arginyl-tRNA synthetase COG2280: 62180488 hypothetical protein SC1918 COG2281: 62180489 flagellar protein COG2282: 62180493 chemotaxis regulatory protein CheY COG2283: 62180494 chemotaxis-specific methylesterase COG2284: 62180495 chemotaxis methyltransferase CheR COG2285: 62180497 purine-binding chemotaxis protein COG2286: 62180498 chemotaxis protein CheA COG2287: 62180500 flagellar motor protein MotA COG2288: 62180501 transcriptional activator FlhC COG2289: 62180502 transcriptional activator FlhD COG2290: 62180505 trehalose-6-phosphate synthase COG2291: 62180506 trehalose-6-phosphate phosphatase COG2292: 62180507 hypothetical protein SC1937 COG2293: 62180508 ferritin COG2294: 62180510 outer membrane lipoprotein COG2295: 62180511 ferritin COG2296: 62180512 hypothetical protein SC1942 COG2297: 62180513 HAAAP family tyrosine-specific transport protein COG2298: 62180514 hypothetical protein SC1944 COG2299: 62180518 phosphatidylglycerophosphate synthetase COG2300: 62180519 excinuclease ABC subunit C COG2301: 62180520 response regulator COG2302: 62180523 hypothetical protein SC1953 COG2303: 62180524 DNA-binding transcriptional activator SdiA COG2304: 62180525 amino-acid ABC transporter ATP-binding protein YecC

208

COG2305: 62180526 ABC-type amino acid transporter permease component COG2306: 62180527 D-cysteine desulfhydrase COG2307: 62180528 cystine transporter subunit COG2308: 62180529 flagella biosynthesis protein FliZ COG2309: 62180530 flagellar biosynthesis sigma factor COG2310: 62180534 flagellar capping protein COG2311: 62180535 flagellar protein FliS COG2312: 62180536 flagellar biosynthesis protein FliT COG2313: 62180537 alpha-amylase COG2314: 62180538 hypothetical protein SC1968 COG2315: 62180539 inner membrane protein COG2316: 62180540 hypothetical protein SC1970 COG2317: 62180541 hypothetical protein SC1971 COG2318: 62180542 flagellar hook-basal body protein FliE COG2319: 62180544 flagellar MS-ring protein COG2320: 62180545 flagellar motor switch protein G COG2321: 62180546 flagellar assembly protein H COG2322: 62180547 flagellum-specific ATP synthase COG2323: 62180548 flagellar biosynthesis chaperone COG2324: 62180549 flagellar hook-length control protein COG2325: 62180551 flagellar motor switch protein FliM COG2326: 62180553 flagellar biosynthesis protein FliO COG2327: 62180554 flagellar biosynthesis protein FliP COG2328: 62180555 flagellar biosynthesis protein FliQ COG2329: 62180556 flagellar biosynthesis protein FliR COG2330: 62180557 capsular/exo- polysaccharide synthesis transcriptional regulator COG2331: 62180558 hypothetical protein SC1988 COG2332: 62180559 hypothetical protein SC1989 COG2333: 62180560 mannosyl-3-phosphoglycerate phosphatase COG2334: 62180561 hypothetical protein SC1991 COG2335: 62180564 DNA mismatch endonuclease, patch repair protein COG2336: 62180565 DNA cytosine methylase COG2337: 62180566 hypothetical protein SC1996 COG2338: 62180568 porin COG2339: 62180573 hypothetical protein SC2003 COG2340: 62180587 AMP nucleosidase COG2341: 62180593 hypothetical protein SC2023 COG2342: 62180594 nicotinate-nucleotide--dimethylbenzimidazole phosphoribosyltransferase COG2343: 62180595 cobalamin synthase

209

COG2344: 62180596 adenosylcobinamide kinase COG2345: 62180597 cobyric acid synthase COG2346: 62180599 synthesis of vitamin B12 adenosyl cobalamide COG2347: 62180600 cobalt transport protein CbiN COG2348: 62180605 precorrin-3B C(17)-methyltransferase COG2349: 62180606 cobalamin biosynthesis protein CbiG COG2350: 62180607 synthesis of vitamin B12 adenosyl cobalamide COG2351: 62180608 cobalt-precorrin-6Y C(15)-methyltransferase COG2352: 62180609 cobalt-precorrin-6Y C(5)-methyltransferase COG2353: 62180612 cobalamin biosynthesis protein COG2354: 62180616 propanediol utilization polyhedral bodies protein COG2355: 62180617 propanediol utilization polyhedral bodies protein COG2356: 62180618 propanediol utilization dehydratase large subunit COG2357: 62180619 propanediol utilization dehydratase, medium subunit COG2358: 62180620 propanediol utilization dehydratase small subunit COG2359: 62180622 propanediol utilization diol dehydratase reactivation protein COG2360: 62180623 propanediol utilization polyhedral bodies protein COG2361: 62180624 propanediol utilization polyhedral bodies protein COG2362: 62180625 hypothetical protein SC2055 COG2363: 62180626 hypothetical protein SC2056 COG2364: 62180629 propanediol utilization CoA-dependent propionaldehyde dehydrogenase COG2365: 62180630 propanediol utilization propanol dehydrogenase COG2366: 62180631 propanediol utilization polyhedral bodies protein COG2367: 62180632 propanediol utilization polyhedral bodies protein COG2368: 62180635 propanediol utilization COG2369: 62180636 propionate kinase COG2370: 62180637 hypothetical protein SC2067 COG2371: 62180638 hypothetical protein SC2068 COG2372: 62180639 hypothetical protein SC2069 COG2373: 62180640 DNA gyrase inhibitor COG2374: 62180642 hydrogen sulfide production: membrane anchoring protein COG2375: 62180647 exonuclease I COG2376: 62180648 amino acid APC transporter COG2377: 62180649 LysR family transcriptional regulator COG2378: 62180652 histidinol dehydrogenase COG2379: 62180653 histidinol-phosphate aminotransferase COG2380: 62180654 imidazole glycerol-phosphate dehydratase/histidinol phosphatase COG2381: 62180655 imidazole glycerol phosphate synthase subunit HisH COG2382: 62180656 1-(5-phosphoribosyl)-5-

210

COG2383: 62180657 imidazole glycerol phosphate synthase subunit HisF COG2384: 62180658 bifunctional phosphoribosyl-AMP cyclohydrolase COG2385: 62180659 regulator of length of O-antigen component of lipopolysaccharide chains COG2386: 62180660 UDP-glucose/GDP-mannose dehydrogenase COG2387: 62180661 6-phosphogluconate dehydrogenase COG2388: 62180670 colanic acid biosynthesis protein COG2389: 62180671 glycosyl transferase family protein COG2390: 62180673 colanic acid exporter COG2391: 62180677 glycosyl transferase family protein COG2392: 62180678 glycosyl transferase in colanic acid biosynthesis COG2393: 62180680 GDP-D-mannose dehydratase COG2394: 62180681 colanic acid biosynthesis acetyltransferase WcaF COG2395: 62180682 glycosyl transferase family protein COG2396: 62180684 glycosyl transferase family protein COG2397: 62180685 colanic acid biosynthesis acetyltransferase WcaB COG2398: 62180688 tyrosine phosphatase COG2399: 62180691 inner membrane protein COG2400: 62180692 assembly protein COG2401: 62180693 deoxycytidine triphosphate deaminase COG2402: 62180694 uridine kinase PAS/PAC domain/diguanylate cyclase/phosphodiesterase domain- COG2403: 62180696 containing protein COG2404: 62180697 3-methyladenine DNA glycosylase COG2405: 62180698 chaperone COG2406: 62180700 multidrug efflux system subunit MdtB COG2407: 62180701 multidrug efflux system subunit MdtC COG2408: 62180702 signal transduction histidine-protein kinase BaeS COG2409: 62180703 DNA-binding transcriptional regulator BaeR COG2410: 62180704 hypothetical protein SC2134 COG2411: 62180721 protease COG2412: 62180726 fructose-bisphosphate aldolase COG2413: 62180728 glycohydrolase COG2414: 62180729 sugar kinase COG2415: 62180730 GntR family transcriptional regulator COG2416: 62180731 phosphomethylpyrimidine kinase COG2417: 62180732 hydroxyethylthiazole kinase COG2418: 62180733 hypothetical protein SC2163 COG2419: 62180737 fimbrial-like protein COG2420: 62180738 hypothetical protein SC2168 COG2421: 62180739 ATPase

211

COG2422: 62180741 lipoprotein COG2423: 62180743 hypothetical protein SC2173 COG2424: 62180744 two-component response-regulatory protein YehT COG2425: 62180745 sensor/kinase in regulatory system COG2426: 62180748 ABC-type proline/glycine betaine transport systems, permease component COG2427: 62180749 proline/glycine betaine ABC transporter ATPase COG2428: 62180750 ABC-type proline/glycine betaine transport systems, permease component COG2429: 62180751 ABC transporter substrate-binding protein COG2430: 62180752 beta-D-glucoside glucohydrolase, periplasmic COG2431: 62180753 D-lactate dehydrogenase COG2432: 62180755 hypothetical protein SC2185 COG2433: 62180756 DedA family membrane protein COG2434: 62180758 multidrug resistance outer membrane protein MdtQ COG2435: 62180761 tRNA-dihydrouridine synthase C COG2436: 62180762 salicylate hydroxylase COG2437: 62180763 glutathione S-transferase COG2438: 62180764 flutathione S-transferase COG2439: 62180765 1,2-dioxygenase COG2440: 62180766 sugar transporter COG2441: 62180768 hypothetical protein SC2198 COG2442: 62180769 hypothetical protein SC2199 COG2443: 62180771 hypothetical protein SC2201 COG2444: 62180772 hypothetical protein SC2202 COG2445: 62180773 oxidoreductase COG2446: 62180775 beta-methylgalactoside transporter inner membrane protein COG2447: 62180776 galactose ABC transporter permease COG2448: 62180777 DNA-binding transcriptional regulator GalS COG2449: 62180778 hypothetical protein SC2208 COG2450: 62180779 GTP cyclohydrolase I COG2451: 62180781 transcriptional regulator COG2452: 62180783 phosphoserine phosphatase COG2453: 62180785 colicin I receptor COG2454: 62180786 lysine transporter COG2455: 62180787 DNA-binding transcriptional regulator COG2456: 62180788 hypothetical protein SC2218 COG2457: 62180789 endonuclease IV COG2458: 62180790 PTS system fructose-specific transporter subunit IIBC COG2459: 62180791 1-phosphofructokinase bifunctional PTS system fructose-specific transporter subunit IIA/HPr COG2460: 62180792 protein

212

COG2461: 62180793 proton efflux pump COG2462: 62180795 hypothetical protein SC2225, partial COG2463: 62180798 hypothetical protein SC2228 COG2464: 62180800 outer membrane lipoprotein COG2465: 62180802 ABC transporter substrate-binding protein COG2466: 62180803 dipeptide/oligopeptide/nickel ABC transporter permease COG2467: 62180804 dipeptide/oligopeptide/nickel ABC transporter permease COG2468: 62180805 ABC transporter ATPase COG2469: 62180806 hypothetical protein SC2236 COG2470: 62180808 bicyclomycin/multidrug efflux system protein COG2471: 62180809 16S rRNA pseudouridylate synthase A COG2472: 62180810 ATP-dependent helicase COG2473: 62180811 50S ribosomal protein L25 COG2474: 62180812 inner membrane protein COG2475: 62180813 nucleoid-associated protein NdpA COG2476: 62180814 hypothetical protein SC2244 COG2478: 62180821 transcriptional regulator NarP COG2479: 62180830 cytochrome c-type protein NapC COG2480: 62180831 citrate reductase cytochrome c-type subunit COG2481: 62180832 quinol dehydrogenase membrane component COG2482: 62180834 nitrate reductase catalytic subunit COG2483: 62180835 periplasmic nitrate reductase COG2484: 62180836 ecotin COG2485: 62180838 DNA repair system specific for alkylated DNA COG2486: 62180840 thiamine biosynthesis lipoprotein ApbE COG2487: 62180841 porin phosphotransfer intermediate protein in two-component regulatory COG2488: 62180842 system with RcsBC COG2489: 62180843 transcriptional regulator RcsB COG2490: 62180845 DNA gyrase subunit A COG2491: 62180849 3-demethylubiquinone-9 3-methyltransferase COG2492: 62180850 ribonucleotide-diphosphate reductase subunit alpha COG2493: 62180851 ribonucleotide-diphosphate reductase subunit beta COG2494: 62180853 permease COG2495: 62180854 LysR family transcriptional regulator COG2496: 62180855 glycerophosphodiester phosphodiesterase COG2497: 62180856 sn-glycerol-3-phosphate transporter COG2498: 62180857 sn-glycerol-3-phosphate dehydrogenase subunit A COG2499: 62180858 anaerobic glycerol-3-phosphate dehydrogenase subunit B COG2500: 62180859 sn-glycerol-3-phosphate dehydrogenase subunit C

213

COG2501: 62180860 deubiquitinase COG2502: 62180862 aldolase COG2503: 62180864 galactonate dehydratase COG2504: 62180865 transcriptional regulator COG2505: 62180866 competence damage-inducible protein A COG2506: 62180867 hypothetical protein SC2297 COG2507: 62180868 aluminum resistance protein COG2508: 62180869 UDP-4-amino-4-deoxy-L-arabinose--oxoglutarate aminotransferase COG2509: 62180870 undecaprenyl phosphate 4-deoxy-4-formamido-L-arabinose transferase COG2510: 62180871 hypothetical protein SC2301 COG2511: 62180872 4-amino-4-deoxy-L-arabinose transferase COG2512: 62180873 inner membrane protein COG2513: 62180874 hypothetical protein SC2304 COG2514: 62180876 O-succinylbenzoate synthase COG2515: 62180877 naphthoate synthase COG2516: 62180878 acyl-CoA thioester hydrolase COG2517: 62180879 2-succinyl-5-enolpyruvyl-6-hydroxy-3-cyclohexene-1-carboxylate synthase COG2518: 62180880 menaquinone-specific isochorismate synthase COG2519: 62180881 hypothetical protein SC2311 COG2520: 62180882 hypothetical protein SC2312 COG2521: 62180884 chemotaxis signal transduction protein COG2522: 62180886 NADH dehydrogenase subunit N COG2523: 62180887 NADH dehydrogenase subunit M COG2524: 62180888 NADH dehydrogenase subunit L COG2525: 62180889 NADH dehydrogenase subunit K COG2526: 62180890 NADH dehydrogenase subunit J COG2527: 62180891 NADH dehydrogenase subunit I COG2528: 62180892 NADH dehydrogenase subunit H COG2529: 62180893 NADH dehydrogenase subunit G COG2530: 62180894 NADH dehydrogenase I subunit F COG2531: 62180895 NADH dehydrogenase subunit E COG2532: 62180896 bifunctional NADH:ubiquinone oxidoreductase subunit C/D COG2533: 62180897 NADH dehydrogenase subunit B COG2534: 62180898 NADH dehydrogenase subunit A COG2535: 62180900 LysR family transcriptional regulator COG2536: 62180902 aminotransferase COG2537: 62180903 hypothetical protein SC2333 COG2538: 62180904 response regulator COG2539: 62180906 hypothetical protein SC2336

214

COG2540: 62180907 hypothetical protein SC2337 COG2541: 62180908 acetate kinase COG2542: 62180909 phosphate acetyltransferase COG2543: 62180910 hypothetical protein SC2340 COG2544: 62180911 transketolase COG2545: 62180912 transketolase COG2546: 62180914 hypothetical protein SC2344 COG2547: 62180915 phosphotransferase system COG2548: 62180916 transcriptional regulator COG2549: 62180917 hypothetical protein SC2347 COG2550: 62180918 phosphodiesterase COG2551: 62180919 glutathione-S-transferase COG2552: 62180920 glutathione S-transferase COG2553: 62180921 hypothetical protein SC2351 COG2554: 62180922 histidine/lysine/arginine/ornithine transporter subunit COG2555: 62180923 histidine and lysine/arginine/ornithine ABC transporter COG2556: 62180924 lysine/arginine/ornithine ABC transporter COG2557: 62180926 lysine/arginine/ornithine ABC transporter ATP-binding protein COG2558: 62180928 3-octaprenyl-4-hydroxybenzoate carboxy-lyase COG2559: 62180929 amino acid transporter COG2560: 62180930 hypothetical protein SC2360 COG2561: 62180932 diaminopimelate decarboxylase COG2562: 62180934 amidophosphoribosyltransferase COG2563: 62180935 colicin V production protein COG2564: 62180936 hypothetical protein SC2366 COG2565: 62180937 bifunctional folylpolyglutamate synthase/ dihydrofolate synthase COG2566: 62180939 hypothetical protein SC2369 COG2567: 62180940 tRNA pseudouridine synthase A COG2568: 62180941 semialdehyde dehydrogenase COG2569: 62180942 erythronate-4-phosphate dehydrogenase COG2570: 62180944 hypothetical protein SC2374 COG2571: 62180945 hypothetical protein SC2375 COG2572: 62180946 putatiave helix-turn-helix regulatory protein COG2573: 62180947 hypothetical protein SC2377 COG2574: 62180948 hypothetical protein SC2378 COG2575: 62180949 inner membrane protein COG2576: 62180950 3-oxoacyl-ACP synthase COG2577: 62180952 hypothetical protein SC2382 COG2578: 62180954 hypothetical protein SC2384

215

COG2579: 62180955 penicillin-insensitive murein endopeptidase COG2580: 62180956 chorismate synthase COG2581: 62180957 N5-glutamine S-adenosyl-L-methionine-dependent methyltransferase COG2582: 62180958 hypothetical protein SC2388 COG2583: 62180959 phosphohistidine phosphatase COG2584: 62180960 multifunctional fatty acid oxidation complex subunit alpha COG2585: 62180961 3-ketoacyl-CoA thiolase COG2586: 62180962 hypothetical protein SC2392 COG2587: 62180963 long-chain fatty acid outer membrane transporter COG2588: 62180964 lipoprotein COG2589: 62180965 hypothetical protein SC2395 COG2590: 62180966 outer membrane protease COG2591: 62180967 phosphoglycerate transport activator COG2592: 62180969 phosphoglycerate transporter COG2593: 62180970 phosphoglycerate transporter COG2594: 62180971 hypothetical protein SC2401 COG2595: 62180972 lipid A biosynthesis palmitoleoyl acyltransferase COG2596: 62180975 aminotransferase COG2597: 62180976 COG2598: 62180977 hypothetical protein SC2407 COG2599: 62180980 hypothetical protein SC2410 COG2600: 62180984 negative regulator COG2601: 62180985 negative regulator COG2602: 62180986 glutamyl-tRNA synthetase COG2603: 62180994 hypothetical protein SC2424 COG2604: 62180995 NAD-dependent DNA ligase LigA COG2605: 62180997 sulfate transport protein CysZ COG2606: 62180998 cysteine synthase A COG2607: 62180999 PTS system phosphohistidinoprotein-hexose phosphotransferase Hpr COG2608: 62181001 PTS system glucose-specific transporter COG2609: 62181002 hypothetical protein SC2432 COG2610: 62181003 pyridoxal kinase COG2611: 62181004 GntR family transcriptional regulator COG2612: 62181005 glutamine amidotransferase COG2613: 62181006 hypothetical protein SC2436 COG2614: 62181010 sulfate/thiosulfate transporter subunit COG2615: 62181011 sulfate/thiosulfate transporter permease subunit COG2616: 62181013 thiosulfate transporter subunit COG2617: 62181014 short chain dehydrogenase

216

COG2618: 62181015 hypothetical protein SC2445 COG2619: 62181017 hypothetical protein SC2447 COG2620: 62181019 N-acetylmuramoyl-L-alanine amidase COG2621: 62181023 transcriptional regulator EutR COG2622: 62181025 ethanolamine ammonia-lyase, heavy chain COG2623: 62181027 transport protein in ethanolamine utilization COG2624: 62181028 heatshock protein (Hsp70) COG2625: 62181032 phosphotransacetylase COG2626: 62181033 cobalamin adenosyltransferase COG2627: 62181034 ethanolamine utilization protein COG2628: 62181035 ethanolamine utilization protein COG2629: 62181036 carboxysome structural protein, ethanol utilization COG2630: 62181037 malic enzyme COG2631: 62181039 transaldolase A COG2632: 62181043 hypothetical protein SC2473 COG2633: 62181047 hypothetical protein SC2477 COG2634: 62181048 succinyl-diaminopimelate desuccinylase COG2635: 62181049 hypothetical protein SC2479 COG2636: 62181051 hypothetical protein SC2481 COG2637: 62181052 phosphoribosylaminoimidazole-succinocarboxamide synthase COG2638: 62181053 lipoprotein COG2639: 62181054 dihydrodipicolinate synthase COG2640: 62181056 thioredoxin-dependent thiol peroxidase COG2641: 62181061 hypothetical protein SC2491 COG2642: 62181062 arsenate reductase COG2643: 62181063 DNA replication initiation factor COG2644: 62181064 uracil transporter COG2645: 62181065 uracil phosphoribosyltransferase COG2646: 62181066 phosphoribosylaminoimidazole synthetase COG2647: 62181067 phosphoribosylglycinamide formyltransferase COG2648: 62181068 polyphosphate kinase COG2649: 62181069 exopolyphosphatase COG2650: 62181078 GMP synthase COG2651: 62181079 inosine 5'-monophosphate dehydrogenase COG2652: 62181083 outer membrane protein COG2653: 62181085 hypothetical protein SC2515 COG2654: 62181086 GTP-binding protein EngA COG2655: 62181087 outer membrane protein assembly complex subunit YfgL COG2656: 62181088 hypothetical protein SC2518

217

COG2657: 62181089 histidyl-tRNA synthetase COG2658: 62181090 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase COG2659: 62181091 cytoskeletal protein RodZ COG2660: 62181092 ribosomal RNA large subunit methyltransferase N COG2661: 62181093 nucleoside diphosphate kinase COG2662: 62181094 polyferredoxin COG2663: 62181095 dimethylsulfoxide reductase COG2664: 62181096 anaerobic dimethylsulfoxide reductase COG2665: 62181098 3-mercaptopyruvate sulfurtransferase COG2666: 62181099 hypothetical protein SC2529 COG2667: 62181100 enhanced serine sensitivity protein SseB COG2668: 62181102 hypothetical protein SC2532 COG2669: 62181103

COG2670: 62181104 co-chaperone HscB COG2671: 62181105 iron-sulfur cluster assembly protein COG2672: 62181106 scaffold protein COG2673: 62181107 cysteine desulfurase COG2674: 62181108 DNA-binding transcriptional regulator IscR COG2675: 62181109 rRNA methylase COG2676: 62181110 inositol monophosphatase COG2677: 62181111 hydrolase COG2678: 62181114 anaerobic sulfide reductase COG2679: 62181116 hypothetical protein SC2546 COG2680: 62181117 stationary phase inducible protein CsiE COG2681: 62181118 3-phenylpropionic acid transporter COG2682: 62181119 serine hydroxymethyltransferase COG2683: 62181121 nitric oxide dioxygenase COG2684: 62181123 lysine/cadaverine antiporter COG2685: 62181125 POT family di-/tripeptide transport protein COG2686: 62181126 nitrogen regulatory protein P-II 1 transcriptional regulator of two-component regulator protein (EBP COG2687: 62181127 familiiy) COG2688: 62181128 hypothetical protein SC2558 COG2689: 62181129 sensory kinase in regulatory system COG2690: 62181130 phosphoribosylformylglycinamidine synthase COG2691: 62181133 tRNA-specific COG2692: 62181136 N-acetylmuramic acid 6-phosphate etherase COG2693: 62181138 2-dehydropantoate 2-reductase COG2694: 62181139 permease COG2695: 62181140 LysR family transcriptional regulator

218

COG2696: 62181141 ferredoxin COG2697: 62181142 4'-phosphopantetheinyl transferase COG2698: 62181143 pyridoxine 5'-phosphate synthase COG2699: 62181144 DNA repair protein RecO COG2700: 62181145 GTP-binding protein Era COG2701: 62181147 signal peptidase I COG2702: 62181148 GTP-binding protein LepA COG2703: 62181212 SoxR reducing system protein RseC COG2704: 62181213 periplasmic negative regulator of sigmaE COG2705: 62181214 anti-RNA polymerase sigma factor SigE COG2706: 62181215 RNA polymerase sigma factor RpoE COG2707: 62181216 L-aspartate oxidase COG2708: 62181217 hypothetical protein SC2647 COG2709: 62181218 ATP-dependent RNA helicase SrmB COG2710: 62181221 autonomous glycyl radical cofactor GrcA COG2711: 62181222 uracil-DNA glycosylase COG2712: 62181223 methyltransferase COG2713: 62181225 hypothetical protein SC2655 COG2714: 62181226 acetyl-CoA synthetase COG2715: 62181227 phosphatidylserine synthase COG2716: 62181228 hypothetical protein SC2658 COG2717: 62181229 alpha-ketoglutarate transporter COG2718: 62181230 hypothetical protein SC2660 COG2719: 62181233 protein disaggregation chaperone COG2720: 62181234 hypothetical protein SC2664 COG2721: 62181235 23S rRNA pseudouridine synthase D COG2722: 62181236 outer membrane protein assembly complex subunit YfiO COG2723: 62181238 translation inhibitor protein RaiA COG2724: 62181239 bifunctional chorismate mutase/prephenate dehydratase COG2725: 62181240 hypothetical protein SC2670 COG2726: 62181241 bifunctional chorismate mutase/prephenate dehydrogenase COG2727: 62181242 phospho-2-dehydro-3-deoxyheptonate aldolase COG2728: 62181243 hypothetical protein SC2673 COG2729: 62181244 50S ribosomal protein L19 COG2730: 62181245 tRNA (guanine-N(1)-)-methyltransferase COG2731: 62181248 signal recognition particle protein COG2732: 62181254 recombination and repair protein COG2733: 62181255 hypothetical protein SC2685 COG2734: 62181256 hypothetical protein SC2686

219

COG2735: 62181257 hypothetical protein SC2687 COG2736: 62181258 SsrA-binding protein COG2737: 62181263 outer membrane efflux protein COG2738: 62181265 HlyD family secretion protein COG2739: 62181274 glycosyl transferase family protein COG2740: 62181275 ABC transporter COG2741: 62181277 hypothetical protein SC2707 COG2742: 62181282 virulence protein VirK COG2743: 62181283 transcription activator COG2744: 62181285 tricarboxylic transport: regulatory protein COG2745: 62181286 tricarboxylic transport: regulatory protein COG2746: 62181288 tricarboxylic transport COG2747: 62181289 hypothetical protein SC2719 COG2748: 62181290 hypothetical protein SC2720 COG2749: 62181295 gamma-aminobutyrate transporter COG2750: 62181296 DNA-binding transcriptional regulator CsiR COG2751: 62181298 YqaE family transport protein COG2752: 62181299 ArsR family transcriptional regulator COG2753: 62181300 hypothetical protein SC2730 COG2754: 62181303 hypothetical protein SC2733 COG2755: 62181304 hypothetical protein SC2734 COG2756: 62181307 hypothetical protein SC2737 COG2757: 62181309 glutaredoxin-like protein COG2758: 62181310 ribonucleotide reductase stimulatory protein COG2759: 62181311 ribonucleotide-diphosphate reductase subunit alpha COG2760: 62181315 glycine betaine transporter periplasmic subunit COG2761: 62181316 inner membrane protein COG2762: 62181317 transcriptional repressor MprA COG2763: 62181318 multidrug resistance secretion protein COG2764: 62181322 S-ribosylhomocysteinase COG2765: 62181323 glutamate--cysteine ligase COG2766: 62181324 inner membrane protein COG2767: 62181325 fructose-1-phosphatase COG2768: 62181329 carbon storage regulator COG2769: 62181330 alanyl-tRNA synthetase COG2770: 62181331 recombination regulator RecX COG2771: 62181332 recombinase A COG2772: 62181333 competence damage-inducible protein A COG2773: 62181334 murein hydrolase B

220

PTS family, glucitol/sorbitol-specific enzyme IIC component,one of two IIC COG2774: 62181335 components PTS family, glucitol/sorbitol-specific IIB component, one of two IIC COG2775: 62181336 components COG2776: 62181337 PTS system glucitol/sorbitol-specific transporter subunit IIA COG2777: 62181338 sorbitol-6-phosphate dehydrogenase COG2778: 62181339 DNA-binding transcriptional activator GutM COG2779: 62181340 DNA-binding transcriptional repressor SrlR COG2780: 62181341 D-arabinose 5-phosphate isomerase COG2781: 62181342 anaerobic nitric oxide reductase transcriptional regulator COG2782: 62181343 anaerobic nitric oxide reductase flavorubredoxin COG2783: 62181344 nitric oxide reductase COG2784: 62181345 hydrogenase maturation protein COG2785: 62181346 electron transport protein HydN COG2786: 62181348 hydrogenase 3 maturation protease COG2787: 62181349 HycE processing protein COG2788: 62181350 hydrogenase activity COG2789: 62181351 formate hydrogenlyase complex iron-sulfur subunit COG2790: 62181352 formate hydrogenlyase subunit 5 COG2791: 62181353 hydrogenase 3, membrane subunit (part of FHL complex) COG2792: 62181355 hydrogenase-3, iron-sulfur subunit (part of FHL complex) COG2793: 62181356 formate hydrogenlyase regulatory protein HycA COG2794: 62181357 hydrogenase nickel incorporation protein COG2795: 62181358 hydrogenase nickel incorporation protein HypB COG2796: 62181359 hydrogenase assembly chaperone COG2797: 62181360 hydrogenase expression/formation protein COG2798: 62181361 hydrogenase expression/formation protein COG2799: 62181362 formate hydrogen-lyase transcriptional activator COG2800: 62181363 hypothetical protein SC2793 COG2801: 62181364 iron transporter: fur regulated COG2802: 62181365 iron transporter: fur regulated COG2803: 62181366 iron transporter: fur regulated COG2804: 62181367 iron transporter: fur regulated COG2805: 62181368 transcriptional regulator COG2806: 62181369 AraC family transcriptional regulator COG2807: 62181370 hypothetical protein SC2800 COG2808: 62181371 flagellar biosynthesis/type III secretory pathway protein COG2809: 62181372 inner membrane protein COG2810: 62181373 cell invasion protein COG2811: 62181374 cell invasion protein

221

COG2812: 62181375 cell invasion protein COG2813: 62181376 cell invasion protein COG2814: 62181377 regulatory protein COG2815: 62181378 invasion protein regulator COG2816: 62181379 cell invasion protein COG2817: 62181381 virulence associated chaperone COG2818: 62181383 acyl carrier protein COG2819: 62181384 cell invasion protein COG2820: 62181385 cell invasion protein COG2821: 62181386 cell invasion protein COG2822: 62181387 cell invasion protein COG2823: 62181388 surface presentation of antigens; secretory proteins COG2824: 62181389 surface presentation of antigens protein SpaS COG2825: 62181390 surface presentation of antigens; secretory proteins COG2826: 62181391 surface presentation of antigens; secretory proteins COG2827: 62181392 surface presentation of antigens protein SpaP COG2828: 62181393 surface presentation of antigens protein SpaO COG2829: 62181394 surface presentation of antigens; secretory proteins COG2830: 62181395 surface presentation of antigens; secretory proteins COG2831: 62181396 ATP synthase SpaL COG2832: 62181397 surface presentation of antigens; secretory proteins COG2833: 62181398 invasion protein COG2834: 62181399 invasion protein COG2835: 62181400 invasion protein; outer membrane COG2836: 62181401 invasion protein COG2837: 62181402 invasion protein COG2838: 62181403 ABC-type transport system COG2839: 62181404 acetyltransferase COG2840: 62181413 permease COG2841: 62181416 aldolase COG2842: 62181419 DeoR family transcriptional regulator COG2843: 62181420 transcriptional regulator COG2844: 62181422 hypothetical protein SC2852 COG2845: 62181423 hypothetical protein SC2853 COG2846: 62181427 lipoprotein NlpD COG2847: 62181428 protein-L-isoaspartate O-methyltransferase COG2848: 62181429 stationary phase survival protein SurE COG2849: 62181430 tRNA pseudouridine synthase D COG2850: 62181431 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase

222

COG2851: 62181432 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase COG2852: 62181433 cell division protein FtsB COG2853: 62181434 hypothetical protein SC2864 COG2854: 62181435 adenylylsulfate kinase COG2855: 62181436 sulfate adenylyltransferase subunit 1 COG2856: 62181446 hypothetical protein SC2876 COG2857: 62181447 phosphoadenosine phosphosulfate reductase COG2858: 62181448 sulfite reductase subunit beta COG2859: 62181449 sulfite reductase subunit alpha COG2860: 62181450 synthase COG2861: 62181454 hypothetical protein SC2884 COG2862: 62181456 phosphopyruvate hydratase COG2863: 62181457 CTP synthetase COG2864: 62181458 nucleoside triphosphate pyrophosphohydrolase COG2865: 62181466 GDP/GTP pyrophosphokinase COG2866: 62181468 hybrid sensory histidine kinase BarA COG2867: 62181473 flavodoxin COG2868: 62181474 tRNA pseudouridine synthase C COG2869: 62181475 hypothetical protein SC2905 COG2870: 62181476 SecY interacting protein Syd COG2871: 62181477 7-cyano-7-deazaguanine reductase COG2872: 62181478 hypothetical protein SC2908 COG2873: 62181479 HAAAP family, serine transport protein COG2874: 62181481 exonuclease IX COG2875: 62181482 L-1,2-propanediol oxidoreductase COG2876: 62181483 L-fuculose phosphate aldolase COG2877: 62181486 L-fucose isomerase COG2878: 62181487 L-fuculokinase COG2879: 62181488 fucose operon protein COG2880: 62181489 DNA-binding transcriptional activator FucR COG2881: 62181490 RNA 2'-O-ribose methyltransferase COG2882: 62181491 hypothetical protein SC2921 COG2883: 62181492 DNA-binding transcriptional activator GcvA COG2884: 62181493 lipoprotein COG2885: 62181494 cysteine sulfinate desulfinase COG2886: 62181495 hypothetical protein SC2925 COG2887: 62181496 hypothetical protein SC2926 COG2888: 62181497 hypothetical protein SC2927 COG2889: 62181498 murein transglycosylase A

223

COG2890: 62181499 N-acetylmuramoyl-L-alanine amidase COG2891: 62181500 N-acetylglutamate synthase COG2892: 62181502 exonuclease V subunit beta COG2893: 62181504 exonuclease V subunit gamma COG2894: 62181505 hypothetical protein SC2935 COG2895: 62181506 hypothetical protein SC2936 COG2896: 62181507 hypothetical protein SC2937 COG2897: 62181508 hypothetical protein SC2938 COG2898: 62181510 COG2899: 62181511 prolipoprotein diacylglyceryl transferase fused phosphoenolpyruvate-protein phosphotransferase PtsP/GAF COG2900: 62181512 domain COG2901: 62181513 dinucleoside polyphosphate hydrolase COG2902: 62181514 DNA mismatch repair protein COG2903: 62181515 hypothetical protein SC2945 COG2904: 62181516 POT family peptide transport protein COG2905: 62181517 aldo-keto reductase COG2906: 62181518 lysophospholipid transporter LplT COG2907: 62181519 bifunctional acyl- COG2908: 62181520 DNA-binding transcriptional regulator GalR COG2909: 62181522 diaminopimelate decarboxylase COG2910: 62181523 LysR family transcriptional regulator COG2911: 62181524 racemase COG2912: 62181525 major facilitator superfamily L-arabinose: proton symporter COG2913: 62181529 LysR family transcriptional regulator COG2914: 62181548 metalloendopeptidase COG2915: 62181549 isopentenyl-diphosphate delta-isomerase COG2916: 62181550 lysyl-tRNA synthetase COG2917: 62181551 peptide chain release factor 2 COG2918: 62181553 thiol:disulfide interchange protein DsbC COG2919: 62181554 site-specific tyrosine recombinase XerD COG2920: 62181557 hypothetical protein SC2987 COG2921: 62181559 global regulator COG2922: 62181560 hypothetical protein SC2990 COG2923: 62181561 hypothetical protein SC2991 COG2924: 62181562 6-phospho-beta-glucosidase COG2925: 62181563 outer membrane protein COG2926: 62181565 glycine cleavage system protein H COG2927: 62181566 glycine cleavage system T COG2928: 62181567 hypothetical protein SC2997

224

COG2929: 62181568 2-octaprenyl-6-methoxyphenyl hydroxylase COG2930: 62181569 proline aminopeptidase P II COG2931: 62181571 Z-ring-associated protein COG2932: 62181573 D-3-phosphoglycerate dehydrogenase COG2933: 62181574 ribose-5-phosphate isomerase A COG2934: 62181575 chromosome replication initiation inhibitor protein COG2935: 62181577 hypothetical protein SC3007 COG2936: 62181578 arginine exporter protein COG2937: 62181579 mechanosensitive ion channel MscS COG2938: 62181581 COG2939: 62181582 erythrose 4-phosphate dehydrogenase COG2940: 62181583 hypothetical protein SC3013 COG2941: 62181584 hypothetical protein SC3014 COG2942: 62181586 cobalt ABC transporter ATPase COG2943: 62181587 cobalt ABC transporter ATPase COG2944: 62181589 Zn-dependent protease with chaperone function COG2945: 62181590 COG2946: 62181596 arginine decarboxylase COG2947: 62181597 hypothetical protein SC3027 COG2948: 62181600 S-adenosylmethionine synthetase COG2949: 62181601 major facilitator superfamily galactose:proton symporter COG2950: 62181602 hypothetical protein SC3032 COG2951: 62181603 DNA-specific endonuclease I COG2952: 62181605 glutathione synthetase COG2953: 62181607 Holliday junction resolvase-like protein COG2954: 62181608 transcriptional regulator COG2955: 62181609 protein transport COG2956: 62181610 hypothetical protein SC3040 COG2957: 62181611 hypothetical protein SC3041 COG2958: 62181613 deoxyribonucleotide triphosphate pyrophosphatase COG2959: 62181614 coproporphyrinogen III oxidase COG2960: 62181616 L-asparaginase II COG2961: 62181618 hypothetical protein SC3048 COG2962: 62181619 hypothetical protein SC3049 COG2963: 62181620 tRNA (guanine-N(7)-)-methyltransferase COG2964: 62181621 adenine DNA glycosylase COG2965: 62181622 hypothetical protein SC3052 COG2966: 62181623 murein transglycosylase C COG2967: 62181626 hypothetical protein SC3056

225

COG2968: 62181634 response regulator COG2969: 62181635 hypothetical protein SC3065 COG2970: 62181636 amino acid transporter COG2971: 62181638 oxidoreductase COG2972: 62181639 NAD-dependent dehydrogenase COG2973: 62181641 hypothetical protein SC3071 COG2974: 62181643 hypothetical protein SC3073 COG2975: 62181644 amidohydrolase COG2976: 62181645 permease COG2977: 62181646 mannonate dehydratase COG2978: 62181647 D-mannonate oxidoreductase COG2979: 62181648 glucuronate isomerase COG2980: 62181649 hypothetical protein SC3079 COG2981: 62181654 hydrogenase 2 accessory protein HypG COG2982: 62181655 hydrogenase nickel incorporation protein HybF COG2983: 62181656 hydrogenase 2-specific chaperone COG2984: 62181657 hydrogenase 2 maturation endopeptidase COG2985: 62181658 hydrogenase 2 large subunit COG2986: 62181659 hydrogenase 2 b cytochrome subunit COG2987: 62181660 hydrogenase 2 protein HybA COG2988: 62181661 hydrogenase 2 small subunit COG2989: 62181663 hypothetical protein SC3093 COG2990: 62181665 hypothetical protein SC3095 COG2991: 62181667 hypothetical protein SC3097 COG2992: 62181669 oxidoreductase COG2993: 62181670 biopolymer transport protein ExbD COG2994: 62181671 biopolymer transport protein ExbB COG2995: 62181675 cystathionine beta-lyase COG2996: 62181676 DedA family membrane protein COG2997: 62181677 AraC family transcriptional regulator COG2998: 62181678 alcohol dehydrogenase COG2999: 62181681 hypothetical protein SC3111 COG3000: 62181682 hypothetical protein SC3112 COG3001: 62181686 repressor protein for FtsI COG3002: 62181687 1-acyl-sn-glycerol-3-phosphate acyltransferase COG3003: 62181690 outer membrane protein COG3004: 62181691 DNA-binding transcriptional regulator QseB COG3005: 62181692 sensor protein QseC COG3006: 62181694 hypothetical protein SC3124

226

COG3007: 62181695 DNA topoisomerase IV subunit B COG3008: 62181696 esterase COG3009: 62181697 cyclic 3',5'-adenosine monophosphate phosphodiesterase COG3010: 62181698 hypothetical protein SC3128 COG3011: 62181699 ADP-ribose pyrophosphatase COG3012: 62181701 hypothetical protein SC3131 COG3013: 62181702 hypothetical protein SC3132 COG3014: 62181703 hypothetical protein SC3133 COG3015: 62181707 thiol-disulfide isomerase and thioredoxin COG3016: 62181708 disulfide oxidoreductase COG3017: 62181711 3,4-dihydroxy-2-butanone 4-phosphate synthase COG3018: 62181712 hypothetical protein SC3142 COG3019: 62181713 glycogen synthesis protein GlgS COG3020: 62181715 inner membrane protein COG3021: 62181716 hypothetical protein SC3146 bifunctional heptose 7-phosphate kinase/heptose 1-phosphate COG3022: 62181717 adenyltransferase bifunctional glutamine-synthetase COG3023: 62181718 adenylyltransferase/deadenyltransferase COG3024: 62181719 hypothetical protein SC3149 COG3025: 62181720 signal transduction protein COG3026: 62181721 multifunctional tRNA nucleotidyl transferase COG3027: 62181722 undecaprenyl pyrophosphate phosphatase COG3028: 62181723 bifunctional dihydroneopterin aldolase COG3029: 62181725 DNA-binding/iron metalloprotein/AP endonuclease COG3030: 62181726 30S ribosomal protein S21 COG3031: 62181727 DNA COG3032: 62181728 RNA polymerase sigma factor RpoD COG3033: 62181729 G/U mismatch-specific DNA glycosylase COG3034: 62181730 hypothetical protein SC3160 COG3035: 62181731 transcriptional regulator COG3036: 62181732 methyl-accepting chemotaxis protein COG3037: 62181733 aerotaxis sensor receptor COG3038: 62181737 hypothetical protein SC3167 COG3039: 62181738 hypothetical protein SC3168 COG3040: 62181739 integral membrane protein COG3041: 62181741 resistance protein COG3042: 62181743 DedA family membrane protein COG3043: 62181744 hypothetical protein SC3174 COG3044: 62181745 hypothetical protein SC3175

227

COG3045: 62181747 hypothetical protein SC3177 COG3046: 62181748 hypothetical protein SC3178 COG3047: 62181750 hypothetical protein SC3180 COG3048: 62181751 LysR family transcriptional regulator COG3049: 62181752 hypothetical protein SC3182 COG3050: 62181754 hypothetical protein SC3184 COG3051: 62181758 propionate/acetate kinase COG3052: 62181760 threonine dehydratase COG3053: 62181761 DNA-binding transcriptional activator TdcA COG3054: 62181764 glycerate kinase COG3055: 62181766 alpha-dehydro-beta-deoxy-D-glucarate aldolase COG3056: 62181767 tagatose-bisphosphate aldolase COG3057: 62181769 PTS system galactitol-specific transporter subunit IIA COG3058: 62181770 PTS system galactitol-specific transporter subunit IIB COG3059: 62181772 galactitol-1-phosphate dehydrogenase COG3060: 62181773 sugar metabolism transcriptional regulator COG3061: 62181774 hypothetical protein SC3204 COG3062: 62181775 transglycosylase COG3063: 62181777 chromosome replication initiator DnaA COG3064: 62181778 hypothetical protein SC3208 COG3065: 62181779 hypothetical protein SC3209 COG3066: 62181780 hypothetical protein SC3210 COG3067: 62181781 hypothetical protein SC3211 COG3068: 62181782 GIY-YIG superfamily protein COG3069: 62181783 ABC transporter membrane protein COG3070: 62181784 hypothetical protein SC3214 COG3071: 62181785 protease COG3072: 62181786 hypothetical protein SC3216 COG3073: 62181790 tryptophan permease COG3074: 62181792 lipoprotein NlpI COG3075: 62181793 polynucleotide phosphorylase COG3076: 62181794 30S ribosomal protein S15 COG3077: 62181795 tRNA pseudouridine synthase B COG3078: 62181796 ribosome-binding factor A COG3079: 62181797 translation initiation factor IF-2 COG3080: 62181798 transcription elongation factor NusA COG3081: 62181803 phosphoglucosamine mutase COG3082: 62181804 ATP-dependent metalloprotease COG3083: 62181805 23S rRNA methyltransferase

228

COG3084: 62181806 RNA-binding protein YhbY COG3085: 62181807 transcription elongation factor GreA COG3086: 62181808 D-alanyl-D-alanine carboxypeptidase COG3087: 62181811 50S ribosomal protein L27 COG3088: 62181813 octaprenyl diphosphate synthase COG3089: 62181814 DNA-binding transcriptional regulator Nlp COG3090: 62181815 UDP-N-acetylglucosamine 1-carboxyvinyltransferase COG3091: 62181816 BolA family transcriptional regulator COG3092: 62181817 STAS domain-containing protein COG3093: 62181818 ABC transporter ATP-binding protein COG3094: 62181819 ABC transporter substrate-binding protein COG3095: 62181820 ABC transporter membrane protein COG3096: 62181821 ABC transporter ATP-binding protein COG3097: 62181823 D-arabinose 5-phosphate isomerase COG3098: 62181824 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase COG3099: 62181825 hypothetical protein SC3255 COG3100: 62181826 lipopolysaccharide transport periplasmic protein LptA COG3101: 62181827 ABC transporter ATP-binding protein COG3102: 62181828 RNA polymerase factor sigma-54 COG3103: 62181829 sigma(54) modulation protein COG3104: 62181830 PTS system transporter subunit IIA-like nitrogen-regulatory protein PtsN COG3105: 62181831 hypothetical protein SC3261 COG3106: 62181832 phosphohistidinoprotein-hexose phosphotransferase COG3107: 62181833 hypothetical protein SC3263 COG3108: 62181834 monofunctional biosynthetic peptidoglycan transglycosylase COG3109: 62181835 isoprenoid biosynthesis protein with amidotransferase-like domain COG3110: 62181836 aerobic respiration control sensor protein ArcB COG3111: 62181837 hypothetical protein SC3267 COG3112: 62181838 glutamate synthase subunit alpha COG3113: 62181840 hypothetical protein SC3270 COG3114: 62181841 cytosine permease COG3115: 62181842 cytosine deaminase COG3116: 62181843 hypothetical protein SC3273 COG3117: 62181844 N-acetylmannosamine kinase COG3118: 62181846 sialic acid transporter COG3119: 62181847 N-acetylneuraminate lyase COG3120: 62181848 transcriptional regulator NanR COG3121: 62181849 ClpXP protease specificity-enhancing factor COG3122: 62181850 stringent starvation protein A

229

COG3123: 62181852 30S ribosomal protein S9 COG3124: 62181854 ATPase COG3125: 62181855 cytochrome d ubiquinol oxidase subunit III COG3126: 62181856 serine endoprotease COG3127: 62181858 inner membrane protein COG3128: 62181862 L(+)-tartrate dehydratase subunit beta COG3129: 62181863 tartrate dehydratase subunit alpha COG3130: 62181865 GntR family transcriptional regulator COG3131: 62181866 GntR family transcriptional regulator COG3132: 62181867 malate dehydrogenase COG3133: 62181868 arginine repressor ArgR COG3134: 62181869 outer membrane protein COG3135: 62181870 hypothetical protein SC3300 COG3136: 62181871 hypothetical protein SC3301 COG3137: 62181872 p-hydroxybenzoic acid efflux subunit AaeB COG3138: 62181873 p-hydroxybenzoic acid efflux subunit AaeA COG3139: 62181874 hypothetical protein SC3304 COG3140: 62181875 DNA-binding transcriptional regulator COG3141: 62181877 hypothetical protein SC3307 COG3142: 62181878 ribonuclease G COG3143: 62181879 Maf-like protein COG3144: 62181880 rod shape-determining protein MreD COG3145: 62181881 rod shape-determining protein MreC COG3146: 62181883 regulatory protein CsrD COG3147: 62181885 sulfite oxidase subunit YedY COG3148: 62181886 sulfite oxidase subunit YedZ COG3149: 62181888 acetyl-CoA carboxylase biotin carboxylase subunit COG3150: 62181889 hypothetical protein SC3319 COG3151: 62181890 sodium/panthothenate symporter COG3152: 62181891 50S ribosomal protein L11 methyltransferase COG3153: 62181892 tRNA-dihydrouridine synthase B COG3154: 62181893 Fis family transcriptional regulator COG3155: 62181895 hypothetical protein SC3325 COG3156: 62181900 outer membrane lipoprotein COG3157: 62181905 ferripyochelin binding protein COG3158: 62181906 hypothetical protein SC3336 COG3159: 62181907 shikimate 5-dehydrogenase COG3160: 62181908 ribosome maturation factor COG3161: 62181909 hypothetical protein SC3339

230

COG3162: 62181910 hypothetical protein SC3340 COG3163: 62181912 peptide deformylase COG3164: 62181914 16S rRNA methyltransferase GidB COG3165: 62181915 potassium transporter peripheral membrane protein COG3166: 62181916 large-conductance mechanosensitive channel COG3167: 62181918 hypothetical protein SC3348 COG3168: 62181919 50S ribosomal protein L17 COG3169: 62181920 DNA-directed RNA polymerase subunit alpha COG3170: 62181921 30S ribosomal protein S4 COG3171: 62181922 30S ribosomal protein S11 COG3172: 62181924 preprotein translocase subunit SecY COG3173: 62181925 50S ribosomal protein L15 COG3174: 62181926 50S ribosomal protein L30 COG3175: 62181927 30S ribosomal protein S5 COG3176: 62181928 50S ribosomal protein L18 COG3177: 62181929 50S ribosomal protein L6 COG3178: 62181930 30S ribosomal protein S8 COG3179: 62181931 30S ribosomal protein S14 COG3180: 62181932 50S ribosomal protein L5 COG3181: 62181933 50S ribosomal protein L24 COG3182: 62181934 50S ribosomal protein L14 COG3183: 62181935 30S ribosomal protein S17 COG3184: 62181936 50S ribosomal protein L29 COG3185: 62181937 50S ribosomal protein L16 COG3186: 62181938 30S ribosomal protein S3 COG3187: 62181940 30S ribosomal protein S19 COG3188: 62181941 50S ribosomal protein L2 COG3189: 62181942 50S ribosomal protein L23 COG3190: 62181943 50S ribosomal protein L4 COG3191: 62181944 50S ribosomal protein L3 COG3192: 62181945 30S ribosomal protein S10 COG3193: 62181946 leader peptidase HopD COG3194: 62181947 bacterioferritin COG3195: 62181948 bacterioferritin-associated ferredoxin COG3196: 62181950 elongation factor G COG3197: 62181951 30S ribosomal protein S7 COG3198: 62181953 sulfur transfer complex subunit TusB COG3199: 62181954 sulfur relay protein TusC COG3200: 62181955 sulfur transfer complex subunit TusD

231

COG3201: 62181957 FKBP-type peptidylprolyl isomerase COG3202: 62181958 hypothetical protein SC3388 COG3203: 62181959 FKBP-type peptidylprolyl isomerase COG3204: 62181960 hypothetical protein SC3390 COG3205: 62181962 glutathione-regulated potassium-efflux system ancillary protein KefG COG3206: 62181963 ABC transporter ATP-binding protein COG3207: 62181965 hypothetical protein SC3395 COG3208: 62181967 hypothetical protein SC3397 COG3209: 62181968 phosphoribulokinase COG3210: 62181969 hypothetical protein SC3399 COG3211: 62181970 cAMP-regulatory protein COG3212: 62181972 bifunctional N-succinyldiaminopimelate-aminotransferase COG3213: 62181973 para-aminobenzoate synthase component II COG3214: 62181974 cell filamentation protein Fic COG3215: 62181975 peptidyl-prolyl cis-trans isomerase A COG3216: 62181976 hypothetical protein SC3406 COG3217: 62181978 nitrite reductase, large subunit COG3218: 62181979 nitrite reductase small subunit COG3219: 62181980 nitrite transporter NirC COG3220: 62181981 siroheme synthase COG3221: 62181983 tryptophanyl-tRNA synthetase COG3222: 62181984 phosphoglycolate phosphatase COG3223: 62181985 ribulose-phosphate 3-epimerase COG3224: 62181986 DNA adenine methylase COG3225: 62181988 3-dehydroquinate synthase COG3226: 62181990 porin COG3227: 62181991 hypothetical protein SC3421 COG3228: 62181992 hypothetical protein SC3422 COG3229: 62181993 hypothetical protein SC3423 COG3230: 62181997 hypothetical protein SC3427 COG3231: 62181998 hydrolase COG3232: 62181999 ribosome-associated heat shock protein Hsp15 COG3233: 62182002 phosphoenolpyruvate carboxykinase COG3234: 62182003 osmolarity sensor protein COG3235: 62182004 osmolarity response regulator COG3236: 62182005 transcription elongation factor GreB COG3237: 62182007 ferrous iron transport protein A COG3238: 62182008 ferrous iron transport protein B COG3239: 62182009 hypothetical protein SC3439

232

COG3240: 62182010 hypothetical protein SC3440 COG3241: 62182011 BioH COG3242: 62182012 gluconate periplasmic binding protein COG3243: 62182013 DNA uptake protein COG3244: 62182014 GntP family, high-affinity gluconate permease in GNT I system COG3245: 62182015 4-alpha-glucanotransferase COG3246: 62182016 maltodextrin phosphorylase COG3247: 62182017 transcriptional regulator MalT COG3248: 62182024 DNA-binding transcriptional repressor GlpR COG3249: 62182025 intramembrane serine protease GlpG COG3250: 62182026 thiosulfate sulfurtransferase COG3251: 62182027 glycerol-3-phosphate dehydrogenase COG3252: 62182034 COG3253: 62182039 aspartate-semialdehyde dehydrogenase COG3254: 62182041 low affinity gluconate transporter COG3255: 62182043 LacI family transcriptional regulator COG3256: 62182044 hypothetical protein SC3474 COG3257: 62182046 acetyltransferase YhhY COG3258: 62182047 sugar metabolism transcriptional regulator COG3259: 62182048 hypothetical protein SC3478 COG3260: 62182049 inner membrane protein COG3261: 62182050 gamma-glutamyltranspeptidase COG3262: 62182051 hypothetical protein SC3481 COG3263: 62182053 glycerol-3-phosphate transporter ATP-binding subunit COG3264: 62182054 glycerol-3-phosphate transporter membrane protein COG3265: 62182055 glycerol-3-phosphate transporter permease COG3266: 62182056 glycerol-3-phosphate transporter periplasmic binding protein COG3267: 62182059 leucine/isoleucine/valine transporter ATP-binding subunit COG3268: 62182060 leucine/isoleucine/valine transporter ATP-binding subunit COG3269: 62182063 hypothetical protein SC3493 COG3270: 62182067 RNA polymerase factor sigma-32 COG3271: 62182068 cell division protein FtsX COG3272: 62182069 cell division protein FtsE COG3273: 62182070 cell division protein FtsY COG3274: 62182071 16S rRNA m(2)G966-methyltransferase COG3275: 62182072 hypothetical protein SC3502 COG3276: 62182073 hypothetical protein SC3503 COG3277: 62182074 hypothetical protein SC3504 COG3278: 62182076 methyl-accepting transmembrane citrate/phenol chemoreceptor

233

COG3279: 62182078 hypothetical protein SC3508 COG3280: 62182079 hypothetical protein SC3509 COG3281: 62182081 PerM family permease COG3282: 62182082 holo-(acyl carrier protein) synthase 2 COG3283: 62182083 nickel responsive regulator COG3284: 62182086 hypothetical protein SC3516 COG3285: 62182087 hypothetical protein SC3517 COG3286: 62182088 PiT family, low-affinity phosphate transporter COG3287: 62182089 universal stress protein UspB COG3288: 62182090 universal stress protein A COG3289: 62182091 inner membrane transporter YhiP COG3290: 62182093 oligopeptidase A COG3291: 62182095 hypothetical protein SC3525 COG3292: 62182096 glutathione reductase COG3293: 62182106 GntR family transcriptional regulator COG3294: 62182109 phage endolysin COG3295: 62182112 hypothetical protein SC3542 COG3296: 62182113 MFS family transporter COG3297: 62182116 ketodeoxygluconokinase COG3298: 62182117 Zn-dependent peptidase COG3299: 62182118 C4-dicarboxylate transporter DctA COG3300: 62182119 phosphodiesterase COG3301: 62182121 endo-1,4-D-glucanase, partial COG3302: 62182129 dipeptide transporter COG3303: 62182130 dipeptide transporter permease DppB COG3304: 62182133 PQQ repeat-containing protein COG3305: 62182143 hypothetical protein SC3573 COG3306: 62182144 3-methyladenine DNA glycosylase COG3307: 62182145 hypothetical protein SC3575 COG3308: 62182146 biotin sulfoxide reductase COG3309: 62182147 outer membrane lipoprotein COG3310: 62182148 2-hydroxyacid dehydrogenase COG3311: 62182149 hypothetical protein SC3579 COG3312: 62182151 cold-shock protein COG3313: 62182152 hypothetical protein SC3582 COG3314: 62182160 glycyl-tRNA synthetase subunit beta COG3315: 62182161 glycyl-tRNA synthetase subunit alpha COG3316: 62182162 outer membrane lipoprotein COG3317: 62182164 hypothetical protein SC3594

234

COG3318: 62182166 xylose isomerase COG3319: 62182167 xylose operon regulatory protein COG3320: 62182168 hypothetical protein SC3598 COG3321: 62182171 3-keto-L-gulonate-6-phosphate decarboxylase COG3322: 62182172 L-xylulose 5-phosphate 3-epimerase COG3323: 62182173 AraC family transcriptional regulator COG3324: 62182174 aldehyde dehydrogenase COG3325: 62182175 transcriptional regulator COG3326: 62182178 glutathione S-transferase COG3327: 62182179 PTS family, mannitol-specific enzyme IIABC components COG3328: 62182180 mannitol-1-phosphate 5-dehydrogenase COG3329: 62182182 hypothetical protein SC3612 COG3330: 62182183 hypothetical protein SC3613 COG3331: 62182184 inner membrane lipoprotein COG3332: 62182187 DNA-binding transcriptional repressor LldR COG3333: 62182188 L-lactate dehydrogenase COG3334: 62182189 tRNA/rRNA methyltransferase YibK COG3335: 62182190 transcriptional regulator COG3336: 62182191 mandelate racemase COG3337: 62182192 serine acetyltransferase COG3338: 62182193 NAD(P)H-dependent glycerol-3-phosphate dehydrogenase COG3339: 62182194 preprotein translocase subunit SecB COG3340: 62182195 glutaredoxin 3 COG3341: 62182196 rhodanese-related sulfurtransferase COG3342: 62182197 phosphoglyceromutase COG3343: 62182198 hypothetical protein SC3628 COG3344: 62182199 hypothetical protein SC3629 COG3345: 62182200 glycosyl transferase family protein COG3346: 62182201 L-threonine 3-dehydrogenase COG3347: 62182202 2-amino-3-ketobutyrate CoA ligase COG3348: 62182203 ADP-L-glycero-D-manno-heptose-6-epimerase COG3349: 62182204 ADP-heptose--LPS heptosyltransferase COG3350: 62182205 ADP-heptose--LPS heptosyltransferase COG3351: 62182206 hypothetical protein SC3636 COG3352: 62182207 hexose transferase, lipopolysaccharide core biosynthesis COG3353: 62182209 lipopolysaccharide core biosynthesis protein COG3354: 62182210 UDP-D-glucose:(galactosyl)lipopolysaccharide glucosyltransferase UDP-D-galactose:(glucosyl)lipopolysaccharide-alpha-1,3-D- COG3355: 62182211 galactosyltransferase COG3356: 62182212 UDP-D-galactose:(glucosyl)lipopolysaccharide-1,6-D-galactosyltransferase

235

COG3357: 62182214 lipopolysaccharide core biosynthesis; phosphorylation of core heptose COG3358: 62182215 glucosyltransferase I COG3359: 62182216 lipopolysaccharide core biosynthesis protein COG3360: 62182217 3-deoxy-D-manno-octulosonic-acid transferase COG3361: 62182218 phosphopantetheine adenylyltransferase COG3362: 62182219 formamidopyrimidine-DNA glycosylase COG3363: 62182220 50S ribosomal protein L33 COG3364: 62182221 50S ribosomal protein L28 COG3365: 62182222 DNA repair protein RadC COG3366: 62182225 nucleoid occlusion protein COG3367: 62182226 orotate phosphoribosyltransferase COG3368: 62182227 ribonuclease PH COG3369: 62182228 hypothetical protein SC3658 COG3370: 62182229 LysR family transcriptional regulator COG3371: 62182230 Zn-dependent hydrolase COG3372: 62182231 hypothetical protein SC3661 COG3373: 62182234 COG3374: 62182235 DNA-directed RNA polymerase subunit omega COG3375: 62182236 bifunctional (p)ppGpp synthetase II COG3376: 62182237 tRNA guanosine-2'-O-methyltransferase COG3377: 62182238 ATP-dependent DNA helicase RecG COG3378: 62182240 GltS family glutamate transport protein COG3379: 62182241 NCS2 family, purine/xanthine transport protein COG3380: 62182242 hypothetical protein SC3672 COG3382: 62182247 hypothetical protein SC3677 COG3383: 62182249 inner membrane protein COG3384: 62182252 inner membrane protein COG3385: 62182255 Mg2+ transport protein COG3386: 62182257 hypothetical protein SC3687 COG3387: 62182258 hypothetical protein SC3688 COG3388: 62182259 phosphotransferase system enzyme II COG3389: 62182260 phosphotransferase system enzyme IIC COG3390: 62182261 phosphotransferase system enzyme IIB COG3391: 62182262 phosphotransferase system enzyme IIA COG3392: 62182264 inner membrane protein COG3393: 62182265 glycosyl hydrolase family protein COG3394: 62182267 helix-turn-helix protein COG3395: 62182277 hypothetical protein SC3707 COG3396: 62182280 DNA-binding transcriptional activator UhpA

236

COG3397: 62182281 hypothetical protein SC3711 COG3398: 62182282 L-fucose permease COG3399: 62182283 ribokinase family sugar kinase COG3400: 62182284 DeoR family transcriptional regulator COG3401: 62182285 acetolactate synthase 1 regulatory subunit COG3402: 62182286 acetolactate synthase catalytic subunit COG3403: 62182290 multidrug resistance protein D COG3404: 62182292 DNA-binding transcriptional regulator DsdC COG3405: 62182293 permease DsdX COG3406: 62182294 D-serine dehydratase COG3407: 62182295 hypothetical protein SC3725 COG3408: 62182296 hypothetical protein SC3726 COG3409: 62182298 hypothetical protein SC3728 COG3410: 62182301 heat shock chaperone IbpB COG3411: 62182302 heat shock protein IbpA COG3412: 62182304 hypothetical protein SC3734 COG3413: 62182308 cytochrome c peroxidase COG3414: 62182309 chaperone protein TorD COG3415: 62182310 trimethylamine N-oxide reductase subunit COG3416: 62182323 DNA gyrase subunit B COG3417: 62182324 recombination protein F COG3418: 62182325 DNA polymerase III subunit beta COG3419: 62182326 chromosome replication initiator DnaA COG3420: 62182328 COG3421: 62182329 inner membrane protein translocase component YidC COG3422: 62182330 tRNA modification GTPase TrmE COG3423: 62182331 multidrug efflux system protein MdtL COG3424: 62182332 hypothetical protein SC3762 COG3425: 62182335 xanthine/uracil permease family protein COG3426: 62182336 6-phosphogluconate phosphatase COG3427: 62182337 transcriptional regulator PhoU COG3428: 62182338 phosphate transporter ATP-binding protein COG3429: 62182339 phosphate transporter permease subunit PtsA COG3430: 62182340 phosphate transporter permease subunit PstC COG3431: 62182341 phosphate ABC transporter substrate-binding protein COG3432: 62182343 dipeptide/oligopeptide/nickel ABC transporter substrate-binding protein COG3433: 62182344 glucosamine--fructose-6-phosphate aminotransferase COG3434: 62182345 bifunctional N-acetylglucosamine-1-phosphate uridyltransferase COG3435: 62182346 ATP synthase F0F1 subunit epsilon

237

COG3436: 62182347 ATP synthase F0F1 subunit beta COG3437: 62182348 ATP synthase F0F1 subunit gamma COG3438: 62182349 ATP synthase F0F1 subunit alpha COG3439: 62182350 ATP synthase F0F1 subunit delta COG3440: 62182351 ATP synthase F0F1 subunit B COG3441: 62182352 ATP synthase F0F1 subunit C COG3442: 62182353 ATP synthase F0F1 subunit A COG3443: 62182356 16S rRNA methyltransferase GidB COG3444: 62182357 tRNA uridine 5-carboxymethylaminomethyl modification protein GidA COG3445: 62182358 flavodoxin COG3446: 62182359 DNA-binding transcriptional regulator AsnC COG3447: 62182361 hypothetical protein SC3791 COG3448: 62182363 potassium transport protein Kup COG3449: 62182364 D-ribose pyranase COG3450: 62182367 D-ribose transporter subunit RbsB COG3451: 62182368 ribokinase COG3452: 62182371 GntR family transcriptional regulator COG3453: 62182374 transcriptional regulator HdfR COG3454: 62182375 hypothetical protein SC3805 COG3455: 62182377 acetolactate synthase 2 regulatory subunit COG3456: 62182378 branched-chain amino acid aminotransferase COG3457: 62182379 dihydroxy-acid dehydratase COG3458: 62182380 threonine dehydratase COG3459: 62182381 hypothetical protein SC3811 COG3460: 62182382 hypothetical protein SC3812 COG3461: 62182384 ketol-acid reductoisomerase COG3462: 62182385 peptidyl-prolyl cis-trans isomerase C COG3463: 62182388 ATP-dependent DNA helicase Rep COG3464: 62182389 guanosine pentaphosphate phosphohydrolase COG3465: 62182391 thioredoxin COG3466: 62182392 transcription termination factor Rho undecaprenyl-phosphate alpha-N-acetylglucosaminyl 1-phosphate COG3467: 62182393 transferase COG3468: 62182394 lipopolysaccharide biosynthesis protein WzzE COG3469: 62182395 UDP-N-acetyl glucosamine -2-epimerase COG3470: 62182397 dTDP-glucose 4,6-dehydratase COG3471: 62182399 TDP-fucosamine acetyltransferase COG3472: 62182400 TDP-4-oxo-6-deoxy-D-glucose transaminase COG3473: 62182401 O-antigen translocase in LPS biosyntesis COG3474: 62182402 4-alpha-L-fucosyltransferase

238

COG3475: 62182403 common antigen polymerase COG3476: 62182404 UDP-N-acetyl-D-mannosaminuronic acid transferase COG3477: 62182405 transporter COG3478: 62182406 protoheme IX biogenesis protein COG3479: 62182408 uroporphyrinogen-III synthase COG3480: 62182409 porphobilinogen deaminase COG3481: 62182410 adenylate cyclase COG3482: 62182411 frataxin-like protein COG3483: 62182412 inner membrane protein COG3484: 62182415 outer membrane lipoprotein COG3485: 62182417 hypothetical protein SC3847 COG3486: 62182418 site-specific tyrosine recombinase XerC COG3487: 62182419 phosphatase COG3488: 62182420 DNA-dependent helicase II COG3489: 62182421 magnesium/nickel/cobalt transporter CorA COG3490: 62182424 resistance COG3491: 62182426 phospholipase A COG3492: 62182427 ATP-dependent DNA helicase RecQ COG3493: 62182428 threonine efflux system COG3494: 62182429 homoserine/homoserine lactone efflux protein COG3495: 62182430 lysophospholipase L2 COG3496: 62182431 sugar phosphatase COG3497: 62182433 metE/metH transcriptional regulator 5-methyltetrahydropteroyltriglutamate/ S- COG3498: 62182434 methyltransferase COG3499: 62182435 dienelactone hydrolase family protein COG3500: 62182436 COG3501: 62182437 DNA recombination protein RmuC COG3502: 62182438 ubiquinone/menaquinone biosynthesis methyltransferase COG3503: 62182439 hypothetical protein SC3869 COG3504: 62182440 ubiquinone biosynthesis protein UbiB COG3505: 62182441 twin-arginine translocation protein TatA COG3506: 62182442 sec-independent translocase COG3507: 62182443 twin-arginine protein translocation system subunit TatC COG3508: 62182444 DNase TatD COG3509: 62182445 transcriptional activator RfaH COG3510: 62182446 3-octaprenyl-4-hydroxybenzoate carboxy-lyase COG3511: 62182447 FMN reductase COG3512: 62182451 proline dipeptidase COG3513: 62182453 protoporphyrinogen oxidase

239

COG3514: 62182457 molybdopterin-guanine dinucleotide biosynthesis protein MobA COG3515: 62182458 hypothetical protein SC3888 COG3516: 62182459 serine/threonine protein kinase COG3517: 62182460 protein disulfide isomerase I COG3518: 62182461 DNA polymerase I COG3519: 62182463 ribosome biogenesis GTP-binding protein YsxC COG3520: 62182466 coproporphyrinogen III oxidase COG3521: 62182468 nitrogen regulation protein NR(II) COG3522: 62182469 glutamine synthetase COG3523: 62182470 GTP-binding protein COG3524: 62182476 hypothetical protein SC3906 COG3525: 62182480 alpha-glucosidase COG3526: 62182483 oxidoreductase COG3527: 62182485 glycerol-3-phosphate regulon repressor COG3528: 62182486 phosphatase COG3529: 62182488 D-tyrosyl-tRNA(Tyr) deacylase COG3530: 62182489 acetyltransferase COG3531: 62182490 acetyl esterase COG3532: 62182494 formate dehydrogenase accessory protein FdhE COG3533: 62182495 formate dehydrogenase-O subunit gamma COG3534: 62182496 formate dehydrogenase-O, Fe-S subunit COG3535: 62182497 formate dehydrogenase accessory protein COG3536: 62182499 hypothetical protein SC3929 COG3537: 62182501 hypothetical protein SC3931 COG3538: 62182502 hypothetical protein SC3932 COG3539: 62182503 hypothetical protein SC3933 COG3540: 62182507 rhamnulokinase COG3541: 62182508 transcriptional activator RhaS COG3542: 62182510 rhamnose-proton symporter COG3543: 62182512 hypothetical protein SC3942 COG3544: 62182516 superoxide dismutase COG3545: 62182518 inner membrane protein COG3546: 62182519 two-component sensor protein COG3547: 62182520 DNA-binding transcriptional regulator CpxR COG3548: 62182521 repressor CpxP COG3549: 62182522 ferrous iron efflux protein F COG3550: 62182523 6-phosphofructokinase COG3551: 62182525 CDP-diacylglycerol pyrophosphatase COG3552: 62182528 ADP-ribosylglycohydrolase

240

COG3553: 62182529 GntR family transcriptional regulator COG3554: 62182533 transcriptional repressor COG3555: 62182534 sugar ABC transporter membrane subunit COG3556: 62182535 sugar ABC transporter membrane subunit COG3557: 62182536 sugar ABC transporter periplasmic subunit COG3558: 62182537 aldolase COG3559: 62182538 autoinducer-2 (AI-2) modifying protein LsrG COG3560: 62182539 epimerase COG3561: 62182540 triosephosphate isomerase COG3562: 62182541 hypothetical protein SC3971 COG3563: 62182543 ferredoxin-NADP reductase COG3564: 62182545 COG3565: 62182546 MIP channel protein COG3566: 62182547 hypothetical protein SC3977 COG3567: 62182548 ribonuclease activity regulator protein RraA COG3568: 62182549 1,4-dihydroxy-2-naphthoate octaprenyltransferase COG3569: 62182550 ATP-dependent protease ATP-binding subunit HslU COG3570: 62182551 ATP-dependent protease peptidase subunit COG3571: 62182554 primosome assembly protein PriA COG3572: 62182558 outer membrane lipoprotein COG3573: 62182560 transcriptional repressor protein MetJ COG3574: 62182561 cystathionine gamma-synthase COG3575: 62182562 bifunctional II/homoserine dehydrogenase II COG3576: 62182566 hypothetical protein SC3996 COG3577: 62182567 5,10-methylenetetrahydrofolate reductase COG3578: 62182571 fructose-6-phosphate aldolase COG3579: 62182577 paral regulator COG3580: 62182579 phosphoenolpyruvate carboxylase COG3581: 62182581 N-acetyl-gamma-glutamyl-phosphate reductase COG3582: 62182582 acetylglutamate kinase COG3583: 62182583 argininosuccinate lyase COG3584: 62182584 DNA-binding transcriptional regulator OxyR COG3585: 62182585 soluble pyridine nucleotide transhydrogenase COG3586: 62182586 DNA-binding transcriptional repressor FabR COG3587: 62182587 hypothetical protein SC4017 COG3588: 62182588 tRNA (uracil-5-)-methyltransferase COG3589: 62182590 glutamate racemase COG3590: 62182595 UDP-N-acetylenolpyruvoylglucosamine reductase COG3591: 62182596 biotin--protein ligase

241

COG3592: 62182597 COG3593: 62182601 preprotein translocase subunit SecE COG3594: 62182602 transcription antitermination protein NusG COG3595: 62182603 50S ribosomal protein L11 COG3596: 62182604 50S ribosomal protein L1 COG3597: 62182605 50S ribosomal protein L10 COG3598: 62182607 DNA-directed RNA polymerase subunit beta COG3599: 62182608 DNA-directed RNA polymerase subunit beta' COG3600: 62182609 inner membrane protein COG3601: 62182611 thiamine biosynthesis protein ThiH COG3602: 62182613 thiamine biosynthesis protein ThiF COG3603: 62182614 thiamine-phosphate pyrophosphorylase COG3604: 62182615 thiamine biosynthesis protein ThiC COG3605: 62182616 anti-RNA polymerase sigma 70 factor COG3606: 62182617 NADH pyrophosphatase COG3607: 62182618 uroporphyrinogen decarboxylase COG3608: 62182619 endonuclease V COG3609: 62182620 hypothetical protein SC4050 COG3610: 62182621 transcriptional regulator HU subunit alpha COG3611: 62182622 hypothetical protein SC4052 COG3612: 62182623 zinc resistance protein COG3613: 62182624 sensor protein ZraS COG3614: 62182625 transcriptional regulatory protein ZraR COG3615: 62182626 phosphoribosylamine--glycine ligase COG3616: 62182630 hypothetical protein SC4060 COG3617: 62182631 homoserine O-succinyltransferase COG3618: 62182632 malate synthase COG3619: 62182633 isocitrate lyase COG3620: 62182636 IclR family transcriptional regulator COG3621: 62182641 hypothetical protein SC4071 COG3622: 62182642 23S rRNA pseudouridine synthase F COG3623: 62182643 hypothetical protein SC4073 COG3624: 62182669 aspartate kinase COG3625: 62182670 glucose-6-phosphate isomerase COG3626: 62182671 hypothetical protein SC4101 COG3627: 62182672 outer membrane lipoprotein COG3628: 62182674 outer membrane lipoprotein COG3629: 62182675 phosphate-starvation-inducible protein PsiE COG3630: 62182676 maltose ABC transporter permease

242

COG3631: 62182677 maltose transporter membrane protein COG3632: 62182680 maltoporin COG3633: 62182681 maltose regulon periplasmic protein COG3634: 62182684 glycerol-3-phosphate acyltransferase COG3635: 62182685 COG3636: 62182686 LexA repressor COG3637: 62182687 DNA-damage-inducible SOS response protein COG3638: 62182689 stress-response protein COG3639: 62182690 zinc uptake transcriptional repressor COG3640: 62182691 hypothetical protein SC4121 COG3641: 62182692 tRNA-dihydrouridine synthase A COG3642: 62182694 quinone oxidoreductase COG3643: 62182695 replicative DNA helicase COG3644: 62182696 alanine racemase COG3645: 62182698 /phosphotransferase COG3646: 62182699 hypothetical protein SC4129 COG3647: 62182700 hypothetical protein SC4130 COG3648: 62182701 hypothetical protein SC4131 COG3649: 62182702 outer membrane lipoprotein COG3650: 62182703 excinuclease ABC subunit A COG3651: 62182704 single-stranded DNA-binding protein COG3652: 62182706 hypothetical protein SC4136 COG3653: 62182707 methyl-accepting chemotaxis protein COG3654: 62182708 ABC transporter outer membrane protein COG3655: 62182709 membrane permease, cation efflux pump COG3656: 62182711 bacteriocin/lantibiotic ABC transporter COG3657: 62182712 hypothetical protein SC4142 COG3658: 62182714 DNA-binding transcriptional regulator SoxS COG3659: 62182715 redox-sensing transcriptional activator SoxR COG3660: 62182716 glutathione S-transferase COG3661: 62182717 xanthine/uracil permease family protein COG3662: 62182719 LysR family transcriptional regulator COG3663: 62182720 hypothetical protein SC4150 COG3664: 62182721 hypothetical protein SC4151 COG3665: 62182722 acetate permease COG3666: 62182724 acetyl-CoA synthetase COG3667: 62182726 cytochrome c552 COG3668: 62182727 cytochrome c nitrite reductase pentaheme subunit COG3669: 62182728 nitrite reductase; formate-dependent, Fe-S centers

243

COG3670: 62182729 nitrate reductase, formate dependent COG3671: 62182730 formate-dependent nitrite reductase COG3672: 62182731 formate-dependent nitrite reductase complex subunit NrfG COG3673: 62182733 glutamate/aspartate:proton symporter COG3674: 62182734 hypothetical protein SC4164 COG3675: 62182735 dioxygenase for synthesis of lipid COG3676: 62182736 aminoalkylphosphonic acid N-acetyltransferase COG3677: 62182737 hypothetical protein SC4167 COG3678: 62182738 hypothetical protein SC4168 COG3679: 62182739 proline/glycine betaine transporter COG3680: 62182740 sensor protein BasS/PmrB COG3681: 62182741 DNA-binding transcriptional regulator BasR COG3682: 62182742 cell division protein COG3683: 62182744 transcriptional activator of AdiA COG3684: 62182745 arginine decarboxylase COG3685: 62182746 DNA-binding transcriptional regulator MelR COG3686: 62182747 alpha-galactosidase COG3687: 62182751 DNA-binding transcriptional activator DcuR COG3688: 62182755 anaerobic dimethyl sulfoxide reductase subunit C COG3689: 62182756 hypothetical protein SC4186 COG3690: 62182758 hypothetical protein SC4188 COG3691: 62182760 LuxR family transcriptional regulator COG3692: 62182761 DNA-binding domain-containing protein COG3693: 62182767 non-specific acid phosphatase COG3694: 62182770 transcriptional regulator COG3695: 62182771 thiol:disulfide interchange protein COG3696: 62182772 divalent-cation tolerance protein CutA COG3697: 62182773 anaerobic C4-dicarboxylate transporter COG3698: 62182774 aspartate ammonia-lyase COG3699: 62182776 FxsA protein COG3700: 62182779 molecular chaperone GroEL COG3701: 62182780 hypothetical protein SC4210 COG3702: 62182782 hypothetical protein SC4212 COG3703: 62182784 entericidin A COG3704: 62182786 LuxR family transcriptional regulator COG3705: 62182787 quaternary ammonium compound-resistance protein SugE COG3706: 62182788 outer membrane lipoprotein Blc COG3707: 62182789 fumarate reductase subunit D COG3708: 62182790 fumarate reductase subunit C

244

COG3709: 62182791 fumarate reductase iron-sulfur subunit COG3710: 62182793 lysyl-tRNA synthetase COG3711: 62182794 amino acid APC transporter COG3712: 62182797 phosphatidylserine decarboxylase COG3713: 62182799 oligoribonuclease COG3714: 62182800 arginine-binding periplasmic protein COG3715: 62182801 Fe-S protein COG3716: 62182803 ATPase COG3717: 62182804 N-acetylmuramoyl-L-alanine amidase COG3718: 62182805 DNA mismatch repair protein COG3719: 62182806 tRNA delta(2)-isopentenylpyrophosphate transferase COG3720: 62182807 RNA-binding protein Hfq COG3721: 62182808 GTPase HflX COG3722: 62182809 FtsH protease regulator HflK COG3723: 62182810 FtsH protease regulator HflC COG3724: 62182812 adenylosuccinate synthetase COG3725: 62182814 exoribonuclease R COG3726: 62182815 23S rRNA (guanosine-2'-O-)-methyltransferase COG3727: 62182816 hypothetical protein SC4246 COG3728: 62182818 hypothetical protein SC4248 COG3729: 62182820 inner membrane protein COG3730: 62182822 hypothetical protein SC4252 COG3731: 62182823 biofilm stress and motility protein A COG3732: 62182824 esterase COG3733: 62182825 transcriptional repressor UlaR COG3734: 62182828 PTS system L-ascorbate-specific transporter subunit IIB COG3735: 62182829 PTS system L-ascorbate-specific transporter subunit IIA COG3736: 62182832 L-ribulose-5-phosphate 4-epimerase COG3737: 62182835 30S ribosomal protein S6 COG3738: 62182836 primosomal replication protein N COG3739: 62182837 30S ribosomal protein S18 COG3740: 62182838 50S ribosomal protein L9 COG3741: 62182839 hypothetical protein SC4269 COG3742: 62182840 hypothetical protein SC4270 COG3743: 62182842 D-alanine/D-serine/glycine permease COG3744: 62182845 iron-sulfur cluster repair di-iron protein COG3745: 62182846 hypothetical protein SC4276 COG3746: 62182847 hypothetical protein SC4277 COG3747: 62182849 adenosine-3'(2'),5'-bisphosphate

245

COG3748: 62182850 hypothetical protein SC4280 COG3749: 62182851 hypothetical protein SC4281 COG3750: 62182852 hypothetical protein SC4282 COG3751: 62182853 methionine sulfoxide reductase A COG3752: 62182854 hypothetical protein SC4284 COG3753: 62182855 hypothetical protein SC4285 COG3754: 62182856 hypothetical protein SC4286 COG3755: 62182857 pemease COG3756: 62182858 inorganic pyrophosphatase COG3757: 62182859 fructose-1,6-bisphosphatase COG3758: 62182861 hypothetical protein SC4291 COG3759: 62182863 cytochrome b(562) COG3760: 62182865 hypothetical protein SC4295 COG3761: 62182866 hypothetical protein SC4296 COG3762: 62182867 hypothetical protein SC4297 COG3763: 62182868 inner membrane protein COG3764: 62182869 inner membrane protein COG3765: 62182871 selenocysteine synthase COG3766: 62182873 phosphotransferase system mannitol/fructose-specific IIA component COG3767: 62182874 bifunctional antitoxin/transcriptional repressor RelB COG3768: 62182875 inner membrane protein COG3769: 62182876 anaerobic ribonucleotide reductase-activating protein COG3770: 62182877 anaerobic ribonucleoside triphosphate reductase COG3771: 62182881 trehalose repressor COG3772: 62182883 hypothetical protein SC4313 COG3773: 62182884 aspartate carbamoyltransferase COG3774: 62182885 aspartate carbamoyltransferase COG3775: 62182887 arginine repressor COG3776: 62182888 hypothetical protein SC4318 COG3777: 62182889 ornithine carbamoyltransferase COG3778: 62182890 carbamate kinase COG3779: 62182893 hypothetical protein SC4323 COG3780: 62182894 ornithine carbamoyltransferase subunit I COG3781: 62182895 hypothetical protein SC4325 COG3782: 62182896 hydroxylase for synthesis of 2-methylthio-cis-ribozeatin in tRNA COG3783: 62182899 acetyltransferase COG3784: 62182900 inner membrane protein COG3785: 62182901 valyl-tRNA synthetase COG3786: 62182902 DNA polymerase III subunit chi

246

COG3787: 62182903 leucyl aminopeptidase COG3788: 62182905 permease COG3789: 62182906 permease COG3790: 62182907 L-idonate regulator COG3791: 62182927 hypothetical protein SC4357 COG3792: 62182928 hypothetical protein SC4358 COG3793: 62182931 hypothetical protein SC4361 COG3794: 62182932 DNA-binding transcriptional repressor UxuR COG3795: 62182933 tryptophanyl-tRNA synthetase COG3796: 62182935 aspartate racemase COG3797: 62182938 hypothetical protein SC4368 COG3798: 62182939 hypothetical protein SC4369 COG3799: 62182940 hypothetical protein SC4370 COG3800: 62182941 hypothetical protein SC4371 COG3801: 62182953 hypothetical protein SC4383 COG3802: 62182963 phosphoglycerol transferase I COG3803: 62182964 hypothetical protein SC4394 COG3804: 62182965 DNA replication protein DnaC COG3805: 62182966 primosomal protein DnaI COG3806: 62182967 hypothetical protein SC4397 COG3807: 62182968 hypothetical protein SC4398 COG3808: 62182971 hypothetical protein SC4401 COG3809: 62182972 ferric hydroximate transport ferric iron reductase COG3810: 62182973 diguanylate cyclase/phosphodiesterase domain-containing protein COG3811: 62182974 hypothetical protein SC4404 COG3812: 62182975 16S ribosomal RNA m2G1207 methyltransferase COG3813: 62182977 nucleotidase COG3814: 62182978 peptide chain release factor 3 COG3815: 62182979 hypothetical protein SC4409 COG3816: 62182981 hypothetical protein SC4411 COG3817: 62182982 YjjV COG3818: 62182983 pyruvate formate lyase activating enzyme COG3819: 62182984 hypothetical protein SC4414 COG3820: 62182985 deoxyribose-phosphate aldolase COG3821: 62182987 phosphopentomutase COG3822: 62182988 purine nucleoside phosphorylase COG3823: 62182990 lipoate-protein ligase A COG3824: 62182991 hypothetical protein SC4421 COG3825: 62182993 phosphoserine phosphatase

247

COG3826: 62182994 DNA repair protein RadA COG3827: 62182995 nicotinamide-nucleotide adenylyltransferase COG3828: 62182996 ABC transporter ATP-binding protein COG3829: 62182998 Trp operon repressor COG3830: 62183002 hypothetical protein SC4432 COG3831: 62183003 DNA-binding response regulator CreB COG3832: 62183004 sensory histidine kinase CreC COG3833: 62183005 hypothetical protein SC4435 COG3834: 62183007 fimbrial subunit COG3835: 62183010 fimbrial chaperone protein COG3836: 62183011 inner membrane protein COG3837: 62183013 two-component response regulator COG3838: 62183015 RNA methyltransferase

248

Appendix B

Growth Characterizations of SEE1 and SEE2

Figure B-1: Growth Curves of SEE1 and SEE2.

249

Figure B-2: Cell Enumeration and Viability by OD600

Matthew R. Moreau Vita

Education The Pennsylvania State University 2013-2015 Ph.D. Pathobiology Advisors: Dr. Subhashinie Kariyawasam and Dr. Bhushan Jayarao The Pennsylvania State University 2012-2013 M.S. Pathobiology Advisor: Dr. Vivek Kapur Bridgewater State University 2004-2008 B.S. Biomed. and Mol. Biology (Hon) Advisors: Dr. Michael Carson, Dr. Patricia Mancini, Dr. Michelle LaBonte

Leadership, Academic Associations and Awards 2015 Harold F. Martin Graduate Assistant Outstanding Teaching Award 2015 Certificate for Teaching in College 2008 Departmental Honors in Biology from Bridgewater State University 2006 Induction into Tri-Beta National Honors Society of the Biological Sciences

Publications Moreau, Matthew R., Megan L. Bailey, Sudharsan R. Gongati, Dona Saumya S. Wijetunge, Eranda M.K. Kurundu Hewage, Yury V. Ivanov, Laura L. Goodfield, Mary J. Kennett, Bhushan Jayarao, and Subhashinie Kariyawasam. Growth in Egg Yolk Enhances Salmonella Enteriditis Colonization and Virulence. (In Preparation for PLoS Pathogens) Moreau, Matthew R., Yury V. Ivanov, Bhushan Jayarao and Subhashinie Kariyawasam. Comparative Genomics of Egg and Human Isolates Salmonella Enteriditis. (In Preparation for BMC Genomics). Moreau, Matthew R., Yury V. Ivanov, Bhushan Jayarao and Subhashinie Kariyawasam. Pan-Genome Analyses Implicate Genes Involved in Host Specific Infection of Salmonella Enteriditis. (In Preparation for BMC Genomics). Moreau, Matthew R., Indira T. Kudva, Robab Katani, Lingling Li, Rebecca Cote, Michael Mwangi, and Vivek Kapur. Analysis of Adherence Genes in Enterohemorrhagic Escherichia coli Isolate SS17 Reveal Divergence in Adhesion Strategies to Cattle and Human Colonic Cells. PLoS One Moreau, Matthew R., Michelle Q. Carter, Maria L. Brandl, Robab Katani, Lingling Li, Rebecca Cote, and Vivek Kapur. Analysis of Genes and SNPs Implicated in the Enhanced Biofilm Formation of EHEC Isolate SS17 Induced in Spinach Lysate. (In Preparation for Applied and Environmental Microbiology). Cote, Rebecca, Robab Katani, Matthew R. Moreau, et al. (2014) Comparative Analysis of Super-Shedder Strains ofEscherichia coli O157:H7 Reveals Distinctive Genomic Features and a Strongly Aggregative Adherent Phenotype on Bovine Rectoanal Junction Squamous Epithelial Cells. PLoS One.

Selected Professional Talks and Presentations 1. Matthew R. Moreau, Robab Katani, Rebecca Cote, Indira T. Kudva, Michelle Q. Carter, et al. (2014). “Molecular Insights Into the Unique Phenotypes Exhibited by Super Shedder Isolates of Escherichia coli O157:H7.” American Society of Microbiology General Meeting. Boston, MA 02210. Abstract and Poster. 2. Matthew Moreau and Kathleen Postle. (2012). “Bacterial Cell Death Mediated by Overexpression of TonB in E. coli.” Pittsburgh Bacterial Meeting. Pittsburgh, PA 15282. Abstract and Poster. 3. Matthew Moreau, David Hughes, Carol Dickerson, and Jeffery T. Sample. (2010). “Mechanism and Role of the Autoregulatory Function of the EBV Genome Maintenance Protein EBNA-1.” Penn State College of Medicine Genetics Seminar. Hershey, PA 17033. Oral Presentation 4. Matthew Moreau and Michael J. Carson. (2008). “The Effects of Quorum Sensing on the ftsQAZ Region of Escherichia coli.” National Conference of Undergraduate Research (NCUR). Salisbury, MD 21801. Abstract and Poster

Teaching Experience 2015 Completion of Teaching in College Certificate Penn State University 2014 Graduate Teaching Assistant for MICRB421W Lab Penn State University 2013-2014 Graduate Teaching Assistant for MICRB107 Lab Penn State University