USE OF MICROARRAY TECHNOLOGY TO STUDY THE PHYSIOLOGY AND PATHOGENESIS OF MOUSE COLONISING STRAINS OF HELICOBACTER PYLORI

Lucinda Jenny Thompson

A thesis submitted for the degree of

Doctor of Philosophy

School of Biotechnology and Biomolecular Sciences, Microbiology and Immunology,

University of NSW

February, 2003 i

ACKNOWLEDGMENTS

The completion of this thesis would not have been possible without the endless support and understanding of many wonderful people:

I am in the fortunate position of being able to thank two sets of fantastic scientists from two institutions on opposite sides of the world: “The Helico lab” and “The Blue-Green Groove Machine” at University of NSW, Down Under, and the Falkow lab, aka “The Bugdungeon” at Stanford University, Sunny California.

The Helico lab: Many different people have made my stay in the Helico lab enjoyable over the past four years: some who became PhD’s and inspired me to get to there as well, particularly Bronwyn and Cora. Others who have endured to keep the whole show running: Jani O’Rourke and John Wilson to whom I must thank for their support, guidance and amazing talents especially in the animal house. To the leaders: a big thank you must go to Hazel Mitchell for taking over and becoming a great supervisor in the time of need. Finally to my other supervisor, Adrian Lee, who’s infectious enthusiasm for microbiology, especially of the spiral variety, has been incredibly inspirational. Adrian’s amazing personality made the lab a great place to be, and drew people to him from all around the world. Thank you for believing enough in me to open a door I never believed possible and giving me the opportunity of a lifetime.

The Blue-Green Goove Machine: Thank you for letting me be part of your groove and for being great friends. To Tim and Michelle who remembered me when I wasn’t there; to Janine for the great coffee breaks and for saving me! Finally, to Brett Neilan my surrogate co-supervisor, whose conversations over a pint were always enlightening and for believing I could be a Stanfordite.

The Bugdungeon: Thank you for taking me in and looking after me so far from home, especially Sara the “lab mom” whose generosity was wonderful. To Nina, Karen and Corrie for taking me under your wings and teaching me the art of Acknowledgements ii being a female scientist, it was quite inspirational. A big thank you to Scotty for helping, supporting and befriending me beyond the call of duty; without whom I would never have finished. Finally, to Stanley Falkow for taking a “risk” with me and being inspiring and supportive. I’ll always think of you as “God of Bacterial Pathogenesis”. I am also indebted to the generosity of Stan and Lucy for having me in their home, nudging me in the right direction and making my whole US experience enjoyable, if not a bit surreal. The experience has changed my life in many ways.

Aussie friends: There have been lots of people over the years that have made it all bearable and whose love kept me going. An especially big thank you and hug to my oldest friends Anna, Jess, Emily, Jamie and The Gang.

Seppo friends: You are some of the most generous, lovely people in the world. Thank you to all the Stanford swimmers, especially Wendy, for welcoming me into your “bubble”, making me love swimming again, and helping me enjoy this wonderful area. Now I know why people find it so hard to leave this place! Or why I keep coming back! Thank you also to Juliana, Laura, and Dave for your friendship and fun and for looking after me when I needed it.

Family: The people I TRULY owe my whole success to are my family. You have all been so supportive, patient and understanding. To Mum and Dad, thank you so much for everything, I could not have EVEN attempted this without you. Dad, you have been wonderfully supportive, have always had the best advice, have believed in me, have shown me the humour in things, reminded me “that this life is not just a dress rehearsal” and most importantly “that I am (supposed to be) enjoying myself!” Mum, you have always been so encouraging, have always led me in the right direction, believed in me, been there for me to lean on (a lot!), listened patiently to all my woes and made it all not seem so bad. Thank you both for the LONG, long distance phone calls that have kept me sane and your genuine interest in the science which motivated me to keep going. To Scot for always looking after me, for teasing me incessantly and most of all for being interested and supportive. To Catriona for rescuing me every time I needed it, for Acknowledgements iii being a great friend, not just a big sister, and for all your support. To Lachlan, the newest member, for being a welcome and exciting distraction; looking forward to being buddies. Thank you all for supporting me on my journey across the world.

Extended Family: To Richard and Heather for putting up with the nerdy little sister! To Ian and Jenny Macintosh, and Jill Harris and Lawson Lobb for being incredibly encouraging with your kind words, I owe much of my determination to you.

Furry Friends: Helgies thanks for looking after Mum and Dad for me, remembering me every time I came home and being the most lovely, cutest creature ever. To my newest friends Aqua (aka “Sweet Potato”) and Jake (aka “Eggieplant”), thanks for looking after me, being excellent companions and a wonderful procrastination, you guys are beautiful!

Last but definitely not least Peyman thank you for making me love everyday, for enjoying life, food, and exercise as much as I do and for showing me the beauty in the world. Thank you for your generosity, loving me through the hardest part and believing I would get there eventually.

I read recently a particularly pertinent comment from an Australian living in France: “It’s hard to love two countries” Sarah Turnbull.

“Do not give your heart to one mistress, nor your loyalty to a single place, for countless are mistresses, and extensive are lands and seas” Sa’adi.

iv

PUBLICATIONS AND PRESENTATIONS

Publications Sutton, P., Danon, S. J., Walker, M., Thompson, L. J., Wilson, J., Kosaka, T. and Lee, A. 2001. Post-immunisation gastritis and Helicobacter infection in the mouse: a long term study. Gut. 49: 467-73.

Lucinda J. Thompson, D. Scott Merrell, Brett A. Neilan, Hazel Mitchell, Adrian Lee, and Stanley Falkow. 2003. Expression Profiling of Helicobacter pylori Reveals a Growth-Phase-Dependent Switch in Virulence . Infect. Immun. 71(5), Accepted 15 Jan 2003.

Invited chapters Thompson, L. J. & de Reuse, H. 2002. Genomics of Helicobacter pylori. Helicobacter. 7(Suppl 1): 1-7

Lee A., Thompson L., O'Rourke JL. 2002. Priorities for future research: Microbiology. In: Hunt RH, Tytgat GNJ, ed. Helicobacter pylori: Basic mechanisms to clinical cure 2002. Dordrecht/Boston/London: Kluwer Academic Publishers, in press.

Proceedings: Thompson, L. J., Danon, S. J., Wilson, J., O'Rourke, J., Salama, N., Falkow, S., Mitchell, H. and Lee, A. 2001. Presence of the cag Pathogenicity Island does not affect the Level of Colonization or Inflammation in Mice infected with Helicobacter pylori. In European Helicobacter pylori Study Group XIVth International Workshop on Gastroduodenal Pathology and Helicobacter pylori, Vol. 49 suppl. II (Ed, Farthing, M. J. G.) BMJ Publishing Group, Strasbourg, France, pp. A14.

Thompson, L. J., Danon, S. J., Wilson, J., O'Rourke, J., Salama, N., Falkow, S., Mitchell, H. and Lee, A. 2001. Presence of the cag Pathogenicity Island does Publications v not affect the Level of Colonization or Inflammation in Mice infected with Helicobacter pylori. In 11th International Workshop on Campylobacter, Helicobacter and Related Organisms, Vol. 291 Suppl. No. 31 (Ed, Hacker, J.) Urban & Fischer Verlag GmbH & Co. KG, Freiburg, Germany, pp. 137.

Thompson, L. J., Guillemin, K., Falkow, S. and Lee, A. 2001. Global Gene Expression of Helicobacter pylori during the Growth Cycle detected on a DNA Microarray. In European Helicobacter pylori Study Group XIVth International Workshop on Gastroduodenal Pathology and Helicobacter pylori, Vol. 49 suppl. II (Ed, Farthing, M. J. G.) BMJ Publishing Group, Strasbourg, France, pp. A9.

Thompson, L. J., Guillemin, K., Falkow, S. and Lee, A. 2001. Global Gene Expression of Helicobacter pylori during the Growth Cycle detected on a DNA Microarray. In 11th International Workshop on Campylobacter, Helicobacter and Related Organisms, Vol. 291 Suppl. No. 31 (Ed, Hacker, J.) Urban & Fischer Verlag GmbH & Co. KG, Freiburg, Germany, pp. 118.

Abstracts Thompson, L. J., Danon, S. J., Wilson, J., O'Rourke, J., Salama, N., Falkow, S., Mitchell, H. and Lee, A. 2002. Have we found a better mouse model of H. pylori infection? In 4th Western Pacific Helicobacter Congress. Perth, Australia, pp. 8.1.6.

Thompson, L. J., Guillemin, K., Falkow, S. and Lee, A. 2002. DNA microarray analysis of the global gene expression of Helicobacter pylori during the growth cycle. In 4th Western Pacific Helicobacter Congress. Perth, Australia, pp. 8.2.4. vi

ABBREVIATIONS

aa. amino allyl RPA. RNase Protection Assays cDNA. Complimentary DNA RT. Room Temperature CFU. Colony Forming Units SAM. Significance Analysis of CLV. Curvilinear Velocity Microarrays Cy. Cyanine fluorescent dye SMD. Stanford Microarray Database DNA. Deoxyribonucleic Acid SOM. Self Organising Map DU. Duodenal Ulcer TC. Time Course GA. Gland Abscesses TZ. Transitional Zone GC. Gastric Cancer gDNA. Genomic DNA GSP. Gene Specific Primers GU. Gastric Ulcer IDA. Iron Deficiency Anaemia LA. Lymphoid Aggregate LCM. Laser Capture Microdissection LPS. Lipopolysaccharide MALT. Mucosal Associated Lymphoid Tissue MDCK. Madin-Darby Canine Kidney mRNA. Messenger RNA OD. Optical Density OMP. Outer Membrane ORF. Open Reading Frame PAI. Pathogenicity Island PCA. Principal Component Analysis PUD. Peptic Ulcer Disease RNA. Ribonucleic Acid vii

ABSTRACT

Helicobacter pylori is a unique bacterial pathogen which colonises the human stomach. Infection with H. pylori has been linked to several disease outcomes including gastric and duodenal ulcer, gastric cancer and MALT lymphoma. Considering the harsh environment in which it resides and the lack of competition from other bacteria, this host/pathogen relationship is particularly interesting. Microarray analysis is a new and powerful technique which can be used to investigate various aspects of these complex interactions. Expression profiling of bacteria using microarrays remains in its infancy and thus appropriate methods were developed herein for investigating the transcriptional responses of H. pylori to various environments in vitro. Studies showed the tight relationship between growth phase dependent expression of iron homeostasis, motility and virulence in H. pylori for the first time. Consequently, the late exponential phase of growth was implicated as the most virulent growth phase of this bacterium in vitro. In response to mammalian cell co-culture, induced expression of H. pylori metabolism/respiration genes, genes of unknown function and genes encoding the 2-component regulators, HP1021 and HP0166, were detected. These represent a set of genes likely to be important specifically in the context of infection. To investigate the host response to infection a new mouse colonising strain of H. pylori, the Sydney Strain 2000 (SS2000), was isolated for use in comparative studies with the established strain, Sydney Strain 1 (SS1). Both host and strain specific effects were studied in a 15 month colonisation experiment using C57BL/6 and BALB/c mice. Genomic typing was used to investigate dynamic changes that occurred in the mouse-adapted strains during colonisation. In these animals reponses relating to the severity of inflammation and to the infecting H. pylori isolate were revealed by gene expression profiling. Previously unrealised cellular responses were uncovered. These included the significant down-regulation of both ferritin and haemoglobin expression. This perhaps suggests a mechanism for H. pylori induced iron deficiency anaemia. Physiological connections between colonisation, acid secretion and expression of the endocrine hormones were also implicated. These experiments have shown the utility of microarray analysis in the investigation of pathogenesis and have highlighted many directions for further investigation. viii

TABLE OF CONTENTS

ACKNOWLEDGEMENTS...... i PUBLICATIONS AND PRESENTATIONS ...... iv ABBREVIATIONS ...... vi ABSTRACT...... vii TABLE OF CONTENTS ...... viii CHAPTER 1: INTRODUCTION...... 1 1.1. HELICOBACTER PYLORI INFECTION ...... 1 1.1.1. PATTERNS OF COLONISATION AND INFLAMMATION ...... 1 1.1.2. DISEASE PROGRESSION ...... 3 1.1.2.1. Duodenal Ulcer ...... 3 1.1.2.2. Gastric Ulcer...... 4 1.1.2.3. Gastric Cancer and MALT lymphoma ...... 5 1.1.2.4. Treatment of H. pylori Infection...... 6 1.2. H. PYLORI PHYSIOLOGY AND GENETIC MAKEUP ...... 7 1.2.1. GENERAL PHYSIOLOGY ...... 7 1.2.2. GENOME CONTENT...... 7 1.2.3. VIRULENCE FACTORS ...... 11 1.2.3.1. Factors Required for Colonisation ...... 11 1.2.3.2. Factors Required for Persistence...... 14 1.2.3.3. Factors Required for Host Tissue Damage and Disease Induction ...... 14 1.3. ANIMAL MODELS USED FOR STUDYING H. PYLORI INFECTION ...... 16 1.3.1. THE MOUSE MODEL...... 18 1.3.2. HOST FACTORS ...... 19 1.4. MICROARRAY TECHNOLOGY...... 21 1.4.1. BACTERIAL MICROARRAYS...... 22 1.4.1.1. Transcription Profiling ...... 22 1.4.1.2. Genome Typing ...... 24 1.4.2. HOST MICROARRAYS...... 25 1.4.2.1. Transcriptional Profiling...... 25 1.5. INVESTIGATIONS INTO HOST/PATHOGEN RELATIONSHIPS ...... 29 1.6. HYPOTHESES AND AIMS ...... 29 1.6.1. OVERALL GOAL OF THIS THESIS: ...... 29 1.6.2. HYPOTHESES TO BE TESTED: ...... 30 1.6.3. SPECIFIC AIMS ...... 30 CHAPTER 2: MATERIALS AND METHODS……………………………………...31 2.1. CULTURE MEDIA ...... 31 2.1.1. HORSE BLOOD AGAR (HBA) ...... 31 2.1.2. CAMPYLOBACTER SELECTIVE AGAR (CSA)...... 31 2.1.3. GLAX SELECTIVE SUPPLEMENT AGAR (GSSA)...... 31 Contents ix

2.1.4. BRAIN HEART INFUSION BROTH PLUS GLYCEROL (BHIG)...... 32 2.1.5. BRUCELLA BROTH WITH FETAL CALF SERUM (BBF)...... 32 2.1.6. TISSUE CULTURE MEDIUM (DMEM/FCS)...... 32 2.1.7. CO-CULTURE MEDIUM (DMEM/FCS/BB)...... 32 2.2.1. PHOSPHATE BUFFERED SALINE (PBS) 0.1 M ...... 33 2.2.2. PHYSIOLOGICAL SALINE...... 33 2.3.1. HELICOBACTER PYLORI STRAINS ...... 33 2.3.2. CRYOPRESERVATION ...... 33 2.3.3. RESUSCITATION AND PLATE CULTURE OF BACTERIA ...... 33 2.3.4. COLONY FORMING UNIT (CFU) ESTIMATION...... 33 2.3.5. LIQUID CULTURE FOR TIME COURSE EXPERIMENTS ...... 34 2.3.6. BACTERIAL HARVESTING FROM IN VITRO GROWTH IN BROTH OR CO-CULTURE ...... 34 2.4. TISSUE CULTURE ...... 34 2.4.1. CELL CULTURE OF MADIN-DARBY CANINE KIDNEY (MDCK) CELLS ...... 34 2.4.2. CO-CULTURE OF H. PYLORI AND MDCK CELLS...... 34 2.5.1. ANIMAL MAINTENANCE AND HOUSING ...... 35 2.5.2. STOMACH COLLECTION ...... 35 2.5.3. VIABLE PLATE COUNT FOR DETECTION OF H. PYLORI INFECTION...... 35 2.6.1. FIXATION, PROCESSING AND SECTIONING ...... 36 2.7. ELECTRON MICROSCOPY...... 36 2.7.1. SCANNING ELECTRON MICROSCOPY (SEM) OF BACTERIAL CULTURES ...... 36 2.8.1. EXTRACTION METHODS ...... 36 2.8.1.1. Genomic DNA (gDNA) extraction from bacterial cultures ...... 36 2.8.1.2. RNA Isolation from H. pylori samples ...... 37 2.8.1.3. RNA isolation from mammalian stomach tissue samples...... 37 2.8.2. AGAROSE GEL ELECTROPHORESIS OF RNA SAMPLES ...... 38 2.8.3. REVERSE TRANSCRIPTION ...... 38 2.8.4. RT-PCR ...... 39 2.9.1. MICROARRAYS...... 39 2.9.1.1. H. pylori microarray ...... 39 2.9.1.2. Murine microarray ...... 40 2.9.2. MICROARRAY POST-PROCESSING ...... 40 2.9.3. H. PYLORI MICROARRAY PROBE PREPARATION AND HYBRIDISATION ...... 40 2.9.3.1. Genomic DNA labelling...... 40 2.9.3.2. Standard total RNA labelling protocol: Indirect incorporation of Cy-dyes ...... 41 2.9.3.3. Hybridisation of H. pylori arrays ...... 42 2.9.4. MURINE MICROARRAY PROBE PREPARATION AND HYBRIDISATION USING TOTAL RNA...... 42 2.9.5. MICROARRAY STRINGENCY WASHES ...... 43 2.9.6. DATA ANALYSIS ...... 43 2.9.6.1. Stanford Microarray Database...... 43 2.9.6.2. Normalisation by SMD ...... 43 2.9.6.3. Data retrieval from SMD...... 44 2.9.6.3.1. H. pylori array data...... 44 2.9.6.3.2. Murine array data...... 44 2.9.6.4. Computer programs used for data analysis...... 45 2.9.6.4.1. CLUSTER and TREEVIEW...... 45 2.9.6.4.2. Significance Analysis of Microarrays (SAM) ...... 45 2.9.6.4.3. Genome typing analysis (GACK)...... 45 2.9.6.4.4. Data manipulation programs...... 46 CHAPTER 3: DEVELOPMENT OF PROCEDURES FOR H. PYLORI TRANSCRIPTIONAL PROFILING...... 48 Contents x

3.1. BACKGROUND...... 48 3.2. PART 1: BACTERIAL HARVESTING AND RNA EXTRACTION TECHNIQUES ...... 53 3.2.1. EXPERIMENTAL PROCEDURES (PART 1)...... 53 3.2.1.1. Bacterial harvesting ...... 53 3.2.1.2. RNA extraction from H. pylori cells ...... 53 3.2.1.3. Assessment of the quality of RNA extracted from broth grown H. pylori ...... 54 3.2.2. RESULTS AND DISCUSSION (PART 1)...... 54 3.2.3. CONCLUSION (PART 1) ...... 57 3.3. PART 2: THE DIRECT LABELLING PROCEDURE...... 57 3.3.1. EXPERIMENTAL PROCEDURES (PART 2)...... 57 3.3.1.1. Improvement of the direct labelling procedure ...... 57 3.3.1.1.1. Original direct labelling procedure (based on TIGR protocol)...... 57 3.3.1.2. RNA samples extracted for use in method testing...... 59 3.3.1.3. Comparison of the use of RNA versus genomic DNA as a reference in array hybridisations...... 61 3.3.1.4. Comparison of the use of random hexamers (R6) and Gene Specific Primers (GSP) for reverse transcription ...... 61 3.3.1.5. Investigation into the efficiency of direct labelling using Superscript (Fig. 3.3A)..62 3.3.1.6. Estimation of array hybridisation quality...... 62 3.3.2. RESULTS AND DISCUSSION (PART 2)...... 63 3.3.2.1. Issues relating to Direct incorporation of Cy-dyes into the first strand cDNA...... 63 3.3.3. CONCLUSION (PART 2) ...... 71 3.4. PART 3: DEVELOPMENT OF THE INDIRECT LABELLING PROCEDURES ...... 72 3.4.1. EXPERIMENTAL PROCEDURES (PART 3)...... 72 3.4.1.1. Investigation into the efficiency of direct labelling using Klenow (Fig. 3.3B)...... 72 3.4.1.1.1. Direct labelling using Klenow procedure...... 72 3.4.1.2. Investigation into the efficiency of the indirect labelling procedure (Fig. 3.3C) ....74 3.4.1.2.1. Indirect labelling procedure (New standard protocol)...... 74 3.4.2. RESULTS AND DISCUSSION (PART 3)...... 75 3.4.2.1. Improvement of the Direct incorporation of Cy-dUTP using Klenow instead of Reverse Transcriptase...... 75 3.4.2.2. Development of the Indirect labelling procedure ...... 76 3.4.3. CONCLUSION (PART 3) ...... 80 3.5. PART 4: FINE TUNING THE INDIRECT LABELLING PROCEDURE...... 81 3.5.1. EXPERIMENTAL PROCEDURES (PART 4)...... 81 3.5.1.1. Titration of RNA, GSPs, and cDNA for indirect labelling procedure ...... 81 3.5.1.2. Investigation of the results generated from array experiments in which the cDNA sample was split for the sample and reference ...... 81 3.5.1.3. Effect of splitting cDNA on the amount of total RNA required for labelling...... 82 3.5.2. RESULTS AND DISCUSSION (PART 4)...... 82 3.5.3. CONCLUSION (PART 4) ...... 83 3.6. GENERAL CONCLUSIONS ...... 83 CHAPTER 4: TRANSCRIPTIONAL PROFILING OF H. PYLORI GROWTH IN BROTH...... 87 4.1. BACKGROUND...... 87 4.2. EXPERIMENTAL PROCEDURES ...... 89 4.2.1. TIME COURSES ...... 89 Contents xi

4.2.2. PREPARATION AND HYBRIDISATION OF CDNA PROBES ...... 89 4.2.3. DATA ANALYSIS ...... 89 4.2.4. VISUALISATION AND STATISTICAL ANALYSIS OF THE DATA ...... 90 4.2.5. RNASE PROTECTION ASSAYS (RPAS) ...... 91 4.2.6. MOTILITY MEASUREMENTS...... 92 4.2.7. SUPPLEMENTARY MATERIAL...... 93 4.3. RESULTS AND DISCUSSION ...... 94 4.3.1. RELIABILITY OF ARRAY DATA...... 94 4.3.2. GENE EXPRESSION IS TEMPORALLY REGULATED DURING H. PYLORI GROWTH ...... 94 4.3.3. A MAJOR SWITCH IN GENE EXPRESSION PROFILES OCCURS DURING THE LATE-LOG PHASE...... 98 4.3.4. SIGNIFICANCE ANALYSIS OF THE LOG-STAT SWITCH ...... 99 4.3.5. VALIDATION OF MICROARRAY RESULTS ...... 100 4.3.6. THE H. PYLORI REGULATORY GENES ...... 105 4.3.7. OPERON STRUCTURE AND GENE REGULATION ...... 105 4.3.8. EXPRESSION OF VIRULENCE FACTORS AND THE LOG-STAT SWITCH...... 109 4.3.9. IRON HOMEOSTASIS REGULATION ...... 112 4.3.10. MOTILITY AND THE CORRESPONDING EXPRESSION OF THE FLAGELLA REGULON ...... 113 4.4. CONCLUSION...... 115 CHAPTER 5: TRANSCRIPTIONAL ANALYSIS OF H. PYLORI IN CO- CULTURE WITH MDCK CELLS...... 117 5.1. BACKGROUND...... 117 5.2. EXPERIMENTAL PROCEDURES ...... 121 5.2.1. CO-CULTURE AND MAINTENANCE OF H. PYLORI AND MDCK CELLS ...... 121 5.2.2. HARVESTING OF H. PYLORI CELLS FROM CO-CULTURE WITH MDCK CELLS...... 123 5.2.3. INOCULATION OF MDCK CELLS FOR TIME COURSE EXPERIMENTS ...... 123 5.2.3.1. G27-MDCK co-culture time courses ...... 123 5.2.3.2. SS1-MDCK co-culture time courses ...... 124 5.2.4. DATA ANALYSIS ...... 125 5.2.4.1. G27-MDCK co-culture time course analysis...... 125 5.2.4.2. SS1-MDCK co-culture time course analysis...... 125 5.2.4.3. Comparison of genes induced in G27 and SS1 co-culture ...... 126 5.2.5. SUPPLEMENTARY MATERIAL...... 126 5.3. RESULTS ...... 127 5.3.1. MAINTENANCE OF H. PYLORI IN CO-CULTURE WITH MDCK CELLS ...... 127 5.3.2. RNA EXTRACTION FROM H. PYLORI GROWN IN CO-CULTURE...... 127 5.3.3. G27 GROWTH IN CO-CULTURE ...... 129 5.3.4. COMPARISON OF GENE EXPRESSION PROFILES OF H. PYLORI GROWN IN CO-CULTURE AND REGULAR BROTH CULTURE...... 129 5.3.5. SPECIFIC TRANSCRIPTIONAL RESPONSE OF H. PYLORI IN PRESENCE OF MDCK CELLS ....130 5.3.6. SS1 GROWTH AND GENE EXPRESSION IN CO-CULTURE ...... 139 5.3.7. GENES INDUCED IN BOTH G27 AND SS1 CO-CULTURE...... 143 5.4. DISCUSSION ...... 146 5.4.1. ADVANTAGES OF H. PYLORI CO-CULTURE WITH MDCK CELLS...... 146 5.4.2. G27-MDCK TIME COURSES ...... 148 5.4.3. RESPONSE TO GROWTH IN DIFFERENT MEDIUMS ...... 148 5.4.4. GENES INDUCED SPECIFICALLY IN THE PRESENCE OF LIVE MAMMALIAN CELLS ...... 149 5.4.5. COMPARISON OF GENES INDUCED IN G27 AND SS1 IN CO-CULTURE ...... 152 Contents xii

5.4.6. FUTURE DIRECTIONS ...... 153 5.5. CONCLUSION...... 154 CHAPTER 6: THE NEW MOUSE COLONISING STRAIN...... 155 6.1. BACKGROUND...... 155 6.2. EXPERIMENTAL PROCEDURES ...... 159 6.2.1. BACTERIAL CULTURES ...... 159 6.2.2. ANIMAL INFECTIONS ...... 159 6.2.2.1. Isolation of new mouse colonising strain...... 159 6.2.2.1.1. First animal passage (Passage A) ...... 159 6.2.2.1.2. Second animal passage (Passage B)...... 160 6.2.2.1.3. Third animal passage (Passage C)...... 160 6.2.2.1.4. Fourth animal passage (Passage D)...... 160 6.2.2.2. Colonisation of original clinical isolate and mouse-adapted SS2000 ...... 161 6.2.2.3. Long term infection with SS2000 in comparison to SS1...... 161 6.2.3. ASSESSMENT OF COLONISATION LEVEL AND DISTRIBUTION...... 162 6.2.4. ASSESSMENT OF HISTOPATHOLOGY ...... 162 6.2.5. STATISTICAL ANALYSIS...... 164 6.2.6. RANDOMLY AMPLIFIED POLYMORPHIC DNA (RAPD) PROFILING...... 165 6.3. RESULTS ...... 165 6.3.1. ASSESSMENT OF COLONISATION OF CLINICAL ISOLATES DURING PASSAGE A, B & C ...... 165 6.3.2. IDENTIFICATION OF THE ORIGINAL CLINICAL ISOLATE OF SS2000 ...... 168 6.3.3. COMPARISON OF PRE-MOUSE AND “MOUSIFIED” STRAINS’ COLONISATION ABILITY ...... 171 6.3.4. ASSESSMENT OF COLONISATION AND INFLAMMATION AFTER LONG TERM INFECTION ...... 171 6.3.4.1. Colonisation...... 171 6.3.4.2. Inflammation...... 175 6.4. DISCUSSION ...... 181 6.4.1. ISOLATION OF A NEW MOUSE COLONISING STRAIN ...... 181 6.4.2. CHARACTERISATION OF THE PRE- AND POST-MOUSE COLONISING STRAINS ...... 182 6.4.3. EFFECT OF HOST AND STRAIN SPECIFIC FACTORS ON CHRONIC COLONISATION ...... 182 6.5. CONCLUSIONS ...... 185 CHAPTER 7: GENOME-TYPING OF THE MOUSE COLONISING STRAINS187 7.1. BACKGROUND...... 187 7.2. EXPERIMENTAL PROCEDURES ...... 189 7.2.1. STRAINS USED TO STUDY “MOUSIFICATION” ...... 189 7.2.1.1. Microarray analysis of the strains in the “mousification” study ...... 189 7.2.2. STRAINS USED TO STUDY THE GENOMIC CHANGES DURING LONG TERM COLONISATION ..189 7.2.2.1. Microarrays for long term study...... 191 7.2.3. ANALYSIS OF GENOME TYPING MICROARRAY RESULTS USING GACK ...... 191 7.2.4. GENOMIC CHANGES DURING “MOUSIFICATION” ...... 192 7.2.5. GENOMIC CHANGES DURING LONG TERM COLONISATION ...... 192 7.2.6. SUPPLEMENTARY MATERIAL...... 193 7.3. RESULTS ...... 195 7.3.1. MICROARRAY ANALYSIS OF THE GENOME CONTENT OF PRE- AND POST-MOUSE STRAINS .195 7.3.1.1. Changes in SS2000 from the original clinical isolate, PMSS2000 ...... 195 7.3.1.2. Changes in SS1 from the original clinical isolate, 10700...... 196 Contents xiii

7.3.1.3. Genomic differences between SS1 and SS2000 ...... 196 7.3.2. MICROARRAY ANALYSIS OF THE GENOMIC CONTENT OF STRAINS ISOLATED FROM THE LONG TERM COLONISATION STUDY ...... 201 7.4. DISCUSSION ...... 212 7.4.1. GENOMIC CHANGES DURING “MOUSIFICATION” ...... 213 7.4.2. GENOMIC DIFFERENCES BETWEEN SS1 AND SS2000 ...... 215 7.4.3. CHANGES IN THE GENOMIC CONTENT OF SS1 AND SS2000 AFTER LONG TERM COLONISATION ...... 216 7.5. CONCLUSIONS ...... 219 CHAPTER 8: HOST TRANSCRIPTIONAL RESPONSE TO H. PYLORI INFECTION ...... 220 8.1. BACKGROUND...... 220 8.2. EXPERIMENTAL PROCEDURES ...... 223 8.2.1. RNA LABELLING AND MICROARRAY HYBRIDISATION...... 225 8.2.2. DATA RETRIEVAL...... 225 8.2.2.1. Comparison of data from the two array sets...... 226 8.2.3. DATA ANALYSIS ...... 226 8.2.4. SUPPLEMENTARY MATERIAL...... 226 8.3. RESULTS AND DISCUSSION ...... 228 8.3.1. TRANSCRIPTIONAL RESPONSE OF C57BL/6 MICE TO INFECTION ...... 228 8.3.1.1. Genes with significantly different expression in samples from uninfected versus infected C57BL/6 mice...... 231 8.3.1.2. Genes with significantly different expression in C57BL/6 samples with different levels of pathology...... 236 8.3.1.3. Genes with significantly different expression in samples from SS1 versus SS2000 infected C57BL/6 mice...... 248 8.3.2. TRANSCRIPTIONAL RESPONSE OF BALB/C MICE TO INFECTION...... 249 8.3.2.1. Genes with significantly different expression in samples from uninfected versus infected BALB/c mice...... 252 8.3.2.1.1. Genes induced in infected BALB/c mice...... 253 8.3.2.1.2. Genes repressed in infected BALB/c mice...... 258 8.3.2.2. Genes with significantly different expression in BALB/c samples with different levels of pathology...... 259 8.3.2.2.1. Total monocyte infiltration ...... 259 8.3.2.2.2. Number of gland abscesses and lymphoid aggregates ...... 259 8.3.2.3. Genes with significantly different expression in samples from SS1 versus SS2000 infected BALB/c mice...... 260 8.3.3. DIRECT COMPARISON OF THE TRANSCRIPTIONAL RESPONSES OF C57BL/6 AND BALB/C MICE TO INFECTION ...... 265 8.4. CONCLUSION...... 272 CHAPTER 9: GENERAL DISCUSSION AND FUTURE DIRECTIONS ...... 274 9.1. BACTERIAL TRANSCRIPT PROFILING METHODOLOGY...... 274 9.2. H. PYLORI TRANSCRIPTIONAL REGULATION IN VITRO ...... 276 9.2.1. BROTH CULTURE ...... 276 9.2.2. CO-CULTURE...... 279 9.3. THE NEW MOUSE MODEL ...... 283 Contents xiv

9.3.1. COLONISATION AND INFLAMMATION ...... 283 9.3.2. GENOME TYPING OF MOUSE COLONISING STRAINS ...... 285 9.3.3. TRANSCRIPTIONAL PROFILING OF THE HOST RESPONSE TO INFECTION...... 287 9.4. CONCLUDING REMARKS ...... 290 APPENDIX...... 292 BIBLIOGRAPHY ...... 296

Chapter 1

INTRODUCTION

1.1. Helicobacter pylori Infection Helicobacter pylori was first isolated in 1982 from the stomach of symptomatic patients undergoing endoscopic examination in Perth, Western Australia. At this time Barry Marshall and Robyn Warren postulated that this bacterium was the causal factor of antral gastritis and gastric ulceration in humans (194). Prior to this discovery it was generally believed that a combination of environmental factors, particularly stress and alcohol consumption, were the cause of these gastric ailments (80). Contributing to this hypothesis was the belief that the stomach was a sterile environment due to its high acidity, pH levels being reported to fall to as low as pH 2 (355). Despite this belief, it is now known that this unique opportunistic pathogen colonises the stomach of over half the world’s population (81, 354). While the majority of those infected with H. pylori remain asymptomatic, a subset of individuals go on to develop a number of diverse outcomes including, gastric and duodenal ulcer, gastric cancer and B cell MALT lymphoma (82, 263).

1.1.1. Patterns of Colonisation and Inflammation Early investigations into H. pylori colonisation showed that this bacterium initially colonises the gastric antrum region of the stomach, after which colonisation spreads along the lesser curvature of the stomach to the body region (128, 222). Further studies have shown that all regions of the glandular stomach including the antrum, body and cardia regions are colonised by this bacterium (276, 277, 343). However the level of colonisation is usually highest in the antrum and cardia although there is less direct evidence for the latter (114).

The normal human stomach is generally devoid of inflammatory tissue and cells. Following H. pylori infection an acute inflammatory response develops Chapter 1 2 that is characterised by enhanced IL -8 secretion by the gastric epithelia that leads to infiltration of polymorphonuclear lymphocytes (neutrophils) into the gastric mucosa. Along with this, other epithelial changes occur including mucin depletion, cellular exfoliation and regenerative changes (80). There is also an accompanying hypochlorhydria that may last from weeks to months before acid levels return to normal. In some individuals acid levels appear to remain low (333). In the majority of individuals this initial immune response fails to clear infection. After a short time chronic inflammation develops where lymphocytes, plasma cells, macrophages and a small number of eosinophils appear in the lamina propria. Long-term colonisation by H. pylori results in the development of chronic active gastritis that is characterised by the presence of both acute and chronic inflammation (80). One of the chronic responses to infection is the recruitment of primed B cells into lymphoid follicles in which the plasma cells are committed to producing mucosally protective IgA antibodies. It is thought that follicles, also called mucosa-associated lymphoid tissue (MALT), are always associated with H. pylori infected human stomachs, although these are often missed due to the position from which biopsies are taken (113).

The location of gastritis in infected individuals has been shown to correlate somewhat with the distribution of colonisation (19, 24). Antral predominant gastritis is associated with high levels of bacteria in the antrum, while pangastritis, involving both the antrum and body mucosa, is associated with significant colonisation of the antrum and body regions. Some reports show that antral gastritis is often more severe than that in the body despite relatively similar levels of bacteria in these two regions (343). The majority of individuals in developed countries develop antral predominant gastritis, but in a small number of individuals corpus (body)-predominant gastritis is observed (80). The most severe gastritis has been reported in the transitional zone (TZ) between the antrum and body regions, where it has been hypothesised that the bacteria may behave differently and thus cause more pronounced tissue damage (373). Chapter 1 3

If left untreated, persistent colonisation by H. pylori and the associated chronic active gastritis can lead to the development of atrophic gastritis and subsequently in some individuals more serious disease such as gastric ulceration and gastric cancer (80).

1.1.2. Disease Progression Peptic ulceration can occur near the antro-duodenal transitional zone, a duodenal ulcer (DU), or in the body region of the stomach, a gastric ulcer (GU). H. pylori infection has been linked to 95% (35, 201) and 70% (199) of these ulcers, respectively. The remaining GU’s have been associated with the usage of non-steroidal anti-inflammatory drugs (NSAIDs) (200), while a small percentage of DU’s have been linked to a number of acid hypersecretory states such as Zollinger-Ellison syndrome and duodenal Crohn’s disease, as well as the consumption of NSAIDS (35, 201, 295).

The prevalence of H. pylori infection and disease progression varies both within and between countries (7, 117, 120, 145, 162, 215, 260, 266, 303). In general, developing countries have a higher prevalence of infection than developed countries. Interestingly, infected individuals from developing countries have a higher probability of developing GU disease, while in developed countries they are more likely to develop DU. It is thought that these differences occur due to a number of factors such as socioeconomic status, diet, and genetic predisposition (215).

1.1.2.1. Duodenal Ulcer Antral predominant gastritis and high levels of colonisation in the antrum are part of the disease sequelae associated with DU. The reason for this antral predominance is explained by the high local acid concentration in the body region of the stomach. The body region contains the acid secreting glands of the stomach, characterised by the presence of parietal and chief cells. Although H. pylori has developed mechanisms to cope with the acidic stomach, the pH range over which the bacterium can survive is relatively narrow, ranging from pH 3.5 to 5.0 in the presence of urea. In this pH range the bacterium is able to Chapter 1 4 maintain the necessary proton motive force (PMF) for the generation of ATP for solute export and import (310). The organism uses its cytoplasmic urease enzyme to metabolise urea to produce ammonia, which in turn neutralises excess protons in the periplasmic space to maintain a pH of 6.2 (310). In very high acid concentrations the bacteria is unable to keep up with the influx of protons and cannot maintain its PMF. Thus, since there is a gradient of acid levels from the body to the antrum, in individuals with high basal acid, H. pylori moves into the more neutral antrum region. Indeed patients with DU have been shown to have higher basal acid secretion (119). A parallel consequence of high basal acid secretion is that unneutralised acid reaches the duodenum. A major consequence of this is the development of gastric metaplasia characterised by the presence of gastric-type mucus-secreting cells in the surface epithelium of the duodenum. This gastric metaplasia is thought to develop as an adaptive response to the excess acid reaching the duodenum as this epithelial tissue is better able to cope with acid injury (80).

The sequence of events proposed for the development of DU involves the acid induced development of gastric metaplasia, followed by spread of H. pylori colonisation from the stomach into the gastric-type epithelium, causing the development of acute and chronic inflammation. The resulting active chronic duodenitis then may develop into frank duodenal ulceration (80, 391).

1.1.2.2. Gastric Ulcer In individuals who develop a corpus-predominant or pangastritis, development of a GU is more likely. This pangastritis occurs because the bacterium is more readily able to colonise the corpus region due to the lower levels of local acid. In the TZ between the antral and corpus regions, there is a gradient of pH. In this narrow region there are often increased colonisation levels and as discussed earlier, often the most pronounced inflammation. One study by Oi et al. (244) showed that 95 % of gastric ulcers were found in close proximity to the TZ, while another study by Stadelmann et al. (339) found that increasing atrophy was necessary for GU to develop. The increased colonisation by H. pylori in this Chapter 1 5 region leads to epithelial degeneration and increased turnover of the epithelial cells. Levels of mucin are decreased and complement is activated. These factors coupled with the increased atrophy and intestinal metaplasia distal to the TZ causes this region to be more susceptible to acid-peptic attack and therefore peptic ulceration (80).

1.1.2.3. Gastric Cancer and MALT lymphoma Since the discovery of H. pylori in 1982 an enormous number of epidemiological studies have examined the association between H. pylori and gastric cancer (54, 58, 106, 112, 162, 178, 232, 258, 325-327). On the basis of these epidemiological studies the International Agency for Research on Cancer (IARC), part of the World Health Organisation (WHO) identified H. pylori as a “group 1 (definite) carcinogen” in 1994 (13). The fact that gastric cancer is the second most common malignancy in the world makes this recognition a particularly pertinent one.

Correa was the first to describe the necessary events leading to a precursor lesion in individuals with gastric cancer (56, 57). This explanation described the initial lesion as superficial gastritis, which was followed by cell regeneration leading to hyperplasia of some glandular components of the stomach. Further destruction of the structure of the gastric glands leads to the replacement of normal glandular gastric epithelium by intestinal type epithelium. Following these intestinal metaplastic changes, further atypical cell changes and tumour development within the metaplastic epithelium occurs (57). At this time Correa proposed that a chronic agent may be responsible for the development of gastric cancer. Following the isolation of H. pylori and the discovery that this organism was the major cause of superficial gastritis, H. pylori was implicated (57, 58).

The association between H. pylori and gastric cancer has been studied in numerous geographic locations, and in general those areas with high levels of GC also have a high prevalence of H. pylori (192). However this is not always the case, a finding that argues for the importance of other factors, such as diet, Chapter 1 6 environment and host genetics, in the development of GC. Because atrophy, the major precursor of GC, leads to a hypochlorhydria, individuals with GC are often found to be H. pylori negative at the time of endoscopy (106, 232, 258). This is related to the neutrality of the stomach being hostile to the bacterium because it is unable to maintain its PMF and also because of other epithelial changes that causes reduced bacterial adhesion. Thus, most evidence for the association between H. pylori and GC has arisen from seroprevalence data that is indicative of past H. pylori infection (106, 232, 258).

MALT lymphoma (non-Hodgkin’s gastric lymphoma) represents a small percentage (3%) of malignant gastric disease (259). This B cell lymphoma is thought to arise from lymphoid follicles that develop due to H. pylori infection in the lamina propria of the stomach. A study by Parsonnet et al. in 1994 showed that patients with gastric lymphoma were significantly more likely to have had previous H. pylori infection as compared with matched controls (259). Further studies in which eradication of the organism resulted in complete regression of low-grade gastric lymphomas in a large proportion of the patients have strengthened the association between H. pylori and MALT lymphoma development (25, 229, 390). It is now well accepted that H. pylori infection causes nearly 100% of low grade gastric MALT lymphoma development (175).

1.1.2.4. Treatment of H. pylori Infection It is now well established that eradication of H. pylori infection can improve or cure several gastroduodenal diseases (205). Thus effective treatment and/or prevention of H. pylori is of enormous benefit. Soon after the discovery and culture of this organism it was established that H. pylori was susceptible to a number of antibiotics in vitro, including amoxicillin, clarithromycin, metronidazole, tetracycline and azithromycin (193). Thus, early treatments consisted of a monotherapy using one of these antibiotics. In the majority of cases however, monotherapy was found to be unsuccessful and dual therapies using a combination of a proton pump inhibitor (PPI), such as omeprazole, and an antibiotic were introduced. Some of these treatments claimed up to 80% Chapter 1 7 eradication, however eradication rates were variable and in many cases infection remained (115, 116).

Triple therapies were then introduced that consisted of a PPI together with two antibiotics, originally metronidazole and amoxicillin and later clarithromycin and amoxicillin (268). Initially such therapies resulted in eradication rates of up to 95% (179), however over recent years H. pylori has shown increasing resistance to metronidazole and clarithromycin (52, 250, 368) that has caused a substantial fall in the rate of eradication using these therapies. Currently new therapies and a vaccine are being sought that can successfully cure this infection particularly in developing countries where both infection rates and disease incidence are high (225). Genes essential for colonisation and/or virulence are considered the best targets for vaccine and drug development (169).

1.2. H. pylori Physiology and Genetic Makeup 1.2.1. General Physiology H. pylori is a microaerophilic, gram negative spiral organism with a tuft of polar flagella (Fig. 1.1C & D). The ability of this organism to colonise the harsh ecological niche of the stomach makes its physiology unique. Apart from a number of other gastric Helicobacters such as H. felis (Fig. 1.1A & B), no other organisms colonise this niche and H. pylori has not been reliably isolated from any other environments. This suggests that H. pylori inhabits a relatively constant environment where it is maintained in a semi-continuous culture system (129). Many aspects of H. pylori biochemistry and physiology reflect this situation. These are particularly evident through investigation of the organism’s genome.

1.2.2. Genome Content H. pylori was the first organism for which two individual strains were sequenced. The first of these was the 26695 strain isolated from a gastritis patient in 1987. This was completed by The Institute for Genomic Research (TIGR) in 1997 (361). The second strain was the J99 strain isolated from a patient with DU in 1994. This was sequenced by Genome Therapeutic Corp., licensed to the Astra A C

D

B

Figure 1.1: Collage of critical point dried Scanning Electron Micrographs of two plate grown gastric Helicobacters, Helicobacter felis (panels A & B) and Helicobacter pylori (panels C & D). A) H. felis with a large tuft of flagella (16 K X Magnification); B) Two H. felis cells showing a tight cork- screw-like spiral morphology and double spiral periplasmic fibrils (16 K X Magnification); C) H. pylori with an S-shaped morphology and a small tuft of polar flagella. Terminal bulbs of the flagella are also evident (19 K X Magnification); and D) Magnified view of the pole of an H. pylori cell from which the flagella protrude (81 K X Magnification). Chapter 1 9

Research Center Boston (ARCB) in 1999 (6). One of the motivations for sequencing a second strain of this organism was the belief that the genomic content of H. pylori strains varied enormously. This diversity had been illustrated by a number of methods including RAPD (random-amplified PCR-DNA), PFGE (pulsed-field gel electrophoresis) and RFLP (restriction fragment length polymorphism) analyses. Interestingly, Alm et al. showed that the size of the genomes in these two strains were similar (the J99 was only 24 kb smaller than that of 26695), and that the overall genomic organisation, gene order and predicted proteomes were also alike (6). Many of the differences in sequence between these two genomes occurred in the third base of codons, which are rarely translated into amino acid differences (6). However these nucleotide differences can cause enormous changes in the patterns obtained by the above mapping methods and consequently any particular H. pylori strain could be distinguished from all other strains using these maps (2, 3, 153, 304, 349, 374).

The size of the H. pylori genome is about 1.6 Mb, smaller than many of the other fully sequenced gram negative bacterial pathogens such as Escherichia coli (4.7- 5.3 Mb) (30), Salmonella typhimurium (4.8 Mb) (385) and Neisseria gonorrhoeae (2.2 Mb) (77), while it is of a similar size to Haemophilus influenzae (1.8 Mb) (176). The reason for this small genome is hypothesised to be due to the specific niche in which H. pylori resides. Many other pathogens are found not only colonising their host but also in environmental reservoirs, which requires these bacteria to have more diverse metabolic, respiratory and stress mechanisms to survive. H. pylori appears to be missing a number of components necessary for some metabolic functions. For example, it has a limited capability to acquire and catabolise sugars from the environment (190), preferring instead amino acids and perhaps fatty acids as the carbon and energy source (284).

The microaerophilic nature of H. pylori may also reflect adaptation to the gastric environment. The organism cannot grow anaerobically, but can survive in a range of oxygen concentrations, some strains being able to survive in levels Chapter 1 10 nearing that in the atmosphere (127). The organism also requires carbon dioxide for growth (150). In the gastric mucus oxygen and carbon dioxide concentrations may vary depending on the distance from the epithelial surface, acidity and contents of the stomach (150). H. pylori also possesses various anaerobic metabolic systems contributing to the organism’s microaerophilicity. It is able to use fumarate as an electron acceptor for respiration (206), although the reason for this respiration system is unknown, especially considering the enhanced production of ATP that can be achieved through aerobic respiration (150). The presence of enzymes that are oxygen-labile, including pyruvate:flavodoxin oxidoreductase (Por), may provide an explanation for the microaerobic nature of H. pylori (149). However whether the organism switches from one type of respiration to the other at certain times is not known.

Contributing to the small size of the genome is the relative paucity of transcriptional regulators in H. pylori as compared with other pathogens. For example, there appear to be no homologues of the sigma factors for heat shock (s32) and stationary phase transition (ss) suggesting a limited need for the regulation of these stress responses (6, 361). In addition, the organism possesses few obvious transcription factors characterised by helix-turn-helix motifs and has only about 1/3 of the two-component regulatory systems present in E. coli (361). Finally, H. pylori has very limited operon structure further confounding its lack of regulators. For example, the flagellar regulon is not organised in operons, and as a result it lacks some of the important negative feedback mechanisms present in other bacterial systems (247). Mutants unable to produce the flagella hook (FlgE) still synthesise the flagellin subunits, although they are not exported (246). On the other hand, some of the regulatory factors of H. pylori have been found to have much broader applications in this organism than in other bacteria. For example the Fur regulator has been found to affect the transcription of a wide variety of genes including urease (371) and riboflavin synthesis (190). These factors indicate the possibility that some genes in H. pylori have multiple functions. Chapter 1 11

A number of other regulatory mechanisms for transcription have been suggested for this organism. Each H. pylori strain appears to contain a unique complement of restriction/modification enzymes along with nine type II methyltransferases indicating that gene expression may be regulated by methylation (6). In addition, H. pylori may use a mechanism of ‘slipped-strand repair’ to modulate gene expression. This was predicted due to the identification of a number of homopolymeric tracts and dinucleotide repeats, particularly in genes encoding outer membrane (OMPs) (6).

Interestingly, the two genome sequences identified a number of strain specific genes, 89 genes in J99 were absent from 26695 and 95 in 26695, were not present in J99 (6). Most of these genes were located in two ‘plasticity zones’ that have a lower G+C percentage (35%) as compared with the rest of the genome (39%). Another area of variation occurring between many strains of H. pylori is a pathogenicity island of ~40 kb (35% G+C) that encodes one of the major virulence factors in this organism. These differences in G+C percentage indicates that the genes in these regions may have been acquired by horizontal gene transfer (6). The presence of strain specific genes along with the possibilities for genomic variation such as ‘slipped strand repair’ indicate mechanisms by which different strains could cause varied pathology.

1.2.3. Virulence Factors The virulence factors of H. pylori, like in other pathogens, can be divided into categories: those required for entry into and initial colonisation of the host; those involved in persistence through avoidance and exploitation of the host immune responses; and finally those used to manipulate and damage host tissue and for transmission to a new host (99). Some of these virulence factors are required for more than one of these stages of infection. The specialised niche of H. pylori causes these virulence factors to be somewhat unique.

1.2.3.1. Factors Required for Colonisation Much is known about the mechanisms used by H. pylori to colonise and persist in its host, while transmission and entry into the host is still controversial. Whether Chapter 1 12 this bacterium is spread by food, water, or by gastric-oral or faecal-oral routes, it is known that the bacterium is ingested (215). In order for this organism to colonise, it must be able to withstand, at least transiently, an extremely high level of gastric acid. The main factor involved in acid resistance is the potent urease enzyme that is used to hydrolyse urea into ammonia and water. The enzyme is encoded in an operon containing 7 open reading frames (ORFs), ureABIEFGH. The two enzyme subunits are encoded by ureAB and the proteins encoded by ureE, ureF, ureG and ureH are required for the addition of the cofactor nickel to the enzyme, which is required for activity (110, 231). Urea is transported into the cytoplasm where the majority of the enzyme resides, by a transporter encoded by the ureI gene (382). The ammonia produced by the enzyme diffuses back across the cell membrane creating a cloud of neutral pH in the immediate vicinity of the organism. The action of this enzyme enables the organism to maintain a PMF across the outer membrane. This is necessary to enable influx of H+ ions for the generation of ATP (310). Apart from the acid neutralising properties of urease, it is also necessary for colonisation in achlorhydric animals, suggesting further functions for this enzyme (90). These include nitrogen assimilation and providing energy for flagellar rotation (358).

The urease enzyme thus protects H. pylori from gastric acid, however the optimum pH for this organism is thought to be between pH 5 and 6 (310). Thus, the organism possesses extremely effective mechanisms to swim from the stomach lumen, through the mucus covering the epithelium to the more neutral pH environ close to the surface of the epithelial cells. The spiral shape and polar tuft of flagella with terminal bulbs are adapted to movement in viscous mediums such as the mucus layer (Fig. 1.1) (128). The flagellar also has a unique sheath, HpaA, covering the filaments that is thought to protect these proteins from damage by acid. The importance of both urease (87) and flagella (91) for colonisation has been shown by investigations of the colonisation ability of mutants in both these factors in animal models. In both cases these mutants failed to colonise (87, 91). Chapter 1 13

Like in other bacterial pathogens, chemotaxis is an important mechanism for successful colonisation and persistence. Chemotaxis enables organisms to sense and recognise environmental factors and to initiate movement to compensate for these (337). The chemotactic mechanisms of H. pylori are thought to be necessary to direct the organism to the mucus layer where it can avoid clearance from the stomach (337). H. pylori possesses four methyl- accepting chemotaxis proteins including TlpA and TlpB that are presumably responsible for sensing the chemoattractants or –repellents (6, 361). It also has three homologues of the chemotaxis pathway present in E. coli, CheW, CheA, and CheY, which is responsible for transducing the signal from the MCPs to the flagellar motor (337). Experimental evidence has shown that H. pylori has chemotactic activity towards a number of compounds, such as some amino acids, mucin, urea, sodium bicarbonate and sodium chloride (216, 337).

Although about 90% of bacterial cells appear to reside in the mucus layer covering the epithelium, H. pylori has developed the ability to attach to epithelial cells and to mucin, the major component of gastric mucus (358). Since transmission of this bacterium is probably a rare event, it is essential that H. pylori be able to colonise immediately in order to avoid being washed out of the stomach by peristalsis, epithelial and mucus turnover. Thus adhesion is an important facet of the early establishment of colonisation. A number of specific adhesins have been identified including the Lewis B-binding adhesin, BabA that binds to Lewis b antigens (34), and the Lewis B antigen-independent adhesins AlpA and AlpB whose host cell receptor is at present unknown (239). H. pylori also possesses a large number of OMP of unknown function that may encode further adhesins such as HopZ (5, 262). LPS (271, 272) and the heat shock proteins Hsp60 and (134, 135) have also been implicated in adhesion. The large number of potential adhesins and the considerable variations between strains and expression of these under different environmental conditions suggests that H. pylori adhesion is probably multifactorial (358). Thus, few in vivo studies of this process have yet been attempted. Adherence of H. pylori appears to occur more frequently at cell-to-cell junctions (128), where it can disrupt this Chapter 1 14 junction leading to increased permeability of the epithelial layer (122, 348). Adhesion is also thought to be necessary for a number of the inflammatory responses induced by H. pylori infection (see below) (358).

1.2.3.2. Factors Required for Persistence The combination of acid resistance, motility, chemotaxis and adhesion allows the organism to penetrate the mucus layer and to survive the vigorous gastric motility, emptying and epithelial regeneration that keeps all other bacteria from colonising this niche. In order for H. pylori to persist in this environment it also has several mechanisms to avoid host defence mechanisms designed to clear the infection. First, the organism has an extensive ability to scavenge and store iron (22). This is particularly important as the host uses various molecules such as lactoferrin and transferrin to bind available iron in its tissues, starving potential pathogens of this essential nutrient (12, 22). The importance of this process is evident in the fact that mutants in the gene encoding the iron storage protein ferritin (pfr) are unable to colonise mice (375).

Second, H. pylori has very effective mechanisms for avoiding damage from reactive oxygen species that are released by leukocytes. The combination of the superoxide dismutase (SodB) that catalyses the conversion of the superoxide anion to hydrogen peroxide and catalase (KatA) enzymes that in turn disintegrates hydrogen peroxide to oxygen and water allows H. pylori to neutralise the threat from oxygen radicals (27, 219). Third, H. pylori possesses anti-phagocytic mechanisms avoiding clearance by phagocytic cells recruited by the host. A number of factors may be involved in this ability including hemagglutination, and the presence of the cytotoxic associated Pathogenicity Island (cag PAI) (358). Last, the organism’s mechanisms of phenotypic variations via recombination and phase variation induced by slipped-strand repair may enable it to avoid recognition by the host immune system (361).

1.2.3.3. Factors Required for Host Tissue Damage and Disease Induction The mechanisms by which H. pylori causes its numerous disease sequelae are less well understood than the colonisation and persistence factors. The ability of Chapter 1 15

H. pylori to induce extensive inflammation, epithelial rearrangement and ultimately more serious gastric diseases is due, in part, to a number of different factors encoded by the organism. The bacterium possesses a vacuolating cytotoxin (VacA) that induces cytoplasmic vacuolation in epithelial cells; alters the intracellular trafficking of proteins and cytoskeleton-dependent functions; and increases permeability of the epithelial cells (64, 197). The extent of vacuolation induced by this protein in vivo is largely unknown, although it does appear to enhance colonisation (18, 299). Oral administration of VacA also leads to erosion of the gastric epithelium in mice (369). In addition, vacuolisation of cells in human biopsy samples has been observed (43, 95). The true biological implication of VacA in this organism is largely unknown. One proposition is that permeabilisation of the epithelial cells may induce release of nutrients from the host, required by the bacterium, including iron (256).

H. pylori induced gastritis is characterised by infiltration of neutrophils and monocytes into the gastric mucosa (24, 194). The HP-NAP protein has been shown to induce neutrophil adhesion to endothelial cells, directing these cells to the gastric mucosa, and to stimulate NADPH-oxidase that in turn induces the release of oxygen radicals (98). The ability of H. pylori to recruit and then activate neutrophils to release oxygen radicals results in tissue damage causing the release of nutrients, which promotes H. pylori survival (104, 219). H. pylori can also protect itself from the toxic effects of the released oxygen radicals using its catalase and superoxide dismutase proteins (as discussed above).

One of the most important factors contributing to the understanding of H. pylori is the cag PAI. This pathogenicity island is composed of at least 27 genes. H. pylori strains carrying the PAI are far more likely to be associated with serious manifestations of H. pylori infection including PUD and gastric cancer (61). It should be noted, however that strains lacking the cag PAI can also be associated with chronic gastritis and more serious disease (63). Recently, it has been established that the cag PAI encodes for a type IV secretion apparatus that injects the CagA protein, along with other unknown proteins, into epithelial cells Chapter 1 16 resulting in a number of signal cascades that lead to reorganisation of host cell actin and induction of IL-8 secretion from these cells (21, 45, 237). Adhesion to epithelial cells has been shown to be necessary for these cellular events (313). It has been established that following CagA translocation from the bacterial to the host cell, it is tyrosine phosphorylated by Src-like protein tyrosine kinases (316, 340, 342). Confocal microscopy analysis has revealed that the CagA protein is inserted into the plasma membrane of host cells (311). These events cause morphological changes in the infected cell known as the “hummingbird phenotype” that is characterised by marked elongation and spreading of cells (311). Further, Higashi et al. (130) showed that CagA binds to the SRC homology 2 domain (SH2)-containing tyrosine phosphatase (SHP-2) protein in a tyrosine- phosphorylation dependent manner and that SHP-2 is required for cell elongation (130, 394). The cellular processes leading to IL-8 secretion from cells is CagA independent (105) and involves both the NFKb pathway and the ERK/MAP kinase cascades (341). Interestingly, there appears to be genetic variation in the cag in many strains rendering the island non-functional in some cases (67, 298).

More detailed mechanistic studies are required in vivo in order to fully understand the action of H. pylori virulence determinants as well as the synergistic relationships between them.

1.3. Animal Models Used for Studying H. pylori Infection The ability to use human patients to study the intricate mechanisms of H. pylori colonisation and pathogenesis is limited. This is in part due to the small size and limited number of gastric biopsies that can be collected from each individual during endoscopy. In addition, the fact that most people are infected before the age of five (215) means that only the chronic form of infection can be investigated as studies in children present even greater difficulties than those in adults. Furthermore, very few studies have shown the effects of acute inflammation in humans. Two volunteer ingestion studies performed by Marshall et al. in Australia (195) and Morris et al. in New Zealand (222) have been used to establish factors Chapter 1 17 involved in early stages of infection. Although informative, extensive studies of this type are impractical. To advance our understanding of the complex nature of interactions between bacterium and host, animal models of H. pylori infection have been employed. These allow researchers to manipulate factors of both the host and bacterium and observe the effects. The types of animals used for these models have been varied. Both large and small models have been used and the choice depends largely on the question being answered.

Larger animal models such as non-human primates (83, 84), gnotobiotic piglets (87, 91), and beagle dogs (289) have been successfully colonised with Helicobacter species. These animals have a distinct advantage in that their gastric physiologies closely resemble that of humans. In the case of non-human primates such as monkeys, it is also possible to endoscope at all stages of disease thus allowing unprecedented studies into the time line of inflammation development (236). However there are limitations to the use of these types of models as they are not practical for most research institutions due to the size and expense of such animals (123, 188). In addition, the clinical presentations in these animals are usually limited to lymphocytic gastritis that does not progress to more severe disease such as ulceration and cancer. In addition, the presence of natural helicobacters, such as “H. heilmannii” in primates (282) and “H. suis” in pigs (71), and limitations for studying long term infections result in further drawbacks.

The use of small animal models is cheaper and more accessible for most research studies. For example, guinea pigs (324), Mongolian gerbils (136, 241), and suckling (123) and adult mice (103, 174) have been used for H. pylori research. The use of Mongolian gerbils is quickly gaining status as one of the best models of H. pylori induced pathology as both ulceration and cancer have been reported in these animals (380). However, there are limitations to the usefulness of this model, such as access to these animals and difficulty in reproducibility between laboratories across the world (A. Lee, personal communication). Rodents, especially adult mice, have been the most extensively Chapter 1 18 utilised model. Studies in mice have concentrated mainly on colonisation and distribution studies, manipulating various parameters such as local acid production (70) and bacterial factors (37, 88, 160, 191, 231, 248). In addition, the mouse model has proved extremely useful for the testing of antigens and adjuvants for the development of vaccines (69, 345-347).

1.3.1. The Mouse Model The original mouse model utilised Helicobacter felis, a close relative of H. pylori, for infection studies as no H. pylori strains had been found to successfully colonise the mouse persistently (40, 172). The H. felis model has been used extensively to study gastritis and vaccine models (69, 296). In 1996, a set of parameters were set down as guidelines for a suitable model using H. pylori infection. These included measurements for: 1) the grade of colonisation and pathology in the antrum and body, 2) the number of H. pylori per gram tissue, 3) the presence or absence of adhesion, 4) the longest period of continuous colonisation achieved, and 5) the number of passages in vitro over which the bacteria could retain its colonising ability (210). In 1997 the Sydney strain of H. pylori, SS1, was introduced and this strain was found to satisfy all of the above criteria and through the dissemination of this strain, a standardised animal model became widely available (174).

The SS1 strain was found to colonise at high levels and persist for up to two years. It was also shown to contain the virulence related genes, vacA and cagA (174). This model has been particularly useful for studying colonisation levels and distribution and for testing bacterial factors required for colonisation such as motility and urease (88, 231, 248), as well as many other gene products including arginase (203) and superoxide dismutase (319). One of the major criticisms of the SS1 model is that despite high levels of colonisation, the inflammation caused by SS1 is only mild to moderate depending on the strain of mouse infected (103, 174, 188, 297). A scattering of other strains that colonise mice for varied periods have been reported but these have been invariably used for short- term studies (< 3 months) (370). Chapter 1 19

1.3.2. Host Factors Many epidemiological studies have shown that both the rates of H. pylori infection and the type of disease progression varies considerably between populations and individuals. These results suggest that host factors are important determinants for the outcome of H. pylori infection. It has also been observed in several studies using animal models of Helicobacter infection that host factors are extremely important for the outcome of infection (218, 296, 297). Two mouse strains that show a clear difference in host response are the C57BL/6 and BALB/c strains. Studies using the H. felis mouse model have shown that some strains of mice including C57BL/6 and C3H/He develop moderate to severe atrophic gastritis within 6 months of infection and are thus referred to as responder strains, while BALB/c and CBA strains produce little or no inflammatory response after 6 months of infection and are thus of the non- responder type (296). However, after longer periods of infection (> 22 months) BALB/c mice have been found to develop a gastric mucosa-associated lymphoid tissue (MALT) that does not develop in C57BL/6 mice (96, 97).

The atrophic gastritis that develops in C57BL/6 mice due to either H. felis or H. pylori infection is characterised by a chronic active gastritis mainly in the body region of the stomach with only mild gastritis in the antrum. This is accompanied by functional atrophy of the body where parietal and chief cells are replaced with proliferating zone cells or mucous secreting cells or both, which is much more apparent in H. felis than in H. pylori infection (296, 297). Additionally the development of atrophy in H. felis infected responder mice was inversely correlated with colonisation levels. This decline in H. felis numbers is hypothesised to occur due to the conditions in the stomach becoming inhospitable to the bacterium, namely a more alkaline pH due to the loss of parietal cells or possibly due to competition from other bacteria colonising the hypochlorhydric state of the stomach (296).

In the non-responder strain BALB/c, the early stages of infection are characterised by negligible gastritis consisting of a mild cellular infiltration in the Chapter 1 20 antrum only by 6 months of infection. There is also little apparent atrophy in these animals at any stage. Very long-term infection with H. felis induces the induction of a MALT-like pathology, while H. pylori does not induce this type of pathology even after 28 months (175).

Thus taken together these animal experiments suggest that host factors are extremely important in defining the type of pathology that will develop in infection with Helicobacter. The definitive identification of factors that make these animals responsive to infection and inflammation are currently unknown. However, several factors have been implicated including both genetic and physical differences. Studies with T- and B-cell deficient mice have implicated T-cell response as a critical factor in Helicobacter infection (291). In particular, the T- helper response in mice is critical. A dominant pro-inflammatory Th1 response is present in some strains, including C57BL/6, while in other strains such as BALB/c there is a dominant anti-inflammatory Th2 response. Still other strains show a balanced Th1/Th2 response and have both types of pathologies seen in the above two strains (102). Various studies have implicated several Th1 cytokines in the induction of atrophic gastritis. These include IFNg, TNFa, IL-7 and interferon regulatory factor-1 (IRF-1) (233, 243, 334). In addition mice lacking IL-10, a key anti-inflammatory Th2 cytokine, developed severe hyperplastic gastritis due to Helicobacter infection (29). Finally, mice infected with both H. felis and the helminth, Heligmosomoides polygyrus, had reduced severity of gastritis, probably because the helminth elicits a polarised Th2 response (107).

Other specific genetic differences may be involved in the differing inflammatory response in various mice strains, for example the gene encoding the secretory phospholipase A2, group II type has been shown to be disrupted in C57BL/6 mice, but intact in BALB/c mice (151). The secretory phospholipases are thought to be important in maintenance of the gastric mucosal barrier (249). Thus, it has been suggested that the lack of this phospholipase in the C57BL/6 may be linked to increased apoptosis and altered cellular differentiation (151). Some physical differences that may occur between these two strains of mice may include Chapter 1 21 differences in the levels of physical activity as well as differences in the level of basal gastric acid output (A. Lee, personal communication).

In human subjects the situation is more complicated because along with possible genetic and physical differences between populations there is also a contribution of diet and other environmental factors leading to the different responses to H. pylori infection. Some of the host factors that have been linked with H. pylori related clinical outcomes are the differences in the basal level of acid secretion, decreased production of prostaglandin, and the nature of inflammation induced cytokines (376). Also particular polymorphisms in some genes, for example IL-1b and p53, have been shown to be related to an increased risk of gastric cancer (111, 166, 228, 377, 378).

In addition, the differences in the severity of the pathology produced by H. felis as compared with H. pylori in the mouse model suggest that bacterial factors are also important. In fact many studies have focussed on bacterial factors in an effort to determine the reason for differences in disease progression in different individuals. Some factors that have been studied are: cagA status, vacA status and adhesin alleles (babA) (187, 211, 290, 292, 360, 369, 392). However, in human studies it is very difficult to separate host from bacterial effects. Many studies have attempted to control for host differences by studying particular ethnic populations, however individual variations further complicate these studies. Few studies have utilised animal models to determine the effects of both host and H. pylori strain differences on the pathology produced primarily because the range of animals in which multiple strains of H. pylori will colonise for an extended time is very small.

1.4. Microarray Technology The quest for the understanding of H. pylori pathogenesis, particularly with respect to the wide range of disease presentations associated with infection has prompted researchers to seek a global view of this pathogen’s virulence mechanisms. To date, many different reporter systems such as lacZ (74), the Chapter 1 22 ureB promoter (143) and xylE (146) have been used to investigate a handful of genes. Also the development of techniques such as differential fluorescence induction (DFI), signature tagged mutagenesis (STM), in vivo expression technology (IVET), recombination based in vivo expression technology (RIVET) and differential display have provided tools for the elucidation of transcriptional changes in many different bacterial species in response to different environmental conditions or in vivo colonisation (62, 125). Although these techniques have been somewhat enlightening, the fact that H. pylori is difficult to manipulate genetically and because these techniques provide only a narrow view of the complicated responses of the organism, they have been limited. The advent of the whole genomic sequence of both H. pylori and a number of hosts including the human and mouse, and the development of microarray technology have provided the ultimate research tools for understanding the relationships between pathogen and host through whole genome profiling. Microarrays can be used to study a variety of aspects of pathogens including expression profiling, comparative genomics, the bacterial response to interaction with host cells and finally the host transcriptional response to infection.

1.4.1. Bacterial Microarrays 1.4.1.1. Transcription Profiling The apparent lack of regulatory factors in H. pylori suggests that this bacterium may not exhibit extensive or dynamic regulation of gene expression. Transcriptional profiling using microarrays are being used to test this idea. To date, three studies have assessed the global transcriptional response of H. pylori to acidic conditions using microarrays (4, 11, 209). Two of these utilised microarrays to monitor the expression profile of the bacterium after incubation in neutral versus acidic conditions at one time point (4, 11). The conditions used by the two groups varied considerably, one using RNA extracted from H. pylori grown on pH adjusted agar plates (pH 5.5 and pH 7.2) for 48 h (11) and the other exposing H. pylori cells from an overnight culture to either a pH 4 citrate buffer or to phosphate-buffered saline (pH 7.0) for 30 min and then extracting RNA for analysis (4). Comparison of the results of these two studies showed that both the Chapter 1 23 numbers of genes found to be differentially regulated in acid versus neutral conditions and the identity of these genes varied considerably. The general lack of agreement between these two studies emphasises both the complexity of H. pylori’s response to acid and the difficulty in comparing single time point experiments for the assessment of global transcription. This is particularly problematic as the growth conditions being compared are likely to cause the bacterium to grow at different rates and thus be in different stages in their growth cycle.

In contrast a more recent study by Merrell et al. (209) used a time course study of H. pylori response to acidic versus neutral conditions. This study contributes a more comprehensive view of H. pylori response to acidic conditions and uncovered over 180 genes that were differentially regulated by acid. Included in the group of repressed genes were many iron related genes including fur. Induced genes included the urease gene cluster, as expected, as well as the flagella genes. Interestingly, the bacteria exposed to acid displayed an increased speed of movement as compared with controls exposed to neutral conditions (209). The induction of motility in response to sensing an acidic environment may provide a mechanism by which the bacteria can reposition itself in an area of the stomach in which the pH is agreeable (209).

Another group has used a combination of computational and microarray analyses to identify the previously unannotated anti-s 28 factor, FlgM in H. pylori and to identify new target operons of this and its partner, FliA the s28 factor (142). The ORF HP1122 was designated flgM by sequence similarity and by a number of functional analyses, including the investigation of the expression profiles of H. pylori strains with the flgM or fliA genes disrupted, using an oligonucleotide microarray (142). The results of this comparison confirmed the regulation of flaA expression by a s28 promoter and detected two novel operons under the control of FliA/FlgM. The first was a single gene operon encoding the gene for OMP11 and the second was an operon consisting of two genes, HP1052 and HP1051, the first of which encodes the lipid A biosynthesis gene lpxC/envA, and the Chapter 1 24 second encodes a gene of unknown function. This comparison of specific mutants is an interesting and useful approach to studying the effect of specific regulators in H. pylori using microarray analysis.

1.4.1.2. Genome Typing Another aspect of pathogenesis in this organism is understanding the phenomenon of the plasticity of the H. pylori genome and how this relates to the evolution of the bacterium within the host and different disease progressions. A seminal paper in which microarrays were used to probe for the presence of genes in 15 clinical isolates of H. pylori was the first to study in depth, the distribution of proposed strain specific genes (298). Salama et al. (298) identified 1281 gene in common to all strains tested, which were enriched in homologues of proteins involved in biosynthesis and intermediary metabolism. A further 362 genes were found to be absent or highly divergent in at least one strain. These strain specific genes varied in their presence among isolates and consisted of mainly genes with no ascribable function and restriction/modification genes (298).

Two papers have subsequently applied differences in the genetic content of H. pylori strains, determined by microarray analysis, to variations in the pathogenesis these strains caused in animal models (32, 136). Björkholm et al. (32) identified two subclones of an H. pylori strain from the same patient, one that contained the cag PAI, and the other that had excised the island. The fitness of these two clones was tested by infection of mice. There was no difference in the colonisation level or pathology produced by these two strains after 3 months of infection. However, the cag negative strain could not colonise conventional mice, only germ-free mice of the same strain. There was no detectable genomic evolution of these strains after 3 and 10 months colonisation assessed by microarray detection of the genes of the cag PAI and sequencing of three genetic loci, which suggests that these were stable subclones. This finding indicates that humans can be infected by multiple stable subclones of H. pylori.

In a study by Israel et al. (136) the genetic makeup of two H. pylori clinical isolates, one from a GU patient, and the other from a DU patient, were analysed Chapter 1 25 using an H. pylori microarray. It was found that the DU strain lacked a run of cag PAI genes while the GU strain had an intact island. A number of phenotypic differences between the strains were observed. The GU strain produced severe gastritis, some gastric ulceration and atrophy in infected gerbils, while the DU strain induced only mild gastritis that was localised mainly in the corpus region. Mutation of the cag PAI in the GU strain caused this modified strain to produce a milder inflammation, more like that of the DU strain. In a commentary in the journal GUT by Atherton (17) it was suggested that although this work shows to some extent the importance of the cag PAI, it does not help us understand the difference between the strains causing GU and DU. Limitations of this study were that this is only one example and that most DU strains have been shown to contain the cag PAI. The ability to detect genetic differences that can be attributed to differences in pathology is an important step forward for H. pylori research.

A second set of studies looked at sequential changes in isolates of H. pylori within the same host in order to observe the evolution of the organism. One paper by Israel et al. (137) identified changes within the entire genome of sequential isolates of the sequenced strain J99, obtained from the same patient 6 years apart. Although the RAPD profiles of the sequential strains appeared to be very similar, there were multiple deletions and acquisitions of genes particularly in the J99 plasticity zone detected by microarray analysis. This study supports the notion that there is microevolution of H. pylori strains within the host. The authors suggest that H. pylori is in a continuous state of genetic flux that may allow it to rapidly adapt to changing conditions or it may be poised to colonise a new host.

1.4.2. Host Microarrays 1.4.2.1. Transcriptional Profiling Understanding bacterial virulence factors, genetic content, response to environmental stimuli and regulatory cascades provides only one side of the story. In order to unravel the events leading to disease it is important to understand the effect the bacteria has on its host. This can be studied in clinical Chapter 1 26 samples, modelled in animals, or investigated with cultured cells in vitro. Human and mouse cDNA microarrays have been used by a number of groups to analyse the host cell transcriptional response to H. pylori infection. A number of groups have used cultured human gastric epithelial cells derived from adenocarcinomas to investigate the effect of H. pylori infection particularly with respect to the effect of the presence of a functional cag PAI (20, 49, 66, 122, 185, 318). In most of these studies a relatively small number of genes appear to have been differentially expressed during infection indicating that the response to H. pylori infection is narrow and specific. Of these genetic changes, some may be attributed to a generalised response to bacterial contact rather than being specific to H. pylori infection (66).

Two groups Maeda et al. (185) and Bach et al. (20) used microarrays containing relatively small numbers of cDNA elements, 2304 and 588 respectively, representing different genes to detect the transcriptional response of cells in culture to infection with isogenic H. pylori strains. Maeda et al. (185) compared the response of cells to a strain with a functional cag PAI and to an isogenic mutant in which the cagE gene had been disrupted, rendering this strain unable to induce IL-8 secretion in epithelial cells. In contrast, in the study by Bach et al. the isogenic strains used consisted of one that contained the entire cag PAI and an isogenic mutant in which the cagA gene was disrupted. In this case both strains were able to induce equivalent IL-8 secretion from epithelial cells (20). Both studies detected changes in genes encoding cytokines, signal transduction molecules and cytoskeletal elements.

In contrast to these two studies, Cox et al. (66) used a combination of four different cDNA nylon arrays with a remarkable 57, 800 cDNA elements, to detect the expression levels of genes in Kato3 cells infected with non-isogenic strains of H. pylori. One of these strains contained an active cag PAI and the other was cag-PAI negative. RNA was extracted from the cells over a time course from 45 min to 24 h after infection enabling the detection of the early and late transcriptional response of the cells to infection (66). Despite the large number of Chapter 1 27 cDNAs present in this array set only 134 genes were found to be induced, and 108 genes had reduced expression levels in response to either H. pylori strain. These genes belong to a variety of functional categories including growth factors, cytokine/chemokines and their receptors, apoptosis proteins and transcription factors. A further 116 genes were shown to be differentially expressed between the cag positive and negative strain infections, indicating a marked difference in the response of the cells to these strains. This differential response can not be attributed solely to the presence or absence of the cag PAI as non-isogenic strains were used. The differential expression of three selected genes were confirmed by semiquantitative RT-PCR of RNA extracted from gastric biopsies of patients with H. pylori infection (65). Another earlier study by Chiou et al. (49) also confirmed the expression of a selection of genes found, by microarray analysis, to be differentially expressed in an in vitro cell culture infection study, in human gastric mucosal biopsies.

By far the most comprehensive microarray analysis to date, examining cag PAI dependent effects on gene expression in cell culture was conducted using human cDNA arrays with 24, 000 elements and H. pylori wildtype, and isogenic mutants in cagA, cagN, virB4, and the entire cag PAI (122). The virB4 mutant and the one missing the entire PAI were unable to deliver CagA to infected AGS cells, while the cagN mutant was able to deliver CagA. The CagN molecule is thought to be an additional secreted effector of the PAI (208). A time course infection of AGS cells with each of these strains was performed and the transcriptional response of these cells monitored using the microarray. The response induced by both the wildtype and cagN mutant were strikingly similar, while the response induced by the virB4 and cag PAI mutants were also similar to each other, but different from the expression pattern induced by the former pair of strains. The cagA mutant exhibited a response intermediate between the above two phenotypes. One of the genes with striking induction in the cagN mutant infected cells, but not in the cells infected with any of the strains able to deliver CagA, was the cell junction proteins claudin 1 and 4 (122). These proteins are likely to be involved in barrier functions of the epithelium. These studies indicate that, at least in cell culture Chapter 1 28 systems, different responses to strains and specific genetic elements can be detected with expression profiling.

Despite the confirmation of results in each of these papers using semiquantitative RT-PCR or Northern blotting, there appears to be little overlap in the genes found to be differentially regulated in these analyses. This may stem from the divergent conditions and experimental protocols used by each group, in particular the multiplicity of infection (the number of bacteria/cell), the cell line used and the length of infection. All these factors could result in important differences in the transcriptional response of the infected cells. These types of in vitro cell culture array experiments are primarily useful as a screen for genes involved in the infection response rather than representing a complete picture of the pathogen- induced changes in host gene expression. Thus, the in vivo confirmation used by Cox et al. (66) and Chiou et al. (49), where patient gastric biopsies were tested for the induction or repression of a selection of genes suspected to be involved in the host response to H. pylori, is of particular merit.

Although host cell transcriptional profiling can be informative, it would be desirable to be able to capture the host cell response in an animal that is infected as this may be more clinically relevant. Using whole tissue from an infected animal though, provides the extra complication of a mixture of cell types being present. In an attempt to address this issue, Mills et al. (214) infected mice with H. pylori for 2 or 8 weeks and then used lectin panning to isolate the parietal cell population in the stomachs of these mice. The transcriptional profile of the parietal cell (PC) population was then compared with the non-PC population in infected versus uninfected mice (214). This approach enables the complexity of gastric samples to be reduced while providing an in vivo profile of the host response to infection. Remarkably, the gene expression in the PC population was quite constant between the infected and uninfected groups of animals, while the non-PC cell population showed an up-regulation in a number of genes. The genes with induced expression included those involved in cell motility/migration, extracellular matrix interactions, and IFN responses. Chapter 1 29

A recent study by Mueller et al. (224) has also investigated the transcriptional response of mice to infection with Helicobacter. In this study, total RNA from whole stomach samples collected at 6, 12, 18, 20, 22 and 24 months post infection, of “H. heilmannii” infected BALB/c animals was used. Analysis of these samples revealed many specific differences in expression pattern between infected and uninfected controls. This study also identified various expression signatures relating to the severity of the inflammation in these animals. This result indicates that, despite the complexity of whole tissue samples, considerable differences between individuals can be detected and linked with clinical diagnosis (224).

1.5. Investigations into Host/Pathogen Relationships The genome era has provided the necessary basis for the development of microarrays that are proving to be invaluable for the in depth investigation of host/pathogen relationships. While in the past only a handful of genes could be studied at one time, now we have the ability to capture the transcriptional responses for every gene in an organism in a single experiment. Linking the responses of both host and pathogen will eventually provide some understanding of the unique mechanisms by which this interaction can be interfered with, contributing to opportunities for drug and vaccine development. Considering the unique genetic make-up, and ecological niche of H. pylori, along with the profound medical consequences of this bacterium, this host/pathogen interaction provides a particularly interesting opportunity for in-depth study.

1.6. Hypotheses and Aims 1.6.1. Overall Goal of this Thesis: The goal of the present study was to utilise microarray technology to investigate specific aspects of the intricate relationship between host and pathogen in H. pylori infection of the mouse model. Chapter 1 30

1.6.2. Hypotheses to be Tested: 1. That microarray analysis can provide a global view of the transcriptional regulation in H. pylori in response to environmental stimuli. 2. That the development of a new mouse model in which the effect of infection with two different H. pylori isolates, in two different mouse strains can provide insight into both strain and host specific effects on colonisation and pathogenesis. 3. That microarray analysis of both bacterial genomic changes and host transcriptional response to infection will help distinguish the specific contributions of host and strain specific differences. 1.6.3. Specific Aims 1. To investigate H. pylori transcriptional regulation in vitro in broth culture 2. To examine the effect of the presence of mammalian cells on the transcriptional regulation in H. pylori. 3. To search for a new mouse colonising strain of H. pylori that is able to persist at equivalent levels to the Sydney Strain (SS1) in mice for use in comparative studies. 4. To utilise two mouse colonising strains of H. pylori of differing genetic makeup for comparison of colonisation, as well as the strain specific host pathological and transcriptional response to these bacteria.

Chapter 2

MATERIALS AND METHODS

2.1. Culture Media Unless stated otherwise all basic chemicals and stains were obtained from either BDH Chemicals (Kilsyth, Vic, Australia) or Ajax Chemicals (Auburn, N.S.W., Australia).

2.1.1. Horse Blood Agar (HBA) Columbia Agar Base (Oxoid, Basingstoke, UK) 4% (w/v) Sterile defibrinated horse blood (HemoStat Labs, Dixon, CA) 5% (v/v) (-cyclodextrin (Sigma Chemical Company, St. Louis, MO) 0.2% (w/v) Vancomycin (Sigma) 10 mg/L Cefsulodin (Sigma) 5 mg/L Polymixin B (Sigma) 2.5 g/L Cyclohexamide (Sigma) 50 mg/L Trimethoprim (Sigma) 5 mg/L Amphotericin B (Sigma) 8 mg/L 2.1.2. Campylobacter Selective Agar (CSA) Blood Agar Base no. 2 (Oxoid, UK) 36 g Distilled water 1 L Sterile defibrinated horse blood (Oxoid, West Heidelberg, Vic, Australia) 50 mL The following selective supplement was added: Skirrow’s selective supplement Polymyxin B (Sigma) 2.5 mg/L Vancomycin (Eli Lilly & Co., Australia) 10 mg/L Trimethoprim (Sigma) 5 mg/L Amphotericin B (Fungizone®, E.R Squibb & Sons, Princeton, NJ) 2.5 mg/L 2.1.3. Glax Selective Supplement Agar (GSSA) Blood Agar Base no. 2 (Oxoid) 36 g Chapter 2 32

Distilled water 1 L Sterile defibrinated horse blood (Oxoid, Australia) 50 mL The following selective supplement added: Vancomycin (Eli Lilly & Co) 10 mg/L Polymyxin B (Sigma) 0.33 mg/L Bacitracin 20 mg/L Nalidixic Acid 1.07 mg/L Amphotericin B (E. R. Squibb & Sons) 5 mg/L 2.1.4. Brain Heart Infusion broth plus Glycerol (BHIG) BHI broth (Oxoid) 3.7 g Distilled water 100 mL Glycerol 31 g The brain heart infusion broth and glycerol were autoclaved separately. Once cooled the BHI broth was aseptically added to the glycerol, mixed and stored at 4°C.

2.1.5. Brucella Broth with Fetal Calf Serum (BBF) Brucella Broth (Difco laboratories, Detroit, MI) 43 g Distilled water 1 L Fetal Calf Serum (GIBCO-Invitrogen, Carlsbad, CA) 500 mL 2.1.6. Tissue Culture Medium (DMEM/FCS) Dulbecco’s Modified Eagle’s Medium (Gibco) 90% (v/v) Fetal Calf Serum (Gibco) 10% (v/v) Vancomycin (Sigma) 10 mg/L 2.1.7. Co-culture Medium (DMEM/FCS/BB) Dulbecco’s Modified Eagle’s Medium (Gibco) 80% (v/v) Fetal Calf Serum (Gibco) 10% (v/v) Brucella Broth medium (Difco laboratories) 10% (v/v) Vancomycin (Sigma) 10 mg/L Chapter 2 33

2.2. Buffers and Solutions 2.2.1. Phosphate Buffered Saline (PBS) 0.1 M

Sodium dihydrogen orthophosphate (NaH2PO4.2H2O) 4.37 g/L

Di-sodium hydrogen orthophosphate, anhydrous (Na2HPO4) 10.22 g/L Sodium chloride (NaCl) 8.5 g/L

Dissolve in 90% volume distilled H2O, adjust pH to 7.2, then make up to full volume, autoclave 121ºC 15 min. Store at room temperature (RT).

2.2.2. Physiological Saline NaCl 8.5 g/L

Dissolve in distilled H2O, autoclave 121ºC 15 min. Store at RT.

2.3. Bacterial Strains and Culturing 2.3.1. Helicobacter pylori Strains The bacterial strains used in this study are listed in Table 2.1.

2.3.2. Cryopreservation Stock cultures of references strains or gastric homogenates from infected animals were frozen in BHIG at either –70°C or in liquid nitrogen.

2.3.3. Resuscitation and Plate Culture of Bacteria H. pylori strains were revived from frozen stocks and grown on solid media on HBA under microaerophilic conditions at 37°C for 48-72 h. Microaerophilic conditions were obtained by using a CampyGen sachet (Oxoid) and an anaerobe jar.

2.3.4. Colony Forming Unit (CFU) Estimation For CFU estimation of H. pylori from broth or tissue co-culture, a series of 1 in 10 serial dilutions were prepared in BB and multiple 10 µL drops of four appropriate dilutions were deposited on HBA plates. The drops were allowed to dry before incubation of the plates under microaerophilic conditions at 37ºC for 72-96 h in an anaerobe jar with CampyGen sachet (Oxoid). Chapter 2 34

2.3.5. Liquid Culture for Time Course Experiments Plate or broth grown H. pylori were used to inoculate BBF liquid media, and grown in microaerophilic conditions with shaking at 37°C for 24 h.

2.3.6. Bacterial Harvesting from in vitro Growth in Broth or Co-culture The bacterial culture was passed under vacuum through a 0.45 µm mixed cellulose acetate and nitrate filter (MF-Millipore membrane filter, Bedford, MA) (a maximum of 2.5 OD 600nm per filter) placed in an all-glass filter holder (Millipore) in order to collect the bacteria on the filter. The filter was immediately placed into a

50 mL falcon tube (Falcon), frozen in liquid N2, and stored at -80°C.

2.4. Tissue Culture 2.4.1. Cell Culture of Madin-Darby Canine Kidney (MDCK) Cells Plastic tissue culture dishes (9 mm) were seeded with ~106 MDCK cells (Dr. W. James Nelson, Stanford University) in 10 mL DMEM/FCS and incubated at 37°C in 5% CO2 atmosphere until confluent.

To passage the MDCK cells, a confluent dish was washed twice in 10 mL warm PBS (Gibco). A further 10 mL warm PBS was added and the plate incubated at 37ºC for 10 min. The PBS was then replaced with 1 mL Trypsin/EDTA (Gibco) and the plate incubated at 37ºC for a further 10 min. Nine mL DMEM/FCS was then added to the plate and the cells separated by continual pipetting. Cell numbers were estimated using a haemocytometer and new tissue culture dishes were seeded with ~106 cells (as above).

2.4.2. Co-culture of H. pylori and MDCK cells A starter culture of the appropriate H. pylori strain was grown in BBF media for 24 h. A dish of confluent MDCK cells were washed once with DMEM before inoculation with 10 mL DMEM/FCS/BB media containing ~108-109 bacteria mL-1 from the starter culture. For the first 2-3 days post inoculation, until the H. pylori infection was well established, 5 mL of the growth medium was removed and 5 mL of new DMEM/FCS/BB media was added to refresh the media. Subsequently Chapter 2 35 every 24 h the growth medium was completely removed from the co-culture after which the cell monolayer was washed once in DMEM to remove the majority of H. pylori cells, and 10 mL of new DMEM/FCS/BB media was added. The co- culture was maintained in this way for 2-8 weeks.

2.5. Animal Experimentation 2.5.1. Animal Maintenance and Housing Female BALB/c mice, 6-8 weeks old, were obtained from the Specific Pathogen Free (SPF) Breeding Facility, at the University of New South Wales, Sydney, N.S.W. or the Animal Resources Centre (ARC), Canning Vale, W.A. All animals were maintained under clean conditions, fed a diet of autoclaved rat and mouse ration (Gordon’s Speciality Stock Feed Pty Ltd, Yanderra, N.S.W.) and given water ad libitum.

All animal experimentation protocols were approved by the Animal Care and Ethics committee of the University of New South Wales.

2.5.2. Stomach Collection To assess the colonisation and pathology in the mice, the animals were euthanased by CO2 asphyxiation and then cervical dislocation; their stomachs were removed, opened along the lesser curvature and rinsed in saline to remove the stomach contents. Half of the stomach was homogenised for viable CFU counts, and the remaining half stomach was placed in 10% buffered formalin and embedded in paraffin for histology (see Light Microscopy below). Four µm sections were cut and stained with modified Steiner silver stain to assess bacterial colonisation and with Haematoxylin and Eosin stain to assess histopathological changes. The grading criteria for these microscopic assessments are described in the Experimental Procedures in Chapter 6.

2.5.3. Viable Plate Count for Detection of H. pylori Infection For assessment of colonisation by H pylori, half stomachs were weighed and homogenised with an Ultra Turrax homogeniser (John Morris Scientific Ltd). One in 10 serial dilutions were prepared in BHI broth and 200 µL aliquots spread over Chapter 2 36

GSSA selective agar plates. After five days of incubation under humidified microaerophilic conditions, colonies were counted. Colony forming units per gram of stomach tissue were calculated.

2.6. Light Microscopy 2.6.1. Fixation, Processing and Sectioning Lillies Neutral buffered Formalin 40% Formaldehyde 100 mL

NaH2PO4.2H2O 4.52 g

Na2HPO4 6.5 g Distilled water 900 mL Tissue samples were fixed in buffered formalin overnight and then transferred to 70% ethanol. Stomach samples were cut into longitudinal strips so that all regions of the stomach could be examined by histopathology. All samples were then placed into tissue processing cassettes (Miles Scientific Inc) for embedding. Tissue samples were dehydrated and embedded in paraffin using standard techniques. Four micron sections were cut and stained as appropriate. All embedding, sectioning and staining was carried out by the Histology Unit, School of Pathology, University of New South Wales.

2.7. Electron Microscopy 2.7.1. Scanning Electron Microscopy (SEM) of Bacterial Cultures Critical point dried preparations of pure bacterial cultures were viewed with a field emission Scanning Electron Microscope (Hitachi S900, Tokyo, Japan).

2.8. General Molecular Techniques 2.8.1. Extraction Methods 2.8.1.1. Genomic DNA (gDNA) extraction from bacterial cultures gDNA was extracted according to the manufactures instructions by use of the Puregene DNA Purification Kit (Gentra Systems, Minneapolis, MN) and stored at –20°C until required. gDNA samples used in microarray analysis were further Chapter 2 37 purified by a standard phenol/chloroform procedure (300). The gDNA was then resuspended in 10 mM Tris-HCl.

2.8.1.2. RNA Isolation from H. pylori samples Samples harvested from cultures on filters (as described above) were stored in 50 mL conical tubes at -80ºC. Each of these conical tubes was thawed on ice and an appropriate volume of Trizol (Gibco) added directly to the membrane (1 mL/ 1-5 x 107 CFU). The tube was vortexed vigorously to ensure complete coverage of the membrane and lysis of the H. pylori cells. Alternatively, if the bacterial samples were pelleted by centrifugation instead of filtered from the media, the Trizol was added directly to the pellet and the tube vortexed to lyse cells.

Total RNA was purified by using a modified protocol which combines the Trizol RNA extraction procedure (Gibco) and the RNeasy mini-kit clean-up protocol (Qiagen, Chatsworth, CA). The manufacturer’s instructions for Trizol RNA extraction from bacterial cells was followed, except that the aqueous layer was removed and used directly in the RNeasy clean-up procedure. Ethanol was added to a concentration of 36% (v/v) to the aqueous layer, and the remaining protocol was followed according to the manufacturer’s instructions (Qiagen). An on-column DNase digestion {RNase-free DNase Set (Qiagen)} was also performed according to the manufacturer’s protocol, except that the incubation time was extended to 40 min. RNA was eluted in RNase free water and quantified by optical density at 260 nm. RNA samples were stored at -80ºC.

2.8.1.3. RNA isolation from mammalian stomach tissue samples Frozen stomach tissue samples (stored at -80ºC) were homogenised using a Tissue Tearor (Biospec Products Inc., Bartlesville, OK) in 1 mL Trizol reagent (Gibco) per 50 mg tissue sample. The manufacturer’s instructions were followed for total RNA extraction except that the aqueous layer was removed and used directly in the RNeasy Midi-kit clean-up procedure (Qiagen). Ethanol was added to a concentration of 36% (v/v) to the aqueous layer, and the remaining protocol was followed according to the manufacturer’s instructions (Qiagen). An on- Chapter 2 38 column DNase digestion (Qiagen) was also performed according to the manufacturer’s protocol, except that the incubation time was extended to 40 min. RNA was eluted in 500 µL RNase free water and quantified by optical density at 260 nm. RNA samples were stored at -80ºC.

2.8.2. Agarose Gel Electrophoresis of RNA Samples For assessment of RNA quality, 2-10 µL of extracted RNA was combined with 6 X gel loading dye (Ambion, Austin, TX) and 30 µg/mL ethidium bromide. Samples were separated on a 1 % agarose gel in TAE buffer (40 mM Tris-acetate, 1 mM EDTA) using 70 V and subsequently stained in 20 µg/mL ethidium bromide. Gels were then visualised and photographed on a Gel Doc system (BioRad Laboratories, Hercules, CA).

2.8.3. Reverse Transcription The synthesis of the first strand of cDNA was performed with Superscript II RNase H- (Promega, Madison, WI, USA) reverse transcriptase and either random hexamers (R6) (Qiagen) or Panorama™ H. pylori cDNA labelling primers (SigmaGenosys, The Woodlands, TX), named Gene Specific Primers (GSPs). The reverse transcription reaction involved an initial RNA denaturation and primer annealing step where 0.5-2 µg total RNA was combined with 1 µg (unless otherwise stated) of the appropriate primer mix (R6 or GSPs), the volume made up to a total of 10 µl, and heat denatured at 65°C for 10 min, then cooled on ice for 2 min. The reaction mixture consisting of: 1 X First strand buffer (250 mM Tris-

HCl, pH 8.3 at 25°C; 375 mM KCl, 15 mM MgCl2); 0.01 M DTT; 0.1-0.2 mM dNTPs (Pharmacia); & 400 Units of Superscript II enzyme (Promega) was added to the RNA/primer mix and the reaction incubated at 42°C for 110 min. The RNA strand was then hydrolysed by adding 1 µL 1 M NaOH and incubating at 65°C for 10 min and then the solution was neutralised with the addition of 1 µL 1 N HCl. The single stranded cDNA (ss-cDNA) was used in either a PCR reaction or Klenow reaction, where appropriate. Chapter 2 39

2.8.4. RT-PCR Following standard reverse transcription, amplification of the required DNA fragment from cDNA was performed using the polymerase chain reaction (PCR). The PCR reaction in a total of 25 µL contained: 1 x PCR reaction buffer (67 mM

Tris-HCl, 16 mM (NH4)2SO4, 0.45% Triton X-100, 0.2% gelatin), 200 µM of each deoxyribonucleotide (dNTP) (Pharmacia), 5 pmol forward and reverse primers, 1 U Taq DNA polymerase (Invitrogen) and 10-100 ng cDNA. The PCR reaction was carried out in a Robocycler 96 (Stratagene, Cedar Creek, TX) with an initial denaturation at 94°C for 5 min, then 25 cycles of 94°C for 50 sec (melting), 48- 52°C for 70 sec (annealing) and 72°C for 2 min 20 sec (extension), followed by a final extension step of 72°C for 7 min. All RT-PCR reactions included a sample where no reverse transcriptase had been added to the reaction for assessment of genomic DNA contamination of the RNA samples, and a negative PCR control consisting of water added to the reaction instead of cDNA to assess contamination of reagents.

2.9. Microarray Procedures and Techniques 2.9.1. Microarrays 2.9.1.1. H. pylori microarray The H. pylori microarrays used in this thesis are described in detail by Salama et al. (2000) (298). The microarray contains elements/spots which are DNA fragments generated by PCR using gene-specific primers (GSPs). The GSPs were designed such that the amplified fragments corresponded to unique segments of each individual open reading frame (ORF). The elements on the array represent 98.9% of all ORFs (1660/1681) present in the two published sequences of H. pylori strains, 26695 (361) and J99 (6). The PCR products were each printed twice on opposing sections of the array. The details of the array printing are described by Eisen & Brown (1999) (93). Microarrays were stored at RT after printing. The naming convention used for all H. pylori microarrays was: HP1111a/b, where the first number represents the print-run of the array set, the last three numbers represent the array number within the set between 1 and 137, Chapter 2 40 and the letter represents the position of the array (on slides with two arrays printed): a is the array furthest from the label and b is closest to the label.

2.9.1.2. Murine microarray The mouse stomach RNA samples were hybridised to two separate sets of mouse microarrays: Stanford style cDNA murine microarrays containing 23 000 spotted elements derived from the RIKEN (213) mouse clone sets (named SMK and SML arrays); and a murine cDNA microarray set supplied by the Stanford Functional Genomics Facility containing 38 000 elements derived from the RIKEN (213) and NIA (353) mouse clone sets (named MMM arrays).

2.9.2. Microarray Post-Processing Before use in hybridisations, microarrays were post-processed to remove salt and to block background. Slides were re-hydrated by incubation in plastic hybridisation chambers over H2O for 15-30 min. The slides were then snap-dried at ~80ºC for 15 sec before being UV crosslinked in a Stratalinker (Stratagene) at 60 mJ. The slides were then plunged into fresh blocking solution containing: 96% (v/v) 1, 2 methyl pyrrolidine (Sigma); 1.7% (w/v) succinate anhydride (Aldrich); and 4% (v/v) 1 M sodium borate (Sigma), and incubated with gently agitation for

15 min. The slides were then immersed in ~99ºC H2O for 2 min, before a final immersion in 95% ethanol. The slides were then dried in a clinical centrifuge (Beckman) for 5 min at 500 rpm.

2.9.3. H. pylori Microarray Probe Preparation and Hybridisation 2.9.3.1. Genomic DNA labelling

0.5-2 µg of gDNA was suspended in 41 µl H2O and denatured for 5 min at 99ºC. Five microliters of 10 X labelling buffer (400 µg/ml random octamers, 0.5 M Tris-

HCl, 100 mM MgSO4, 10mM DTT); 5 µL dNTP/dUTP mix (0.5 mM dGTP, dATP, dCTP, 0.2 mM aminoallyl dUTP/0.3 mM dTTP); & 2 µL Klenow exo- (10 U/µl) (New England Biolabs, Beverly, MA), was added and the reaction incubated for at least 1 h at 37°C. Free amines were removed by adding 450 µL of double distilled H2O plus the reaction (50 µl) to a Microcon YM 30 (Millipore) column. The samples were concentrated by centrifugation for 8 min at 11, 750 X g in a Chapter 2 41 microcentrifuge. The eluate was discarded and the wash step repeated twice. The samples were collected after the final wash and dried in a Speed Vac Plus

(Savant). The probe was resuspended in 4.5 µL H2O and labelled by the addition of 4.5 µL of 0.1 M sodium bicarbonate, pH 9.0, containing 1/16 of one reaction vial of FluoroLink™ Cy5 (test sample) or Cy3 (reference consisting of equal amounts of 26695 and J99 gDNA) monofunctional dye (Amersham), and was incubated for 1 h at room temperature in the dark. The reaction was quenched by addition of 4.5 µL of 4 M hydroxylamine and incubated for 15 min at room temperature in the dark. The Cy3 and Cy5 reactions were combined and unincorporated dye removed using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen). The eluate from the columns was dried in a Speed Vac and resuspended in 11 µL TE and 25 µg yeast tRNA added. The 12 µL reaction was used in a standard hybridisation to an H. pylori microarray.

2.9.3.2. Standard total RNA labelling protocol: Indirect incorporation of Cy-dyes cDNA was synthesized from 0.5-2 µg of total RNA in a standard reverse transcriptase reaction using Superscript II (-) (Promega) with 1 µL of Panorama™ H. pylori cDNA labelling primers (GSPs) (SigmaGenosys). The cDNA was purified using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen) and eluted in 40 µL of elution buffer (provided). The eluate was heat denatured at 99°C for 5 min and then added to: 5 µL of 10 X labelling buffer; 5 µL dNTP/dUTP mix; & 2 µL Klenow exo- (10 U/µl) (as above for gDNA labelling) and the reaction incubated for 16 h at 37°C. Free amines were removed by adding 450 µL of double distilled H2O plus the reaction (50 µl) to a Microcon YM 30 (Millipore) column. The samples were concentrated by centrifugation for 8 min at 11, 750 X g in a microcentrifuge. The eluate was discarded and the wash step repeated 2 times. The samples were collected after the final wash and dried in a Speed Vac (Savant). The probe was resuspended in

4.5 µL H2O and labelled by the addition of 4.5 µL of 0.1 M sodium bicarbonate, pH 9.0, containing 1/16 of one reaction vial of FluoroLink™ Cy5 or Cy3 Chapter 2 42 monofunctional dye (Amersham), and was incubated for 1 h at room temperature in the dark. The reaction was quenched by addition of 4.5 µL of 4 M hydroxylamine and incubated for 15 min at room temperature in the dark. The Cy3 and Cy5 reactions were combined and unincorporated dye removed using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen). The eluate from the columns was dried in a Speed Vac and resuspended in 11 µL TE and 25 µg yeast tRNA added. The 12 µL reaction was used in a standard hybridisation to an H. pylori microarray.

2.9.3.3. Hybridisation of H. pylori arrays For hybridisation 2.55 µL of 20X SSC and 0.45 µL of 10 % SDS, were added to the labelled probe and then heat denatured for 2 min at 99°C. The probe was cooled briefly, and then applied to the H. pylori microarray for hybridization under a 22 mm square cover slip in chambers for 16-24 hrs at 55°C.

2.9.4. Murine Microarray Probe Preparation and Hybridisation using Total RNA

Five µg of anchored oligo-dT[(dT)20-VN] (Operon, HPLC purified) was mixed with 20-40 µg total murine RNA, denatured at 70ºC for 10 min and cooled on ice for 5 min. The reaction mixture containing: 1X RT buffer (Gibco) (250 mM Tris-HCl, pH

8.3 at 25°C; 375 mM KCl, 15 mM MgCl2), 1X dNTP mixture [500 µM dATP, dGTP, dCTP each (Pharmacia); 400 µM aminoallyl-dUTP (Sigma); and 100 µM dTTP], 0.01 M DTT (Gibco), and 400 U Superscript II RT (Promega), was added to the RNA/primer mixture and incubated at 42ºC for 2 h. RNA was then hydrolysed by addition of 0.05 N NaOH and incubation at 70ºC for 10 min. The reaction was then neutralised by addition of 0.05 N HCl. Free amines were removed from the reaction by either Microcon-30 filters (Amicon) (described in the H. pylori labelling procedures) or by Qiaquick PCR purification column using the manufacturer’s instructions (Qiagen), except that the cDNA was eluted using neutral pH H2O instead of the elution buffer provided. The eluate was dried by speed vac centrifugation and resuspended in 4.5 µL H2O. An equal volume of Cy5 (for the experimental sample) or Cy3 (for the reference) (Amersham) Chapter 2 43 suspended in 0.1 M sodium bicarbonate was then added and incubated in the dark at RT for 1 h. The reaction was then quenched with 4.5 µL 4 M hydroxylamine (Sigma) and incubated in the dark at RT for 15 min. The appropriate Cy3 and Cy5 reactions were then combined and the unincorporated dyes were removed by Qiaquick PCR purification using the manufacturer’s instructions (Qiagen) and eluting the labelled cDNA in 60 µL elution buffer (provided). Twenty µg of mouse Cot-1 DNA (Gibco), 25 µg yeast t-RNA and 10 µg polyA RNA (Sigma) were added to the labelled cDNA before concentration using a microcon-30 filter (Amicon) using the manufacturer’s instructions. The hybridisation buffer of 3.5 X SSC and 0.3% SDS was added to the probe in either a 35 µL (SMK and SML arrays) or a 45 µL (MMM arrays) volume and then was denatured at 99ºC for 2 min and cooled to RT. This probe was then applied to the centre of a murine microarray, an appropriate size cover slip lowered and placed in a hybridisation chamber for hybridisation in a 65ºC water bath for 24-48 h.

2.9.5. Microarray Stringency Washes After hybridisation, the slides were submerged in wash solution 1 (2 x SSC, 0.1 % SDS) until the cover slip dropped off, and then transferred to wash solution 2 (1 x SSC) for 2 min with rocking. Finally the slide is transferred to wash solution 3 (0.2 x SSC) and rocked for a further 2 min, then the slide was spun dry in a clinical centrifuge for 5 min at 500 rpm (93). The hybridized slides were scanned and analysed using a Gene Pix Scanner 4000A and the GENEPIX 3.0.6.86 software (Axon Instruments, Redwood City, CA).

2.9.6. Data Analysis 2.9.6.1. Stanford Microarray Database All microarray data generated in this thesis was collated and stored using the Stanford University Microarray Database (320).

2.9.6.2. Normalisation by SMD Once uploaded into SMD the raw data for each experiment is normalised to correct for differences in overall channel intensities in each hybridisation. The procedure used for all data analysis in this thesis, unless otherwise specified, Chapter 2 44 was the default computed normalisation provided by SMD (http://genome- www5.stanford.edu/MicroArray/help/results_normalization.shtmL). Briefly, SMD first selected spots which met a particular criteria, which were: the spots must not be flagged (user specified in the Genepix Image analysis program), and the spots whose percent of pixels with intensities greater than one standard deviation above the background pixel intensity in both channels must be above a threshold value. The threshold was between 0.55 and 0.65 depending on the value at which > 10% of spots on the array pass. The normalization value was then calculated by taking the average of the natural log ratio of the red/green channel intensities, which was then raised to the power e to calculate the normalisation value. All the raw values in the red channel (635 nm) are then divided by the normalisation value.

2.9.6.3. Data retrieval from SMD 2.9.6.3.1. H. pylori array data Data was retrieved from the SMD using a custom filtering criteria described here unless otherwise stated. Spots were excluded from analysis due to: obvious spot abnormalities; low signal (if the sum of the median intensities for the two channels was = 500); or uneven distribution of pixel intensities in the spot (the standard deviation of pixel intensity ratios > 3.5). The data obtained for the net pixel intensity in each channel of each microarray were normalised by using the default-computed normalisation described in this section. The ratio of the Red (time point sample)/Green (reference) channels for each spot were expressed as log2 (R/G).

2.9.6.3.2. Murine array data

For the murine arrays the normalised log2 (R/G) ratios were retrieved from SMD using a filtering criteria where: spots were excluded due to obvious spot abnormality, spot quality {a regression correlation of < 0.6; a standard deviation of pixel intensity ratios of > 2.5; and the percent saturated pixels > 30% in either Channel 1 (green) or Channel 2 (red)}, and low signal (sum of the median intensities from both channels = 350). Finally, only those genes whose log2 (R/G) Chapter 2 45 ratios were more than 1.5 (for the SMK and SML arrays) or 2 (for the MMM arrays) standard deviations away from the mean in at least 2 arrays were retrieved for analysis and those genes and arrays with < 80% good data across all the arrays from the same type were excluded.

2.9.6.4. Computer programs used for data analysis 2.9.6.4.1. CLUSTER and TREEVIEW The CLUSTER program (version 1.50.1.1) was used for clustering of the data using various algorithms: hierarchical clustering, principal component analysis and self organising maps. The TREEVIEW program (version 2.11.01) was used to visualise the clustered data. Both are available from http://www.microarrays.org/software.html (94).

2.9.6.4.2. Significance Analysis of Microarrays (SAM) The SAM program was used to assess significant differences in gene transcript levels in arrays from two assigned groups. For all SAM analysis normalised log transformed data was used and the missing data points were first estimated with a K-Nearest-Neighbour imputation, where K equalled 10 (364). The SAM program calculated the list of significant genes and produced a false discovery rate (FDR), which is an estimate of the percentage of false positives called in the analysis. In the analyses of the H. pylori array data a calculated FDR of < 1% was used to assign significance, while for the murine array data an FDR of < 5% was used. In all cases a minimum 2 fold change in gene transcript level between the two groups of arrays being tested was employed. The program generates an ri value that is the relative level of change in gene expression in log space and a Score (d) value, which represents the level of significance for each gene. The program also reports a relative fold change for the gene transcript level between the two groups of arrays tested (364, 78). SAM is available from http://www-stat- class.stanford.edu/SAM/servlet/SAMServlet.

2.9.6.4.3. Genome typing analysis (GACK) In order to determine changes in the genomic content of H. pylori strains, the data obtained from microarrays were analysed using the microarray genome Chapter 2 46 analysis program GACK (157). This program generates a dynamic cutoff for assigning genes as present or divergent (absent or with significant sequence divergence from the gene used to generate the microarray) in each separate array hybridisation, instead of using an empirically determined constant cutoff for this purpose. The algorithm used functions independently of any normalisation process which can be influenced by differences in strain composition and hybridisation quality. The program assigns an Estimated Probability of Presence

(EPP) according to the distribution of the log2 (R/G) ratios for each array and thus gives an estimate of how likely a gene is to be present. The EPP range is from 0%, assigned as divergent, to 100% assigned as present. Those genes falling in the transition region between 0% and 100% EPP are classified as slightly divergent. Three options are provided for the data output, binary (0=divergent, 1=present), trinary (-1=divergent, 0=slightly divergent, 1=present), and graded {continuous range -0.5 (divergent)-0.5 (present)}. The GACK program is available from the Falkow Lab Website (http://falkow.stanford.edu).

2.9.6.4.4. Data manipulation programs Various programs custom written (in Perl language) for use with microarray data manipulation were also used (156). Name averaging (NACK) was used to average duplicate data for each unique ORF in the H. pylori microarray. Filter/Retrieve IDs (FRICK) was used to, either obtain or filter out, data for a list of unique ORFs from a data file. File Linker (FLICK) was used to link two data files together for analysis. SAM to cluster (SAMSTER) was used to obtain the log2 (R/G) ratios of a list of significant genes retrieved by a SAM analysis for use in the clustering algorithms. Histogram Maker (HIMACK) was used to produce histograms for multiple data files in order to assess the quality of RNA and DNA hybridisations. These programs are available from the Falkow Lab Website (http://falkow.stanford.edu). Chapter 2 47

Table 2.1: H. pylori Strains used in this study

Strain Relevant Source Characteristics 10700 Clinical isolate UNSW (174) SS1 (SS1) Murine passaged isolate UNSW (174) SS1 (SS1-SF) In vitro passaged Stanford University, originally murine isolate obtained from A. Covacci 2.1 Clinical isolate UNSW (this study) SS2000 Murine passaged isolate UNSW (this study) G27 Clinical isolate A. Covacci (392) 10319 Clinical isolate UNSW (174) 10217 Clinical isolate UNSW (174) 26695 Sequenced clinical Stanford (361) isolate J99 Sequenced clinical Stanford (6) isolate

Table 2.2: Primer sequences used for PCR reactions

Primer Sequence 5’à 3’ Annealing Source Name temperature used Cag1 GAA TTT TCA CAA GTT GGG 52°C Nina Salama TGT (personal communication) Cag2 AAT CCC CAT TAC CAA ACT 52°C Nina Salama CAG T (personal communication) RAPD AAC GCG CAA C 40°C and 36°C (2) primer 3 (1281) RAPD GTG GAT GCG A 40°C and 36°C (2) primer 4 (1290)

Chapter 3

DEVELOPMENT OF PROCEDURES FOR H. PYLORI TRANSCRIPTIONAL PROFILING

3.1. Background Transcriptional profiling of bacteria using microarrays is an emerging field. The importance of developing highly efficient and reliable methods for harvesting bacterial samples, extraction of RNA, and subsequent labelling of RNA for arrays, is becoming increasingly apparent in transcript profiling experiments (183). Although there are existing methods for harvesting bacterial cells from growth medium and extraction of bacterial total RNA, many of these procedures are not sufficiently rapid to ensure that messenger RNA from the bacterial cells does not degrade or change in transcription level. Bacterial mRNA is particularly unstable and thus the levels of individual mRNA species can be significantly affected by degradation of RNA during extraction and labelling (302),(183). In addition, bacteria can modify expression levels rapidly in response to stresses such as those that may occur during the harvesting procedure (42). These changes profoundly alter the results of microarray experiments as the expression of every gene is detected at the same time.

Most existing bacterial harvesting procedures involve pelleting bacterial cells from the growth medium by centrifugation at 4°C (383, 384), or at 20°C (4, 395) for 2-5 min. In some protocols prior to RNA fixation, the pellet is also washed (11, 242, 396). Considering that the half-life of mRNA molecules in E. coli can be as short as 30 s (42, 281), this type of harvesting provides sufficient time for significant changes in mRNA levels through degradation, as well as the induction of the cold shock and/or general stress response genes (183). To address this a number of methods have been used to stabilize mRNA, these include the addition of chaotrophic agents such as guanidinium isothiocyanate (51, 240), the use of products such as RNALater (Ambion) (23), and phenol/EtOH stop solution (285), Chapter 3 49 as well as rapid snap-freezing of bacterial cells (183). The harvesting method developed here involves membrane filtration to extract the bacterial cells from the growth medium. In addition to rapid RNA extraction, it is essential that high quality RNA is obtained. RNA purity is particularly important as the presence of cellular protein, lipid and carbohydrate may inhibit enzyme activity during the labelling process and can cause significant non-specific binding of fluorescently labelled cDNAs to slide surfaces of microarrays (85).

Once extracted the RNA sample is labelled with cyanine dyes for hybridisation to the microarray. For each microarray experiment two samples are labelled with two separate fluorescent dyes and these are mixed for hybridisation to the array (Fig. 3.1). Since bacterial mRNA is generally not polyadenylated, total RNA must be used of which only ~3% is mRNA. This means that a large amount of total RNA is required and/or the labelling procedure must be very efficient (42). To date there have been few advances in methods for labelling bacterial mRNA for microarrays. The typical method uses a direct labelling procedure where cyanine dye-conjugated nucleotide analogues (eg. Cy-dUTP) are incorporated into complementary DNA (cDNA) during reverse transcription (155, 384). This method is relatively inefficient as the fluorescent nucleotides are not the normal substrate of reverse transcriptase, and bulkiness of the fluorescent moieties results in lower rates of incorporation (72, 183). This inefficiency results in both a compromise in the number of cyanine labels incorporated into each cDNA molecule as well as differential incorporation of dyes as the reverse transcriptase incorporates Cy5-dUTP at a lower rate than Cy3-dUTP (183). Additionally, this method is expensive and a large amount of RNA (8-50 µg, for glass microarrays) is required for efficient hybridisation (384, 395, 396). The incorporation bias of the reverse transcriptase can be controlled by performing dye swap experiments (383), which require double the number of arrays, or by using an indirect labelling procedure (72, 183). The indirect labelling methods involve incorporating a nucleotide analogue featuring a chemically reactive amine group to which the fluorescent dye can then be attached (68). These indirect labelling procedures increase the relative incorporation of fluors and decrease labelling bias, thus Chapter 3 50 increasing sensitivity and efficiency of labelling and allowing the use of smaller amounts of RNA for efficient hybridisation.

Regardless of the nucleotides used, the first step in any of these labelling procedures is a reverse transcription reaction to synthesise cDNA from total RNA. The absence of polyadenylation in prokaryotes means that instead of using a poly (dT) primer in the reverse transcription reaction, as is done for labelling eukaryotic mRNA, a different approach is required. The primers used are typically one of three types: i) random oligonucleotide primers (384) ii) a mixture of reverse-strand oligonucleotides specific for each ORF, {Gene Specific Primers (GSPs)} (14); or iii) a minimally complex mixture of oligomers sufficient to hybridise to the 3’ end of every ORF {Genome Directed Primers (GDPs)} (352). Random oligomers are the most commonly used primer set as they are relatively cheap, they do not need to be specifically designed, and the resulting cDNA population provides good coverage of the RNA sample (68). However, because the random oligomers can prime from ribosomal RNA (rRNA) as well as mRNA, some non-specific cDNA synthesis occurs that can result in high background hybridisation (352). In contrast, the use of GSPs or GDPs results in higher signal- to-noise ratios by preferentially synthesising cDNA from coding regions causing less background hybridisation from mislabelled rRNA (68).

In DNA microarray experiments one RNA sample is labelled with Cy5, typically the sample or test RNA, and the other with Cy3, the reference RNA (Fig. 3.1). The level of a particular mRNA species is estimated by comparison to the level of the same mRNA species in the reference sample. Thus, after hybridisation the array is scanned with a laser scanner which records the intensity of the Cy5 and Cy3 fluorescent signal from each feature (Fig. 3.1). The ratio of the Cy5/Cy3 intensities for each spot is obtained to assess the relative level of that particular mRNA in the sample tested compared with the level in the reference. The reference sample used is an important consideration when designing array experiments. Two array designs may be employed: I) each sample RNA is directly compared to a biological control RNA sample (Type I); or II) each sample Sample Reference

Extract total Extract total RNA RNA or genomic DNA

Label

Cy5 Cy3

Hybridize to H. pylori microarray

Scan higher in reference equal in both higher in sample

Figure 3.1 Diagram representing the procedure used for labelling RNA or DNA samples for microarray hybridisation. The sample RNA is labelled with Cy5 (red) and the reference RNA or DNA sample is labelled with Cy3 (green). The labelled sample and reference are mixed together and hybridised to the microarray. The red (635nm) and green (532nm) signal emitted from each spot is then detected with a laser scanner. A green spot indicates increased expression of the represented ORF in the reference, a red spot shows increased expression in the sample and a yellow spot would indicate equal expression in both sample and reference. Cy refers to cyanine fluorescent dyes. Chapter 3 52

RNA is compared to a universal reference RNA or DNA sample (Type II). In many cases a Type II experiment is chosen because this allows for direct comparison of the results obtained from multiple arrays where each separate sample RNA is hybridised with the universal reference RNA or DNA. An optimal universal reference would contain the necessary mRNA or DNA fragments that will bind to every spot on the array so that each spot emits a signal from the Cy3 channel to which the signal from the Cy5 channel can be compared. If no fragment exists in the reference to hybridise to a particular spot, the resulting Cy5/Cy3 intensity ratio may be calculated as an inaccurately large number because it has been divided by an intensity value close to zero. It has been suggested that genomic DNA would serve as an appropriate universal reference for bacterial arrays (183), as this reference should result in a fluorescence signal for each spot on the array to which the sample can be compared. In comparison if an RNA sample is used for the reference, there may not be representative cDNA molecules for every spot in that sample. This problem can be at least partially avoided if a pool of RNA samples is used for the reference as this improves the chances of there being cDNA present for each ORF represented on the microarray.

Given the lack of adequate protocols for bacterial harvesting, RNA extraction and RNA labelling procedures, appropriate methods were developed for use with the H. pylori microarray. In this chapter the development of these new procedures are described in four parts. Part 1 describes the bacterial harvesting and RNA extraction protocol. In Part 2 the original labelling procedure, as well as the choice of primers and reference samples for array experiments are discussed. Part 3 compares the efficiency of the original labelling method with the two labelling methods developed in this study. Part 4 describes the fine tuning of the final labelling method chosen for use in further microarray experiments. Chapter 3 53

3.2. Part 1: Bacterial Harvesting and RNA Extraction Techniques 3.2.1. Experimental Procedures (Part 1) 3.2.1.1. Bacterial harvesting A new method of bacterial harvesting using membrane filtration was developed to improve both the speed and efficiency of collection in order to avoid message changes in bacterial samples. Membrane filtration has been used in the past for bacterial harvesting for various purposes such as bacterial counts and estimation of species diversity in water samples (109, 177). However, the use of filtration for the extraction of nucleic acids from bacteria is novel. The bacterial culture (a maximum of 2.5 OD 600nm per filter) was passed though a 0.45 µm mixed cellulose acetate and nitrate filter (MF-Millipore membrane filter) under vacuum to extract the bacterial cells from the growth medium. The filter was immediately placed into a 50 ml conical tube (Falcon), frozen in liquid N2, and stored at -80°C.

3.2.1.2. RNA extraction from H. pylori cells The total RNA from the harvested bacteria was extracted using a newly developed RNA extraction procedure (detailed in Chapter 2) which combines the use of the Trizol reagent (Gibco) for the initial separation of RNA from the rest of the cell components, and the clean-up protocol of RNeasy mini columns (Qiagen). Trizol was used first to efficiently lyse the H. pylori cells on the filter without the need for enzymatic digestion (such as lysozyme) and to immediately inactivate any RNases so that the RNA was stabilised (51). The RNeasy clean- up with an on-column DNase treatment was employed to remove any contaminating DNA, protein, phenol and the abundant small RNA molecules (tRNAs and 5S rRNA) (73). Once the aqueous solution containing the RNA sample was obtained from the Trizol preparation, this was used directly in the RNeasy clean-up procedure instead of first precipitating the RNA, which is the method recommended by the manufacturer of Trizol (Gibco). The components in the aqueous layer of Trizol (guanidine isothiocyanate) are equivalent to the first buffer required for the RNeasy clean-up procedure (RLT buffer: guanidine isothiocyanate containing) and thus the required amount of ethanol Chapter 3 54

(recommended in the Qiagen RNeasy clean-up protocol) was added directly to the aqueous layer for loading on the RNeasy mini-column.

RNA extracted from broth grown cultures described in section 3.2.1.3 were used to assess the success of these new harvesting and RNA extraction procedures and to provide RNA samples for use in the development of the cDNA labelling procedure described in Parts 2-4 of this chapter.

3.2.1.3. Assessment of the quality of RNA extracted from broth grown H. pylori H. pylori cells grown in broth culture, as described in Chapter 2, were harvested using the new filtration method described above. Total RNA was extracted from the bacteria collected on filters using the new extraction method detailed above. The samples used in this chapter are listed in Table 3.1.

The quantity and quality of the extracted total RNA was assessed by visualisation on 1 % agarose gels and the OD260 nm /OD280 nm ratios of the RNA in H2O were determined using a UV spectrophotometer (Gene Spec III, Mirai Bio Inc., Alameda, CA). Counts of the colony forming units (CFU) were also performed and used to assess the efficiency of the RNA extraction method. In addition an RT-PCR using primers for a portion of the cagA gene was performed on the RNA to check that the mRNA was not significantly degraded and that there was no DNA contamination. In each RT-PCR reaction negative controls were included where no reverse transcriptase was added during the reaction (RT-) and thus no PCR product should be formed from these reactions unless there was genomic DNA contamination of the RNA sample.

To optimize the conditions for on-column DNase treatment, two different concentrations of the RNase-free DNase (Qiagen), 10 µl or 20 µl per column, were used and the length of the DNase digestion extended from 15 to 40 min.

3.2.2. Results and Discussion (Part 1) The new method of bacterial harvesting described here involved the passage of the bacterial culture through a filter membrane followed by snap-freezing of the Chapter 3 55 membrane to inhibit RNA degradation. Using this method, up to a total of 2.5

OD600 nm of H. pylori growth could be harvested in less than 1 min. This method is unlikely to induce a stress response in the bacteria or allow time for any other mRNA level changes to occur.

The RNA extraction procedure developed was a combination of the Trizol extraction procedure and the Qiagen RNeasy clean-up procedure. The Trizol reagent lyses the H. pylori cells and stabilises the RNA, while the RNeasy clean- up procedure ensures the final RNA sample is free from contaminating proteins, phenol and DNA. By loading the aqueous layer extracted by Trizol directly onto the RNeasy column after the appropriate amount of ethanol was added, precipitation of the RNA sample was avoided. Precipitated RNA is often difficult to resuspend probably due to contaminating protein in the sample, thus some of the RNA sample can be lost. In addition, this modified method is much quicker than the two procedures separately. The final total RNA sample extracted was consistently of high quality with an OD260 nm/OD280 nm measured in H2O of between 1.8-2.1 and was never degraded as assessed by agarose gel electrophoresis (Fig. 3.2.1). In addition using the harvesting/extraction procedure an average of 0.6 µg/ 107 CFU total RNA was obtained from a starting culture of between 3 x 107 CFU/ml to 9 x 108 CFU/ml. This was comparable to the average RNA yield predicted for the RNeasy mini column extraction for E. coli which is 0.5 µg/ 107 CFU.

The RNA quality was further confirmed using an RT-PCR for a portion of the cagA gene (section 3.2.1). In a number of the initial experiments where the new RNA extraction procedure was used, there was some DNA contamination observed using the RT-PCR technique. For example, in Fig. 3.2.2 a band of the expected size (~1 kb) is present in the positive control in lane 7 and in the reactions where RT enzyme was added (RT+) (lanes 1, 3 and 5). However, the same size band is seen in the reactions where no RT enzyme was added (RT-), while the PCR negative control (H2O added instead of cDNA) contains no visible band, indicating that the DNase treatment during the RNA extraction did not 1 2 3 4 5 6

Figure 3.2.1: Five separate samples (lanes 2-6) of total RNA extracted from H. pylori run on a 1% agarose gel. Lane 1 contains a 1 kb DNA ladder (Gibco), arrows indicate 1 kb and 0.5 kb size bands respectively.

1 2 3 4 5 6 7 8 9

Figure 3.2.2: A 1% agarose gel showing products obtained from RT- PCR reactions using cagA primers 1 & 2 (see Table 2.2). Lanes 1, 3, & 5 contain the products obtained from three RT-PCR reactions on separate RNA samples, while lanes 2, 4, & 6 show the products obtained from parallel reactions where no reverse transcriptase was added (the bands seen in these lanes indicate DNA contamination of the corresponding RNA sample). Lanes 7 & 8 are the positive (SS1 genomic DNA) and negative (H2O) PCR controls respectively. Lane 9 contains a 1kb DNA ladder and the arrows indicate 1 kb and 0.5 kb size bands respectively. 1 2 3 4 5 6 7 8

Figure 3.2.3: A 1% agarose gel showing products obtained from RT-PCR reactions on three separate RNA samples using cagA primers 1 & 2. Lane 1 contains a 1kb DNA ladder and the arrows indicate 1 kb and 0.5 kb size bands respectively. Lane 2 contains the PCR negative control (H2O). Lanes 3, 5, & 7 contain the RT-PCR reactions, while lanes 4, 6, & 8 show the products obtained from parallel RT-PCR reactions where no reverse transcriptase was added. Chapter 3 57 successfully remove all contaminating genomic DNA. Thus, the DNase treatment used in the extraction procedure was reviewed. The manufacturer’s instructions for RNase-free DNase (Qiagen) suggest the use of 10 µl of DNase per column, with an incubation time of 10-15 min at room temperature when RNA from a maximum of 109 CFU has been loaded onto the column. In an attempt to eradicate all contaminating genomic DNA, the use of both 10 and 20 µl volumes of DNase per column for a digestion time of 40 min were tested (for method see section 3.2.1.3). Both combinations tested were successful at removing contaminating genomic DNA from the RNA preparation and so in subsequent RNA extractions a volume of 10 µl DNase/column with a digestion time of 40 min at room temperature was used. In addition, to improve the efficiency of removing genomic DNA from the sample care was taken to use an appropriate volume of Trizol (1 ml per 1-5 X 107CFU) and to ensure the mini RNeasy column was not overloaded. Using this procedure RNA samples with undetectable DNA contamination were routinely achieved (an example is shown in Fig. 3.2.3).

3.2.3. Conclusion (Part 1) The new method of bacterial harvesting has improved the speed and efficiency of bacterial cell collection while the RNA extraction procedure was more reliable and efficient than the previous methods. Total RNA extracted from H. pylori cells using these modified procedures was used for all the time course experiments presented in Chapter 4 & 5. Based on these findings this complete RNA extraction method is now also being implemented by other researchers working with different bacterial species.

3.3. Part 2: The Direct Labelling Procedure 3.3.1. Experimental Procedures (Part 2) 3.3.1.1. Improvement of the direct labelling procedure 3.3.1.1.1. Original direct labelling procedure (based on TIGR protocol) The first strand cDNA labelled with Cy-dye was synthesised as follows: an initial RNA denaturation and primer annealing step was performed where 0.5-2 µg total RNA was combined with 1 µg of GSPs (SigmaGenosys), the volume made up to Chapter 3 58 a total of 10 µl, and heat denatured at 65°C for 10 min, then cooled on ice for 2 min. The reaction mixture consisting of: 1 X First strand buffer (250 mM Tris-HCl, pH 8.3 at 25°C; 375 mM KCl, 15 mM MgCl2); 0.01 M DTT; 1 µl dNTP mix (0.5 mM dATP, dCTP & dGTP, and 0.05 mM dTTP); 0.05 mM Cy3 or Cy5-dUTP (Amersham Pharmacia Biotech UK Limited, England); & 400 Units of Reverse Transcriptase was added to the RNA/primer mix and the reverse transcription reaction performed at 42°C for 110 min. The RNA strand was then hydrolysed by adding 0.05 µM NaOH and incubating at 65°C for 10 min and then the solution was neutralised with the addition of 0.05 µM HCl. An equal amount of the appropriate Cy5 & Cy3 reactions were combined, 450 µl TE, and 25 µg yeast tRNA added and the mix applied to a Microcon YM30 membrane filtration unit (Millipore). The samples were concentrated by centrifugation for 8 min at 11, 750 X g in a microcentrifuge. The eluate was discarded and the wash step repeated twice. The samples were finally concentrated after the last wash to a volume less than 12 µl and collected by inverting the column in a fresh collection tube and centrifuging at maximum speed for 1 min. The volume was adjusted to 12 µl with TE and used in a standard hybridisation to an H. pylori microarray.

The existing labelling procedure for bacterial RNA, the direct labelling method (above), was considered by members of the Falkow laboratory to be quite inefficient as the intensity of the spots was mostly not greater than 1.5 fold above the background. The basic procedure for direct labelling is shown in Fig. 3.3A. Various parameters of this procedure were explored in order to improve the efficiency of labelling. First, a comparison between the use of genomic DNA and an RNA sample for the reference was investigated in the first experiment described below (section 3.3.1.3). Most of the H. pylori microarray experiments performed in this thesis utilized Type II hybridisation procedures and thus a reference sample which was labelled with Cy3 was needed for each hybridisation. This choice influences the labelling procedure, the effectiveness of normalisation calculations and data analysis. Second, the use of GSP compared to random hexamer primer sets in the reverse transcription reaction required for labelling was assessed (3.3.1.4). This was carried out due to claims by et al. (14) Chapter 3 59 that GSP primer sets may not be as efficient as using random priming and that they may introduce bias into the results.

In all of the experiments described in Parts 2-4 the reference sample was labelled with Cy3 fluorescent dye while the test sample was labelled with Cy5 fluorescent dye. Many of the arrays used for the development of the labelling techniques were hybridised with the same RNA sample labelled in the Cy5 and the Cy3 channels and these are referred to as “self to self” experiments. These self to self experiments are useful for comparison of the labelling techniques, in particular the assessment of dye incorporation bias and reproducibility of the labelling procedures.

3.3.1.2. RNA samples extracted for use in method testing Many of the initial experiments where the effectiveness of the various labelling techniques (described in Parts 2 and 3 of this Chapter) were assessed utilised test RNA samples extracted from a number of different conditions. These are described below and listed in Table 3.1.

The SS1 strain of H. pylori was grown in 6 ml broth cultures of Brucella Broth plus 5% fetal calf serum for 25 h (1) or 35 h (2). The bacteria were collected by centrifugation at room temperature, the pellet snap-frozen in liquid N2 and stored at -80°C for RNA extraction as described in Chapter 2. These RNA samples are referred to as BR1-RNA and BR2-RNA respectively.

The SS1 strain of H. pylori was grown on HBA plates (Chapter 2) for 24 h. One ml of PBS was then added to the plate and the cells harvested. The cells were then centrifuged for 3 min at 10, 000 rpm and the pellet resuspended in 1 ml Trizol. The RNA was extracted and is referred to as PG-RNA.

In order to test the RNA labelling procedures it was essential to obtain RNA samples which were expected to have significant variations in RNA content. To achieve this, the SS1 strain of H. pylori was exposed to neutral or acid conditions by first growing it in an overnight brucella broth culture (described in Chapter 2) to Chapter 3 60

Table 3.1: H. pylori arrays used for technique development

Sample Labelled in Each Channel Array Name Method † Cy5 Cy3 Reference Section Type 3.2.4 HP6095a Direct PG-RNA SS1 gDNA gDNA Exp 2 HP6095b Direct BR1-RNA SS1 gDNA gDNA Exp 2 HP6096a Direct BR2-RNA SS1 gDNA gDNA Exp 2 HP6099b Direct pH7-RNA SS1 gDNA gDNA Exp 2 HP6102b * Direct R6 / PG-RNA GSP / PG-RNA RNA Exp 1B HP6103a Direct pH4-RNA SS1 gDNA gDNA Exp 2 HP6103b Direct pH4+U-RNA SS1 gDNA gDNA Exp 2 HP6104a Direct pH4-RNA pH7-RNA RNA Exp. 1A HP6104b Direct pH4+U-RNA pH7-RNA RNA Exp. 1A HP6108a * Direct PG-RNA PG-RNA RNA Exp 3A HP6109a * Klenow PG-RNA PG-RNA RNA Exp 3A HP6110a * Indirect PG-RNA PG-RNA RNA Exp 4A HP6115a Klenow SS2-pH7 SS2-O/N RNA Exp 3B HP6115b Klenow SS2-pH4 SS2-O/N RNA Exp 3B HP6116a Indirect SS2-pH7 SS2-O/N RNA Exp 4B HP6116b Indirect SS2-O/N SS2-pH7 RNA Exp 4B HP7004b Indirect TC1 T6 h TC1 T0 h RNA Exp 4C HP7010b Indirect TC1 T42 h TC1 T0 h RNA Exp 4C HP7015a Indirect TC2 T6 h TC2 T0 h RNA Exp 4C HP7017a Indirect TC2 T28 h TC2 T0 h RNA Exp 4C HP7052b * Indirect 2 µg PG-RNA 2 µg PG-RNA cDNA Exp. 5A HP7053a * Indirect 2 µg TP-RNA 2 µg TP-RNA cDNA Exp. 5B HP7053b * Indirect 1 µg TP-RNA 1 µg TP-RNA cDNA Exp. 5B HP7054a * Indirect 0.5 µg TP- 0.5 µg TP-RNA cDNA Exp. 5B RNA HP7061a Indirect TC3 T12 h Mix of TP’s TC3 cDNA Exp. 4C

* Indicates arrays which were “self to self” experiments. † Indicates the labelling protocol used as defined in Figure 3.1.

Chapter 3 61

an OD600 nm 0.2 and then using this, three different brucella broths were inoculated to a final OD600 nm 0.1. The three broths used were: i) pH7 (control), ii) pH4, and iii) pH4 plus 5 mM urea. The pH of each broth was adjusted with 5 M HCl. The three flasks were incubated with shaking in microaerophilic conditions at 37°C for 100 min. The resultant growth from each flask was harvested and the RNA extracted using the methods described in Part 1 of this chapter. The extracted total RNA samples were named pH7-RNA, pH4-RNA and pH4+U-RNA, respectively. The same experiment was also performed with the SS2000 strain of H. pylori except that RNA samples from only the pH7 broth (SS2-pH7), and the pH4 broth (SS2-pH4) were extracted. For the arrays where the samples, SS2- pH7 or SS2-pH4 were used, RNA extracted from the original overnight broth culture of SS2000 (SS2-O/N) was used as the reference.

3.3.1.3. Comparison of the use of RNA versus genomic DNA as a reference in array hybridisations The use of genomic DNA compared to an RNA sample for the reference in H. pylori array hybridisations was investigated. In this set of experiments six array hybridisations were performed where an RNA sample was labelled with Cy5 using direct incorporation by Superscript II (Fig. 3.3A) while SS1 genomic DNA (gDNA) was labelled with Cy3 by direct incorporation using the Klenow enzyme (Fig. 3.3B). These arrays are: HP6095a, HP6095b, HP6096a, HP6099b, HP6103a & HP6103b (details in Table 3.1). In order to assess the quality of these six RNA/DNA array hybridisations, the results were compared to another set of three arrays in which an RNA sample was labelled in both the Cy5 and Cy3 channels (i.e. RNA/RNA hybridisations) using direct incorporation of cy-dUTPs with Superscript (Fig. 3.3A): HP6104a, HP6104b, & HP6108a (Table 3.1).

3.3.1.4. Comparison of the use of random hexamers (R6) and Gene Specific Primers (GSP) for reverse transcription In this experiment 2 µg of the RNA sample, PG-RNA (section 3.2.2), was labelled with Cy5 using the direct labelling method described above, except that 2 µl of R6 (Qiagen, 1.8 µg/µl) were used for the reverse transcription reaction instead of Chapter 3 62

GSPs, while another 2 µg of PG-RNA was labelled with Cy3 using GSPs in the reverse transcription reaction for the direct labelling method (Fig. 3.3A). These labelled samples were hybridised to array HP6102b (Table 3.1) and the quality of the hybridisation of the Cy3 labelled sample and the Cy5 labelled sample were compared.

3.3.1.5. Investigation into the efficiency of direct labelling using Superscript (Fig. 3.3A) The following array hybridisations were performed using the direct labelling procedure using Superscript, shown in Fig. 3.3A, for comparison to the two other labelling procedures illustrated in Parts B and C in Fig. 3.3. Two micrograms of each of the sample RNA’s, pH4-RNA and pH4+U-RNA (section 3.2.2), were labelled with Cy5, while 4 µg RNA extracted from the pH7-RNA sample was labelled with Cy3 for the reference in both arrays (reaction split in half for the two arrays). The Cy5 labelled pH4-RNA and Cy3 labelled reference RNA were hybridised to array HP6104a, while the pH4+U-RNA and Cy3 labelled reference were hybridised to array HP6104b (Table 3.1).

3.3.1.6. Estimation of array hybridisation quality In order to estimate the effectiveness of the RNA labelling procedures, described in Parts 2-4 of this chapter, the quality of the resulting array hybridisations were investigated. The quality of the arrays was first assessed with Genepix. An Axon Array Quality Control Report (in the Genepix program) was run on the results of each of the arrays described. The median signal-to-noise ratio in each channel, 635 nm (Cy5) and 532 nm (Cy3) was assessed. An array with a ratio below 5 in either channel was considered to be of bad quality, indicating inadequate labelling.

In addition histograms of the normalised log2 (R/G) values for a selected set of the arrays listed in Table 3.1 were plotted using the HIMACK program (developed by C. Kim) to assess the spread of log values from each array and the success of the normalisation. If the arrays were of good quality, a small spread of log values was expected for arrays where the same RNA sample was labelled in both the Chapter 3 63 channels (self to self). In contrast a larger spread of values was expected from arrays where a different RNA sample was labelled with Cy3 compared to Cy5. If the normalisation was successful the peak of the histogram should lie close to 0 and the spread of the curve should be symmetrical around 0.

In experiments where self to self samples were investigated, the correlation coefficient between the intensity values in the two channels was calculated and a scatter plot of the 532 nm channel (Ch1) versus the 635 nm channel (Ch2) was constructed for estimation of the linear regression.

Finally a hierarchical cluster of those arrays comparing self to self samples, where both the arrays and the genes were clustered was completed using the Cluster program. This cluster was utilized to compare the similarity in arrays labelled using the 3 different methods described in Parts 2 & 3.

3.3.2. Results and Discussion (Part 2) 3.3.2.1. Issues relating to Direct incorporation of Cy-dyes into the first strand cDNA. The original method used to label RNA samples from bacteria for microarrays was the direct labelling method using Superscript II enzyme. This method involves the incorporation of Cy-dUTP into the first strand of cDNA by the reverse transcriptase (Superscript II) (Fig. 3.3A). As discussed in section 3.1 there are problems inherent to this method of incorporation as it is both inefficient and can incorporate bias due to the reverse transcriptase incorporating different modified dUTPs at different rates. The use of this direct labelling procedure for H. pylori microarrays was investigated. It was found that the quality of microarrays resulting from this direct labelling method was generally poor. The 4 arrays HP6104a, HP6104b, HP6102b & HP6108a, show low median signal-to-noise ratios in both the channels (635 nm – Cy5; 532 nm – Cy3) (Fig. 3.4A). In an attempt to improve these results, two features of the direct labelling method were investigated. First the use of genomic DNA as a reference was compared with the use of an RNA sample as the reference. Second, the choice of primers used in the reverse transcription reaction was investigated. A: Direct B: Direct C: Indirect Labelling with labelling with labelling Superscript Klenow (Indirect) (Direct) (Klenow) Total Total Total RNA RNA RNA RT with RT RT Cy-dUTP RNA/cDNA-Cy RNA/cDNA RNA/cDNA

Hydrolyse Hydrolyse Hydrolyse RNA RNA strand RNA strand strand

ss-cDNA-Cy ss-cDNA ss-cDNA

Clean- Clean- Clean up up -up ss-cDNA Hybridise ss-cDNA to array Klenow & Klenow & Cy-dUTP aa-dUTP ds-cDNA-Cy ds-cDNA-aa Clean-up NHS ester-Cy

Hybridise to ds-cDNA-aa-Cy array Clean-up

Hybridise to array

Figure 3.3: Flow diagram showing the major steps in the three RNA labelling protocols (A, B and C) investigated for use for transcriptional profiling with H. pylori microarrays. Protocol A will be referred to as Direct, B as Klenow and C as Indirect. The step at which the cyanine dyes are incorporated into nucleotide strand in each of the protocols is indicated by an open arrow. Clean-up refers to purification of the nucleotide strand using a Qiagen PCR purification kit; array refers to the H. pylori microarray; RT is reverse transcription; ss is single stranded; ds is double stranded; -cy is cyanine dye coupled or labelled; aa is amino-allyl; and NHS is succinimidyl. Chapter 3 65

The use of genomic DNA compared to an RNA sample as the reference was investigated in a number of H. pylori microarray experiments. In each of the arrays (HP6103a, HP6103b, HP6099b, HP6095a, HP6095b, & HP6096a) shown in Fig. 3.5.1 an RNA sample (see Table 3.1 for details) was labelled using the direct incorporation of Cy5-dUTP by Superscript II into the first strand cDNA (Fig. 3.3A), while SS1 genomic DNA was labelled by the incorporation of Cy3-dUTP by Klenow. The necessity for using two different methods of dye incorporation immediately introduces bias into the array experiment particularly because Klenow enzyme incorporates modified dNTPs more efficiently than reverse transcriptase. It was observed that the Cy3 signal for these arrays was consistently stronger than the Cy5 signal. This inconsistency caused problems with the normalisation procedure which can be observed in Fig. 3.5.1. This figure shows a histogram for each array indicating the distribution of each log2 (R/G) ratio obtained. A normal distribution for the intensities from each of the channels is assumed (for the normalisation procedure described in Chapter 2), however it can be seen that for the arrays where DNA is used as a reference the distribution is left-skewed. This result may occur for multiple reasons: i) the different labelling procedures results in bias; ii) the relative number of transcripts for each ORF can vary considerably in RNA samples, but is fairly uniform in genomic DNA and thus there is a more uniform Cy3 signal from each spot (the DNA signal) than the Cy5 signal from the RNA sample; & iii) the difference in hybridisation strength of the DNA/DNA (DNA of the spot and DNA of the reference) hybrids is weaker than DNA/RNA hybrids which could also result in a bias of the signal.

In comparison, the log2 (R/G) ratios for the 3 arrays where RNA samples were labelled using the direct incorporation of Cy-dUTP by Superscript in both channels show a normal distribution (Fig. 3.5.2). Thus the need for normalisation of the signals from Cy3 and Cy5 necessitates the use of an RNA sample as the reference in these arrays. Hence, in all further experiments an RNA sample was used as the reference for the H. pylori microarrays. It is possible that the use of the indirect labelling procedure (described below), would allow the use of A 30 25 20 635 nm 15 532 nm 10 5 0 Median signal to noise ratio Direct Direct Direct Direct Klenow Indirect Indirect Klenow HP6116a HP6116b HP6104a HP6108a HP6115a HP6115b HP6104b HP6102b

B HP6104a HP6104b C HP6115a HP6115b

HP6102b HP6108a HP6116a HP6116b

Figure 3.4: A) Median signal-to-noise ratio for arrays hybridised with RNA samples labelled with the Direct method: direct incorporation of cy-dUTP during reverse transcription (HP6104a, HP6104b, HP6102b, &HP6108a); the Klenow method: direct incorporation of cy-dUTP using Klenow (HP6115a & HP6115b); and the Indirect method (HP6116a & HP6116b). The 635 nm values refer to the wavelength of the fluorescent signal emitted from the Cy5 labelled sample, while the 532 nm values refer to the wavelength of fluorescent signal emitted from the Cy3 labelled sample. B & C) Laser scans of a representative block (block 13) of each of the arrays indicated which were hybridised with RNA labelled using the Direct (B), Klenow (C) or Indirect methods (C) is shown. The white arrows show examples of “comets” which are more severe in the direct labelled arrays (B) than the arrays labelled using the Klenow method and the Indirect method (C). HP6103a HP6103b HP6099b HP6095a

Frequency HP6095b HP6096a

Log2(R/G) Figure 3.5.1: Histogram showing the left-skewed distribution of the

normalised log2 (R/G) ratios (x-axis) against the frequency (y-axis) for the arrays indicated in the legend on the right side. These arrays were hybridised with H. pylori genomic DNA labelled with Cy3 in the reference channel and an RNA sample labelled with Cy5 (see Table 3.1 for details).

HP6104a HP6104b HP6108a Frequency

Log2(R/G) Figure 3.5.2: Histogram showing the normal distribution of the normalised log2 (R/G) ratios (x-axis) against the frequency (y-axis) for the arrays indicated in the legend on the right side. These arrays were hybridised with RNA samples labelled in both the Cy3 and Cy5 channels (see Table 3.1 for details). Chapter 3 68 genomic DNA labelled in the same way as a reference, but this was not investigated.

The reverse transcription step in the labelling procedures requires the use of a primer set to target the entire population of mRNA molecules present. In bacteria poly (dT) primers cannot be used because of the lack of polyadenylation and so in most cases a mixture of random oligomer primers is used. There are limitations to the use of these random primers as these can prime sites in rRNA and cause non-specific labelling of these regions which in turn can cause a high background hybridisation of the arrays. In contrast the use of GSPs or GDPs reduces the mis-priming of non-coding sequences and thus increases the signal- to-noise ratio of array hybridisations. A set of H. pylori GSPs were used in transcriptional profiling with the H. pylori array. These consisted of a pool of the 3’ primers used in the production of the microarray.

The use of GSPs for transcriptional profiling had not been thoroughly investigated at the time and so the efficiency of the H. pylori pool of GSPs for labelling cDNA was briefly assessed in comparison to the use of traditional random hexamers. An array hybridisation was performed using the PG-RNA sample from H. pylori (section 3.3.1.4) labelled with Cy5 using R6 primers and hybridised against the same RNA sample labelled with Cy3 using the H. pylori GSPs (Genosys), both using a direct labelling reaction using Superscript II (HP6102b, Table 3.1). The similarity between the patterns of spot intensities produced by the two labelling experiments was assessed by calculation of the correlation coefficient and linear regression value between the two channels. These values, 0.6 and 0.36 respectively, indicate a low similarity between the two sets of channel intensities which are plotted in Fig. 3.6A. This result indicates that the two primer sets result in considerably different cDNA synthesis, but it does not indicate which primer set produces the most representative cDNA synthesis. Thus, the number of spots with detectable signals in each of the channels (intensity > 500) after normalisation were measured. For the channel labelled using the GSPs (Cy3) 61% of the spots had good signals while a smaller number (54%) of the spots Chapter 3 69 had good signals in the channel labelled with the random hexamers (Cy5). In addition the median signal-to-noise ratio was above 5 for the channel labelled with GSPs, while the random hexamer labelled channel had a value below 5, indicating poor signal quality (Fig. 3.4A). Although this result is not conclusive, it does appear that using GSPs rather than random hexamers for incorporating cy- dUTP during reverse transcription of H. pylori total RNA may be better.

Recently the use of GSPs for reverse transcription in E. coli has been challenged (14). The authors claim that when GSPs were used for cDNA synthesis there was a significant under-representation of about 30% of mRNAs from the hybridised arrays when compared to arrays where R6 primers had been used in reverse transcription (14). The reason for this under-representation was attributed to the possibility that the GSPs simply did not hybridise to about 1/3 of the mRNAs either because of the hybridisation conditions or because the primers hybridise to themselves or to one another. In addition the fact that E. coli mRNA has widely differing degradation rates may be a further cause for this problem as the 3’ sites for oligo priming may be differentially degraded (14).

The apparent disparity between the results published by Arfin et al. (14) and the use of GSPs for H. pylori transcription profiling described above may be explained by a number of differing experimental conditions between the two studies. First, the primer sets were designed by different groups raising the possibility that different parameters were used for their designs, while the primer production specifications would be similar as both the E. coli and H. pylori GSP sets were provided by the same company, SigmaGenosys. Second, the H. pylori pool of GSPs contains only 1635 unique primers compared to the E. coli set containing 4260 unique primers, suggesting that the H. pylori GSP mixture is significantly less complex which could result in better primer annealing and less primer-dimer production. Third, the hybridisation efficiencies for the H. pylori primers may be more uniform as H. pylori has a lower GC content than E. coli (48% compared to 54%) and the average melting temperature of the GSPs in H. pylori was calculated to be 51°C (not reported for the E. coli GSPs). Finally, the

Figure 3.6: A-D) Scattergrams representing the relationship between the normalised intensity values from the 635 nm channel (Cy5) on the y-axis and the 532 nm channel (Cy3) on the x-axis for four arrays of the “self to self” experiments (name of array and protocol used for labelling indicated under each scattergram) (see Table 3.1 for details). The R2 value and the equation of the linear regression are shown on each plot and the correlation coefficient between the values for the two channels is reported under each scattergram. A HP6102b y = 0.5678x + 404.43 9100 2 8100 R = 0.3598 7100 6100 5100 4100 3100 2100 1100 100 100 2100 4100 6100 8100 HP6102b Direct (Correlation coefficient= 0.60)

B HP6108a

25000

20000

15000

10000 y = 0.9249x + 16.037 5000 R2= 0.8736 0 0 5000 10000 15000 20000 25000

HP6108a Direct (Correlation coefficient= 0.93)

C HP6109a

30000

25000

20000

15000

10000 y = 1.1118x - 240.05 5000 R2 = 0.9628 0 0 5000 10000 15000 20000 25000 HP6109a Klenow (Correlation coefficient= 0.98)

D HP6110a

14000 12000 10000 8000 6000 4000 y = 1.1121x - 131.12 2 2000 R = 0.981 0 0 2000 4000 6000 8000 10000 12000 14000

HP6110a Indirect (Correlation coefficient= 0.99) Figure 3.6 Chapter 3 71 comparison between the use of R6 primers and the GSP set by Arfin et al. is not easily interpreted as the labelling conditions used were not equivalent for the 2 primer sets: the amount of total E. coli RNA labelled with the R6 primers was 10 times more (20 µg) compared to the amount used for the GSP labelling (2 µg); and the temperature used for the denaturing and annealing steps were much higher for the GSPs (90°C) than for the R6 primer reaction (70°C). These different conditions could cause different labelling efficiencies and considering that RNA is known to degrade at temperatures above 65°C, the higher temperature of denaturation used for the GSP reaction could result in more degradation of the RNA and thus less efficient priming. Both these studies using E. coli GSPs and our H. pylori GSPs utilised the direct labelling procedure, which is an inefficient and variable method of labelling and thus the problems encountered with the GSPs may be reduced when using non-modified substrates are used for reverse transcription such as in an indirect labelling procedure.

When the same primer set is used for both sample and reference reverse transcription reactions the results are uniform and thus are comparable. This can be seen in the histograms in Fig. 3.7 which shows the spread of log2 (R/G) values. For self to self labelling experiments the peak should be around zero with very little spread of values. The result for the RNA labelled with R6 v GSPs is shown (HP6102b) and it has a large spread compared to the other array labelled with the direct labelling method (HP6108a) using GSPs in both channels. In addition the correlation coefficient (0.934) and R2 (0.87) values for HP6108a (Fig. 3.6B) are much improved compared with HP6102b suggesting at least that the priming by GSPs is consistent. Given the fact that the H. pylori GSPs set appeared to be as efficient, or more so, than R6 primers it was decided to continue using GSPs for reverse transcription in the H. pylori array experiments.

3.3.3. Conclusion (Part 2) These experiments suggest that the use of an RNA reference instead of genomic DNA for H. pylori microarray experiments is preferable and that the GSPs used for the reverse transcription step were efficient and reliable. Chapter 3 72

3.4. Part 3: Development of the Indirect Labelling Procedures 3.4.1. Experimental Procedures (Part 3) A number of experiments (described below) were performed in order to assess the quality of array hybridisations using the direct labelling procedure described in Part 2 and two new indirect labelling procedures. These methods differ in the stage at which cy-dye was incorporated into the representative cDNA population and each method has a different efficiency of labelling and may incorporate some bias. The first is the direct labelling using Superscript that involves the incorporation of cy-dUTP nucleotides directly into the first strand cDNA during reverse transcription (illustrated in Fig. 3.3A, Part 2). The second method uses the Klenow enzyme to incorporate cy-dUTP nucleotides into the second strand of the cDNA instead of during the RT step (illustrated in Fig. 3.3B). The third method is an indirect labelling procedure in which a reactive amine derivative of dUTP, 5- (3-aminoallyl)-2’-deoxyuridine 5’-triphosphate (aa-dUTP), is incorporated into the second strand cDNA using Klenow enzyme. The cyanine labels are then incorporated by the linkage of the reactive amine to an N-hydroxysuccinimidyl ester group attached to the cyanine dyes (illustrated in Fig. 3.3C).

3.4.1.1. Investigation into the efficiency of direct labelling using Klenow (Fig. 3.3B) 3.4.1.1.1. Direct labelling using Klenow procedure cDNA was synthesized from 2 µg of total RNA in a standard reverse transcriptase reaction using Superscript II (-) (Promega) and 1 µl of Panorama™ H. pylori cDNA labelling primers (GSPs) (SigmaGenosys) (see Part 2 Experimental Procedures). The cDNA was purified using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen) and eluted in 30 µl of elution buffer (provided). The eluate was heat denatured at 99°C for 5 min and then added to: 5 µl of 10X Buffer (400 µg/ml random octamers/0.5 M Tris•HCL/100 mM MgSO4/10mM DTT); 5 µl dNTP mix (0.5 mM dGTP, dATP, dCTP), 2 µl Cy-dUTP (Amersham), & 2 µl Klenow (New England Biolabs, Beverly, MA), and the reaction incubated for 1-2 h at 37°C. The Cy5 and Cy3 reactions for each array are combined and purified by adding the labelled Chapter 3 73 mixture and 450 µl of TE, plus 25 µg yeast tRNA to a Microcon YM 30 (Millipore) column and the were concentrated to a volume less than 12 µl in the same manner as in section 1.2.3.1. The volume was adjusted to 12 µl with TE and used in a standard hybridisation to an H. pylori microarray.

Two separate experiments were performed to assess the array hybridisation quality resulting from RNA labelled with the Direct labelling method using Klenow enzyme (Fig. 3.3B) and these arrays were compared to the results obtained with the original direct labelling method using Superscript described in Part 2 (Fig. 3.3A).

In the first experiment, two self to self array hybridisations were performed in parallel, the first array (Cy5 and Cy3) used the direct incorporation with Superscript II method (Fig. 3.3A) which were hybridised to array HP6108a. The second array once again, consisted of 2 µg of PG-RNA labelled in each of the channels, but this time the samples were labelled using the Klenow direct incorporation method (Fig. 3.3B) and were hybridized to array HP6109a.

A second experiment was performed where different RNA samples were used in the two channels for hybridisation to assess whether the success of the direct Klenow incorporation method remained apparent when the two labelled samples were different (instead of self to self). Two array hybridisations were performed where 2 µg of the following RNA samples, SS2-pH4 and SS2-pH7 (section 3.3.1.2), were labelled with Cy5, while 2 µg of the reference RNA sample SS2- O/N was labelled with Cy3 for each of the arrays, using the Klenow direct incorporation method (Fig. 3.3B). The Cy5 labelled SS2-pH4 sample was hybridised together with half the Cy3 labelled SS2-O/N sample to array HP6115a, while the Cy5 labelled SS2-pH7 sample and the other half of the Cy3 labelled SS2-O/N sample were hybridised to array HP6115b (Table 3.1). Chapter 3 74

3.4.1.2. Investigation into the efficiency of the indirect labelling procedure (Fig. 3.3C) 3.4.1.2.1. Indirect labelling procedure (New standard protocol) cDNA was synthesized from 0.5-2 µg of total RNA in a standard reverse transcriptase reaction using Superscript II (-) (Promega) with 1 µl of Panorama™ H. pylori cDNA labelling primers (GSPs) (SigmaGenosys). The cDNA was purified using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen) and eluted in 40 µl of elution buffer (provided). The eluate was heat denatured at 99°C for 5 min and then added to: 5 µl of 10X Buffer (400 µg/ml random octamers/0.5 M Tris•HCL/100 mM

MgSO4/10mM DTT); 5 µl dNTP/dUTP mix {0.5 mM dGTP, dATP, dCTP, 0.2 mM aminoallyl dUTP/0.3 mM dTTP (Gibco)}; & 2 µl Klenow exo- (10 U/µl) (New England Biolabs, Beverly, MA), and the reaction incubated for 16 h at 37°C. Free amines were removed by adding 450 µl of double distilled H2O plus the reaction (50 µl) to a Microcon YM 30 (Millipore) column. The samples were concentrated by centrifugation for 8 min at 11, 750 X g in a microcentrifuge. The eluate was discarded and the wash step repeated 2 times. The samples were collected after the final wash and dried in a Speed Vac (Savant). The probe was resuspended in

4.5 µl H2O and labelled by the addition of 4.5 µl of 0.1 M sodium bicarbonate, pH 9.0, containing 1/16 of one reaction vial of FluoroLink™ Cy5 or Cy3 monofunctional dye (Amersham), and was incubated for 1 h at room temperature in the dark. The reaction was quenched by addition of 4.5 µl of 4 M hydroxylamine and incubated for 15 min at room temperature in the dark. The Cy3 and Cy5 reactions were combined and unincorporated dye removed using a Qia-Quick PCR purification column according to the manufacturer’s instructions (Qiagen). The eluate from the columns was dried in a Speed Vac and resuspended in 11 µl TE and 25 µg yeast tRNA added. The 12 µl reaction was used in a standard hybridisation to an H. pylori microarray.

Further development of the RNA labelling procedure was conducted by testing whether the indirect labelling method illustrated in Fig. 3.3C resulted in better array hybridisation quality than the two previously described direct labelling Chapter 3 75 procedures. Thus, two separate experiments were performed for this assessment as described below.

An initial test of the method was done by performing a self to self experiment where 2 µg of the PG-RNA sample was labelled in both the Cy5 and Cy3 channels using the indirect labelling method described in Chapter 2, and hybridised to array HP6110a.

Secondly, a dye swap experiment was performed using the indirect labelling method (Chapter 2) where 2 µg of the RNA sample SS2-pH7 was labelled with Cy5, while 2 µg of the reference sample SS2-O/N was labelled with Cy3 and hybridised to array HP6116a. At the same time the labelling was reversed so that the SS2-O/N sample was labelled with Cy5 and the SS2-pH7 sample was labelled with Cy3 and these were hybridised to HP6116b (Table 3.1).

3.4.2. Results and Discussion (Part 3) 3.4.2.1. Improvement of the Direct incorporation of Cy-dUTP using Klenow instead of Reverse Transcriptase The use of the Klenow enzyme to incorporate Cy-labelled dUTP into the second strand of the cDNA after a standard reverse transcription was investigated. The two arrays, HP6115a & HP6115b, were hybridised with RNA samples (see Table 3.1 for details) labelled using the Klenow direct incorporation method (Part 3 of this Chapter) in both channels. Fig. 3.4A shows that the median signal-to-noise ratios for both channels in the arrays HP6115a and HP6115b were higher than the arrays where a direct labelling using reverse transcriptase was employed. In addition, in the arrays labelled with the direct reverse transcription incorporation method there were many “comets” (where the signal from the spot leaves a trail spreading away from the spot, see Fig. 3.4B) observed which can increase the background signal significantly, whereas in the Klenow incorporation method these “comets” were significantly reduced (Fig. 3.4C). It is unclear whether this observation is in fact related to the labelling method or the batch of arrays used. The histograms showing the results from the self to self arrays labelled by the direct incorporation during reverse transcription method (HP6108a) compared to Chapter 3 76 the Klenow direct method (HP6109a) are shown in Fig. 3.7. It can be seen in this figure that the histogram for the Klenow labelled array has much less spread than the direct labelled array suggesting that there was bias introduced in the reverse transcription incorporation, but not in the Klenow incorporation of the dUTP moieties. In addition the correlation coefficient between the values for the two channels was very high (0.981) for HP6109a as was the R2 (0.962) (Fig. 3.6C). Therefore, both the sensitivity and accuracy of the labelling is improved by using the Klenow enzyme to incorporate cyanine labelled dUTP moieties compared with using the reverse transcriptase. This result also further supports the consistency of the priming achieved with GSPs and indicates that the use of a standard reverse transcription reaction prior to incorporation of label improves the final result.

3.4.2.2. Development of the Indirect labelling procedure Investigations of an indirect labelling procedure was also conducted as, although the Klenow direct labelling procedure improved the hybridisation results, it did not substantially improve the sensitivity of labelling (measured by the median signal- to-noise ratio for the arrays) as well as the fact that the expensive cyanine coupled dUTP moiety were used in this procedure. The indirect labelling procedure involved the incorporation of an amino-allyl modified nucleotide into the cDNA to which the Cy-dye was coupled using a reactive succinimidyl ester. These reagents are significantly cheaper than the cy-dUTP moieties. The labelling of DNA using a succinimidyl ester of Cy3 was first described by Randolph et al. (280) and has been used in a variety of techniques such as DNA sequencing (36). The indirect labelling procedure results in amplification of the signal as more cyanine molecules are incorporated into each cDNA species and thus less RNA is required for each microarray, and like the Klenow direct method, it avoids enzymatic incorporation bias (72). It should be noted that since the aa- dUTP moiety can be incorporated by reverse transcriptase, it could have been incorporated into the first strand cDNA during reverse transcription, however this method was not used as a large amount of total RNA (25-50 µg) is required (242). HP6102b Direct HP6108a Direct HP6109a Klenow Frequency HP6110a Indirect

Log2(R/G) Figure 3.7: Histograms showing the differing spread of distribution of the log2 (R/G) ratios for the arrays indicated in the legend on the right side. All of these arrays were hybridised with the same RNA sample labelled in both the Cy5 and Cy3 channels (“self to self” experiments), using the 3 different labelling methods (Direct, Klenow, or Indirect) (see Table 3.1 for details).

B HP7004b HP7010b A

30 25 635 nm 20 532 nm 15 10 5 0

HP7015a HP7017a Median signal to noise ratio Indirect Indirect HP017a Indirect Indirect Indirect HP7010b HP7015a HP7061a HP7004b

HP7061a Figure 3.8: A) Median signal-to-noise ratio for arrays hybridised with RNA samples (indicated on the x-axis) labelled with the Indirect labelling method. The 635 nm values refer to the wavelength of the fluorescent signal emitted from the Cy5 labelled sample, while the 532 nm values refer to the wavelength of fluorescent signal emitted from the Cy3 labelled sample. B) Laser scans of a representative block (block 13) of each of the arrays indicated which were hybridised with RNA labelled using the Indirect method are shown. The blue arrows indicate spots that have “bled” into one another. Chapter 3 78

The results for two of the first arrays where an RNA sample was labelled in both channels with the new indirect labelling procedure are shown in Fig. 3.4A (HP6116a & HP6116b). This figure shows that the median signal-to-noise ratio for these 2 arrays is comparable to the results obtained for the Klenow direct method and is better than the direct reverse transcription incorporation method. The indirect procedure developed was a more involved method than both of the direct incorporation techniques and thus was more difficult to perfect. The more experiments performed using this method, the more reliable the results became as can be seen in Fig. 3.8A which shows the signal-to-noise ratios for some of the arrays from the time course experiments described in Chapter 4 and 5. These ratios are significantly higher than for all of the arrays in Fig. 3.4A, indicating the high quality of hybridisation achieved with the indirect labelling procedure. One reason that the time course arrays produced a better signal-to-noise ratio than the initial arrays using the indirect labelling method, is that these hybridisations were performed on arrays from a different print run, HP7000’s, than the earlier arrays (HP6000’s), which had improved spot size compared to the HP6000’s. Further evidence for the quality of these hybridisations can be seen in Fig. 3.7 where the self to self hybridisation performed with the indirect method (HP6110a) shows a very small spread of the log values similar to the Klenow direct array, and the correlation coefficient (0.990) and the linear regression (0.98) values for this array (Fig. 3.6D) are also very high. Therefore the indirect labelling procedure increases the fluorescence intensity significantly, decreases background as shown by the lack of comets in these arrays (Fig. 3.8B) and removes bias in labelling.

A hierarchical cluster of the log ratios from the 4 self to self experiments which have been discussed for the comparison of the 3 labelling methods, is shown in Fig. 3.9. The Cluster program was used to cluster both the arrays and the genes in order of similarity. Interestingly, the two arrays labelled with the direct incorporation during reverse transcription (HP6102b & HP6108a) cluster together, while both the arrays for the Klenow direct method and the indirect method cluster together (HP6109a & HP6110a). This suggests that the Figure 3.9: A hierarchical cluster of the log2 (R/G) values from the “self to self” arrays indicated at the top of the figure which were hybridised with the samples labelled with the three different methods (Direct, Klenow, or Indirect). The dendrogram at the top of the figure indicates the relationship between each of the arrays, where the length of the arms is proportional to the extent of similarity between the data from each array. The scale indicates the colour scale of the log2 (R/G) ratios used.

Scale: >2.8 fold repression 0 >2.8 fold induction Chapter 3 80 incorporation bias of reverse transcriptase for incorporating the cy-dUTP moieties is driving this cluster and thus causes more differences in the results than the use of direct versus indirect labelling per se. In addition, most of the genes would be expected to have values close to zero (because they are all self to self experiments) which is indicated in this cluster as a black colour, while those genes where the log values are greater (red) or smaller (green) than zero are coloured. It can be seen that there are far more genes represented by red and green colours in the two direct incorporation, during reverse transcription arrays than the other two arrays which are mostly black.

3.4.3. Conclusion (Part 3) The indirect labelling protocol developed in this section was more efficient and reliable than the direct labelling protocol and the protocol using Klenow to incorporate cy-dUTP. Chapter 3 81

3.5. Part 4: Fine Tuning the Indirect Labelling Procedure 3.5.1. Experimental Procedures (Part 4) 3.5.1.1. Titration of RNA, GSPs, and cDNA for indirect labelling procedure Due to the success of the indirect labelling procedure, the amounts of RNA, GSPs and cDNA required for good quality array hybridisation using this method were assessed. The amount of total RNA required for high quality hybridisations is of concern as in many situations the quantity of RNA that can be obtained is limited due to the experimental conditions. For example in the early time points of the time course experiments described in Chapter 4 and 5, only 1-4 µg could be extracted from the harvested cells. Thus in the following experiments the hybridisation quality arising from the labelling of 0.5 µg, 1 µg and 2 µg RNA per channel were assessed. The amount of GSPs used in each reverse transcription reaction was also assessed to optimise the relationship between primer amount and RNA amount in the RT reaction. In addition, the amount of cDNA required in each channel for sufficient hybridisation quality (see section 3.2.5 for the parameters used to estimate array hybridisation quality) was investigated by splitting the cDNA present in each sample after Klenow incorporation of aa-dUTP nucleotides into the second strand, into two parts for labelling in both channels. This meant that a separate reference RNA sample for experiments comparing multiple RNA samples would not be needed as the cDNA produced from each of the sample RNAs could be split in half. One half was labelled with Cy5 and the other half of each of the sample cDNA’s could be pooled and labelled with Cy3 for the reference in each of the arrays. Two separate experiments exploring these possibilities are described below.

3.5.1.2. Investigation of the results generated from array experiments in which the cDNA sample wa s split for the sample and reference The possibility of splitting one cDNA sample into two parts for labelling by the indirect method in both channels was investigated in array HP7052b where the cDNA sample from a standard RT reaction using 2 µg was split into two equal parts. Each of the parts were adjusted to the regular volume (20 µl) with double Chapter 3 82

distilled H2O and the first part labelled with Cy5, while the other part was labelled with Cy3 in the regular manner (Part 3 Experimental Procedures).

3.5.1.3. Effect of splitting cDNA on the amount of total RNA required for labelling Further investigation into the effects of RNA amount and the splitting of a cDNA reaction for labelling, following the incorporation of aa-dUTP, was conducted. This experiment used the RNA sample from the T39 h time point of a TC in varying amounts: 2 µg, 1 µg or 0.5 µg, and the GSPs in two volumes: 1 or 0.5 µl (Table 3.1). After the incorporation of aa-dUTP into the second strand of the cDNA by Klenow, each sample was split in two and half labelled with Cy5, and the other half with Cy3 in the regular manner. These two labelled samples were then hybridised to the same array: HP7053a, HP7053b, HP7054a, & HP7054b (Table 3.1).

3.5.2. Results and Discussion (Part 4) The improved efficiency provided by the indirect labelling method, suggested that the use of less starting total RNA and/or less cDNA might be feasible. To investigate this possibility, an experiment was performed where a standard reverse transcription reaction was done with the regular amount of RNA (2 µg) and then the single stranded cDNA product was split in half for labelling with both Cy3 and Cy5, and these reactions were hybridised together on array Hp7052b (described in section 3.6.1.2). The array quality was very good as can be discerned from the signal-to-noise ratio in Fig. 3.10, the linear regression and correlation coefficient in Fig. 3.11, and the histogram in Fig. 3.12. Therefore the array quality is not compromised by using half the amount of cDNA for labelling in each channel.

Further titration of the amount of starting RNA using 2 µg, 1 µg or 0.5 µg in a standard reverse transcription reaction and then splitting the ss-cDNA for labelling in both the channels was done. The results for these arrays (HP7053a, Hp7053b, HP7054a) are shown in Fig. 3.10-3.12 also. All of these arrays were of excellent quality and thus it was concluded that as little as 0.5 µg of total RNA Chapter 3 83 could be used in a labelling reaction where only half of the resulting cDNA need be labelled in each channel for the hybridisation. This is very useful when varying amounts of RNA can be extracted from samples, for example in time course experiments where the earlier time points have less bacteria and thus produce a small amount of RNA. A mixture of the cDNA’s resulting from the individual reverse transcription reactions can be used as a reference instead of using a mixture of the RNA samples as a reference which requires twice the number of reverse transcription reactions and twice the amount of RNA sample.

3.5.3. Conclusion (Part 4) These experiments show that as little as 0.5 µg of total RNA labelled with the indirect protocol is required for good hybridisation quality using the H. pylori array. It is likely that even less total RNA may be used considering that half the cDNA produced from an RT reaction of 0.5 µg RNA can produce a reliable signal on the array.

3.6. General Conclusions In conclusion, the indirect labelling procedure developed in this chapter is highly sensitive and produces high quality array hybridisation results, with little incorporation bias. This work has contributed to the development of a complete procedure for transcriptional profiling of H. pylori using DNA microarrays. Other researchers in the Falkow lab have now extended this procedure for transcriptional profiling of other bacterial species including Campylobacter jejuni, Salmonella typhimurium, and Streptococcus pneumoniae. 35 30 25

- to noise ratio 20 635 nm 15 532 nm 10 5 0 Median signal 2µg PG-RNA 2 µg TP-RNA 1 µg TP-RNA0.5 µg TP-RNA HP7052b HP7053a HP7053b HP7054a

Figure 3.10: Median signal-to-noise ratio for arrays hybridised with RNA samples where the amount of RNA used was titrated (2 µg, 1 µg, or 0.5 µg) and labelled with the Indirect protocol. In these arrays the cDNA synthesised in the reverse transcriptase reaction was split in half and each half was labelled in Cy3 or Cy5. The 635 nm values refer to the wavelength of the fluorescent signal emitted from the Cy5 labelled sample, while the 532 nm values refer to the wavelength of fluorescent signal emitted from the Cy3 labelled sample.

Figure 3.11: A-D) Scattergrams representing the relationship between the normalised intensity values from the 635 nm channel (Cy5) on the y-axis and the 532 nm channel (Cy3) on the x-axis of four arrays from the RNA titration experiments (see Figure 3.9) using the indirect labelling protocol. The R2 value and the equation of the linear regression are shown on each plot and the correlation coefficient between the values for the two channels is reported under each scattergram A HP7052b

40000 35000 30000 25000 20000 15000 y = 0.9975x - 36.827 10000 2 5000 R = 0.9804 0 -5000 0 5000 10000 15000 20000 25000 30000 35000

HP7052b Indirect (Correlation coefficient= 0.990)

B HP7053a

40000 35000 30000 25000 20000 15000 y = 1.0664x - 166.06 10000 R2 = 0.9916 5000 0 0 5000 10000 15000 20000 25000 30000 35000

HP7053a Indirect (Correlation coefficient= 0.996)

C HP7053b

30000

25000

20000

15000

10000 y = 1.0071x - 13.405 5000 R2 = 0.9876 0 0 5000 10000 15000 20000 25000 30000

HP7053b Indirect (Correlation coefficient= 0.994)

D HP7054a

45000 40000 35000 30000 25000 20000 15000 y = 1.0668x - 193.74 10000 R2 5000 = 0.9915 0 0 5000 10000 15000 20000 25000 30000 35000

HP7054a Indirect (Correlation coefficient= 0.996) Figure 3.11 half andeachwaslabelledinCy3orCy5(seeTable3.1for the cDNAsynthesisedinreversetranscriptasereactionwassplit (2 µg,1µgor0.5µg)labelledwiththeindirectprotocol.Int arrays werehybridisedwithdifferingamountsofthesameRNAsample ratios forthearraysindicatedinlegendonrightside. Figure 3.12: Frequency Histograms showingthespreadofnormalisedlog Log 2 (R/G) hese arrays HP7054a 0.5 HP7053b 1 HP7053a 2 HP7052b 2 These details). 2 (R/G) µ µ µ g g g µ g

Chapter 4

TRANSCRIPTIONAL PROFILING OF H. PYLORI GROWTH IN BROTH

4.1. Background H. pylori inhabits the harsh environmental niche of the stomach. The ability of H. pylori to live in this acidic environment makes its physiology unique, and much research has focused on understanding the factors that enable this bacterium to survive in the stomach. A number of factors known to be involved in virulence such as, the cag pathogenicity island (cag PAI), motility and the urease enzyme, have been extensively studied and significant advances regarding the regulation of expression of these factors have been made (1, 143). However, little is known about the global mechanisms of gene expression regulation in H. pylori and how this expression is modified to cope with changes in the environment or to facilitate chronic infection within the stomach.

Transcriptional regulation in H. pylori is unique compared to other pathogens, as it possesses relatively few genes encoding transcriptional regulators. This may be due, in part, to the relatively small size of the H. pylori genome, which has only ~1500 predicted open reading frames (ORFs), compared to the ~1740 ORFs predicted for Haemophilus influenzae and ~4290 ORFs in Escherichia coli. Only four genes in H. pylori code for proteins with helix-turn-helix motifs compared to 34 such proteins in H. influenzae and 148 in E. coli (361). In addition only 1/3 of the number of two-component regulatory systems of E. coli are present in H. pylori (361). This apparent lack of regulation may reflect the fact that H. pylori is exposed to few different environments, the stomachs of humans and primates being the only known reservoirs of the bacterium (190). There is, however, evidence that H. pylori uses other mechanisms of regulation. These include slipped-strand mispairing within genes (140) and in putative promoter regions (6), and DNA methylation (190, 361). To date, little is known regarding post- transcriptional or translational control in H. pylori, but evidence from 2D-gel Chapter 4 88 electrophoresis analysis suggests that these exist (144). Finally, the H. pylori genome does not have extensive operon structure. For example, the flagellar regulon is not contained in operons in this organism, which further confounds the apparent lack of regulation (361).

Whole genome expression profiling can be performed using DNA microarrays. This technique has recently been used to profile the global gene expression of numerous model microbial organisms, such as E. coli (286), Caulobacter crescentus (170), Bacillus subtilis (105) and Streptomyces coelicolor (133). However, with the exception of two studies in Streptococcus pneumoniae investigating competence development (288), and the response to an autoinducer peptide (73), few comprehensive expression profiling experiments of pathogenic micro-organisms have been performed. Those studies that have been conducted have concentrated on bimodal gene expression such as iron limitation in Pasteurella multocida (261) and low oxygen tension in Mycobacterium tuberculosis (321). A number of studies have used microarrays to investigate H. pylori transcriptional responses, mainly investigating acid resistance mechanisms (details in Chapter 1). No extensive time-course gene expression profiling of fastidious pathogenic organisms, has been previously reported.

The aims of the current study were to perform comprehensive gene expression profiling of H. pylori grown in vitro over a given time period and to investigate the relationship between the biological measure of motility and the gene expression of the flagellar regulon. Motility was selected for this purpose as it represents an easily measured function to which gene expression can be linked. The methods developed in Chapter 3 were used for these expression profiling experiments. Chapter 4 89

4.2. Experimental Procedures 4.2.1. Time Courses Two individual time course (TC) experiments of H. pylori growth in broth culture were performed. These were conducted on different days using individual cultures for each time point. Plate grown (HBA) H. pylori were used to inoculate Brucella Broth liquid media, supplemented with 10% (v/v) fetal calf serum (GIBCO-Invitrogen) (BBF). These were then incubated in microaerobic conditions with shaking at 37°C for 24 h (starter culture). For the first TC, BBF media was inoculated with the starter culture to an optical density at 600 nm (OD600) of 0.05 and 5 ml aliquots were distributed into 9 x 50 ml tubes one tube for each time point tested (6, 12, 18, 24, 30, 36, 42, 48, and 60 h). For the second TC, 20 ml aliquots were distributed into 8 x 125 ml conical flasks, one for each time point tested (6, 12, 18, 22, 28, 35, 42, and 50 h). RNA was extracted from the remaining time 0 h (T0 h) culture (as described below). For each time point an aliquot was removed from the sample for: OD600 measurement; colony forming unit (CFU) counts; and microscopic visualisation of the culture for assessment of motility and morphology. The remaining culture was passed through a 0.45 µm cellulose acetate filter by vacuum (Millipore) to remove the bacteria from the media. The filter was immediately placed into a 50 ml conical tube, frozen in liquid N2, and stored at -80°C. Total RNA was then extracted from these filters using the method described in Chapter 2.

4.2.2. Preparation and Hybridisation of cDNA probes Each time point sample was labelled and hybridised on separate H. pylori microarrays (described in Chapter 2) together with reference RNA. The reference for both experiments was the T0 h RNA. The labelling procedure used was the indirect labelling protocol that is described in Chapter 2.

4.2.3. Data Analysis Data were collated using the Stanford University Microarray Database (SMD) (320). Spots were excluded from analysis for the following reasons: obvious spot abnormalities; low signal (if the sum of the median intensities for the two Chapter 4 90 channels was = 500); or uneven distribution of pixel intensities in the spot (the standard deviation of pixel intensity ratios > 3.5). The data obtained for the net pixel intensity in each channel of each microarray were normalized by using the default-computed normalisation from SMD (described in Chapter 2). The ratio of the Red (time point sample)/Green (reference) channels for each spot were expressed as log2 (R/G). The data within each time course were normalised by mathematical transformation such that the abundance of each gene’s transcript represented by a given spot was relative to the level of that transcript at the time point at the end of the lag phase (the 6 h time point). Only spots which contained data for > 80% of the arrays were used and duplicate spots for each ORF on the microarray were averaged for analysis. There were 1590 values representing unique genes used for analysis (The complete data set is available in the Supplementary Material, Table S4.1).

4.2.4. Visualisation and Statistical Analysis of the Data The log transformed data were analysed with the CLUSTER program by performing self-organizing map (SOM) analysis and the results displayed using the TREEVIEW program (94) (see details of programs in Chapter 2). Genes whose expression level varied by = 2 fold over the course of both time courses (TCs) were extracted for visualisation. Genes whose average net intensity (above background) across each entire TC was = 500 and whose expression level changed by less than 1.3 fold in both TCs were deemed constitutively expressed. The statistical significance of the major changes observed using the clustering analysis was assessed using the SIGNIFICANCE ANALYSIS OF MICROARRAYS (SAM) program (described in Chapter 2). In brief, SAM performs iterative t-tests between the data for two groups of arrays (assigned by the user), and reports genes whose levels are significantly different between them. For these analyses, the missing data points were first estimated using a K- Nearest-Neighbour imputation, using 10 neighbours (364). Two sets of un-paired two-class SAM analyses were performed on the imputed data, where two time- points prior to the major changes in gene expression in each time-course were assigned to the first group, while two time-points after the switch were assigned Chapter 4 91 to the second group. In each case the data for the first TC was analysed separately from the second TC and only the genes found to be significant in both data sets are reported. For the first analyses to assess changes between mid-log and stationary phases the time points used were: first TC, mid-log = T18 & T24 h and stationary phase = T42 & T60 h; second TC, mid-log = T12 & T18 h and stationary phase = T42 and 50 h. For the second analyses, to assess dramatic changes during the transition from log to stationary phases the time points used were: first TC, mid-late log = T30 & T36 h and early stationary phase = T42 & 60 h; second TC, mid-late log = T18 and T22 h and early stationary phase = T28 & 35 h. The SAM program calculated a list of genes whose transcript levels were significantly increased or decreased between the two groups and produced a false discovery rate (FDR), which is an estimate of the percentage of false positives called. In both analyses a calculated FDR of < 1% was used to assign significance and a two fold cutoff in the change in expression level was imposed. The relative level of significance calculated by the program is also reported (Score, values are correlated with significance) (78, 364).

The expression patterns of genes of interest were plotted over time using the Microsoft EXCEL program. Patterns are representative of both TCs.

4.2.5. RNase Protection Assays (RPAs) RPAs were conducted as previously described (207) in order to validate the ability of the H. pylori microarray to detect significant changes in gene expression. For each gene, 1 µg of total RNA from the 18, 22, 42 and 50 h time points from the second TC was hybridised to its respective antisense riboprobe. Riboprobe templates were generated using the primer pairs listed in Table 4.1 and these produced the following sized templates: 302 bp for flaA, 330 bp for pfr, 293 bp for fecA, 359 bp for frpB, and 261 bp for amiE. In each case the templates were generated by PCR using Taq polymerase and the amplification products were ligated to pGemT (Promega), proper orientation was confirmed, and riboprobes were synthesised using the Maxiscript kit (Ambion), the appropriate RNA polymerase and 50 µCi of [32P]UTP (NEN, Boston, MA), as previously Chapter 4 92 described (207). The products of RNase protection were separated on 5% denaturing polyacrylamide gels and exposed to phosphor-screens (Kodak, Rochester, NY). Quantification and peak analysis of bands was conducted using a PhosphorImager and the ImageQuant program (Molecular Dynamics, Sunnyvale, CA).

4.2.6. Motility Measurements A third TC of H. pylori growth in batch culture was performed in order to quantitatively measure changes in motility concurrent with gene expression changes in the flagellar regulon. This TC was performed as described for the first two TCs except that a single 70 ml BBF culture was inoculated with an overnight starter culture of H. pylori to an OD600 of 0.06. Samples were taken (3-20 ml depending on culture density) at 0, 6, 12, 18, 24, 30 and 36 h time points for OD measurement, RNA extraction and motility tracings.

Motility of 1 µl samples was monitored by live phase contrast microscopy using glass slides and coverslips pre-warmed to 37ºC. A Hammamatsu C2400 video CCD camera was used to record movement in the field of view via an Argus-20 image processor (using the TRACE function) onto S-VHS video. Movement was traced over a 5 sec period. Two sets of selected video frames for each time point were digitized for the generation of time-lapse films with the NIH ObjectImage program. The percentage of motile bacteria at each time was estimated using these films. In addition the lengths of 5-10 individual motility traces were measured for each time point and the curvilinear velocity (CLV) of each of these bacteria calculated in µm/s. The average percent motility and CLV for each time point was plotted over time using Microsoft Excel.

The gene expression profiles from this TC were assessed by microarray as described for the first two TCs. The data for all of the genes involved in flagella structure, biosynthesis, regulation and function were extracted and this data was visualised using the CLUSTER and TREEVIEW programs. The Excel plot containing the motility data was then compared with the transcriptional profile of the flagella regulon for this TC. Chapter 4 93

Table 4.1: Primers used in this study for RPAs.

Primer DNA sequence (5’-3’) amiE-RPA-F GGTTTGCCTGGGTTGGAT amiE-RPA-R GATTTTGCGGTATTTTTG fecA-RPA-F AAGCGCCAATCAGAGCAT fecA-RPA-R TCACACCGCCAAAAACAT flaA-RPA-F CTGACATCGTTCGTTTGA flaA-RPA-R AATCCCTGTGCCTGCTGA frpB-RPA-F CTAACCCTGATGTGAATG frpB-RPA-R ATGCGCGTTTTGATAAGC pfr-RPA-F GCGGCTGAAGAATACGAG pfr-RPA-R CTGATCAGCCAAATACAA

4.2.7. Supplementary Material The following material is available in the supplementary material (see Appendix): Table S4.1: Normalised data set for the first and second TC. Table S4.2: Full list of the genes from the Induced Set indicated in Fig. 4.1B that vary by at least two fold over time in both TCs. Table S4.3: Full list of the genes from the Repressed Set indicated in Fig. 4.1B that vary by at least two fold over time in both TCs.

Chapter 4 94

4.3. Results and Discussion 4.3.1. Reliability of Array Data To assess whether gene expression of H. pylori varied according to the phase of growth and if this corresponded with coordinated gene expression regulation, two independent time course experiments were performed. RNA samples were collected at time points covering the entire growth cycle. The level of RNA transcript at each time point {time point RNA, labelled with Cy5 (red)} was compared to the level of transcript at the 0 h time point {reference, labelled with Cy3 (green)} using an H. pylori DNA microarray (298). The normalised log ratio of the red/green intensities {log2 (R/G)} for each spot were collated and expressed as the level relative to the time point at the end of lag phase, the 6 h time point.

The H. pylori microarray used in this study contained duplicate spots representing each ORF designated in the two sequenced strains 26695 & J99 (298). Duplicate spots provided an internal estimation of array quality and ensured a greater coverage of represented ORFs. Using this approach reliable data for 96% (1590/1660) of the represented ORFs on the array were obtained. A pooled estimate of variance calculation showed that the median variance th between the log2 (R/G) values for duplicate spots was 0.02 (0.089, 95 percentile), indicating a high correlation in the values for each duplicate measurement. This result also indicated that the quality of data obtained from the H. pylori arrays varied little across each array and thus the duplicate measurements for each gene were averaged for further analysis.

The gene expression patterns obtained for the two independent TCs were very similar. To ascertain reproducible growth phase dependent changes, the data obtained from each TC were assessed separately and only the genes which showed consistent expression patterns between experiments were reported.

4.3.2. Gene Expression is Temporally Regulated During H. pylori Growth SOM analysis was used to order genes such that those with similar patterns of expression were grouped together and the resultant order of these groups Chapter 4 95 approximated the time of first induction or repression during the time courses (53). Similar, coordinated gene expression patterns were detected in both TCs. The SOM analysis (Fig. 4.1A) showed that gene expression patterns varied in a time and growth phase dependent manner. Four major expression patterns were observed and are indicated in Fig. 4.1A. In this SOM analysis all genes which passed the filtering criteria (described in the Methods) were included (1590 genes). It is evident that the expression level of many genes did not vary significantly over time (80% of spots varied by < 2 fold). This suggests that a large number of genes were either constitutively expressed or are not expressed at all during batch culture. A gene was considered to be expressed only if the net intensity value in the red channel was = 500. Using this criterion it was found that the average number of genes expressed at any one time point in these TCs was ~40% (data not shown). The genes which were expressed and had the least variance in expression over time in both TCs were considered constitutively expressed genes (Table 4.2). This set of 15 genes includes those that are likely to be involved in homeostasis during culture, such as the central intermediary metabolism genes (hypE and ppk) and the transport and binding genes (narK and proWX). Others are involved in the maintenance of cell structure (dgkA and neuB). Interestingly, a number of genes of unknown function were also constitutively expressed suggesting an important as yet unidentified role for these gene products.

Those genes whose expression level was increased or decreased by at least 2 fold over both TCs are depicted in Fig. 4.1B (325/1590, 20%). Only 40 of these genes changed by = 4 fold over time. The subtle expression patterns observed in the first SOM in Fig. 4.1A were reduced to two prominent patterns of expression in Fig. 4.1B: genes whose level of expression was reduced (Repressed set); and genes whose expression was increased (Induced set) during transition into stationary phase. More genes comprised the Repressed set (64%), than the Induced set (36%). Interestingly, this is a similar result to the time-course analysis of S. coelicolor growth where 80% of the genes analysed did not change substantially over time and the remaining 20% were equally divided into up- and

Figure 4.1: Self organizing maps showing the temporal dependence of gene expression patterns in a time course of H. pylori growth: A) all 1590 genes that passed the filtering criteria and B) genes that were induced or repressed by at least two fold in both time course experiments (325 genes). The major classes of expression patterns are indicated in A and these are reduced to Induced and Repressed genes shown in B. The progression of time in the time course is shown from 6-50 h (blue triangles). The arrow shows the position of the Log-Stat switch. The scale indicates the relative level of expression of each gene, where red indicates induced expression and green repression. A B

6 12 18 22 28 35 42 50 6 12 18 22 28 35 42 50

A: Induced from mid-log to stationary phase Induced

B: Induced in stationary phase Repressed

C: Repressed from mid-log to stationary phase

D: Repressed in stationary phase

Scale: >2 fold 0 >2 fold repression induction

Figure 4.1 Chapter 4 97

Table 4.2: Constitutively expressed genes during broth culture.

UNIQID Symbol NAME Category HP0178 neuB spore coat polysaccharide biosynthesis Cell envelope protein E, sialic acid synthase HP0289 putative toxin-like outer membrane protein, Cell envelope putative (VacA paralog) HP0047 hypE hydrogenase expression/formation Central intermediary protein metabolism HP1010* ppk polyphosphate kinase Central intermediary metabolism HP0821 uvrC excinuclease ABC subunit C DNA metabolism HP0700 dgkA diacylglycerol kinase Fatty acid and phospholipid metabolism HP0313* narK nitrite extrusion protein, putative Transport and binding HP0818* proWX osmoprotection binding protein Transport and binding HP0639 conserved Conserved hypothetical HP0810 conserved Conserved hypothetical HP0952* conserved integral membrane protein Conserved hypothetical HP1066 conserved putative outer membrane protein Conserved hypothetical HP1221* conserved Conserved hypothetical HP0287 unknown H. pylori specific HP0608 unknown H. pylori specific * Genes with the highest expression level in this group (based on an average net intensity in the red channel of >1000). Chapter 4 98 down-regulated genes (133). The full set of genes which change by at least 2 fold are listed in the Supplementary Material in Tables S4.2 (Induced Set) and S4.3 (Repressed Set).

The Repressed set of genes was composed primarily of ribosomal genes, genes involved in DNA synthesis, transcription and translation, genes encoding transport and binding proteins, and genes involved in energy metabolism. The gene showing the greatest reduction in expression over the time course was that for aliphatic amidase (amiE), which was reduced by = 5 fold in both TCs.

The Induced set of genes was more heterogeneous in nature, but included many of the genes known to be involved in virulence. These include the cytotoxin associated gene (cagA), the neutrophil activating protein (napA), a large number of genes coding for outer membrane proteins (OMPs), some regulatory protein genes, and many of the genes encoding stress related proteins, such as the and genes, clpB and dnaK. The gene with the highest induction was the non-heme iron-containing ferritin (pfr) that was induced by = 10 fold in both TCs.

4.3.3. A Major Switch in Gene Expression Profiles Occurs During the Late- Log Phase The gene expression pattern observed in Fig. 4.1B indicated that there was a switch in gene expression during the growth curve (indicated by the arrow), which corresponds with the transition from late log phase to stationary phase. We termed this dramatic shift in gene expression as the “Log-Stat switch”. Prior to this point the expression levels of the effected genes changes little, while after the switch, levels of many genes begin to increase or decrease dramatically. This switch also directly followed a change from maximum motility and spiral-shape morphology to a decline in these characteristics (by microscopic observation at each time point).

The majority of previous studies have assessed the mRNA level of a given gene at just one time point in the growth cycle, comparing different growth conditions Chapter 4 99 or mutants (28, 31). Two recent microarray studies investigated the transcriptional response of H. pylori to acid. These studies used a single time point and showed no overlap between the genes identified (4, 11). This emphasises the difficulty in comparing only a single time point for analysis of global transcriptional changes, particularly when the conditions being compared may cause the bacterium to grow at different rates.

The data presented in the present study show that gene expression patterns in H. pylori can vary dramatically within short periods of time, particularly at the transition between log and stationary phase. This highlights the importance of examining a number of time points during the growth cycle to investigate the kinetic response of transcription to environmental changes or mutations. This study represents the first use of time course experiments for microarray analysis of global transcriptional coordination in H. pylori.

4.3.4. Significance Analysis of the Log-Stat Switch To assess the significance of the observed changes in gene expression levels during the Log-Stat switch, we performed Significance Analysis of Microarrays (SAM). The SAM program identifies genes which have significant changes in expression between the assigned groups of arrays using a series of iterative t- tests. A two-class, unpaired SAM analysis was performed on each TC to determine genes showing significant changes between mid-log and stationary phase. A group of 75 genes were found to be significantly changed in this analysis and these included genes representative of the trends observed in Fig. 4.1B . There were 23 induced genes that included genes involved in virulence; the cag PAI genes cagA and cag1, the neutrophil activating protein napA, the major flagellin flaA, and a number of outer membrane protein genes, omp5, omp29, omp11 and hopA (Table 4.3A). Included in the repressed set of 52 genes were a large number of transcription and translation genes, as well as the urease structural subunit gene ureA and the regulatory genes gppA and spoT (Table 4.3B). Perhaps not surprisingly, it appears that the transition from log to stationary phase is characterised by the repression of many of the genes Chapter 4 100 required for bacterial growth and replication. In contrast apparently non-growth related genes were up-regulated at this time. This suggests that other cellular processes are important in stationary phase. The up-regulation of many key virulence genes may indicate an increase in the organism’s ability to cause disease in this growth phase.

Since the expression levels of some genes changed precisely at the Log-Stat switch, a second two-class, unpaired SAM analysis was performed using the two time points immediately prior to the switch and the two time points directly following the switch. A small number of genes (14 genes) were found to change significantly at this time (Fig. 4.2). Four of these genes were significantly increased in level, while the levels of 10 genes were significantly decreased. All of these genes were also found to change significantly in the first SAM analysis (indicated by * in Table 4.3A&B) except for one gene, prfA (indicated by ** in Table 4.3B). These genes also represent those whose expression changes the most over the entire TC.

The significantly repressed set includes three genes involved in biosynthesis and translation, some key transport and binding protein genes (including two iron uptake genes), and the regulatory gene, gppA. Those genes that had significantly increased expression levels included the iron storage protein gene pfr, and two genes involved in energy metabolism (hydA and hydB). These results suggest that the Log-Stat switch comprises a specific change in physiology that, considering the inclusion of iron homeostasis genes, may be driven by changing levels of iron in the cytoplasm.

4.3.5. Validation of Microarray Results To validate the ability of this H. pylori microarray to determine significant changes in gene expression, RNase protection assays (RPAs) were performed. Five representative genes from those found to change significantly between mid-log and stationary phases of growth were chosen for further analysis (Table 4.3A&B). As shown in Fig. 4.3, transcript levels of flaA and pfr were greatly induced in stationary phase, while transcript levels of amiE, fecA (HP0686) and frpB 18 22 28 35 hydC HP0023 hydB pfr pdxJ frpB fecA amiE aspA HP0318 cons. hyp. prfA gppA glnH HP1173 Scale: >4 fold 0 >4 fold repression induction

Figure 4.2: A hierarchical cluster of the genes found to be significantly induced or repressed at the Log-Stat switch in both time courses by SAM analysis. The data for one time course is shown at the time points used for this analysis comparing expression at T18 and T22 h (shown in orange at the top) with expression at T28 and T35 h (black). The gene names are indicated on the right side (details shown in Table 4.3A&B). The scale indicates the relative level of expression ofeach gene, where red indicates induced expression and green repression. Cons. hyp. is conserved hypothetical protein. Chapter 4 102

Table 4.3A: Genes whose expression was significantly induced between mid-log and stationary phases as assessed by a two-class unpaired SAM analysis.

aTIGR First TC Second TC Symbol Putative function b b no. Score Fold Dc Score Fold Dc Cell envelope HP0227 omp5 outer membrane protein 1.7 2.2 2.5 5.0 HP0229 hopA outer membrane 1.4 2.5 2.3 3.9 protein/porin HP0472 omp11 outer membrane protein 1.9 2.5 1.7 3.1 HP1342 omp29 outer membrane protein 1.6 2.1 2.5 5.1 Cellular Processes HP0109 dnaK chaperone and heat shock 2.7 4.2 2.9 6.8 protein 70kDa HP0110 grpE co-chaperone and heat 1.7 3.5 3.8 6.2 shock protein , 24kDa HP0111 hrcA heat shock regulator 1.3 2.9 3.2 4.7 HP0243 napA neutrophil activating protein 2.3 4.5 4.3 9.3 HP0520 cag1 cag pathogenicity island 1. 2.1 1.9 3.9 protein HP0547 cagA cytotoxicity associated gene 2.3 2.6 2.3 6.2 A HP1006 traG conjugal transfer protein 2.9 3.2 1.6 4.8 Degradation of proteins HP0264 clpB ATP-dependent protease 2.1 2.6 2.2 3.3 binding subunit , heat shock Energy metabolism HP0631 hydA Quinine-reactive Ni/Fe 3.8 4.1 4.4 8.6 hydrogenase, small subunit HP0632* hydB Quinine-reactive Ni/Fe 3.0 3.3 3.8 5.9 hydrogenase, large subunit HP0633* hydC Quinine-reactive Ni/Fe 2.6 2.7 3.8 6.0 hydrogenase, cytochrome b subunit HP1458 thioredoxin thioredoxin, putative 1.7 4.0 2.8 8.8 Motility HP0601 flaA flagellin A 2.2 2.7 2.3 5.4 Transport and binding HP0653* pfr nonheme iron-containing 6.7 13.7 6.5 22.2 ferritin Unknown function HP0023* 3.5 3.5 2.3 3.0 HP0964 2.4 2.9 2.3 3.9 HP0965 2.5 2.8 2.2 3.3 HP0966 1.8 2.1 1.7 2.6 HP1588 1.5 2.4 3.1 4.9 Boxed genes are part of putative operons, a Gene number assigned by TIGR (54), b Relative level of significance assigned by SAM, c fold change between mid-log and stationary phase, * Genes significantly changed in level precisely at the Log-Stat switch and between mid-log and stationary phase, ** Level significantly changed only at the Log-Stat switch.

Chapter 4 103

Table 4.3B: Genes whose expression was significantly repressed between mid- log and stationary phases as assessed by a two-class unpaired SAM analysis.

aTIGR First TC Second TC Symbol Putative function b b no. Score Fold Dc Score Fold Dc Amino acid biosynthesis HP0649* aspA aspartate ammonia-lyase -2.2 2.1 -1.5 2.4 HP0672 aspB aspartate aminotransferase -3.2 3.3 -1.4 2.1 Biosynthesis of cofactors HP1582* pdxJ pyridoxal phosphate -2.3 2.9 -2.6 4.3 biosynthetic protein J HP1583 pdxA pyridoxal phosphate -2.3 2.5 -2.1 3.2 biosynthetic protein A Cell envelope HP1373 mreB rod shape-determining protein -1.7 3.1 -1.2 2.2 HP1429 kpsF polysialic acid capsule -2.4 2.8 -1.4 2.2 expression protein Cellular processes HP0630 mda66 modulator of drug activity -1.6 2.5 -1.6 3.0 Central intermediary metabolism HP0004 icfA carbonic anhydrase -1.9 2.3 -1.3 2.1 HP0073 ureA urease alpha subunit (urea -1.3 2.1 -1.3 2.2 amidohydrolase) HP1532 glmS glucosamine fructose-6- -1.6 2.3 -1.6 2.1 phosphate aminotransferase DNA metabolism HP0213 gidA glucose inhibited division -1.6 2.0 -2.0 3.3 protein HP0259 xseA exonuclease VII, large subunit -2.1 2.4 -1.2 2.0 Energy metabolism HP0294 amiE aliphatic amidase -6.8 16.8 -5.1 14.4 HP1133 atpG ATP synthase F1, subunit -1.1 2.3 -1.3 2.6 gamma Fatty acid and phospholipid metabolism HP0201 plsX fatty acid/phospholipid -1.6 2.8 -1.8 2.7 synthesis protein HP0202 fabH beta-ketoacyl-acyl carrier -1.8 2.6 -1.8 2.2 protein synthase III Regulatory functions HP0278* gppA guanosine pentaphosphate -1.6 2.8 -1.8 2.4 phosphohydrolase HP0775 spoT penta-phosphate guanosine- -1.6 2.0 -1.4 2.2 3'-pyrophosphohydrolase Transcription and translation HP0077** prfA peptide chain release factor -2.3 2.1 -4.7 2.2 RF-1 HP0297 rpl27 ribosomal protein L27 , 50S -1.6 2.6 -1.3 2.9 HP1195 fusA translation elongation factor -1.7 4.2 -2.0 5.5 EF-G HP1196 rps7 ribosomal protein S7 , 30S -1.5 2.7 -1.5 3.5 HP1198 rpoB DNA -directed RNA -1.5 2.9 -2.1 3.6 polymerase, beta subunit Chapter 4 104

HP1199 rpl7/l12 ribosomal protein L7/L12 , -1.4 2.2 -2.0 3.5 50S HP1200 rpl10 ribosomal protein L10 , 50S -1.9 2.3 -2.5 4.3

HP1292 rpl17 ribosomal protein L17 , 50S -1.8 3.1 -2.1 3.5 HP1293 rpoA DNA -directed RNA -1.8 4.2 -2.5 5.5 polymerase, alpha subunit HP1294 rps4 ribosomal protein S4 , 30S -1.6 3.6 -1.7 4.0 HP1295 rps11 ribosomal protein S11 , 30S -1.3 3.1 -1.3 2.8 HP1296 rps13 ribosomal protein S13 , 30S -1.6 2.5 -1.5 2.7

HP1299 map methionine amino peptidase -2.1 2.9 -2.3 4.1 HP1300 secY preprotein translocase subunit -2.1 4.0 -3.1 6.4 HP1302 rps5 ribosomal protein S5 , 30S -1.7 2.5 -2.3 4.0 HP1304 rpl6 ribosomal protein L6 , 50S -1.3 2.0 -1.7 3.6 HP1307 rpl5 ribosomal protein L5 , 50S -1.4 2.0 -2.0 4.0 HP1309 rpl14 ribosomal protein L14 , 50S -1.8 2.8 -2.2 4.3 HP1312 rpl16 ribosomal protein L16 , 50S -1.6 2.8 -2.9 4.8 HP1313 rps3 ribosomal protein S3 , 30S -1.9 3.0 -2.6 7.1 HP1319 rpl3 ribosomal protein L3 , 50S -2.2 2.5 -2.2 4.4 HP1555 tfs translation elongation factor -1.1 2.0 -1.5 2.6 EF-Ts Transport and binding HP0140 lctP L-lactate permease -1.6 2.1 -2.1 2.9 HP0686* fecA iron(III) dicitrate transport -3.9 4.7 -3.3 5.3 protein HP0876* frpB iron-regulated outer -4.6 6.5 -4.7 9.6 membrane protein HP1170 glnP glutamine ABC transporter, -1.8 2.3 -1.4 2.0 permease protein HP1172* glnH glutamine ABC transporter, -2.4 2.9 -2.1 3.9 periplasmic protein HP1400 fecA iron(III) dicitrate transport -1.2 2.6 -2.0 4.2 protein Unknown function HP0184 -1.1 2.2 -1.1 2.3 HP0719 -1.1 2.4 -1.9 3.2 HP1124 -1.3 2.6 -1.2 2.7 HP1173* -2.0 3.6 -1.3 2.3 HP0318* -2.1 2.7 -1.0 2.1 HP1490 -1.5 2.6 -1.2 2.1 Boxed genes are part of putative operons, a Gene number assigned by TIGR (54), b Relative level of significance assigned by SAM, c fold change between mid-log and stationary phase, * Genes significantly changed in level precisely at the Log-Stat switch and between mid-log and stationary phase, ** Level significantly changed only at the Log-Stat switch. Chapter 4 105

(HP0876) were repressed. Thus, RPA analysis confirms the ability of the H. pylori microarray to detect changes in gene expression.

4.3.6. The H. pylori Regulatory Genes The apparent coordinated temporal regulation of gene expression found in these time courses suggests that transcriptional regulators may be involved. Little is known of the mechanism of action of the predicted regulatory proteins of H. pylori, with the exception of the HspR, HrcA and Fur repressor proteins which are well described (76, 131, 336). In the present study SAM analysis revealed that hrcA expression was significantly induced while expression of gppA and spoT, encoding known regulatory proteins, were significantly reduced during the transition from log to stationary phase.

The spoT gene of H. pylori has a high degree of similarity to other spoT genes involved in the production of guanosine-3’-diphosphate-5’-diphosphate (ppGpp), while the gppA gene controls the synthesis of pppGpp. These nucleotides are known to mediate the stringent response in other bacterial species (44, 46). In these systems the stringent response has been shown to be involved in diverse cellular processes, such as sporulation and virulence. These functions include both positive and negative regulation of various factors involved in adaptation to metabolic signals (46). It was previously believed that H. pylori exhibits a relaxed phenotype indicating no classical stringent response (309). However, the presence of both the spoT and gppA genes and their significant regulation during the growth curve, suggests that H. pylori at least is able to produce the ppGpp and pppGpp nucleotides and thus they are likely to be involved in some kind of metabolic regulatory response. Future investigation of the growth phase regulation of the levels of ppGpp and pppGpp in H. pylori may help elucidate the function of these nucleotides and should reveal whether H. pylori is in fact able to undergo a stringent response.

4.3.7. Operon Structure and Gene Regulation In most bacteria the expression of genes encoded in multicistronic units is co- ordinately regulated. There are relatively few of these operonic structures in H. 18 22 42 50 h

flaA

pfr

amiE

fecA

frpB

Figure 4.3: Independent validation of microarray results by RNase protection assay. Relative levels of transcript for each of the indicated genes were assessed using antisense riboprobes as described in the Experimental Procedures. Clear patterns of expression for Log phase (T18 and T22 h) and Stationary phase (T42 and T50 h) are evident and support data obtained via H. pylori microarray. Chapter 4 107 pylori in comparison with other bacteria, which further confounds the relative scarcity of regulatory proteins in H. pylori (361). Those genes found to be significantly regulated in this study that are likely to be contained in operons, are boxed in Table 4.3A & B. Among these are those coding for the stress related genes, dnaK, grpE and hrcA (DnaK operon), and a set of H. pylori specific genes of unknown function, HP0963-HP0966.

The expression pattern of the genes in the DnaK operon and another stress related operon, HspR, consisting of the genes cbpA, hspR, and orf (HP1026) are shown in Fig. 4.4A. As expected, this analysis shows that genes within each of these operons have highly related expression profiles. The DnaK operon showed a biphasic pattern of expression where levels increased following the Log-Stat switch, subsided briefly and then increased again in stationary phase. In contrast, the expression levels of the three genes in the HspR operon showed only the spike in expression level after the Log-Stat switch.

The transcriptional profiles of the two operons are supported by their proposed transcriptional regulation. Both the HspR and HrcA proteins are transcriptional repressors that act on s80-dependent promoters (131). The promoter of the HspR operon is negatively autoregulated by the HspR protein (131). In contrast, the promoter of the DnaK operon is negatively regulated by both the HrcA and the HspR proteins (131). This difference in promoter activity may explain the monophasic (HspR operon) versus biphasic (DnaK operon) expression profiles of these two operons (Fig. 4.4A). Thus, from these examples it can be seen that the transcriptional profile data from the present study may provide some insight into the control of operon structures in relation to growth phase.

Interestingly, three of the four genes in the putative operon HP0964-HP0966 are significantly induced in stationary phase and the expression profiles of all the genes in the operon are shown in Fig. 4.4B. All of these genes are induced 2-4 fold in stationary phase which suggests a growth phase specific function for these gene products. Since this operon appears to be co-regulated with known A 3 2.5 dnaK 2 1.5 grpE (R/G)

2 1 hrcA 0.5 Log cbpA 0 hspR -0.5 0 10 20 30 40 50 HP1026 -1 -1.5 Time (h) B 2.5

2 HP0963 1.5 (R/G)

2 HP0964 1 HP0965 Log 0.5 HP0966 0 0 10 20 30 40 50 -0.5 Time (h)

Figure 4.4: Line graphs showing the change in expression level {log2 (R/G)} of selected genes over time (h). A) The heat shock operons: DnaK (dnaK, grpE and hrcA) and HspR {cbpA, hspR, and orf (HP1026)}. B) An operon containing H. pylori specific genes of unknown function. The legend on the right in each case indicates the names of the genes plotted. Chapter 4 109 virulence genes, it is possible that these gene products may also be important in this process.

4.3.8. Expression of Virulence Factors and the Log-Stat Switch The microarray data in this study revealed that many of the known virulence factors of H. pylori such as napA, cagA, flaA, and pfr exhibit peak expression levels in the late log or stationary phase of growth. To date, little is known about the coordinated transcriptional expression of these virulence factors and how this may relate to infection and pathogenicity.

The expression levels of two of these, the neutrophil activating protein napA, and the cytotoxin associated gene cagA, were both significantly induced over time in these time course studies. The level of gene expression of the cagA gene began to increase at the Log-Stat switch and continued to increase over the entire period of growth sampled (Fig. 4.5A). The CagA protein is a major effector molecule of the cag PAI which encodes a type IV secretion apparatus and this set of genes are considered one of the most important elements in H. pylori virulence (21, 45). Despite this, little is known about the functions of the individual proteins encoded by the cag PAI or the transcriptional control of these genes. In the present study, the majority of the genes in the cag PAI did not change significantly during the growth curve. Only one other cag PAI gene, cag1, was found to be significantly regulated during the transition from log to stationary phase (Table 4.3A). The cag1 gene has been shown to be unnecessary for the function of the type IV secretion apparatus (105). Interestingly, another gene, traG, thought to encode part of a different type IV secretion apparatus (308) is also significantly induced after the Log-Stat switch, possibly suggesting a redundant function for this gene product in CagA secretion (Table 4.3A).

It was also observed that the expression levels of two genes previously unrelated to the cag PAI, omp5 and omp29, closely followed the expression profile of the cagA gene (Fig. 4.5A; correlation 0.97). Omp5 and omp29 are duplicated genes encoding outer membrane proteins, (also known as hopM/N) and are predicted to be porins (5). The close co-expression of the cagA gene with the omp5/29 genes A 3.5

3

2.5

2 omp5

(R/G) 1.5

2 cagA 1 omp29 Log 0.5

0 0 10 20 30 40 50 -0.5 Time (h) B 5 4 3 2 pfr (R/G)

2 1 frpB 0 amiE Log -1 0 10 20 30 40 50 fecA -2 -3 -4 Time (h)

C 4 3.5 3 2.5 hydA hydB (R/G) 2 2 1.5 hydC

Log 1 napA 0.5 0 -0.5 0 10 20 30 40 50 Time (h)

Figure 4.5: Line graphs showing the change in expression level {log2 (R/G)} of selected genes over time (h). A) The co-expression of the cagA and omp5/29 genes; B) Some key iron homeostasis genes; and C) Selected iron co-factored genes. The legend on the right in each case indicates the names of the genes plotted. Chapter 4 111 may suggest that these outer membrane proteins are involved in CagA secretion and/or activation.

The expression of napA begins to increase after the Log-Stat switch and then levels out in late stationary phase (Fig. 4.5B). Dundon et al. (30) have previously demonstrated that the HP-NAP protein accumulates in stationary phase in normal growth conditions. HP-NAP is important for pathogenesis as H. pylori induced gastritis is characterised by infiltration of neutrophils and monocytes into the gastric mucosa (194). The HP-NAP protein induces neutrophil adhesion to endothelial cells, directing these cells to the gastric mucosa, and stimulates NADPH-oxidase which in turn induces the release of oxygen radicals (98). This results in tissue damage causing the release of nutrients, which promotes H. pylori survival (104). H. pylori can protect itself from the toxic effects of the released oxygen radicals by producing superoxide dismutase and catalase enzymes (27). Interestingly in the current study, both the sodB gene and the catalase-like gene, HP0485, (data not shown) are shown to be induced at the same time or directly following the induction of napA. As discussed earlier the stress related DnaK operon was also found to be up-regulated at this time and may also be necessary for protection of the organism from oxidative stress.

Some pathogenic bacteria such as Salmonella serotype typhimurium have been shown to be most virulent in the late log phase of growth (184). In S. typhimurium this has been attributed to the peak in expression of the type III secretion apparatus, one of the major aspects of the virulence determinants in Salmonella sp. (184). Based on the results of the current study we would predict that H. pylori may be most virulent in the late log phase of growth. Based on the findings of the current study, it has been established that H. pylori in the late log phase of growth are most efficient in delivering CagA protein and inducing cell elongation in AGS cells (M. Amieva, personal communication). This is considered to be one measure of virulence in this organism (312). These observations may suggest that the Log-Stat switch does indeed correspond with an increase in virulence attributes. Chapter 4 112

4.3.9. Iron Homeostasis Regulation The regulation of iron homeostasis is very important in bacterial pathogens as the host sequesters available iron from the tissues as a defence mechanism (265). H. pylori has an extensive ability to scavenge iron that may contribute significantly to its virulence as infection has been linked with iron deficiency anaemia (22). A large proportion of the genes whose expression changed dramatically at the Log-Stat switch have previously been shown to be involved in iron uptake and storage (98, 389), or encode proteins that are iron co-factored (86, 189). During log phase the putative iron-uptake genes: fecA (HP0686), an iron (III) dicitrate transport protein; and frpB (HP0876), an iron-regulated outer membrane protein, were expressed at a maximal level (Fig. 4.5C). The expression of these genes was significantly repressed during the Log-Stat switch (Fig. 4.3 & Table 4.3B). Interestingly, another gene previously unrelated to iron uptake, amiE (HP0294) encoding an aliphatic amidase (328), had a very similar pattern of expression to these two iron-uptake genes (correlation coefficient of 0.95) (Fig. 4.5C). In contrast the non-heme iron-containing ferritin gene, pfr, which codes for the major iron-storage protein had the opposite pattern of expression at the Log-Stat switch (Fig. 4.5C). The expression of this ferritin in H. pylori has been shown previously to accumulate in stationary phase during normal growth and our expression data would support this finding (86). A number of the known iron co-factored protein genes also had an increased level of expression at this time such as, the quinone-reactive Ni-Fe hydrogenase subunit genes hydA-C and napA (which has been shown to bind iron) (86). (Fig. 4.5B). Interestingly, the expression level of the major iron-dependent regulator gene, fur, did not change significantly over time (data not shown). The net result of this switch appears to be the cessation of iron uptake and the storage of excess iron in the cytoplasm in order to prevent iron toxicity, as well as the expression of proteins which require iron as a co-factor. This tight relationship between the expression levels of these particular iron uptake genes and the pfr gene during the growth cycle have not been previously reported and indicates the utility of microarray expression studies in discovering new relationships between genes. Chapter 4 113

4.3.10. Motility and the Corresponding Expression of the Flagella Regulon Since it was observed that motility appeared to be regulated in relation to the Log-Stat switch in the first two TCs, a third TC was conducted and used to assess gene expression and to quantitate motility. As was observed for the first two TCs, the percentage of motile bacteria in the culture peaked prior to the transition from log to stationary phase growth (Fig. 4.6A). In contrast the average curvilinear velocity (CLV, µm/s) of the bacteria was found to drop during log phase (T12 h) and then to peak in early to mid-stationary phase and finally to drop dramatically in late stationary phase (Fig. 4.6A). This data is in agreement with a similar study where motility was measured over the growth curve (387). This study showed that during log phase when the bacteria were dividing the CLV was minimal but by stationary phase when all division was complete the bacteria reached peak speeds (387).

To understand the role of individual genes in the observed motility, we correlated the expression of the genes of the flagella regulon with motility. The hierarchical cluster of all of these genes (Fig. 4.6B) indicates that there are temporal changes in flagellar gene expression and that these are clustered into groups of genes which roughly approximate their proposed flagellar gene “class”. In S. typhimurium the transcriptional regulation of bacterial flagella has been shown to be composed of an intricate network of temporally regulated genes, which are organized into three classes of genes. These are class 1, the “early” gene complex, class 2 and class 3, the “late” flagellar gene complex (142). A similar network of flagellar gene transcription appears to be present in H. pylori. However, since the flagellar regulon in this organism is not organized in an operon structure, many of the components of this system are yet to be elucidated (142). The predicted class for some of the flagellar genes is indicated in Fig. 4.6B (142, 338). The predicted class 1 genes (blue) show quite diverse patterns of expression in this study, suggesting that there may be more than one means of regulation for these genes. In contrast the class 2 genes (orange), including the minor flagellin, flaB, are all down-regulated after transition into stationary phase. The class 3 genes (purple), including the major structural component of the

Figure 4.6: The changes in H. pylori motility and flagella gene expression over time. A) Plot showing the changes in the percentage of motile bacteria in the culture and the changes in curvilinear velocity (CLV, µm/s) of the motile bacteria over time (h). B) A hierarchical cluster showing the expression of the flagellar regulon over the same time points indicated in A. The gene names are shown on the right side along with colour coding showing the predicted class of each gene (blue- class 1, orange- class 2, purple- class 3, orange & purple– class 2&3). R indicates genes which are predicted to be regulators. The three clusters containing the majority of the genes in each class (1, 2, 3 and 2&3) are shown on the far right. A 100 12 80 10 8 60 % motile m m/s) 6 CLV 40 4 20 CLV ( % motile bacteria 2 Time (h) 0 0 6 12 18 24 30 36 B flgD fliE fliR flhB flaG fliI pflA Class 1 genes motB flgR R flgG fliG fliF fliY fliQ Secreted HP1462 fliA R flgE fliM flgK flaB Class 2 genes fla fliH flhA R Secreted HP0232 atoS R flgB flgH fliN flgI fliD Secreted HP1192 flhB_2 fliP R flagellin Class 3 and flgE’ 2/3 genes fliS fliP R flgM flaA hpaA Figure 4.6 Chapter 4 115 flagella, flaA, are expressed at a high level from log into stationary phase. The expression of this class of genes appears to be correlated with the highest CLV for this TC (Fig. 4.6A). Finally, two genes which are predicted to belong to both class 2 and 3, flgM and a flagellin gene (flaG), were expressed at an intermediate level between these classes.

Using this temporal data it may be possible to predict the class of other genes in the flagellar regulon which have not yet been assigned. For example, the expression of flaA and the hpaA gene were closely correlated (Fig. 4.6B) The H. pylori flagellum is covered by a flagellar sheath, encoded by the hpaA gene, which is thought to be necessary to protect it from gastric acid (190). This close correlation of expression between the flaA and the hpaA genes detected in the current study has not been previously reported. The promoter region of hpaA has been shown to contain a putative s70 sequence but no apparent s28 sequence (138). Thus regulation of the expression of these two genes may not be directly related, especially considering that in flaA/flaB knockout mutants, the flagellar sheath is still produced (141). Another example is the expression of the flagella hook homolog flgE’ (HP0908) which appears to be regulated in a similar fashion to the class 3 genes (Fig. 4.6B).

4.4. Conclusion This global expression profiling experiment has highlighted the particular advantage of time-course analysis for illuminating previously unknown programmed physiological processes in H. pylori. Through the investigation of coordinated expression profiles the importance of a number of genes of unknown function have been inferred, including a number of constitutively expressed genes. In addition, we have shown that the transition from log-phase to stationary phase growth in H. pylori is particularly important in the regulation of iron homeostasis, motility and virulence gene expression. Although, a somewhat simplistic view, these data may suggest that the late log phase corresponds with the most virulent phase of growth and thus may be intimately related to its pathogenesis. It also suggests that the ability of H. pylori to withstand conditions Chapter 4 116 of stress, such as iron limitation, may vary depending on growth phase. This possibility should be further investigated.

Chapter 5

TRANSCRIPTIONAL ANALYSIS OF H. PYLORI IN CO-CULTURE WITH MDCK CELLS

5.1. Background Due to gradients of cell types, mucus and acid, the environment of the stomach is extremely complex. It is hypothesised that H. pylori is exposed to the acidic lumen of the stomach only temporarily and that the majority of colonising bacteria reside in the relatively neutral mucus covering the epithelial cells and in the crypts (307). There are also a portion of H. pylori cells that adhere to the epithelial cells (124). This suggests that there may be a number of different microclimates in which H. pylori may be found which relate to the region of the stomach and the relative distance from the epithelial cells. The physiology of bacteria in each of these environments is likely to be different from each other and to those in in vitro environments. In addition, H. pylori causes chronic infection, a factor that would suggest that a steady-state physiology of both epithelial cells and bacteria may develop to maintain this long term infection.

The previous chapter outlined the use of microarrays in conjunction with in vitro broth culture time courses to investigate gene expression profiles in H. pylori. These experiments demonstrated that the growth phase of the bacterium dramatically affects its physiology in terms of motility, virulence gene expression, transcription and translation. This is an excellent approach for both the investigation of gene regulation and co-ordination of expression in bacterial cultures and for predicting function for previously unknown genes, particularly through the comparison of growth under different environmental conditions. However, broth culture is an artificial environment which may bear no relation to the in vivo situation. To gain a thorough understanding of the host-pathogen interaction, in vivo studies are essential. The ability to detect H. pylori gene expression during infection could provide insight into the disease process. To date however, technical difficulties have hindered such investigations. For Chapter 5 118 example in an animal model of infection the amount of H. pylori RNA in an infected stomach sample is extremely small as compared with the amount of mammalian RNA derived from the tissue. Further, bacterial RNA cannot be sufficiently purified from mammalian RNA, which clearly affects the ability to specifically detect mRNA from H. pylori in a mixed sample using microarray analysis.

Cell culture techniques have been used extensively in H. pylori research as a method of mimicking the in vivo environment. These in vitro techniques have been utilised to gain an insight into the response of the pathogen to host cells and vice versa. Cell lines derived from gastric and colon adenocarcinomas such as AGS and Caco-2 cells have been used to study the pathogenic effect of H. pylori on host cells in relation to a number of factors including a particular strain’s ability to induce vacuolation, IL-8 secretion and cellular elongation (21, 89, 312, 317). Recently more extensive investigations into the host cells’ response to infection have been performed using human cDNA microarrays to analyse the global gene expression of these cells (20, 49, 66, 122, 185, 214). Although the use of cell lines has revealed much about the effects of some of H. pylori’s virulence factors, various problems exist in the interpretation of these results in light of in vivo colonisation and pathogenic mechanisms. This is partly due to the complex nature of the gastric environment as compared with cell culture of a single cell type and because the exact environmental conditions present in the stomach are impossible to mimic in vitro. This is exacerbated by the fact that the use of different cell lines in infection studies can produce very different results. For example it has been shown that induction of vacuolation by different vacA genotypes is strongly cell line dependent (211). Additionally the use of cancer cell lines may provide an artificial view of host response with little relation to the in vivo situation. Although few studies to date have used primary gastric epithelial cell culture these cell lines are likely to provide a better perspective of the in vivo host response to infection (245). Chapter 5 119

To date, no studies have utilised cell culture techniques to detect the opposite side of the story, which is the global gene expression response of H. pylori to infection of host cells. H. pylori infection of the AGS cell line is the most well studied system for investigating host-pathogen interactions. These cells induce adherence of H. pylori. Although the physiology of these adherent bacteria is of interest, this system has been found to be unsuitable for investigation of H. pylori gene expression profiles due to difficulties in extracting pure bacterial RNA of sufficient quantities for downstream detection methods such as microarray studies (306). In addition, most previous cell culture studies have been performed over a time frame of up to 24 h only, due to the fact that AGS cells die after this length infection (20, 49). Thus the availability of a cell line to which H. pylori does not adhere in large numbers and on which H. pylori can be cultured for long periods would offer specific advantages. Madin-Darby Canine Kidney (MDCK) cells is such a cell line. Growth of H. pylori on MDCK cells results in <1% of bacterial adherence. In addition co-culture of H. pylori with these cells also allows the study of longer term infections. Thus, this cell line provides an opportunity to investigate the response of non adherent H. pylori to the presence of mammalian epithelial cells.

Although the in vivo environment cannot be entirely mimicked with this type of co- culture condition, it does allow investigation of whether the presence of mammalian cells affects the physiology of the bacteria. It has been hypothesised that the metabolism of H. pylori cells in co-culture may be altered due to competition for nutrients such as essential amino acids and iron. In addition H. pylori respiration may change due to the close proximity of the bacterium to the respiring epithelial cells which may result in an increase in the concentration of

CO2, oxygen radicals and other metabolites in the media. It is also possible that signalling between the bacterium and the MDCK cells may affect chemotaxis and motility as well as virulence gene expression. Finally co-culture may induce the expression of certain regulatory proteins, such as the two-component systems which have not been found to be induced under other in vitro conditions. Chapter 5 120

Thus the aim of this chapter was to detect specific gene expression responses of H. pylori grown in co-culture with MDCK cells over time. Chapter 5 121

5.2. Experimental Procedures Previous investigations have shown that a single culture of the H. pylori G27 strain could be maintained in co-culture with MDCK cells for an extended period of time (up to 4 months) (M. Amieva, personal communication). H. pylori grown in this way was shown not to require microaerophilic conditions for growth, but for long term survival to require that the tissue culture medium be supplemented with Brucella Broth (BB). Also the co-culture required regular replenishment of the growth media.

As part of the current study (this thesis) preliminary studies were performed to investigate the ability to maintain, G27 and SS1 H. pylori strains, in co-culture with MDCK cells and to determine the ideal procedure for using these co-culture grown bacteria in time course experiments (section 5.2.1). In addition a reliable method for harvesting the bacterial cells from co-culture for RNA extraction was developed (section 5.2.2).

Given that the G27 strain had been characterised extensively in the co-culture model in terms of its growth pattern and ability to deliver CagA when used in other cell culture systems, this strain was initially used in the time course experiments comparing gene expression in co-culture to control cultures (section 5.2.3.1). Over the time period of the experiment the genes considered to be specifically induced in co-culture were determined.

In addition to determining the effect of co-culture on the gene expression of the G27 strain, the effect of co-culture of the SS1 strain on the same cell line (MDCK) was investigated. The expression of genes considered to be specifically induced in co-culture were determined (section 5.2.3.2). A schematic representing the procedure for the data analysis of these experiments is shown in Fig. 5.1.

5.2.1. Co-culture and Maintenance of H. pylori and MDCK cells Plastic tissue culture dishes (9 mm) were seeded with ~106 MDCK cells (Dr. W. James Nelson, Stanford University) in 10 ml DMEM/FCS (see Chapter 2) and incubated at 37°C in 5% CO2 atmosphere until confluent. A starter culture of the G27 strain grown in:

Co-culture Ä

BBF alone DMEM/BB/FCS DMEM/BB/FCS with alone dead MDCK cells Ä Ä Ä

SAM 1: Compare gene expression patterns SAM 2: Compare gene expression patterns to Effect of extract genes specifically Effect of media MDCK induced in co-culture differences (56%) cells (44%)

SS1 strain grown in:

Co-culture Ä x2

SAM 3: Genes Genes induced in significantly induced both G27 and SS1 or repressed during co-culture log phase

Figure 5.1: Schematic representing the procedure used to analyse the expression profiles obtained from the microarray results for the G27 and SS1 co-cultures and control time courses. Chapter 5 123 appropriate H. pylori strain was grown in Brucella Broth with 10% FCS (BBF) media for 24 h. The confluent MDCK cells were washed once with DMEM before inoculation with 10 ml DMEM/FCS media supplemented with 10% BB (DMEM/FCS/BB) containing ~108-109 bacteria ml-1 from the starter culture. For the first 2-3 days post inoculation, until the H. pylori infection was well established, 5 ml of the growth medium was removed and 5 ml of new DMEM/FCS/BB media was added to refresh the media. Subsequently every 24 h the growth medium was completely removed from the co-culture (this H. pylori culture was either discarded or used to seed time course experiments described below) after which the cell monolayer was washed once in DMEM to remove the majority of H. pylori cells, and 10 ml of new DMEM/FCS/BB media was added. The co-culture was maintained in this way for 2-8 weeks.

5.2.2. Harvesting of H. pylori Cells from Co-culture with MDCK Cells Growth medium from the H. pylori co-culture with MDCK cells was collected in the following manner: The H. pylori cells settled on the MDCK cells were first dislodged by pipetting up and down and then the growth medium (containing the majority of the H. pylori cells, but no MDCK cells) was removed from the culture dish using a pipette, leaving behind the MDCK monolayer. This growth medium was immediately applied to the filtration apparatus (described in Chapter 2 which was used for all RNA extractions from broth culture) to collect the H. pylori cells on a cellulose filter which was then snap-frozen. The RNA was extracted from these filters for use in microarray analysis using the established procedure described in Chapter 2.

5.2.3. Inoculation of MDCK Cells for Time Course Experiments 5.2.3.1. G27-MDCK co-culture time courses A starter co-culture of 9 x 107 CFU/ml of G27 grown with MDCK cells for 4 wks was used to inoculate individual dishes of MDCK cells for the time course study. Twenty four hours prior to use, the growth media in the starter co-culture had been replenished (see section 5.2.1 for description of co-culture). In parallel with this co-culture time course, three control time courses inoculated with 6-9 x 107 Chapter 5 124

CFU/ml of the same starter culture were grown in different mediums for comparison with the co-culture TC. The first control consisted of broth culture media alone (BB with 10% FCS, BBF), the second control was co-culture media alone (DMEM/FCS/BB) and the third control was growth in co-culture media (DMEM/FCS/BB) on a monolayer of killed MDCK cells. These cells were killed using formaldehyde fixation. For fixation, five confluent 20 mm dishes of MDCK cells were washed once in DMEM and then 5 ml 2% formaldehyde in 100 mM phosphate buffer pH 7.4 was added to each dish and incubated at room temperature for 15 min. The fixative was then removed, the cells washed six times in PBS and once in DMEM. All four TC’s were performed in 20 mm culture dishes with an individual dish for each time point. All dishes were grown without shaking in a C02 incubator set at 37°C and 5% CO2. Twenty ml samples were collected at 5, 11, 20, 23, 30, 35 and 50 h, with the exception of the TC grown on fixed MDCK cells where the T35 and T50 h time points were not harvested. At each time point colony counts were conducted and RNA extracted for microarray analysis. A pool of all the RNA samples from all four TCs was used for the reference RNA for these microarray hybridisations.

5.2.3.2. SS1-MDCK co-culture time courses For the preliminary investigations of the transcriptional response of the SS1 strain in co-culture, two independent time course experiments of SS1 growth on MDCK cells were performed and were named SS1-A and SS1-B. A starter culture of the SS1 strain was maintained in co-culture with MDCK cells for 2 weeks. Twenty four hours prior to use this culture was washed and the media replenished (see section 5.2.1 for description of co-culture). For SS1-A, seven 9 mm dishes of confluent MDCK cells were each inoculated with 2 x 107 CFU/ml of this SS1 starter culture in DMEM/FCS/BB. These dishes were incubated at 37°C in 5%

CO2 and the H. pylori growth from each plate was harvested as described in 5.2.2 for RNA extraction at 6, 12, 18, 24, 30, 39, & 47 h post inoculation. For SS1-B the same procedure was followed using a similar SS1 starter culture, except that 20 mm dishes of confluent MDCK cells were used to increase the volume of each sample from 10 ml to 20 ml and the time points examined were 6, Chapter 5 125

12, 18, 24, 30, 36, 41 and 51 h. At each of these time points the bacterial cells were harvested and the RNA extracted as detailed in section 5.2.2.

These RNA samples were then used in microarray analysis using a pool of the cDNAs produced from every time point within the relevant TC as the reference for the array hybridisations.

5.2.4. Data Analysis 5.2.4.1. G27-MDCK co-culture time course analysis For the four G27 TCs (section 5.2.3.1), the data from the microarrays were collated as described in Chapter 4 except that in these time courses no mathematical transformation was performed because all the arrays were hybridised to the same reference sample and thus were directly comparable. The expression profiles during log phase in all four TC’s were compared by a two- class un-paired SAM analysis using a False Discovery Rate (FDR) <1% (described in Chapter 4) and a cutoff of at least 1.5 fold change in expression level between groups. The first analysis compared the co-culture TC to the broth grown control TC (SAM1), and the second compared the co-culture TC to the two control time courses grown in tissue culture media alone or in tissue culture media with fixed MDCK cells (SAM2) (Fig. 5.1). Those genes significantly induced or repressed in co-culture compared to these two controls were deemed to be a specific response of the H. pylori cells to the presence of live mammalian cells.

5.2.4.2. SS1-MDCK co-culture time course analysis The data obtained from the microarrays for SS1-A and SS1-B were collated as described in Chapter 4 and in both cases normalised by mathematical transformation to the 6 h time point since different reference samples were used for these two TCs. The growth curves and cluster analysis (using self organising maps, SOM) for SS1-A and SS1-B were compared to the results obtained from the two regular broth culture time courses described in Chapter 4 and to the G27 co-culture TC. Chapter 5 126

5.2.4.3. Comparison of genes induced in G27 and SS1 co-culture A one-class SAM analysis using an FDR <1% of the arrays from SS1-A and SS1- B was performed (SAM3, Fig. 5.1) to find the set of genes significantly induced or repressed in these cultures over the time points falling approximately in log phase, 12, 18, 24, and 30 h. The results of this analysis were compared to the gene set found to be specifically induced or repressed in the G27 strain in live co- culture during log phase as compared with the control cultures (Fig. 5.1).

5.2.5. Supplementary Material The following material is available in the supplementary material (see Appendix): Movie S5.1: Movie showing the movement of H. pylori cells infecting MDCK cells maintained for two weeks in co-culture. Movie S5.2: Movie showing the movement of H. pylori cells infecting MDCK cells maintained for 8 weeks in co-culture. Table S5.1: Results from the SAM1 analysis comparing expressing in co-culture to static broth culture. Table S5.2: Results from the SAM2 analysis comparing expressing in co-culture to growth in media alone and in the presence of fixed MDCK cells. Table S5.3A: Results from the SAM3 analysis showing genes which were significantly induced log phase in the co-cultures SS1-A and SS1-B. Table S5.3B: Results from the SAM3 analysis showing genes which were significantly repressed in log phase in the co-cultures SS1-A and SS1-B. Table S5.4A: Normalised averaged data for the G27 time courses used in this analysis. TC6 is the live co-culture, TC7 is the culture in media alone, TC8 is culture in static BB, and TC9 is culture with Fixed MDCK cells. Table S5.4B: Normalised averaged data for the SS1-A (TC4) and SS1-B (TC5) time courses used in this analysis. Chapter 5 127

5.3. Results 5.3.1. Maintenance of H. pylori in Co-culture with MDCK Cells Two strains of H. pylori, G27 and SS1 were grown in co-culture with MDCK cells for use in time course transcriptional analysis. Both strains of H. pylori required 2- 3 days to fully adapt to continual culture on the MDCK cells. During this time, changes such as the appearance of the cell/cell junctions in the MDCK cells were observed. The cell junctions of a two week old G27 co-culture, 12 h after media replenishment, can be seen in the video capture in Fig. 5.2A. After an extended time (2-4 wks) in co-culture the MDCK cells became irregularly shaped and elongated (Fig. 5.2B). In addition, numerous cell vacuoles became apparent. Due to the static nature of these cultures, there were both H. pylori cells motile in the media and also in clumps near the monolayer surface. This is shown in Movie S5.1 in the Supplementary Material for Chapter 5. It is unclear whether the clumped bacteria are dead or simply non-motile but some of these were coccoid forms and a small proportion (< 1%) were seen to adhere to the MDCK cell surface.

After 1-2 wks of continuous co-culture, both G27 and SS1 cultures were found to grow in a reproducible manner and reach a peak in CFU/ml every 24 h after media replenishment. The CFU/ml measurements taken before and after media replenishment in a typical G27 co-culture for 8 days are shown in Fig. 5.3 (courtesy of M. Goodrich). The co-cultures could be maintained in this way for up to four months. G27 cultures grown in this way were found to adapt to growth on a fresh dish of MDCK cells very quickly if inoculated at about 108 CFU/ml and thus could be used to inoculate parallel dishes for time course experiments. SS1 cultures also adapted to inoculation of new MDCK cells although this process was less reproducible.

5.3.2. RNA Extraction from H. pylori Grown in Co-culture As part of this thesis a novel method was developed to harvest H. pylori cultures from MDCK co-culture. Using this technique, only MDCK cells which had A

B

Figure 5.2: A) One frame from a video capture showing a two-week old co-culture of H. pylori with MDCK cells. The regular shape of the MDCK cells in the monolayer is evident (open arrow indicates a cell/cell junction). The black shapes covering the cells are H. pylori (closed arrow shows one typical H. pylori cell) (100 x magnification). A movie showing the movement of the H. pylori cells in this co-culture is in the Supplementary Material (Movie S5.1). B) One frame from a separate video capture showing a 2 month old co-culture with irregular shaped MDCK cells (open arrow shows a cell/cell junction) which contain numerous vacuoles (striped arrow). The black shapes covering the cells are H. pylori cells (Movie S5.2). (400 x magnification) (Movies courtesy of M. Amieva, Stanford University). 1.00E+09

1.00E+08 Log CFU/ml 1.00E+07

1.00E+06 0 24 48 72 96 120 144 168 192 Time (h)

Figure 5.3: Line graph showing the level of CFU/ml of H. pylori grown in co-culture with MDCK cells during each 24 h period for 8 days, just prior to (black data points) and immediately after (white data points) washing the monolayer and replenishing the co-culture media. (Data courtesy of M. Goodrich, Stanford University)

1 2 3 4

1 kb

0.5 kb

Figure 5.4: Three separate samples (lanes 2-4) of total RNA extracted from H. pylori cells grown in co-culture with MDCK cells run on a 1% agarose gel. Lane 1 contains a 1kb DNA ladder, 1 kb and 0.5 kb size bands are indicated. The black arrows indicate the 23S and 16S ribosomal bands respectively. Chapter 5 129 sloughed off the plate surface were harvested along with the bacterial culture. In the time course experiments these contaminating MDCK cells were few in number as the MDCK cells were infected for only 6-50 h. Three typical RNA samples extracted from MDCK co-culture are shown in Fig. 5.4. As can be seen from this figure no mammalian ribosomal RNA bands were present thus indicating the success of this technique in obtaining pure bacterial RNA. Bacterial RNA extracted in this way was used to detect gene expression profiles in microarray analyses.

5.3.3. G27 Growth in Co-culture A typical growth curve for G27 co-culture is shown in Fig. 5.5. The G27 cells in this co-culture showed a fairly typical bacterial growth curve pattern with distinct lag and exponential phases of growth. However, instead of a stationary phase, there appeared to be a dramatic reduction in live bacteria indicating that under these conditions the death phase may directly follow the exponential phase. Overall, G27 co-cultures were found to have reproducible patterns of growth in co-culture with MDCK cells (data not shown).

5.3.4. Comparison of Gene Expression Profiles of H. pylori Grown in Co- culture and Regular Broth Culture The gene expression profile of the G27 co-culture TC (Fig. 5.5) was analysed and compared to the profiles obtained for the regular broth culture time courses TC1 and TC2 described in Chapter 4. The profile of gene expression in the co- culture TC appeared to be less time dependent than in the broth TCs as witnessed by the fact that far fewer genes appeared to increase or decrease steadily over time. The variance in gene expression levels over time in the co- culture was less than in the broth TCs (Table 5.1). Only 4.6% of the genes in co- culture varied in expression level by 2.8 fold over the 50 h TC compared to 14% in the two broth cultures which were 50 and 60 h long, respectively.

Inspection of the genes which changed by 2.8 fold in these three TCs revealed that 24 genes were common to all three TCs (Fig. 5.6A). 16 of these genes appeared to have fairly similar patterns of expression in both co-culture and in Chapter 5 130 broth culture, including 7 genes involved in translation. The remaining eight genes (indicated by a bracket in Fig. 5.6A) increased in expression during the entire broth culture, while in co-culture the expression level of these genes peaked at around 11-20 h, corresponding to mid-log phase and then decreased. Interestingly included in this set of genes are the flaA and cagA genes.

In addition there were 35 genes whose expression level changed by 2.8 fold over time in the co-culture, but not in the broth cultures. These are shown in Fig. 5.6B. Interestingly this group included 15 genes of unknown function, 3 genes involved in DNA metabolism, as well as the transport and binding proteins feoB, hpn and copP.

There were many differences in the growth conditions used for regular broth culture and those used in co-culture. In regular broth culture, H. pylori was grown in BBF media in a microaerophilic atmosphere with constant shaking. In contrast, in co-culture H. pylori was grown in DMEM/FCS/BB in a 5% CO2 incubator at atmospheric levels of O2 with no shaking. Thus, a series of parallel time course experiments of G27 co-culture with control cultures grown in the same conditions were performed to investigate the specific transcriptional response of H. pylori to the presence of MDCK cells.

5.3.5. Specific Transcriptional Response of H. pylori in Presence of MDCK Cells The G27 co-culture growth curve, as compared with that for three culture controls, is shown in Fig. 5.7 (named Co-culture). The curves obtained for the Co-culture and the control TCs grown in tissue culture medium (Media alone) and with killed MDCK cells (Fixed MDCKs), were very similar to each other with an average generation time of ~4.5 h. The curve for growth in broth culture media alone (Broth media) however differed substantially with a generation time of 13.7 h. In addition the broth grown culture only reached a maximum of 2.4 x 108 CFU/ml while the other three time courses reached a maximum of 6-10 x 109 CFU/ml. This difference is likely to have resulted from the different media used Chapter 5 131

Table 5.1: Variance in gene expression levels over time in broth versus MDCK co-culture growth.

TC1 (broth) TC2 (broth) G27 co-culture Fold change No. genes No. genes (% No. genes (% (Maximum – total- 1646) total-1648) minimum) 4 fold 86 (5.4%) 97 (5.8%) 10 (1%) 2.8 fold 228 (14%) 225 (14%) 75 (4.6%) 2 fold 710 (45%) 539 (33%) 481 (29%) 1.4 fold 1375 (87%) 1369 (83%) 1375 (83%) 1 fold 1580 (100%) 1646 (100%) 1647 (100%) The data used for comparison between the indicated TCs is shown in bold 1.0 x 1010

1.0 x 109

1.0 x 108 Log CFU/ml

Time (h) 1.0 x 107 0 20 40 60 Figure 5.5: Line graph showing log CFU/ml over time of a G27 co- culture time course grown on MDCK cells.

st nd A 1 TC- 2 TC- Co-culture B Co-culture broth broth

tsaA HP0080 HP0310 hpn cagA gtp1 flgM flhF flaA flaB dnaK copP HP0920 HP0245 msrA HP0968 amiE HP0938 pdxJ HP0015 frpB rpl9 ureA fabI htrA def fusA HP1449 rps7 HP0273 secG HP0639 HP0721 HP1142 HP0720 tnpB moaA tlpB HP0745 rps2 xerD efp HP1520 rpl3 cag4 tufA dpnA hopC HP1080 Figure 5.6: Hierarchical clusters showing the HP1579 feoB expression patterns of the genes whose level virB4 changes by = 2.8 fold over time in: A) all three ackA st nd MBOIIR TCs: regular broth {1 TC (blue triangle), 2 TC topA_3 (pink triangle)} and G27 co-culture (green nusA triangle), and B) in the G27 co-culture TC only. HP0081 HP0986 The triangles represent progression of time from type III rest. the beginning to end of each TC. The purple bracket in A indicates genes whose expression pattern in the two broth TCs were different from in the co-culture TC. Rest. is restriction enzyme. Chapter 5 133 for the broth TC (BBF) as compared with the DMEM/FCS/BB media used in the other three TCs given that all other parameters were identical.

The gene expression levels and profiles of this H. pylori co-culture was compared to that in the controls. The microarrays for all time points in all four G27 TCs were hybridised against the same reference sample allowing the data from these arrays to be directly compared. The cluster in Fig. 5.8 shows that the gene expression profiles are relatively time dependent. This figure also indicates that although there appears to be similar overall expression profiles between these TC’s, there are also clear differences. To detect these differences SAM analyses were performed.

The three log phase time points (T20, T23 and T30 h) from each of the four time courses were used in the data analysis for the comparison of gene expression levels. Since the growth curve observed for the culture grown in broth media differed from that in the other three time courses, the gene expression levels in the co-culture were first compared directly to this broth culture TC using SAM analysis (SAM1 in Fig. 5.1). This analysis was conducted to assess the dependence of the gene expression profiles on the media type used for these two TCs. In this analysis, 188 genes were induced in the co-culture conditions versus broth culture while 37 genes were significantly induced in broth culture. Comparison of these 225 genes to that identified by SAM2 analysis, used to identify the genes specifically induced or repressed in co-culture (discussed below), showed that 99 genes (indicated in blue text in Fig. 5.9) were identified in both SAM analyses (Fig. 5.1). This suggests that about half (44%) of the differences in gene expression between co-culture and broth culture may be due to the presence of the MDCK cells. The other 126 genes (56%) are probably differentially expressed due to the growth medium used and the different growth rates for the two TCs and so will not be discussed (See Supplementary Material Table S5.1 for full set of genes).

In order to eliminate the complication of different growth mediums and growth rate, the second SAM analysis, compared the gene expression levels in co-

Figure 5.7: Growth curves of four G27 cultures grown in parallel without shaking in a 5% CO2 incubator: co-culture with MDCK cells (Co-culture); co-culture media alone (Media alone); BBF media alone (Broth Media); & co-culture media in the presence of killed MDCK cells (Fixed MDCKs). Curves are plotted by log CFU/ml counts over time.

Figure 5.8: A hierarchical cluster of the self organizing map showing the temporal dependence of gene expression changes in the four parallel G27 time courses of all genes that passed the filtering criteria with duplicates averaged. The progression of time in each time course is shown, from 5-50 h in the co- culture TC (green triangle), the TC of growth in DMEM/FCS/BB alone (pink triangle), the TC of growth in BBF media alone (yellow triangle), and from 5-30 h in the TC of growth in DMEM/FCS/BB with fixed MDCK cells (light blue triangle). Profiles indicate the extent of gene expression relative to the common reference.

The log2 (R/G) values are represented according to the colour scale shown as the bottom. Red shades represent an increase and green shades represent a decrease in the level of hybridising cDNA in each sample. Black indicates no detectable changes and grey represents missing data. The time points indicated in blue are those used to compare gene expression levels in these time courses using SAM. 1.0 x 1011

1.0 x 1010 Co-culture Media alone 1.0 x 109 Fixed MDCKs Log CFU/ml Broth Media 1.0 x 108

1.0 x 107 0 20 40 60 Time (h) Figure 5.7

Co-culture Fixed MDCK Media alone Broth media

5 11 20 23 30 35 50 5 11 20 23 30 5 11 20 23 30 35 50 5 11 20 23 30 35 50

Scale: >2 fold repression 0 >2 fold induction

Figure 5.8 Chapter 5 135 culture directly with the one grown in the same tissue culture media alone, and the one grown in the presence of fixed MDCK cells (SAM2 in Fig. 5.1). Of these only four genes were found to be reduced in level in the co-culture compared to the two control time courses, while 122 genes were induced specifically in co- culture (Fig. 5.9). The putative function of a selection of these is shown in Table 5.2 and the pathways in which some may function are indicated in Fig. 5.10, 5.11 & 5.12. The range of the fold change in expression level of the significantly different genes, between co-culture and the two controls was small (1.3 to 2.4 fold) with an average of 1.6 fold. Only those genes which had at least a 1.5 fold change in level between the two groups are reported in Table 5.2 and Fig. 5.9.

The set of genes induced specifically in the presence of live MDCK cells included genes involved in a variety of functions. A large number of these included genes involved in translation (26), energy metabolism (15), amino acid transport and biosynthesis (5), or were genes of unknown function (29) (Table 5.2 and Supplementary Material Table S5.2 for full set of genes). Induced genes of particular interest were those involved in energy metabolism such as genes encoding some of the enzymes of the citric acid cycle, porB, acnB, and frdA (Fig. 5.10) and the respiratory chain (Fig. 5.11) such as: atpA, atpC, and atpG, which function in ATP-proton motive force interconversion; nuoC, nuoG, and nuoN, parts of the proton-translocating NADH-quinone oxidoreductase complex (NDH-

1); petB, part of the ubiquinol:cytochrome c oxidoreductase (bc1 complex); and fixP, part of the terminal oxidase (cbb3-type oxidase). The frdA, bisC and porB genes may also be involved in anaerobic metabolism (indicated with a circle in Fig. 5.10 & 5.11). Also included were genes involved in transcriptional regulation (the response regulators, HP1021 and HP0166 and gppA), in iron uptake {fecA, frpB, exbB, and tonB (Fig. 5.12)} a number of genes encoding cell envelope proteins (omp5/29, omp32, babA, lpp20, lpxD, and flhF), and one gene possibly involved in immune evasion, gcpE.

Only a very small number of genes, four, were found to be significantly repressed in co-culture compared to the controls (Fig. 5.9A). Two of these genes were H.

Figure 5.9: A hierarchical cluster showing the gene expression levels at the time points 20, 23 and 30 h in the three parallel G27 time courses, co-culture (Live- green box), growth in DMEM/FCS/BB alone (DMEM-pink box) and growth in DMEM/FCS/BB on fixed MDCK cells (FIXED-light blue box), of the genes found in the SAM analysis to be significantly repressed (A) or induced (B-G) in co- culture compared to the controls grown in DMEM and with fixed MDCKs. Gene expression levels are displayed relative to the common reference. The log2 (R/G) values are represented according to the colour scale shown as the bottom; rows correspond to individual genes whose names are indicated on the right of each cluster (function recorded in Table 5.2) and columns correspond to successive time points. Red shades represent an increase and green shades represent a decrease in the level of hybridising cDNA in each sample. Black indicates no detectable changes and grey represents missing data. Gene names in blue indicate those which were also found to be significantly different in co-culture compared to the parallel broth culture during the same time points. Con. is conserved hypothetical protein; reg. is regulator; put. is putative; sec. is secreted protein.

LIVE FIXED DMEM LIVE FIXED DMEM LIVE FIXED DMEM 20 23 30 20 23 30 20 23 30 20 23 30 20 23 30 20 23 30 20 23 30 20 23 30 20 23 30 A HP0710 D G ompR-like valS lysA HP0318 con. hemL HP0460 glyA HP0880 HP0642 put. bisC E nuoC B HP1440 secD HP0838 frpB HP0307 lpp20 nuoN dnaB HP1457 mpr rpl36 C rpoB HP1455 map exbB rpl14 flhF ureI nifU-like putP fabG rpl16 HP1454 babA omp32 gppA rps12 frpB atpG omp29 eno nusG pyrC HP1173 HP0035 con. HP0721 cpdB tlpA reg. HP1021 HP1479 rps7 HP0232 sec. HP0233 con. fusA HP1570 con. atpC gmd HP0773 HP1466 con. rpoA HP0655 OMP infC rps6 F fliM HP0269 con. porB htrA omp5 rps5 rps1 rps4 gcpE HP1507 con. lpxD rpl5 rpl7/l12 rps3 putA secY dapD folD fixP HP0137 hdhA fecA HP0138 con. JHP0585 put. HP1430 con. secA icfA petB rps13 flgG nrdA frr fecA clpX tonB tlpB nuoC fabB rpl1 atpA thrB efp rps2 frpB prfA hemB HP0118 frdA nifS acnB tfs ureA rpl11 accC ureB purA fabH HP0080 plsX mreB ferredoxin Scale: >2 fold 0 >2 fold repression induction Figure 5.9 Chapter 5 137

Table 5.2: A selection of the genes identified by the SAM2 analysis as being differentially expressed in live co-culture compared to the two same media controls TCs.

Genes repressed in Co-culture compared to controls Gene TIGR Full name Putative Scorec Fold a b Name no. Function Dd OMP HP0710 putative outer membrane protein Cell envelope 3.3 1.7 valS HP1153 valyl-tRNA synthetase Translation 4.1 1.5 unknown HP0460 3.9 1.7 unknown HP0880 3.5 1.9 Genes induced in Co-culture compared to controls Gene TIGR Full name Putative Scorec Fold a b Name no. Function Dd glyA HP0183 serine hydroxymethyltransferase Amino acid -3.0 1.8 biosynthesis dapD HP0626 tetrahydrodipicolinate N- Amino acid -2.6 1.6 succinyltransferase biosynthesis thrB HP1050 homoserine kinase Amino acid -2.6 1.5 biosynthesis nifS HP0220 synthesis of [Fe-S] cluster , putative Amino acid -2.5 1.6 AMINOTRANSFERASE biosynthesis OR cofactor synthesis lysA HP0290 diaminopimelate decarboxylase (dap Amino acid -2.1 1.5 decarboxylase) biosynthesis folD HP0577 methylene-tetrahydrofolate Biosynthesis of -2.6 1.5 dehydrogenase cofactors bisC HP0407 biotin sulfoxide reductase OR N- or S- Biosynthesis of -2.6 1.6 oxide oxidoreductase cofactors OR anaerobic respiration hemL HP0306 glutamate-1-semialdehyde 2,1- Biosynthesis of -2.4 1.7 aminomutase cofactors hemB HP0163 delta-aminolevulinic acid dehydratase Biosynthesis of -2.3 1.5 cofactors icfA HP0004 carbonic anhydrase Central intermediary -2.5 1.5 metabolism hdhA HP1014 7-alpha-hydroxysteroid dehydrogenase Central intermediary -2.4 1.5 metabolism ureI HP0071 urease accessory protein , urea Central intermediary -2.1 1.8 transporter metabolism ureB HP0072 urease beta subunit (urea Central intermediary -2.1 1.9 amidohydrolase) metabolism ureA HP0073 urease alpha subunit (urea Central intermediary -2.1 1.9 amidohydrolase) metabolism lpxD HP0196 UDP-3-0-(3-hydroxymyristoyl) Cell envelope -3.1 1.7 glucosamine N-acyltransferase (lipid A biosynthesis) babA HP1243 outer membrane protein Cell envelope -2.9 1.5 flhF HP1035 flagellar biosynthesis protein Cell envelope -2.8 1.5 flgG HP1585 flagellar basal-body rod protein Cell envelope -2.8 1.7 lpp20 HP1456 membrane-associated lipoprotein Cell envelope -2.7 1.7 omp5 HP0227 outer membrane protein Cell envelope -2.6 1.7 mreB HP1373 rod shape-determining protein Cell envelope -2.6 1.7 omp32 HP1501 outer membrane protein Cell envelope -2.5 1.8 gmd HP0044 GDP-D-mannose dehydratase Cell envelope -2.4 1.5 omp29 HP1342 outer membrane protein Cell envelope -2.3 1.5 fliM HP1031 flagellar motor switch protein Cell envelope -2.1 1.6 tlpB HP0103 methyl-accepting chemotaxis protein Cellular processes -2.7 1.9 Chapter 5 138

tlpA HP0099 methyl-accepting chemotaxis protein Cellular processes -2.2 1.5 dnaB HP1362 replicative DNA helicase DNA metabolism DNA metabolism -2.8 1.6 nrdA HP0680 ribonucleoside-diphosphate reductase 1 DNA metabolism -2.5 1.7 alpha subunit nuoC HP1262 NADH-ubiquinone oxidoreductase, NQO5 Energy metabolism -3.1 1.5 nuoN HP1273 NADH-ubiquinone oxidoreductase, Energy metabolism -3.0 1.5 NQO14 fixP HP0147 cytochrome c oxidase, diheme subunit, Energy metabolism -2.8 1.5 membrane-bound petB HP1539 ubiquinol cytochrome c oxidoreductase, Energy metabolism -2.7 1.7 cytochrome b subunit atpA HP1134 ATP synthase F1, subunit alpha Energy metabolism -2.5 1.8 porB HP1111 pyruvate ferredoxin oxidoreductase, beta Energy metabolism -2.4 1.8 subunit ferredoxin HP0277 ferredoxin energy metabolism Energy metabolism -2.4 1.5 atpC HP1131 ATP synthase F1, subunit epsilon Energy metabolism -2.4 1.6 atpG HP1133 ATP synthase F1, subunit gamma Energy metabolism -2.3 1.6 nuoG HP1266 NADH-ubiquinone oxidoreductase, NQO3 Energy metabolism -2.2 1.5 subunit acnB HP0779 aconitase B Energy metabolism -2.1 1.6 frdA HP0192 fumarate reductase, flavoprotein subunit Energy metabolism -2.1 1.6 putA HP0056 delta-1-pyrroline-5-carboxylate Energy metabolism -2.1 1.5 dehydrogenase gcpE HP0625 protein E, 1-hydroxy-2-methyl 1-2-(E) - Isoprene -2.7 1.5 butenyl 4-diphosphate synthase biosynthesis secA HP0786 preprotein translocase subunit Protein and peptide -2.3 1.5 secretion secD HP1550 protein-export membrane protein Protein and peptide -2.2 1.5 secretion gppA HP0278 guanosine pentaphosphate Regulation -2.7 1.5 phosphohydrolase HP1021 response regulator Regulation -2.5 1.6 HP0166 response regulator Regulation -2.1 1.6 infC HP0124 translation initiation factor IF-3 Translation -3.3 1.8 prfA HP0077 peptide chain release factor RF-1 Translation -3.2 1.7 fusA HP1195 translation elongation factor EF-G Translation -3.1 2.4 clpX HP1374 ATP-dependent protease ATPase subunit Translation -2.4 1.6 tfs HP1555 translation elongation factor EF-Ts Translation -2.1 1.5 efp HP0177 translation elongation factor EF-P Translation -2.1 1.6 exbB HP1339 biopolymer transport protein Transport and -3.1 1.6 binding fecA_3 HP1400 iron(III) dicitrate transport protein Transport and -3.0 1.7 binding tonB HP1341 siderophore-mediated iron transport Transport and -2.8 1.7 protein binding fecA_1 HP0686 iron(III) dicitrate transport protein Transport and -2.7 1.5 binding frpB_1 HP0876 iron-regulated outer membrane protein Transport and -2.6 1.7 binding frpB_4 HP1512 iron-regulated outer membrane protein Transport and -2.5 1.6 binding frpB_2/3 HP0916 iron-regulated outer membrane protein Transport and -2.1 1.5 binding a Refers to the ORF number assigned by Tomb et al. (361); b Putative function assigned by Tomb et al. and by comparison with the literature; c The score refers to the level of significance assigned by the SAM program; d The fold change refers to the level of change in expression level between the two groups compared in the SAM analysis (assigned by the SAM program). Chapter 5 139 pylori specific and of unknown function. One was a putative outer membrane protein and the other was the tRNA synthetase gene, valS.

5.3.6. SS1 Growth and Gene Expression in Co-culture A preliminary analysis of SS1 growth and gene expression in co-culture was performed for comparison with the G27 co-culture results. The growth curves for the two time courses of SS1 growth on MDCK cells are shown in Fig. 5.13. The curves do not appear to follow a normal pattern of in vitro bacterial growth. In both SS1-A and SS1-B there is little transition between the lag, exponential and stationary phases of growth. The shape of the curves for SS1-A and SS1-B also differ substantially from each other and from the G27 co-culture TC shown in Fig. 5.5. The growth curve of G27 in co-culture was both more reproducible and more typical of bacterial growth curves than the SS1 co-culture (SS1-A & B). G27 grew more quickly and obtained a higher CFU/ml value (~1010 CFU/ml) as compared to the SS1 co-cultures (~108 CFU/ml).

Despite the apparent lack of discernible growth phases in these two time courses, the gene expression patterns appear to be similar to each other (Fig. 5.13). This figure indicates that, as was observed for the G27 co-culture TC, the gene expression profiles in SS1-A and SS1-B change in a fairly time dependent manner. However, the gene expression levels in SS1-A and SS1-B do not change significantly over the TC. The number of genes whose expression varies by more than 2.8 fold over the entire TC is < 1% in both SS1-A and SS1-B as compared with 14% in regular broth culture TCs and 5% in the G27 co-culture. These preliminary experiments of SS1 growth in co-culture indicate that the growth conditions of this strain may need to be optimised before its gene expression profiles in response to co-culture are further analysed. For example the amount of inoculum, the volume of the co-culture and the percentage of BB supplementation may require adjustment to improve the growth parameters of this strain. porB HP0642-putative

acnB

ferredoxin

frdA

Figure 5.10: Schematic representing the putative enzymatic pathways for pyruvate metabolism and the citric acid cycle in H. pylori. Enzymes are denoted by numbers: 1) Pyruvate:flavodoxin oxidoreductase (porB); 2) flavodoxin:NADP oxidoreductase (HP0642-putative); 3) Acetate kinase; 4) phosphotransacetylase; 5) citrate synthase; 6) aconitase (acnB); 7) isocitrate dehydrogenase; 8) 2-oxoglutarate:ferredoxin oxidoreductase; 9) ferredoxin:NADP oxidoreductase; 10) malate dehydrogenase; 11) fumarase; 12) fumarate reductase (frdA). The circle is incomplete and the exact mechanism for oxaloacetate generation is unknown. Enzymes 2 & 9 are putative at present. The genes for the proteins indicated in bold were induced in the co-culture TC compared to the controls in the present study. The genes circled may be involved in anaerobic metabolism. {Figure modified from Kelly, 1998 (149)}

Figure 5.11: Schematic diagrams representing putative enzymatic pathways involved in the A) respiratory chain and B) other downstream metabolism pathways. A) The hypothetical arrangement of the major components of the respiratory chains of H. pylori is shown at the top. Reduced substrates (DH) are oxidised (to D) via membrane-bound or membrane-associated dehydrogenases. Integral membrane oxidoreductases include an NDH-1 complex (Nuo) and hydrogenase (Hya also termed Hyd). Peripherally associated oxidoreductases include malate:quinone oxidoreductase (Mqo). Reducing equivalents from all these substrates reduce the sole quinone, menaquinone-6, in the lipid bilayer of the inner membrane. Menaquinol reduces the trimeric cytochrome bc1 complex

(Pet), which in turn reduces periplasmic cytochrome c553 (c). Cytochrome c is reoxidised by the sole terminal oxidase of the Cco (or Fix) types, cytochrome cbb3. Cytochrome c may also be reoxidised by hydrogen peroxide in the periplasm through the activity of cytochrome c peroxidase (CCP). Fumarate reductase (Frd) catalyses electron transfer from menaquinol to fumarate. The reactions catalysed by CcO, CCp, FrdA, Mqo, and Nuo show substrate/product conversions. The other arrows indicate directions of electron transfer. B) Some of the downstream effects of the Proton Motive Force (PMF) are shown. In both A and B the genes for the proteins induced in the co-culture TC compared to the control TC in this study are boxed and in bold. The gene names circled may be involved in anaerobic metabolism. {Diagram modified from (150)}. A RESPIRATORY CHAIN:

nuoC nuoG nuoN

petB

fixP

frdA B PMF Or other dehydrogenases: ATP synthase: putA atpA atpC atpG Solute transport: putP, secA, secD, secY Iron uptake Oxidative stress: ATP sodB, katA production Iron in Motility: cytoplasm flhF, flgG, fliM Chemotaxis: tlpA, tlpB, ureA, ureB, ureI Figure 5.11 üü ü üü

TonB ü

ExbD ExbBü

Figure 5.12: Schematic representation of the putative enzymatic pathways for iron uptake and storage in H. pylori. –Fe iron restricted conditions; +Fe iron-replete conditions; OM outer membrane; PP periplasm; CM cytoplasmic membrane. The genes for the proteins indicated with a tick were induced in the co-culture TC compared to the controls in the present study. {Figure modified from van Vilet et al. 2002 (372)} Chapter 5 143

5.3.7. Genes Induced in Both G27 and SS1 Co-culture A final analysis was performed to elucidate which of the genes found to be significantly induced or repressed in the G27 co-culture TC, as compared with the two same media parallel controls (Table 5.2), were also induced or repressed in the SS1 co-culture time courses SS1-A and SS1-B (SAM3 in Fig. 5.1). A one- class SAM analysis of the data from SS1-A and SS1-B was performed to find which genes were significantly induced or repressed over the time points T12, T18, T24, and T30 in both SS1-A and SS1-B. There were 121 genes found to be induced while 131 genes were repressed. Of these genes 21 overlapped with the group of genes significantly induced by at least 1.5 fold in G27 co-culture compared to the controls (Table 5.3). These genes included a number of iron uptake genes, the response regulator HP1021, omp32, the chemotaxis gene tlpA, and the gcpE gene.

Figure 5.13: Growth curves for two time courses, SS1-A and SS1-B, of SS1 co- culture with MDCK cells, grown without shaking in a 5% CO2 incubator. SS1-A was grown in 9 mm dishes with 10 ml media and SS1-B was grown in 20 mm dishes with 20 ml media.

Figure 5.14: A hierarchical cluster of the self organizing map showing the temporal dependence of gene expression changes in the SS1 MDCK co-cultures, SS1-A and SS1-B of all genes that passed the filtering criteria with duplicates averaged. Profiles indicate the extent of gene expression relative to the 6 h time point in each time course. The log2 (R/G) values are represented according to the colour scale shown at the bottom; rows correspond to individual genes and columns correspond to successive time points. Red shades represent an increase and green shades represent a decrease in the level of hybridising cDNA in each sample. Black indicates no difference in expression in sample and reference cDNAs and grey represents missing data. The progression of time in each time course is shown from 6-47 h in SS1-A (blue triangle) and from 6-50 h in SS1-B (pink triangle).

100 7

10

Log cfu/ml x 10 ¨ SS1-A ¾ SS1-B 1 0 10 20 30 40 50 60 Time (h) Figure 5.13

SS1-A SS1-B

Scale: >2 fold repression 0 >2 fold induction

Figure 5.14 Chapter 5 145

Table 5.3: Genes specifically induced in SS1 and G27 co-culture.

Gene TIGR no.a Full name Putative functionb name dapD HP0626 tetrahydrodipicolinate N-succinyltransferase Amino acid biosynthesis bisC HP0407 biotin sulfoxide reductase Biosynthesis of cofactors OR anaerobic respiration ureB HP0072 urease beta subunit Central Intermediary metabolism omp32 HP1501 outer membrane protein Cell envelope tlpA HP0099 methyl-accepting chemotaxis protein Cellular processes gcpE HP0625 protein E, 1-hydroxy-2-methyl 1-2-(E) -butenyl 4- Isoprene biosynthesis diphosphate synthase putative JHP0585 putative 3-hydroxyacid dehydrogenase Energy metabolism putative HP0642 NAD(P)H-flavin oxidoreductase Energy metabolism response HP1021 response regulator Regulatory functions regulator glyS HP0972 glycyl-tRNA synthetase, beta subunit Translation metG HP0417 methionyl-tRNA synthetase Translation fecA_1 HP0686 iron(III) dicitrate transport protein Transport and binding proteins fecA_3 HP1400 iron(III) dicitrate transport protein Transport and binding proteins frpB_1 HP0876 iron-regulated outer membrane protein Transport and binding proteins frpB_2/3 HP0916 iron-regulated outer membrane protein , Transport and binding proteins putative I frpB_4 HP1512 iron-regulated outer membrane protein Transport and binding proteins exbB HP1339 biopolymer transport protein Transport and binding proteins tonB HP1341 siderophore-mediated iron transport protein Transport and binding proteins unknown HP0118 unknown HP1454 unknown HP1455 a Refers to the ORF number assigned by Tomb et al. (361); b Putative function assigned by Tomb et al. and by comparison with the literature.

Chapter 5 146

5.4. Discussion Described in this chapter is the time course analysis of H. pylori growth in co- culture with MDCK cells. The effect of the presence of these mammalian cells on H. pylori transcriptional regulation was investigated in an effort to gain a preliminary understanding of the possible behaviour of H. pylori in vivo. Comparisons of the expression profiles of bacteria grown in co-culture with those grown in control situations revealed a specific set of genes which were induced under co-culture conditions. Some of these genes were induced in both the H. pylori strains, G27 and SS1, during co-culture.

5.4.1. Advantages of H. pylori Co-culture with MDCK Cells Previous studies in the field of Helicobacter research have required H. pylori cultures to be grown on solid or liquid media for 2-3 days prior to use in a planned experiment (10). A frequent problem with this procedure is that each time they are cultured, H. pylori cultures often grow at varied rates and thus standardisation is particularly difficult. Adding to this difficulty is the fact that each H. pylori strain may grow at a different rate (10). These issues can be particularly problematic when experiments need to be repeated or when multiple strains are being used for comparison. The co-culture technique described in this chapter represents a major advance. Using this technique H. pylori cultures can be grown and maintained continually on the same dish of MDCK cells for a period of 2-4 months. These co-cultures, washed and replenished with new media every 24 h, can be grown in a reliable and reproducible way (Fig. 5.3). Thus multiple co- cultures can be “synchronised” in their growth and are immediately available for use in further experiments such as the time course experiments outlined in this Chapter. The ability to maintain these co-cultures for up to 4 months is most probably due to the fact that H. pylori infection encourages increased turn-over of the MDCK cells, thus replenishing the monolayer. After this time however there is significant ageing of the MDCK monolayer which eventually causes it to degenerate. Chapter 5 147

This MDCK co-culture technique has not been reported previously. Only one other technique for in vitro H. pylori co-culture has been previously described to our knowledge (60). Cottet et al. described an intricate technique for growing H. pylori in co-culture with Caco-2 cells on filter supports (60). In this technique, the basolateral surface of the cells was exposed to regular tissue culture medium (DMEM/FCS) in aerophilic conditions while the apical surface of the monolayer was exposed to microaerophilic conditions and bacterial growth medium (Brain Heart Infusion with FCS) inoculated with H. pylori. These culture conditions allowed co-culture to proceed for about 48 h without disruption of the monolayer or significant decreases in bacterial CFU counts. In Cottet’s study the co-cultures were used to study host cell signalling in response to H. pylori adhesion, as well as urease expression in adherent bacteria, versus those suspended in the growth medium (60). While this work represents a significant improvement in the study of chronic H. pylori infection in vitro, the considerable complexity of the culture system makes this technique inhibitory for wide usage.

In contrast the co-culture system described in the present Chapter is simple, allows co-culture conditions to be maintained for long periods of time (at least 4 months) and does not require special microaerophilic conditions to maintain healthy growth of the H. pylori culture, as it is grown in aerobic conditions with 5%

CO2. In addition, pure samples of bacterial cells can be easily harvested from the MDCK monolayer in the present system due to the small level of adherent bacteria.

Comparison of the gene expression profiles of H. pylori grown in co-culture with regular broth grown H. pylori (Chapter 4) indicated that although in both conditions there were some similarities, such as the decreasing expression level of genes involved in translation over time, there were many differences. These differences were likely to be caused by the different growth conditions and made the comparison of these two conditions difficult. Thus, comparison of the co- culture gene expression profile with parallel controls grown in the same Chapter 5 148 environmental conditions was performed to determine which genes were expressed specifically in response to the close proximity to mammalian cells.

This represents the first study investigating the global transcriptional response of H. pylori to mammalian cells in vitro. The reason for this lack of data in the literature is most likely due to the same kind of technical difficulties as are faced with detecting in vivo gene expression of bacteria. Namely, extracting a sufficient amount of bacterial RNA and separating this from the host cell RNA molecules is formidable. The use of the present co-culture method in which the bacteria do not readily attach to the MDCK cells allows easy RNA extraction from the bacterial cells resulting in an almost pure bacterial RNA sample for downstream analysis.

5.4.2. G27-MDCK Time Courses The co-culture method described was developed using the G27 strain of H. pylori as much was known about its growth kinetics in co-culture as well as other parameters such as its ability to inject CagA protein into cultured Caco-2 and AGS cells. In the present study this strain was used to investigate the transcriptional response of H. pylori to growth on MDCK cells. The concurrent growth of G27 in co-culture together with three control cultures allowed individual gene expression profiles to be assigned to the presence of live mammalian cells (Fig. 5.1).

5.4.3. Response to Growth in Different Mediums Due to the major differences observed between co-culture and regular broth culture, an initial investigation into the differences in gene expression profiles of G27 in co-culture and G27 grown in broth medium under identical conditions was made. The shape of the growth curve for the G27 culture grown in BBF media was very different to the co-culture TC (Fig. 5.7). Also the peak in CFU counts obtained for the BBF grown culture was much lower than the co-culture TC indicating significant growth restriction of G27 in BBF media grown in these aerobic conditions. H. pylori growth in regular broth culture (BBF media with shaking in microaerophilic conditions and 10% CO2) typically multiplies around 2 Chapter 5 149 log-units from the beginning of lag phase until entry into stationary phase. In the G27 BBF TC the culture multiplied by only ½ log-unit (Fig. 5.7). This culture was grown statically in 5% CO2 in a tissue culture incubator. The lower concentration of CO2, which has been found to be essential for H. pylori growth (150), and the higher O2 content (atmospheric levels) may have increased oxidative stress (150, 319). In conjunction with this different atmosphere the static culture conditions may have allowed gradients of metabolites to build up which may have stressed the culture and restricted its growth.

The difference in growth of the G27 control grown in BBF media compared to in DMEM/FCS/BB media may be explained by the difference in media composition, as all other growth parameters were identical for these cultures. It is possible that the co-culture media may better protect the bacteria from toxic metabolites and/or provide a superior blend of nutrients for growth in this high oxygen tension, low carbon dioxide environment.

5.4.4. Genes Induced Specifically in the Presence of Live Mammalian Cells The similarity in the growth curves for the co-culture TC, and the control cultures grown in DMEM/FCS/BB with and without fixed MDCK cells resulted in all three cultures being in log phase simultaneously. Given this, the microarrays from log phase samples were used for analysis of the specific genes expressed in the presence of live MDCK cells. Comparison of the gene expression ratios during log phase in the live co-culture time course and controls revealed a set of genes which appear to be induced or repressed specifically in the presence of live MDCK cells. Although this set included 126 genes, the differences in gene expression level between the groups were not large (average fold change of 1.6, Table 5.2). This indicates that the MDCK cells may have only a small influence on H. pylori transcription in co-culture and that relatively few large changes in expression levels are needed for H. pylori to survive in this situation. Interestingly many of the genes found to have significantly different levels in co-culture are genes of unknown function. This may reflect the fact that few experiments investigating the direct transcriptional response of H. pylori to mammalian cells Chapter 5 150 have been performed and thus those genes which may be important specifically in the context of infection have not yet been studied in detail. These genes are also likely to be important in the in vivo situation (6, 361).

The enhanced induction of genes involved in amino acid biosynthesis and transport, co-factor synthesis and energy metabolism in the G27 co-culture time course may reflect both differing and increased biochemical and metabolic needs of the H. pylori culture when grown in the presence of mammalian cells (Table 5.2). Interestingly genes involved in many aspects of the aerobic respiratory chain including the quinone oxidoreductase (nuoC, nuoN and nuoG), the cytochrome bc1 complex (petB) and the cb-type terminal oxidase (fixP) (indicated in bold in Fig. 5.11) are induced in live co-culture. The respiratory chain is responsible for maintaining a proton-motive force (PMF) across the inner membrane and the ATP synthase converts this PMF into ATP for use in energy demanding processes such as solute transport and motility (149, 150). Not all of the genes induced in co-culture are involved in aerobic metabolism. There were also a number of genes involved in pyruvate metabolism and the citric acid cycle induced in co-culture (Fig. 5.10), two of these may be responsible for anaerobic respiration, the fumarate reductase gene frdA and the putative N- or S-oxide oxidoreductase {annotated by Tomb et al. (361) as a biotin sulfoxide reductase, bisC}. The oxygen-sensitive gene porB was also induced in co-culture. This suggests that both aerobic and anaerobic respiration may have been proceeding during live co-culture.

This pattern of gene expression may indicate that the bacteria in co-culture were in a more metabolically active state, possibly due to signals from the mammalian cells. These signals may also induce chemotaxis and motility as well as increased solute uptake (such as iron, Fig. 5.12) due to competition with the mammalian cells for nutrients. For example, it is well known that the host sequesters iron as a defence mechanism against pathogens (180, 386). Both of these aspects would be expected to be important in vivo where the bacterium Chapter 5 151 needs to avoid being washed out of the stomach and to acquire nutrients from its environment.

Induction of a number of genes involved in cell surface modification such as the lipoprotein gene lpp20 and the lipid A biosynthesis gene lpxD as well as a number of outer membrane proteins which could contribute to adhesion, or changes in surface antigen profile were also observed. Such changes may contribute to interaction with host cells and possibly the evasion of the host immune response in vivo. Interestingly the Lpp20 protein is highly antigenic (41) and has been shown to confer protection against H. pylori in mice (148). Thus Lpp20 may be very important in the interaction of H. pylori with the host.

Another gene which is induced in co-culture which may be involved in immune evasion is the gcpE gene. In E. coli this gene has been shown to be involved in the synthesis of isoprene which is a molecule capable of stimulating gd T cell proliferation (9). It has been shown that a subset of gd T cells reduces the ability of the innate immune system to clear infection of organisms such as Listeria (235). Therefore it is possible that contact with mammalian epithelial cells induces H. pylori to produce isoprene and that this may contribute to immune evasion allowing a chronic infection to develop.

Finally in light of the fact that H. pylori has so few regulators and the functions of these are largely unknown, it is intriguing that three putative regulators are induced in co-culture conditions. The first is the gppA gene which is possibly involved in the stringent response in H. pylori and controls the level of pppGppp. In E. coli pppGppp and ppGppp levels mediate the stringent response to nutrient starvation in order to conserve energy (46). Thus it is possible that the gppA gene may be induced in response to changes in intracellular concentrations of metabolites due to, or in response to, the enhanced metabolic activity in the co- culture situation. It has been suggested by Scoarughi et al. that the stringent response is not functional in H. pylori because of the absence of the relA gene (309), but it is possible that this response occurs differently in this bacterium. The Chapter 5 152 stringent response has also been linked with the induction of virulence properties in other bacteria such as Legionella pneumophilia and Mycobacterium tuberculosis (46). Thus it is possible that nutrient limitation which occurs due to competition with host cells induces the expression of gppA which in turn triggers virulence related mechanisms in H. pylori. Measurements of the levels of ppGppp and pppGppp in these cell culture systems may help elucidate the possible importance of the stringent response in H. pylori.

The two-component regulators HP0166 and HP1021 were also significantly induced in co-culture (Table 5.2). While a number of studies have suggested that both of these regulators are essential in H. pylori (26, 202), conflicting reports have suggested that HP1021 may not in fact be essential (202). Little is known of the environmental stimuli required for the activation or the targets of these two- component regulators (26). HP1021 is an orphan response regulator with no cognate histidine kinase found in either of the two sequenced genomes and the receiver domain of this protein differs from other response regulators in that it may not require phosphorylation to exert its function (75). Thus a completely different mechanism may be needed for the induction of this regulator. Further investigation of the role of these regulators in cell culture could provide insight into in vivo gene regulation.

5.4.5. Comparison of Genes Induced in G27 and SS1 in Co-culture Preliminary experiments (SS1-A and SS1-B) investigating the response of the SS1 strain to growth in co-culture were performed for consistency with the work presented in the two previous chapters. However the growth curves of these two cultures were not consistent indicating that more extensive optimisation of the co- culture system was required for experiments with this strain. Despite this, a preliminary investigation into the similarity of the transcriptional response of SS1 and G27 to the MDCK cells was made. A parallel analysis of the growth of these two strains in co-culture would be required in the future to confirm these results.

There were only a small number of genes found to be induced in both G27 and SS1 in co-culture (Table 5.3). This may illustrate that these genes were Chapter 5 153 specifically induced in the presence of the MDCK cells and thus are candidate virulence genes. Of particular interest were the iron uptake genes expressed in both co-culture conditions. This further supports the concept that the bacterium is in constant competition with the MDCK cells for available iron in the media and thus the bacteria expresses its iron uptake genes at a maximal level.

The methyl-accepting chemotaxis protein, tlpA and the urease structural gene ureB were both induced in co-culture with G27 and SS1. This is of particular interest given that the tlpA protein is thought to sense ligands necessary for chemotaxis towards the epithelial cells in vivo and the urease protein has been found to be essential for chemotaxis especially in viscous environments (226). This data further illustrates the importance of this system in infection. Finally the induced expression of the response regulator HP1021 in both H. pylori strains in co-culture is intriguing as it is possible that close contact with epithelial cells is the required environmental stimulus for the activation of this regulator.

5.4.6. Future Directions The difficulty in these cell culture studies in separating the effects of growth media from the environmental signals highlight the importance of comparing the gene expression patterns of H. pylori to appropriate controls in order to gain accurate information. It also strongly suggests that the investigation of gene expression in vivo must be viewed with caution for three reasons. First, there is likely to be mixed populations of bacterial cells in vivo, which may dampen gene expression detection, second, because only a snap-shot of the gene expression can be collected at any one time it will be unclear which stage of growth the cells are in and whether this state is static or dynamic and third, the growth condition the in vivo gene expression levels should be compared to is unknown. It will be difficult to draw any conclusions with respect to growth phase or metabolic state in vivo without the appropriate comparisons. Additionally understanding which genes are constitutively expressed in vivo and which are transient will be challenging. Chapter 5 154

5.5. Conclusion The transcriptional profiles presented in this chapter provide a first step in the understanding of the in vivo transcriptional response of H. pylori to the host environment. However, they also highlight inherent problems in comparisons of bacterial growth in two different conditions, particularly if these conditions change the growth parameters. These points need to be considered for future investigations into in vivo expression profiling of bacteria, particularly when drawing conclusions about differences in the metabolic or growth state of bacterial cultures in response to the host environment. Despite this, the possible involvement of the response regulators, HP0166, HP1021, gppA, and the possible immune evasion gene gcpE identified in these experiments provide interesting targets for further in vivo experimentation to determine their involvement in infection and pathogenesis.

Chapter 6

THE NEW MOUSE COLONISING STRAIN

6.1. Background Animal models of H. pylori induced disease have been used extensively in Helicobacter research. Due to the difficulties in studying many of the factors of gastric disease in human subjects, these models are necessary. A number of different animals have been used for this purpose. These include nonhuman primates (83), gnotobiotic piglets (87), beagle dogs (289), guinea pigs (324), Mongolian gerbils (241), suckling mice (123) and adult mice (174). These models have been particularly useful in elucidation of factors required for initial colonisation, distribution and persistence of infection (70, 173, 174), the contribution of various virulence factors of H. pylori (87, 88, 91, 241, 248, 299), as well as the pathogenesis of disease for the development of vaccines (59, 69, 279, 346).

The mouse model is one of the most extensively utilised systems for studying H. pylori induced disease because the animals are small, relatively cheap and specific reagents are available (188). To date, very few H. pylori strains have been shown to be able to colonise the mouse persistently (40, 172). In 1997, Lee et al. isolated the Sydney strain of H. pylori (SS1) by screening a number of clinical isolates for their ability to colonise the mouse. This strain was found to consistently colonise multiple strains of mice and to establish infection which persisted over many months, thus providing an ideal opportunity to study the effects of chronic infection and host specificity in this small rodent model (174). The SS1 strain has been shown to contain the virulence related genes, vacA and cagA (174). Despite high levels of colonisation however, the inflammation caused by this strain was only mild to moderate, depending on the strain of mouse infected (103, 174, 188, 297). Chapter 6 156

Studies have shown that disease progression of H. pylori infected individuals varies between populations and individuals suggesting the possibility of host- dependent effects (7, 117, 120, 145, 162, 260, 266, 303). Comparative studies of both H. felis, a close relative of H. pylori that colonises mice to high levels, and SS1 in various mouse strains has contributed greatly to the understanding of host-dependent gastritis. In particular, the concept of responder and non- responder strains has been suggested by Sakagami et al. (296). In their study C57BL/6 and C3H/He mouse strains, were found to develop moderate to severe chronic active gastritis 6 months post infection with SS1 or H. felis and thus were termed responder mice. In contrast in BALB/c and CBA mice, only a mild form of gastritis was found to develop during the time course of infection and thus these were termed non-responder strains (296). Interestingly, after longer periods of colonisation (18-28 months) with H. felis the non-responder BALB/c strain was found to develop pathology resembling MALT lymphoma (97). The different inflammatory responses of C57BL/6 and BALB/c mice have been partially attributed to their T-helper phenotype. In C57BL/6 mice the pro-inflammatory Th1 phenotype dominates, while in the BALB/c mice the largely anti-inflammatory Th2 phenotype is present (102, 291).

In addition to these differences in pathology, the level and distribution of colonisation of H. pylori has been shown to vary in different mouse strains. In C57BL/6 mice there is a high level of antral dominant colonisation. In contrast, BALB/c mice have a lower overall level of colonisation which is localised mainly in the antrum-body transitional zone (174). Thus by using a combination of mouse strains, host dependent colonisation and inflammation studies can be performed.

Along with host-factors, it has long been understood that H. pylori strain specific differences can influence disease progression in humans. In particular the presence of the cag PAI has been associated with more severe disease (33, 264). The fact that most H. pylori strains do not readily colonise mice has hindered the study of the effects of such strain specific differences in the mouse Chapter 6 157 model. Other mouse-adapted strains of H. pylori have been utilised for studies in mice (191, 204, 293, 332, 335, 351, 370), as well as using fresh clinical isolates (269, 359). However, the detailed geno- and phenotype of these H. pylori strains are not available and only the effects of short-term infection were assessed (< 3 months) in most cases. In many cases the success rate for colonisation was less than 100% of animals inoculated (351, 370) and the infection level decreased over the course of the experiment (370) suggesting that these strains were unlikely to colonise persistently. In contrast, SS1 was shown to colonise for more than 12 months {(174) and current study} at a consistent level. However, colonisation has been shown by various groups to be somewhat dependent on the strain of mouse used and the in vitro passage number of SS1 (67, 108, 188). In some mouse strains (such as BALB/c mice) there are dramatic differences in the inflammatory response induced after short (296) and long term infection (97). This necessitates the choice of H. pylori strains that can persistently colonise.

Despite the advantages of the Sydney strain mouse model for studying colonisation and distribution of H. pylori in the stomach, there are limitations. In particular the low level of inflammation produced by this strain during short-term colonisation (< 6 months) makes comparisons with the human disease not ideal (269). In addition, recently a number of groups have attempted to use SS1 for studies of the effects of deletions in parts of the cag PAI on colonisation and inflammation in the mouse model and have reported conflicting results (89, 191). A possible explanation for these conflicts is that SS1’s cag PAI may not be fully functional as it does not induce IL-8 secretion from AGS cells in culture (67, 89, 370 and N. Salama, personal communication). Also, a recent publication by Salama et al. (298) revealed that SS1 did not have a complete cag PAI, the ORF7 gene being missing. Regardless, the impact of the cag PAI on inflammation in mice should be viewed with caution as these animals do not appear to have a homolog of the IL-8 cytokine (356). Thus one of the signalling pathways known to be induced by cag PAI positive strains may not be important for inflammation in this host (269). However, it is possible that these effects are Chapter 6 158 mediated through a different signalling pathway such as the macrophage inflammatory protein-2 (MIP-2) (234).

In order to investigate in depth the contributions of host and strain specific effects on colonisation and inflammation the specific aims of these studies were as follows: First, to isolate new mouse colonising strains of H. pylori for use in comparative studies with SS1. It was hypothesised that strains with enhanced ability to colonise mice might cause an increased inflammatory response and thus represent better strains for use in vaccine studies. Second, the colonisation and inflammation induced by new mouse colonising strains in comparison with SS1 over long term infection in two mouse strains, C57BL/6 and BALB/c, was investigated in order to determine the specific contributions of both host and strain differences. Chapter 6 159

6.2. Experimental Procedures 6.2.1. Bacterial Cultures To isolate new mouse colonising strains a selection of 110 clinical H. pylori isolates collected by the UNSW Helicobacter laboratory from patients undergoing endoscopy at a Sydney gastroenterology unit were used. These strains had been stored in liquid N2 after limited passage on CSA plates (less than 10 times). Three additional strains were used as controls: the mouse adapted strain, SS1, and two clinical isolates which had previously been identified as being able to colonise mice in moderately high numbers for at least four weeks, 10319 and

10217 (174). Each culture was revived from liquid N2 as described in Chapter 2, and passaged once on CSA plates before inoculation of animals.

6.2.2. Animal Infections All animals used in this Chapter were age matched (6-8 weeks old) female C57BL/6 or BALB/c mice obtained from ARC (Australia).

6.2.2.1. Isolation of new mouse colonising strain Following culture, bacterial strains were harvested from plates in BHI media and approximately equal amounts of 3-7 individual strains were combined (termed groups). A total of 22 groups of strains plus the three individual control strains were used. Two C57BL/6 mice per group of strains (a total of 50 mice) were inoculated intragastrically, twice in a 3 day period. Individual mice in each cage were traced using ear notching.

6.2.2.1.1. First animal passage (Passage A)

Six weeks post-infection all of the infected mice were euthanased by CO2 asphyxiation and the stomachs harvested and homogenised. Of the two animals in each group, the homogenised stomach of one was passaged to a new group of three C57BL/6 mice by intragastric inoculation, while the homogenate from the second mouse was serially diluted and used for viable colony forming unit (CFU) estimation on GSSA plates. Of those plates on which colonies resembling H. pylori grew, 12 colonies per group were subcultured individually onto new CSA Chapter 6 160 plates. Genomic DNA (gDNA) was extracted from a sample of all cultures which were successfully passaged and these were also harvested and stored at -80ºC in BHI plus 30% glycerol (BHIG) (Passage A isolates).

6.2.2.1.2. Second animal passage (Passage B) After 4 weeks one of the three mice in each of the infected groups from the previous passage was euthanased and the stomachs harvested and homogenised. One tenth of this homogenate was used for viable CFU estimation and the remaining portion was inoculated intragastrically into the remaining 2 mice in the same group along with a 3rd previously uninfected C57BL/6 mouse, such that each group had 3 infected mice. Again 12 colonies per group in which H. pylori grew were subcultured for further analysis (Passage B isolates).

6.2.2.1.3. Third animal passage (Passage C) After 7 weeks all three animals in the five remaining infected groups, as assessed after passage B, were euthanased, their stomachs harvested, homogenised and pooled. One third of each pool was frozen in liquid N2 with an equal volume of BHI/glycerol for storage, the second third was used for viable CFU estimation and the final third was used to inoculate 10 C57BL/6 mice for each group. The mice in the remaining groups found to be uninfected after passage B, were also euthanased and viable CFU estimation performed to confirm their infection status. Again representative strains that grew on the plates from any of the 25 groups were restruck for further analysis (Passage C strains). Only two groups of mice were still colonised at this time point, SS1 and Group 3/2. The new mouse-adapted strain from Group 3/2 was termed the Sydney Strain 2000 (SS2000).

6.2.2.1.4. Fourth animal passage (Passage D) After 2.5 months all 10 infected C57BL/6 mice from the previous passage were euthanased, their stomachs harvested and homogenised for viable CFU counts. Since the original mouse adapted strain, SS1, had been shown to colonise BALB/c mice at a level ~1 log unit lower than in C57BL/6 mice, the ability of SS2000 to colonise this strain of mouse was investigated. A portion of all 10 Chapter 6 161 stomachs from the SS2000 group were pooled and 10 BALB/c mice were inoculated intragastrically. Representative isolates were also collected from plates and stored for further analysis (Passage D-1 isolates).

After a further 3.5 months these 10 BALB/c mice were euthanased and the level of infection measured by determining viable CFU counts of the homogenised stomachs of each individual mouse in the group (Passage D-2 isolates).

6.2.2.2. Colonisation of original clinical isolate and mouse-adapted SS2000 In order to assess the extent of mouse adaptation which may have occurred in the SS2000 strain after the multiple mouse passages described above, the colonisation level of SS2000 was compared with that of the original clinical isolate of SS2000 (termed 2.1) which had been identified with RAPD profiling (described below). Two groups of 10 C57BL/6 mice were inoculated intragastrically with an equal amount (~107 CFU/mouse) of either the SS2000 strain or the 2.1 strain. After 4 weeks all 20 mice were euthanased, their stomachs excised. Half the stomach was used to determine colonisation and the other for histology (described in Chapter 2).

6.2.2.3. Long term infection with SS2000 in comparison to SS1 The primary experiment of this study was to assess and compare the colonisation ability and inflammation induced by SS1 and SS2000 over long term infection of mice. 90 C57BL/6 and 90 BALB/c mice were inoculated intragastrically twice within a 3 day period with either BHI, SS1, or SS2000 for two time points, 6 and 15 months, as indicated in Table 6.1. Equivalent numbers of bacteria in each strain (~108 CFU/ml) were estimated for inoculation using haemocytometer counts and retrospective CFU counts. Samples of these input strains were pelleted, resuspended in Lysis solution (Puregene) and incubated at 80°C for 10 min. These lysed samples were frozen at -20°C for subsequent gDNA extraction (Input strains: SS1-I and SS2000-I) for microarray analysis in following chapter. Chapter 6 162

At the 6 month time point the stomachs from each of the animals were treated in the same way as described for the experiment in section 6.2.2.2, and half of the stomachs were fixed for histopathology and the other half for viable CFU counts. From the colony counting plates from each infected mouse 1-3 individual colonies were passaged to new CSA plates, grown in the 10% CO2 incubator, snap-frozen in 1 ml BHIG and stored at -80°C for subsequent gDNA extraction (6M-Output strains, for following chapter).

At the 15 month time point the mice in Groups A-C and E-G were split into two (see Table 6.1 for details) such that the stomachs from 10 mice from each of the groups was cut in half and was fixed in 10% formalin while the remaining half was placed in a cryo-tube and snap-frozen in liquid N2 for subsequent RNA extraction (RNA used for microarray analysis described in Chapter 8). The stomachs of the remaining animals in each group along with those in Group D were handled as described in the previous two sections for histology and viable counts. Two individual H. pylori colonies were randomly selected from 5-6 mice per infected group and passaged to new CSA plates. The individual strains were snap-frozen and stored at -80°C for subsequent gDNA extraction (15M- Output Strains, for the following chapter).

6.2.3. Assessment of Colonisation Level and Distribution Using the silver stained slides from the 6 M and 15 M time points five areas of the stomach (antrum, antrum/body transitional zone, body, body/cardia transitional zone and cardia regions) were assessed for the level and presence of bacteria as described previously (174). The grading system used is as follows: 0, no bacteria observed; 1, mild bacterial colonisation observed although not in every crypt; 2, mild colonisation in the majority of crypts; 3, moderate to heavy colonisation in every crypt; 4, heavy colonisation with all crypts densely packed with bacteria.

6.2.4. Assessment of Histopathology The histopathological features were assessed by light microscopy in blinded Haematoxylin and Eosin (H&E) stained slides by Dr. S. Danon. Antral and body mucosa were graded separately; active inflammation was assessed by the Chapter 6 163

Table 6.1: Animal groups and euthanasia time points for the long term colonisation experiment.

Mouse BALB/c C57BL/6 strain Group A B C E F G Inoculatio BHI SS1 SS2000 BHI SS1 SS2000 n 6 monthsa 10 10 10 10 10 10 A1-A10 B1-B10 C1-C10 E1-E10 F1-F10 G1-G10 Time 15 months 10 10 10 10 10 10 point (i)b A11-A20 B11-B20 C11-C20 E10-E21† F11-F20 G11-G20 15 months 8* 10 10 9* 10 10 (ii)a A21-A28 B21-B30 C21-C30 E22-E30 F21-F30 G21-G30 * A portion of animals in these groups died before the time of euthanasia. † One control animal in this group excluded from analysis. a The stomachs from these animals were split in half for CFU counts and histology b The stomachs from these animals were split in half for RNA extraction and histology Chapter 6 164 presence of neutrophils and chronic inflammation by the presence of lymphocytes. The scoring system was graded as: 1=mild multifocal; 2=mild widespread or moderate multifocal; 3=mild widespread and moderate multifocal or severe multifocal; 4=moderate widespread; 5=moderate widespread and severe multifocal; and 6=severe widespread. The total number of lymphoid aggregates and follicles in each section were counted. The total number of glands with neutrophil infiltration in the crypt and lumen were also counted to assess the number of gland abscesses. Atrophy was evaluated on the degree of loss of parietal cells and mucus cell hyperplasia and assessed as zero, 1 (mild), 2 (moderate), or 3 (severe). Submucosal inflammation (that is, inflammation below the muscularis mucosae) was assessed on a scale of 0–6 as above. These criteria have been used previously (92).

The same blinded H&E slides were also assessed for the presence of lymphocytic infiltration (LA) and lymphoepithelial lesions (LEL) by Dr. J. O’Rourke. These features were graded on a 0-3 point scale using the following criteria: for LA, 0 (no change), 1 (mild- single or a few small aggregates of lymphocytes), 2 (moderate- multiple, multifocal large lymphoid aggregates or follicles), 3 (severe- extensive multifocal lymphocytic infiltration often extending through the depth of mucosa resulting in distortion of the epithelial surface); for LEL, 0 (no change), 1 (early lesions or single LEL’s), 2 (multiple well formed LEL’s), 3 (multiple LEL’s resulting in extensive destruction of the epithelium possibly indicative of low-grade lymphoma).

6.2.5. Statistical Analysis To assess significant differences in colonisation levels, unpaired t-tests with Welch’s correction were used because the variances of the groups were not equal. For analysis of differences in colonisation distribution grading the non- parametric Kruskal-Wallis analysis was used because more than two groups were compared. For comparison of the histopathological grades between two groups at a time, the non-parametric Mann-Whitney analysis was used. To assess differences in the counts of lymphoid aggregates and gland abscesses, Chapter 6 165 an unpaired t-test with Welch’s correction was used. Finally to assess whether there was a significant difference in one group (body LELs in SS2000 infected BALB/c mice) compared to zero (body LELs in SS1 infected BALB/c mice) a Wilcoxon signed rank test was employed.

6.2.6. Randomly Amplified Polymorphic DNA (RAPD) Profiling RAPD profiling based on the method by Akopyanz et al. (2) was performed on the mouse adapted SS2000 strain and the original clinical isolates from the group of input strains (2.1-2.5 and 3.1-3.5) using the two primers {1281 (primer 3) and 1290 (primer 4); sequences shown in Table 2.2}. Profiles were compared with elucidate from which clinical isolate SS2000 originated. Separate PCR reactions were performed in which one of the primers was used for amplification in a 50 µl reaction containing 1 x PCR buffer (67 mM Tris-HCl, 16 mM (NH4)2SO4, 0.45% Triton X-100, 0.2% gelatin), 200 µM each dNTP, 1 U Taq DNA polymerase

(Biotech International, Perth, Australia), 3 mM MgCl2, 20 pmol primer and approximately 20-100 ng genomic DNA). The cycling conditions used in these reactions was 5 cycles of 94°C for 5 min, 40°C for 5 min and 72°C for 5 min, followed by 15 cycles of 94°C for 1 min, 40°C for 1 min and 72°C for 2 min, followed by 15 cycles of 94°C for 1 min, 36°C for 1 min and 72°C for 2 min, followed by a final extension step of 72°C for 10 min and these were run on the Corbett FTS 320 thermocycler (Corbett Research, Sydney, N.S.W.). The reactions were separated on a 1.5% agarose gel using 50 V for 1 h 30 min.

6.3. Results 6.3.1. Assessment of Colonisation of Clinical Isolates During Passage A, B & C The colonisation level in the mice inoculated with one of the 25 groups of strains (including SS1, 10319 and 10217) for Passage A were assessed and it was found that 84% (21/25) of these mice were colonised to varying degrees (1 x 103 to 1.5 x 107 CFU/g stomach tissue) after 6 weeks (Table 6.2). This suggests that at least 20 of the clinical isolates {including the two control strains from (174)} were able to colonise C57BL/6 mice for at least 6 weeks. The limit of detection for Chapter 6 166 these experiments was 1 x 103 CFU/g stomach tissue. The number of colonising strains in each mouse was assessed by RAPD analysis (results described below) for Group 3/2 and Group 11 only.

For practical purposes groups of 4-8 mice used for Passage A were kept in the same cage. Ear notching was used to identify individual mice in each cage so that each mouse was dosed with one group of strains. However in one cage the ear notches differentiating the mouse which received the Group 2 strains and the one which received Group 3 strains were indistinguishable two days later because of tearing and as a result the inoculums for these mice were switched so that both mice received one dose of Group 2 and one of Group 3 strains in opposite orders. These groups are referred to as Group 2/3 and Group 3/2 for this reason.

When the colonisation of the Passage B mice (4 wks post infection) was assessed there were detectable levels of H. pylori in only 20% (6/25) of the inoculated mice. However, the level of colonisation in the mice from the remaining groups had increased from the level recorded after Passage A by ~½ to 1 log unit, except in the Group 3/2 strain which retained a steady level (Table 6.2). In Passage C (7 wks post infection) only the SS1 and the Group 3/2 infected mice remained colonised with a detectable load of H. pylori. This represents only 4% of the original number of groups inoculated (Table 6.2). The Group 3/2 strain isolated was deemed a new mouse colonising strain and was named Sydney Strain 2000 (SS2000). Finally, in Passage D the ability of SS2000 to colonise 10 C57BL/6 (D-1) for 2 ½ months and 10 BALB/c (D-2) for 3 ½ months, was assessed. The results for the Passage D-1 and D-2 experiments (Table 6.3) show that both SS1 and SS2000 colonised 100% of the mice to comparable levels. In the C57BL/6 mice the level of colonisation appears to be less variable for the SS2000 infected mice (SD 9.1 x 106 CFU/g tissue) than it was for SS1 (SD 2.2 x 107 CFU/gram tissue). The level of colonisation of SS2000 in the BALB/c mice was ~ ½ to 1 log lower than in the C57BL/6 mice. This result has also been observed previously with the SS1 strain (data not shown). Chapter 6 167

Table 6.2: Colonisation levels in mice infected with each group of strains.

Group Passage Aa Passage Ba Passage Ca 6 wks p.i. (CFU/g) 4 wks p.i.(CFU/g) 7 wks p.i.(CFU/g) 1 3 x 105 - - 2/3 4 x 105 - - 3/2 1.5 x 107 9.3 x 106 1.5 x 107 4 3 x 104 - - 5 3 x 104 - - 6 7.1 x 104 - - 7 - - - 8 1.6 x 104 - - 9 2.2 x 105 - - 10 3 x 104 - - 11 7.5 x 104 1.5 x 105 - 12 4 x 103 - - 13 - - - 14 5.8 x 105 - - 15 2.9 x 105 - - 16 - - - 17 - - - 18 2.2 x 104 - - 19 3.2 x 103 - - 20 3.3 x 103 1.5 x 104 - 21 2.5 x 104 9 x 104 - 22 1 x 103 9.2 x 105 - 10319b 1.6 x 104 - - 10217b 8.8 x 104 - - SS1 3.7 x 106 2.2 x 106 1.3 x 107 Total 21/25 (84%) 6/25 (24%) 1/25 (4%) a Values represent CFU per gram stomach tissue and p.i. is weeks post infection b The two control strains from (174)

Chapter 6 168

6.3.2. Identification of the Original Clinical Isolate of SS2000 To identify the origin of the new mouse colonising strain obtained from Group 3/2 (SS2000), the RAPD profiles (generated with Primer 4) of all the clinical isolates in Group 2 (2.1, 2.2, 2.3, 2.4, 2.5), Group 3 (3.1, 3.2, 3.3, 3.4, 3.5) and SS1 were compared with the profiles for two mouse isolates from Group 3/2, 3B.10 (isolated from Passage B) and SS2000 isolated from Passage D (Fig. 6.1). From this analysis it can be seen that each of the 10 clinical isolates and SS1 have distinct RAPD profiles. The profile observed for both the SS2000 and 3B.10 strains were the same as the 2.1 clinical isolate. Two further RAPD profiles were generated to confirm this observation. In Fig. 6.2A Primer 3 was used to generate the RAPD profiles, while in Fig. 6.2B Primer 4 was used. Both of these RAPD analyses show that the profile derived from the clinical isolate 2.1 and the SS2000 isolates are identical. There are some differences in the intensity of some of the bands in these gels but this is most likely to have occurred due to slight differences in the amount of gDNA used to generate the RAPD. Thus the clinical isolate named 2.1 was considered to be the pre-mouse isolate of SS2000 (PMSS2000). This clinical strain was originally isolated from an Australian male diagnosed with gastritis.

All of the isolates from the Group 3/2 mice from all three Passages that were tested (8 from Passage A, 11 from Passage B, and 4 from Passage C- data not shown) had the same RAPD profile using both primer 3 and 4 suggesting that only one strain from this group was able to colonise during all three passages. In contrast the RAPD profiles obtained from the Group 11 Passage A isolates indicated that two strains with profiles resembling 11.1 and 11.2, colonised this mouse 6 weeks after initial infection (data not shown). However, inspection of the Group 11 strain profiles obtained from the Passage B isolates indicated that only one strain remained with the 11.2 profile (data not shown). This data strongly suggests that > 1 strain may colonise mice initially but eventually one single strain dominates.

Figure 6.1: 1.5% agarose gel showing the RAPD profiles obtained using primer 4 (see Table 2.2 for sequence) of all possible clinical isolates from which SS2000 could have originated (2.1-2.5 & 3.1-3.5) along with that of SS1 and two of the strains obtained from group 3/2 infected mice, SS2000 and 3B.10. The strains indicated in bold have the same RAPD pattern that is different from all other patterns. M indicates the FN-1 size marker and the sizes of the bands are indicated on the left, –ve indicates the negative PCR control in which H2O was added instead of genomic DNA.

Figure 6.2: 1.5% agarose gels showing RAPD profiles using primer 3 A) and primer 4 B) (see Table 2.2 for sequences) for H. pylori strains: SS1; two original clinical isolates in the study (2.1 and 2.3); SS2000; and an isolate of SS2000 from an earlier passage 3B.10. Isolates indicated in bold have the same RAPD profiles which are different from SS1 and 2.3. M indicates the FN-1 size marker and the sizes of the bands are indicated on the left, –ve indicates the negative

PCR control in which H2O was added instead of genomic DNA. 1.1 kb 1.5 kb 2.7 kb Figure 6.1 Figure 6.2

M 1.1 kb 1.5 kb 2.7 kb A 1.1 kb 1.5 kb 2.7 kb B -ve SS1

M 2.1 M 2.2 -ve -ve 2.3 SS1 SS1 2.4

2.1 3B.10 2.5 3.1 2.3 SS2000 3.2 SS2000 2.1 3.3 3B.10 2.3 3.4 3.5 SS2000 3B.10 Chapter 6 170

Table 6.3: Colonisation of SS2000 compared with the SS1 in the preliminary experiments.

Experiment Group Mouse Fraction Mean SDa strain mice (CFU/g) (CFU/g) infected Passage D-1 SS1 C57BL/6 10/10 4.0 x 107 2.2 x 107 b (2 ½ M p.i.) SS2000 C57BL/6 10/10 2.6 x 107 9.1 x 106 Passage D-2 6 6 (3 ½ M p.i.)b SS2000 BALB/c 10/10 6.2 x 10 4.8 x 10 a SD is standard deviation b colonisation level months (M) post infection (p.i.)

Table 6.4: Colonisation of pre-mouse and “mousified” strains of SS1 and SS2000 4 wks post infection in C57BL/6 mice

Group Fraction Mean (CFU/g SDa (CFU/g Source of data mice stomach stomach infected tissue) tissue) 10700 4/5 4.0 x 105 8.9 x 105 (174) SS1 5/5 1.9 x 106 2.3 x 106 (174) PMSS2000 10/10 6.9 x 106 3.2 x 106 This study SS2000 10/10 2.3 x 107 9.7 x 106 This study a SD is standard deviation

Table 6.5: Colonisation levels of the SS1 and SS2000 strains in C57BL/6 and BALB/c mice

SS1 SS2000 6M 15M 6M 15M C57BL/6 6.9X106 9.8X106 4.6X106 2.9X107 BALB/c 1.3X105 7.2X105 2.9X105 1.6X106 Sig. Lv.a p=0.013 p=0.018 p=0.014 p<0.001 a Significance level of colonisation in C57BL/6 mice compared with BALB/c mice in the strains and times indicated (Un-paired t-test with Welch’s correction).

Chapter 6 171

6.3.3. Comparison of Pre-mouse and “Mousified” Strains’ Colonisation Ability In the original publication describing the SS1 strain it was shown that the consistency and level of colonisation of the “mousified” SS1 strain was superior to the original clinical isolate of this strain (10700) (174). The results from both this original study describing SS1 and the results obtained from the present study on SS2000 are shown in Table 6.4. The “mousified” strain of SS2000 colonised to a significantly higher level than the original clinical isolate (PMSS2000) (Fig. 6.3). It is also apparent that PMSS2000 may have naturally colonised at a higher level than the pre-mouse isolate of SS1 (10700), although a parallel comparison of these is required for confirmation.

6.3.4. Assessment of Colonisation and Inflammation After Long Term Infection 6.3.4.1. Colonisation The level and distribution of colonisation of controls, SS1 and SS2000 infected mice were assessed 6 and 15 months post infection. All the control animals were free from gastric Helicobacter infection, except for one animal found to be H. felis infected (animal E17). This animal was excluded from further analysis. The colonisation levels of SS1 and SS2000 after 6 months were comparable in both mouse strains (Fig. 6.4) and in both cases the level of colonisation in the C57BL/6 mice was significantly higher than in the BALB/c mice (Table 6.5). These results are in agreement with previous studies on SS1 colonisation in these two strains of mice (174). The distribution pattern in these animals was also similar for SS1 and SS2000 in the C57BL/6 mice with high levels detected in the antrum, cardia and both transitional zones (Fig. 6.5A). In the BALB/c mice the pattern of distribution of the two H. pylori strains was similar with the highest levels detected in the two transitional zones. However, the level of colonisation of SS2000 in the transitional zones and the cardia region were found to be significantly greater than in the SS1 infected BALB/c mice (Fig. 6.5B). The location of the organisms in the gastric mucosa infected with either of the two

Figure 6.3: Difference in mean colonisation levels of the original clinical isolate (PMSS2000) and SS2000 in C57BL/6 mice after 1 month. Levels are expressed as CFU/g stomach tissue. * indicates statistically significantly (unpaired t-test with Welch’s correction). Error bars are 1 SD. The large error bar for SS2000 is due to a single outlier (4.5 x 103 CFU/ g stomach tissue).

Figure 6.4: Histograms showing the colonisation levels of SS1 and SS2000 in A) C57BL/6 mice and B) BALB/c mice. Levels are expressed as CFU/g stomach tissue. Error bars are 1 SD. * shows levels which are statistically significant between 15 M and 6 M post infection. † shows level is statistically significant in SS2000 infected mice compared to SS1 infected mice at the same time point (unpaired t-test with Welch’s correction). 3.50 * P=0.0003

) 3.00 7 2.50

2.00

1.50

1.00

CFU/g tissue (x 10 0.50

0.00 SS2000 PMSS20002.1 Figure 6.3

A C57BL/6 colonisation levels 40.0 * P<0.001

) 35.0 6 † P<0.001 30.0 25.0 6M 20.0 15M 15.0 10.0 5.0 CFU/g tissue (x 10 0.0 SS1 SS2000

B BALB/c colonisation levels 3.5 * P=0.014 ) 6 3.0 2.5 2.0 6M 1.5 15M 1.0 0.5 CFU/g tissue (x 10 0.0 SS1 SS2000 Figure 6.4 A C57BL/6 MICE

4 * 3.5 * * * ** * 3 *** * ANTRUM * 2.5 * A/B TZ 2 BODY 1.5 B/C TZ 1 CARDIA Colonisation grade 0.5 0 6M SS1 15M SS1 6M SS2000 15M SS2000

B BALB/c MICE

4 3.5 3 ANTRUM 2.5 * * A/B TZ * ** 2 BODY 1.5 B/C TZ 1 CARDIA Colonisation grade 0.5 0 6M SS1 15M SS1 6M SS2000 15M SS2000

Figure 6.5: Histograms showing the grade of colonisation of SS1 and SS2000 after 6 and 15 months infection in each part of the stomach in A) C57BL/6 mice and B) BALB/c mice. Levels are represented as the grade of colonisation from 1-3 (described in the Experimental Procedures). A/B TZ and B/C TZ are the antrum/body and body/cardia transitional zones, respectively. The mean grade is used and error bars represent 1 SD. The colonisation grade of SS2000 is statistically significant as compared with SS1 in the indicated regions at the same time point. Level of significance is indicated by * (p<0.05), ** (p<0.01) or *** (p<0.001) (Kruskal-Wallis un-paired t-test). A B

Figure 6.6: Light micrograph showing the antral tissue from C57BL/6 mice colonised with A) H. pylori strain SS1 and B) strain SS2000, 6 months post infection. Large numbers of both H. pylori strains are seen colonising the mucus lining of the gastric pits (open blue arrows) and the epithelial surface (closed blue arrows). Section stained with a modified Steiner silver stain. Magnification 400X. Photos courtesy of S. Danon (UNSW). Chapter 6 175 mouse-adapted strains was very similar in both mouse types (Fig. 6.6 shows an example of each in C57BL/6 mice).

The colonisation levels 15 months post infection of both H. pylori strains were increased as compared with the 6 month level (Fig. 6.4), although this increase was only statistically significant in the SS2000 infected groups (indicated by * in Fig. 6.4). The level of SS2000 in the C57BL/6 mice was also significantly higher than the level of SS1 (indicated by † in Fig. 6.4A). At this later time point, both strains retained a statistically significant higher level of colonisation in the C57BL/6 mice compared with the BALB/c mice (Table 6.5). The distribution patterns of both strains in the C57BL/6 mice were similar although, reflecting the higher level of colonisation, the grade in each section of the stomach in the SS2000 infected mice was significantly higher than in the SS1 infected mice (Fig. 6.5A). In the BALB/c mice the higher level of colonisation of SS2000 was reflected only in a significantly higher grade in the antrum as compared with the SS1 group (Fig. 6.5B).

6.3.4.2. Inflammation The histology scores for the 6 month animals indicated that there was very little difference in the inflammation induced by either H. pylori strain in both mouse types (Table 6.6 & 6.7). The only significantly different score was the number of lymphoid aggregates induced in the BALB/c mice by the SS2000 strain (median of 2) as compared with the SS1 infected BALB/c mice (median of 0.5) (Table 6.7). It was also noted that the macroscopic appearance of the SS2000 infected animals of both the C57BL/6 and BALB/c groups had thickened stomachs compared with the SS1 infected group after 6 months infection (data not shown).

In contrast the inflammation induced by the two H. pylori strains after 15 months differed substantially in both mouse types. In the C57BL/6 mice the level of inflammation induced by the SS1 strain had increased significantly from that at the 6 month time point. This is indicated by the significantly increased levels of infiltrating neutrophils and monocytes in both the antrum and body regions, as well as a significantly increased amount of submucosal inflammation (indicated Table 6.6: Histopathology grades for long-term infected C57BL/6 mice. C57BL/6 Antrum Body Time point Lymphoid Gland Sub. Group Atrophy (months) aggregates Abcesses Inflam.a Neutrophils Monocytes Neutrophils Monocytes Non- 6 0 (0-1) 0.5 (0-2) 0 (0) 0 (0-1) 0 (0) 0 (0) 0 (0) 0 (0) infected 15 0 (0-1) 1 (0-2)* 0 (0-1) 0 (0-2) 0 (0) 0 (0) 0 (0) 0 (0) SS1 6 0 (0-2) 1 (1-3) 0 (0-1) 2 (1-3) 0.5 (0-8) 0 (0-2) 1.5 (0-3) 1.5 (0-3) infected 15 1 (0-2)* 3 (1-4)*† 1.5 (0-3)* 3 (1-5)*† 1 (0-3) 1 (0-6) 0 (0-2) 2.5 (0-4) *† SS2000 6 0 (0-2) 2 (1-3) 0 (0-2) 2.5 (1-4) 0.5 (0-8) 0 (0-4) 1.5 (0-3) 2 (0-3) infected 15 1 (0-2) 1.5 (1-3) 1 (0-2) 2 (0-3) 0 (0-3) 0 (0-3) 0 (0-2) 1.5 (0-5) All scores are median (range). aSub. Inflam. refers to submucosal inflammation. *Significantly greater than corresponding value at 6 months; †significantly greater in SS1 infected mice compared with SS2000 infected mice after 15 months. All significance tests involving neutrophil or monocyte infiltration, atrophy and submucosal inflammation used p<0.05; Mann-Whitney; while tests for the counts of the number of lymphoid aggregates or gland abscesses used un-paired t-test with Welch’s correction p<0.05.

Table 6.7: Histopathology grades for long-term infected BALB/c mice.

BALB/c Time Antrum Body Lymphoid Gland Sub. Group point Atrophy a aggregates Abcesses Inflam. (months) Neutrophils Monocytes Neutrophils Monocytes Non- 6 0 (0) 1 (0-2) 0 (0) 1 (0-2) 0 (0) 0 (0) 0 (0) 0 (0-1) infected 15 0 (0-1) 1 (1-2) 0 (0-1) 1 (1-2) 0 (0) 0 (0) 0 (0) 0 (0-1) SS1 6 0.5 (0-2) 2 (1-3) 0 (0-2) 2 (2-3) 0.5 (0-2) 0 (0-1) 0 (0-1) 1 (0-3) infected 15 1 (0-2)* 2 (0-2)* 1 (0-2)* 3 (2-4)* 2 (0-5)* 0 (0-3) 0 (0-1) 3 (1-4)* SS2000 6 1 (0-1) 2 (1-3) 0 (0-1) 2 (2-4) 2 (1-3)¥ 0 (0-3) 0 (0-1) 2 (1-3) infected 15 1 (0-2)* 2.5 (1-4) 1 (0-3)* 4 (2-5)* 4 (0-6)*† 1.5 (0-6)† 1 (0-3)*† 3 (1-5)* All scores are median (range). aSub. Inflam. refers to submucosal inflammation. *Significantly greater than corresponding value at 6 months (P<0.05; Mann-Whitney); †significantly greater in SS2000 infected mice compared with SS1 infected mice after 15 months (p<0.05; Mann-Whitney); ¥ number significantly higher in SS2000 than in SS1 at 6M. All significance tests involving neutrophil or monocyte infiltration, atrophy and submucosal inflammation used p<0.05; Mann-Whitney; while tests for the counts of the number of lymphoid aggregates or gland abcesses used un-paired t-test with Welch’s correction p<0.05.

Figure 6.7: The gastric mucosa of C57BL/6 mice after 15 months infection with SS1 in panels A) (original magnification X50) & B) (original magnification X200) and SS2000 in panels C) (original magnification X50) & D) (original magnification X200) (Haematoxylin and Eosin stain). Panels B and D represent magnified views of the boxed region in the corresponding panel. Significant diffuse inflammatory cell infiltration of the entire mucosa and parts of the submucosa can be seen in the SS1 infected animal in panel A and B. In the SS2000 infected animal in panels C and D the mucosal infiltration is restricted to the region closest to the stomach lumen and less submucosal infiltration is evident. Panel E) shows the normal body mucosa of an uninfected control animal (original magnification X50). A B

C D

E

Figure 6.7 Chapter 6 178 by * in Table 6.6). This increase in inflammatory cell infiltration was not simply due to the age of the mice as the levels did not increase in the control animals with the exception of a mild increase in the number of monocytes in the antrum. In the SS2000 infected C57BL/6 mice the histology scores were largely unchanged from the 6 to 15 month time point. Thus SS1 induced a moderate chronic active gastritis in the C57BL/6 mice 15 months post inoculation, while SS2000 induced only a mild chronic active gastritis in these mice. The SS1 infection was characterised by significantly greater infiltration of monocytes in the antrum and body regions and greater submucosal infiltration than in the SS2000 infected mice (indicated by † in Table 6.6 and shown in Fig. 6.7).

The BALB/c animals infected with either H. pylori strain showed an increase in inflammation at 15 months post infection as compared with that at 6 months (indicated by * in Table 6.7). This increase was particularly evident in the amount of submucosal infiltration and the numbers of lymphoid aggregates detected at 15 months in both SS1 and SS2000 infected animals as compared with the levels of these at 6 months. Interestingly, in the BALB/c mice the level of inflammation in the SS2000 infected mice was greater than that of the SS1 infected animals. This is represented by the significantly greater number of lymphoid aggregates and gland abscesses in the SS2000 infected animals (indicated by † in Table 6.7). The amount of atrophy in the SS2000 infected mice was also higher, but represented only a mild level. Thus at 6 months both strains induced only a mild cellular infiltration in the BALB/c mice which by 15 months, had developed into a MALT type inflammatory response, which was more severe in the SS2000 infected animals. This was characterised by a large number of lymphoid aggregates and infiltrating monocytes mainly in the body region (Fig. 6.8). The percentage of SS2000 infected BALB/c mice with lymphoid aggregates (LA) and lymphoepithelial lesions (LEL) (both early indications of MALT-type pathology) in the body region was higher than in SS1 infected BALB/c mice, while in the cardia region the numbers were similar (Fig. 6.9). This prevalence data mimicked the grade of LA and LEL in these BALB/c mice (data not shown).

Figure 6.8: The gastric mucosa of BALB/c mice after 15 months infection with SS2000 in panels A) (original magnification X50) & B) (original magnification X200) and SS1 in panels C) (original magnification X50) & D) (original magnification X200) (haematoxylin and eosin stain). Panels B and D represent magnified views of the boxed region in the corresponding panel. A large lymphoid aggregate destructing the mucosa is shown in panels A & B. A diffuse infiltration of lymphocytic cells can be seen in the mucosa shown in panels C & D. A B

C D

Figure 6.8 100 90 80 70 60 Control † 50 SS1 40 SS2000 30 * 20 Percentage animals 10 0 Body LA Body LEL Cardia LA Cardia LEL LF

Figure 6.9: Histogram showing the percentage of control, SS1 and SS2000 infected BALB/c animals with lymphoid aggregates (LA), and lymphoepithelial lesions (LEL) in the Body and Cardia regions of the stomach, as well as possible lymphoid follicles (LF) 15 months post infection. † indicates statistically significant compared to SS1 infected animals (Mann-Whitney, P=0.0232); * indicates statistically significant compared to SS1 infected animals (Wilcoxon signed rank test, P=0.0313). Chapter 6 181

6.4. Discussion 6.4.1. Isolation of a New Mouse Colonising Strain The screening of a large number of clinical isolates for their ability to colonise mice resulted in the isolation of a novel mouse colonising strain of H. pylori (SS2000). In the course of this experiment it was found that many of the clinical isolates (at least 20) were able to colonise mice to varying degrees for up to 6 weeks. In some cases two strains were shown to concurrently colonise a single mouse. However, in subsequent passages most of these strains were not able to persist and only single strains were recovered, suggesting that in each case a single strain from a mixed inoculation eventually dominates. This phenomenon has been previously shown by a number of investigators (174, 344, 379).

Colonisation by multiple H. pylori strains for 1-2 months has been shown by a number of researchers {For examples see (269, 359)}. However, the majority of these studies do not report longer term colonisation and in many cases less than 100% of the inoculated mice remain infected over this time frame (332). Short term infection experiments in mice using genetically modified H. pylori strains can be useful for determining factors required for colonisation (89, 191, 204, 299, 375). However, given that H. pylori infection in humans is chronic, studies in animal models in which the effects of long term colonisation can be assessed are desirable. A major criticism of the SS1 mouse model is the fact that it does not produce extensive inflammation in mice (269). This is particularly true of short- term infection (< 6 months) whereas after longer infection times pathology increases significantly in the majority of mice strains (174, 297). In some mouse strains, such as the non-responder strain BALB/c, the first signs of inflammation are not evident until after 6 months and, at least in H. felis infections, this can subsequently develop into severe pathology (97).

In contrast to these previous short-term infection studies, the mouse colonising strain isolated in this study, SS2000, colonises both C57BL/6 and BALB/c mice with equivalent or enhanced efficiency to the SS1 strain (100% mice inoculated) Chapter 6 182 for at least 15 months. This has enabled the assessment of the effects of both strain and host factors during long term colonisation using the SS1 and SS2000 strains. Such an experiment has not been reported previously and should improve our understanding of the contribution of strain and host factors to disease development.

6.4.2. Characterisation of the Pre- and Post-Mouse Colonising Strains The parental strain of the mouse-adapted SS2000 isolate was determined using RAPD analysis. Interestingly, this pre-mouse strain of SS2000 (PMSS2000) was able to colonise C57BL/6 mice to a higher level (equivalent to colonisation ability of SS1) than the pre-mouse isolate of SS1 (10700), indicating that the PMSS2000 clinical isolate had an enhanced intrinsic ability to colonise mice. Despite this, some “mousification” appears to have occurred during the initial passages to obtain SS2000, as the mouse-adapted strain colonises to a level about ½ log unit higher than PMSS2000 in C57BL/6 mice. This type of “mousification” also occurred during the isolation of the SS1 strain (174).

6.4.3. Effect of Host and Strain Specific Factors on Chronic Colonisation Although at least 10 mouse adapted strains of H. pylori have been reported previously, few have been shown to persistently colonise multiple strains of mice (269, 370). This is the first extensive comparative investigation of the effects of two H. pylori strains on colonisation and inflammation in two strains of mice during long term infection. Six months after infection with either SS1 or SS2000, the colonisation level of these two strains in both BALB/c and C57BL/6 mice were comparable, although there was a trend of higher colonisation in the antrum and TZs of SS2000 in BALB/c mice. In both cases, the colonisation level was significantly higher in the C57BL/6 strain as compared with the BALB/c strain, suggesting that host dependent factors previously described in these mouse types are important. The inflammation produced by the two H. pylori strains after 6 months was almost indistinguishable with the exception of an increased number of lymphoid aggregates (LAs) induced in BALB/c mice by SS2000. Thus Chapter 6 183 it is possible that the higher colonisation level in the BALB/c mice at this time point may explain the increase in LAs.

By 15 months post infection more distinct differences in colonisation and inflammation produced by these two bacterial strains were apparent. While the level of SS1 colonisation remained constant from the 6 month time point, the level of SS2000 increased significantly in both mouse strains. However, at least in the C57BL/6 mice, the level of inflammation induced by SS2000 did not increase concomitantly. The ability of SS2000 to increase an already high bacterial load in these mice without an increase in gastritis, suggests that this strain may have an enhanced ability to evade the murine immune system. The fact that this strain was isolated from a patient diagnosed with gastritis, and not more serious disease, suggests it was also able to restrict the immune response in the human host. In contrast, the level of inflammation in the C57BL/6, SS1 infected mice was significantly increased from the level at 6 months resulting in a moderate chronic active gastritis at the 15 month time point.

In the BALB/c mice, as was indicated at the 6 month time point, the increased level of SS2000 induced a significantly higher level of inflammation than the SS1 infected animals. This inflammation was characterised by a large number of LA in the body and cardia regions, indicating that this pathology may be of a MALT type inflammatory response. Comparison of histology scores with scores obtained for both SS1 and H. felis infection of BALB/c mice 15 months post- infection indicate that the level of LA induction by SS2000 is intermediate between these two groups (J. O’Rourke, personal communication). A small percentage of the H. felis infected animals of the aforementioned study developed a pathology resembling MALT lymphoma by 24-28 months post- infection (J. O’Rourke, personal communication). Another Helicobacter species, “H. heilmannii” has also been shown to induce MALT lymphomas in long term infected BALB/c mice (175). Interestingly the cag PAI is believed to be missing from both H. felis and “H. heilmannii”. In the genome-typing study of the mouse colonising H. pylori strains (described in the following chapter), SS2000 was also Chapter 6 184 found to lack the entire cag PAI. Thus, it is possible that the lack of the cag PAI in SS2000 may be related to the inflammation produced by this strain in BALB/c mice while there are also likely to be other factors involved. Longer term infections of BALB/c mice with SS2000 will be needed to further elucidate the importance of these observations.

The mechanisms underlying the differences in host response between the C57BL/6 and BALB/c mice are largely unknown. However, a number of factors may contribute. First, as mentioned earlier, a pro-inflammatory Th1 response dominates in the C57BL/6 mice, while the anti-inflammatory Th2 phenotype is present in the BALB/c mice. This has been used to explain the general lack of inflammation seen in the BALB/c mice after moderate length infections such as 6 months (102). A possible contribution to this difference in response is the fact that

C57BL/6 mice do not express IgG2a the most commonly utilised marker for the

Th1 response in mice. Instead they express the IgG2c isotype (196). It is possible that the expression of the IgG2a isotype by the BALB/c mice is involved in mediating protection against extensive colonisation by H. pylori. However, persistent colonisation in other IgG2a expressing mouse strains has been observed elsewhere (332).

Second, a very recent report has shown differential expression of certain Toll-like receptors by dendritic cells (DC) from BALB/c and C57BL/6 strains. The response of the DC from these two mouse strains varied in that the ligands for LPS (TLR-4), lipoprotein (TLR-2) and CpG (TLR-9) induced higher IL-12p40 secretion in cells derived from C57BL/6 mice whereas in BALB/c mice the induction of monocyte chemoattractant protein 1 (MCP1) was higher (182). This may be closely related to the induction of a Th1 dominant phenotype in the C57BL/6 mice, as IL-12 is known to drive T-cell development towards a Th1 dominant phenotype in response to bacterial products such as LPS (362). Thus, the induction of MCP1 instead of IL-12 in the BALB/c mice suggests a mechanism for the development of the dominant Th2 response in these mice (182, 362). Chapter 6 185

Third, the C57BL/6 mice have a different Major Histocompatibility Complex (MHC) genotype than the BALB/c’s. The former expresses the class B H-2 complex (H-2b), while the latter expresses the class D H-2 complex (H-2d) (218). Different localisation of the expression of MHC II antigen was also seen in BALB/c (mainly cardia) compared with C57BL/6 (antrum and corpus) strains. However, it may be unlikely that this contributes to the different responses of these strains to infection with H. pylori as strains with the same H-2 complex had differing susceptibilities in a previous study (296). In addition, in general disease susceptibility has been linked to non-MHC autosomal genes rather that to MHC class (218).

Last, some investigators have reported differing levels of antibody production in response to H. pylori infection. Kim et al. showed higher serum IgG and IgA, and secreted IgA levels in BALB/c mice compared with C57BL/6. This induced antibody production may explain the lower level of colonisation achieved in the BALB/c mice (159).

6.5. Conclusions These experiments have provided the opportunity to dissect some of the factors contributing to the host and strain specific effects on disease in H. pylori infection of mice. H. pylori strain specific effects were evident through the differing levels of colonisation and inflammation produced by the two strains. Host specific effects were also shown in the different level of colonisation occurring in C57BL/6 compared to BALB/c mice and in the type of inflammatory response induced.

The complete lack of the cag PAI in the SS2000 strain has shown that this pathogenicity island is not required for mouse colonisation and in fact loss of this factor may provide an advantage for persistent infection of mice. The cag PAI is not the only factor affecting colonisation ability, as it has been shown that many other cag PAI negative strains are not able to persist in mice. It is interesting to note that the SS2000 strain was able to induce significant inflammation, particularly in BALB/c mice, suggesting that there may be other factors which are Chapter 6 186 important in the induction of pathology by H. pylori in mice. Considering that there is not a complete correlation between the severity of disease in an individual and the presence of the cag PAI in the infecting H. pylori strain, this new mouse model will be useful in determining the importance of cag-independent virulence factors. Further analysis of other phenotypic differences between SS1 and SS2000 may also provide insight into strain specific differences that influence pathology such as LPS structure, urease activity, adherence, and motility.

This study has further demonstrated the importance of differences in host factors in the inflammatory response to H. pylori infection. Further investigations into the antibody and cytokine responses of these mice to the two different H. pylori strains may shed light on the mechanism inducing polarised Th1 or Th2 responses in these mice. Measurement of infection induced release of cytokine/chemokines (eg. MIP-2) in mouse gastric cell lines such as GSM06 (234) and in the serum of infected animals will also shed light on the contribution of both host and strain specific effects.

Chapter 7

GENOME-TYPING OF THE MOUSE COLONISING STRAINS

7.1. Background The previous chapter investigated the specific contributions of both host and strain differences to disease progression during long term colonisation of mice. Recently, H. pylori microarrays have been used to investigate the genomic variation between different strains in an attempt to link genetic elements with disease progression (32, 136, 298) (described in Chapter 1). In addition, some of these studies have used microarray analysis to investigate the notion that there is microevolution of H. pylori strains within the host. The study by Bjorkholm et al. (32) investigated possible changes in the genes encoded on the cag PAI as well as three other separate genetic loci, of two individual strains colonising mice for up to 10 months. In this study no genetic changes in these genes were discovered during the colonisation period. Thus, the investigators concluded that these strains represented stable clones. Another study has utilised this methodology to investigate changes in the entire genetic content of a single H. pylori strain occurring during human colonisation for 6 years (137). In contrast to Bjorkholm’s study, a number of deletions and acquisitions of genes were found in these strains. The reason for the different results of these two studies may be explained by the time frame of colonisation, the sensitivity of the analysis used, or the narrow range of genes tested by Bjorkholm et al. (32, 137).

A recent paper by Kim et al. (157) has investigated the sensitivity of the analysis of genome typing microarray results used in all of the studies mentioned above. The aforementioned investigators all utilised an empirical determination of cutoffs to assign genes as present or absent/divergent in the strains tested. This inflexible methodology is subject to several problems that can result in the misclassification of many genes (157). Kim et al. have developed a new computational analysis program, GACK, which uses a dynamic cutoff value to Chapter 7 188 assign genes. Thus the program accounts for strain composition and hybridisation quality rendering it a more reliable technique (157).

The studies presented in this chapter aimed to characterise the genetic differences between SS1 and SS2000 strains. In addition any changes in genetic content that may have occurred in these strains during initial “mousification” (adaptation to the murine gastric environment) and during long term colonisation were analysed using the GACK program. Chapter 7 189

7.2. Experimental Procedures 7.2.1. Strains Used to Study “Mousification” Genomic DNA (gDNA) was extracted from very low passage isolates of the original clinical isolates of both SS1 (10700) and SS2000 (PMSS2000/2.1), as well as from the mouse-adapted SS1 (from the A. Lee laboratory) and SS2000 isolates for analysis of the DNA content of these strains using the H. pylori DNA microarray (Table 7.1).

7.2.1.1. Microarray analysis of the strains in the “mousification” study gDNA labelling and hybridisation to microarrays was performed as described in Chapter 2. In each case 1 µg of the test gDNA sample was labelled with Cy5 and this was hybridised with 1 µg Cy3 labelled reference DNA. The reference DNA used consisted of equal amounts of gDNA from the two H. pylori strains used to make the H. pylori microarray, 26695 and J99 (see Table 2.1 for details). Two microarrays were performed for each of these 4 strains (8 arrays in total).

The duplicate microarrays for 10700, PMSS2000 and the single arrays for SS1 and SS2000 were performed by N. Salama and the data from these arrays were combined with the data obtained for the array hybridisations for SS1 (SS1-S) and SS2000 (SS2000-S) performed in this study, for analysis (Table 7.1).

7.2.2. Strains Used to Study the Genomic Changes during Long Term Colonisation gDNA was extracted from one H. pylori strain from each mouse in groups B, C, F and G from both the 6 and 15 month time points from the long term colonisation experiment described in Chapter 6 (output strains-Table 7.2). In addition gDNA was extracted from the input strains of SS1 and SS2000 as well as from the in vitro passaged stocks of these strains for comparison with the output strains (Table 7.2). The input strain of SS1 (SS1-I) and the SS1 stock strain, in vitro passaged 9 times (SS1-S) (total of 2 input strains) along with 21 SS1 output strains (13 x 6 month strains and 8 x 15 month strains) were analysed. The input (SS2000-I) and stock, in vitro passaged 3 times, (SS2000-S) strains of SS2000 Chapter 7 190

Table 7.1: Strains used for microarray hybridisation for analysis of genomic changes during “mousification”.

Strain Name Strain Origin (in vitro passage no.) HP array no. SS1 (1) SS1 (S9)† (SS1-S)‡ HP8075a SS1 (2) SS1 (S11)† HP4065 10700 (1) 10700 (S5)† HP4043 10700 (2) 10700 (S5)† HP4044 SS2000 (1) SS2000 (S3)† (SS2000-S) ‡ HP8076b SS2000 (2) SS2000 (S5)† HP4073 PMSS2000 (1) 2.1 (S5)† HP4072 PMSS2000 (2) 2.1 (S5)† HP4087 † Refers to the number of in vitro passages for each strain. This is unknown for the two SS1-SF strains. ‡ Refers to the alternate name used for these strains. Chapter 7 191

(total of 2 input strains) along with 16 SS2000 output strains (8 x 6 month strains and 8 x 15 month strains) were also analysed.

7.2.2.1. Microarrays for long term study The gDNA (1 µg) from each strain was hybridised to 1 array each using the same reference DNA sample described in the previous section (mixture of gDNA from 26695 and J99) (total of 41 arrays) (Table 7.2). Genomic DNA labelling and microarray hybridisation was performed as described in Chapter 2.

7.2.3. Analysis of Genome Typing Microarray Results using GACK In order to determine changes in the genomic content of the H. pylori strains described in this chapter, the data obtained from the microarrays described above were analysed separately using the microarray genome analysis program GACK (157). This program generated a dynamic cutoff for assigning genes as present or divergent. Divergent genes may either be completely absent or may have significant sequence divergence from the gene used to generate the microarray. Thus, an individual cutoff was determined by GACK for each separate array hybridisation, instead of using an empirically determined constant cutoff for this purpose. The algorithm used functions independently of any normalisation process which can be influenced by differences in strain composition and hybridisation quality. The program assigns an Estimated

Probability of Presence (EPP) according to the distribution of the log2 (R/G) ratios for each array and thus gives an estimate of how likely a gene is to be present. The EPP range is from 0%, assigned as ‘divergent’, to 100% assigned as ‘present’. Those genes falling in the transition region between 0% and 100% EPP are classified as ‘slightly divergent’. Three options are provided for the data output, binary (0= divergent, 1= present), trinary (-1= divergent, 0= slightly divergent, 1= present), and graded {continuous range -0.5 (divergent) to 0.5 (present)}. The GACK program is available from the Falkow Lab Website (http://falkow.stanford.edu). Chapter 7 192

7.2.4. Genomic Changes during “Mousification” Normalised data from the 10 arrays (Table 7.1) were downloaded from SMD using their log2 (R/G) ratios. Data was filtered to remove elements containing failed PCR reactions, and elements whose Cy3 net mean intensity =100. The resulting data set contained 3350 spots. The duplicate spots within each array were then averaged using the NACK program (156) (described in Chapter 2) and genes with at least 90% good data were retrieved leaving a total of 1522 unique ORFs for analysis using the GACK program (157) (This data set is available in the Supplementary Material, Table S7.1).

The trinary output was first used in this study to assign genes in all arrays to the three groups (present, slightly divergent or divergent) and then those genes in the slightly divergent category and genes whose assignment was different in the duplicate arrays were further scrutinised using the graded output. If the assigned grade for a particular gene differed by more than 0.5 units (50% variation) between duplicate arrays for the same strain, it was discarded from analysis because its hybridisation was deemed irreproducible. Two separate comparisons of these data were made. First, the difference between the pre-mouse and post- mouse strains of SS1 and SS2000 were compared in order to assess changes that may have occurred during “mousification” of these strains. Second, the differences in SS1 and SS2000 were assessed in order to determine possible reasons for the differences in the pathology produced by these strains in mice (see Supplementary Material Table S7.4 for full list). For these two analyses only those genes in which the GACK assignment was in complete agreement between the duplicate arrays for each strain were used.

7.2.5. Genomic Changes during Long Term Colonisation

The normalised log2 (R/G) data was obtained from SMD using the same filtering criteria as described in section 7.2.4 and the data within each array was averaged for GACK analysis (This data set used for analysis is available in the Supplementary Material, Table S7.2). The graded assignment was used for further analysis of differences between strains over time. Genes which were Chapter 7 193 deemed present (value of 0.5), representing the core genes, or divergent (value of -0.5) (absent or significantly divergent) in = 80% of the arrays within each strain were eliminated from the analysis. Those genes remaining (non-core genes) whose values varied across arrays from the same parent strain {see Supplementary Material: Table S7.5 Non-core SS1 genes (186 genes); Table S7.6 Non-core SS2000 genes (140 genes)} were used in analyses to determine changes occurring: (i) between the input and output strains, (ii) between strains obtained from C57BL/6 and BALB/c mice, and (iii) between the 6 and 15 month isolates. Analyses used for this purpose were Hierarchical Clustering (HC), Self Organising Maps (SOM) and Principal Component Analysis (PCA) using the Cluster and Treeview programs.

7.2.6. Supplementary Material The following material is available in the supplementary material (see Appendix): Table S7.1: Normalised averaged data for the pre- and post-mouse strains of SS1 and SS2000. Table S7.2: Normalised averaged data for Input and Output strains of SS1 and SS2000 from long term colonisation experiment. Table S7.3.1: Normalised averaged data for the SS1_AL and SS1_SF strains. Table S7.3.2: Genes that vary between the SS1 isolate from the A. Lee laboratory (SS1-AL) and the isolate obtained by Stanford University (SS1-SF) as determined by GACK analysis, graded output. Table S7.4: Genes that differ between SS1 and SS2000 as determined by GACK analysis, graded output. Table S7.5: Genes that vary in the SS1 Input and Output strains (Non-core genes) as determined by GACK analysis, graded output. Table S7.6: Genes that vary in the SS2000 Input and Output strains (Non-core genes) as determined by GACK analysis, graded output. Chapter 7 194

Table 7.2: Strains used for genomic analysis of Input and Output strains from the long term colonisation experiment.

Strain Name Strain Origin (in vitro Animal Time Input or HP array (Animal no.) passage no.) Strain Point Output no. SS1-I SS1 (S9) BALB/c & 0 M Input HP8074b C57BL/6 SS1 S SS1 (S9) - - Stock HP8075a SS2000 I SS2000-Passage D BALB/c & 0 M Input HP8076a (S3) C57BL/6 SS2000 S SS2000 (Passage D) - - Stock HP8076b (S3) B1 SS1 BALB/c 6 M Output HP8088a B3 SS1 BALB/c 6 M Output HP8088b B4 SS1 BALB/c 6 M Output HP8077a B5 SS1 BALB/c 6 M Output HP8077b B6 SS1 BALB/c 6 M Output HP8089a B8 SS1 BALB/c 6 M Output HP8089b B9 SS1 BALB/c 6 M Output HP8090a B10 SS1 BALB/c 6 M Output HP8090b B21 SS1 BALB/c 15 M Output HP8066a B25 SS1 BALB/c 15 M Output HP8068a B27 SS1 BALB/c 15 M Output HP8066b B28 SS1 BALB/c 15 M Output HP8067a C1 SS2000 BALB/c 6 M Output HP8091a C5 SS2000 BALB/c 6 M Output HP8091b C6 SS2000 BALB/c 6 M Output HP8092a C7 SS2000 BALB/c 6 M Output HP8092b C9 SS2000 BALB/c 6 M Output HP8093a C10 SS2000 BALB/c 6 M Output HP8093b C21 SS2000 BALB/c 15 M Output HP8068b C26 SS2000 BALB/c 15 M Output HP8069a C27 SS2000 BALB/c 15 M Output HP8069b C28 SS2000 BALB/c 15 M Output HP8070a F1 SS1 C57BL/6 6 M Output HP8094a F2 SS1 C57BL/6 6 M Output HP8094b F3 SS1 C57BL/6 6 M Output HP8097a F6 SS1 C57BL/6 6 M Output HP8095a F9 SS1 C57BL/6 6 M Output HP8097b F10 SS1 C57BL/6 6 M Output HP8095b F22 SS1 C57BL/6 15 M Output HP8071a F25 SS1 C57BL/6 15 M Output HP8067b F26 SS1 C57BL/6 15 M Output HP8070b F27 SS1 C57BL/6 15 M Output HP8071b G1 SS2000 C57BL/6 6 M Output HP8096a G2 SS2000 C57BL/6 6 M Output HP8096b G21 SS2000 C57BL/6 15 M Output HP8073b G23 SS2000 C57BL/6 15 M Output HP8072a G24 SS2000 C57BL/6 15 M Output HP8072b G26 SS2000 C57BL/6 15 M Output HP8073a Chapter 7 195

7.3. Results 7.3.1. Microarray Analysis of the Genome Content of Pre- and Post-mouse Strains Genomic typing of the pre- and post- mouse strains of SS1 and SS2000 using microarrays revealed that there may be modification of a number of genes during the “mousification” process. Analysis of the microarray results was performed with the GACK program which calculates an estimate of how likely a gene is to be present in the strain being tested (157). For these analyses the graded output of the program was used which assigns each gene in each array a value in the range between -0.5 and 0.5 depending on the likelihood of it being present (details in the Experimental Procedures). Since the H. pylori microarray was designed using PCR products derived from the genomic DNA of the 26695 and J99 strains (298), genes which gave a high hybridisation signal were those which had a high sequence identity to the gene on the array (present, 0.5). Those genes which gave a very low hybridisation signal could either have been highly divergent in their sequence identity (thus having poor hybridisation qualities) or could have been absent from the strain being tested (divergent, -0.5). Those genes whose hybridisation signal was intermediate were likely to have had slightly divergent sequence identity from the gene used for the array (slightly divergent, value between -0.5 and 0.5).

Data for the duplicate arrays for each strain were compared and only the genes which were in complete agreement between the two arrays were used. Between 10 and 15% of genes were found to have been assigned differently by GACK in the duplicate arrays for each strain and most of these genes had been assigned to the slightly divergent category. These genes were not considered for further analysis because their hybridisation properties were not reliable.

7.3.1.1. Changes in SS2000 from the original clinical isolate, PMSS2000 When the genomic content of the PMSS2000 strain was compared with the “mousified” SS2000 strain there were 7 genes found to differ between the two Chapter 7 196 strains (Fig. 7.1A, gene details in Table 7.3). Four of these were of unknown function. One gene, found to be divergent in the SS2000 strain but only slightly divergent in the PMSS2000 strain, was a putative lipopolysaccharide (LPS) biosynthesis gene (JHP0562). The ribosomal protein rps11 was found to be slightly divergent in SS2000, but present in PMSS2000. Finally, a putative outer membrane protein (OMP) gene (HP0486) was found to be present in SS2000, but slightly divergent in the PMSS2000 strain.

7.3.1.2. Changes in SS1 from the original clinical isolate, 10700. Comparison of the pre-mouse SS1 strain, 10700, and the “mousified” SS1 strain revealed many more differences (58 genes) than between the SS2000 strains (Fig. 7.1B, gene details in Table 7.4). Four genes were found to be present in 10700 but divergent or slightly divergent in SS1 including the Ni/Fe hydrogenase expression/formation protein hypD. In contrast 51 genes were found to be present in SS1 and slightly divergent in 10700. Of these genes the majority were of unknown function (18), while there were also 4 genes involved in cell envelope structure (including babB, omp5 and omp29); 8 transport and binding protein genes (including ureI, the urea transporter, and frpB, tonB and fecA involved in iron uptake); and 5 DNA metabolism genes. Of particular interest were two genes, HP0486 and rps11, which had also been shown to vary in the SS2000 strains described above.

7.3.1.3. Genomic differences between SS1 and SS2000 A large number of genes were present in all the pre- and post-mouse SS1 and SS2000 isolates tested (734/1522). These represent a core set of genes which do not change between these strains and thus were subtracted from the list of genes in this analysis. The genes which were divergent in any of the arrays were clustered in Fig. 7.2 using the trinary output (1= present, 0= slightly divergent, -1= divergent) from the GACK analysis. The main sets of genes which differ between the SS1 strains and the SS2000 strains are indicated by the coloured triangles. These are separated into three clusters which are shown in full in the following figures: the restriction enzyme cluster (Fig. 7.3), the transposon (Tn) cluster (Fig.

Figure 7.1: Genomic changes during mousification. Clusters of the GACK graded assignment of the genes from the duplicate arrays which are different in: A) the pre- and post-mouse SS2000 strains (PMSS2000 and SS2000 respectively), and B) the pre- and post- mouse SS1 strains (10700 and SS1 respectively). Blue colours represent present genes, yellow represents divergent genes and the blue, black and yellow intermediate shades represent genes which are slightly divergent. OMP is outer membrane protein.

Figure 7.1 A

PMSS2000 (1) PMSS2000 (2) SS2000 (1) SS2000 (2) JHP0929 HP1295 rps11 HP0080 HP0486 OMP JHP0829 HP1388 JHP0562 putative B

10700 (1) 10700 (2) SS1 (1) SS1 (2) HP0169 prtC HP0354 dxs HP1566 HP1213 pnp HP0903 ackA HP0872 phnA HP0902 HP1241 alaS HP0762 HP0915 frpB HP1506 gltS HP1058 panB HP1269 nuoJ HP1484 conserved HP0010 groEL JHP1297 res_1 HP0557 accA HP0843 thiB HP1383 HP0227 omp5 HP1250 HP0898 hypD HP1572 dniR HP0939 yckJ HP0105 conserved HP1452 tdhF HP0599 hylB HP0961 gpsA HP0071 ureI HP0690 fadA HP0102 conserved HP0896 babB HP0710 conserved HP1153 valS HP0919 carB HP1337 conserved HP1347 ung HP0720 HP0823 conserved HP0189 conserved HP1353 HP1341 tonB HP0783 HP0486 OMP HP0686 fecA HP0464 hsdR HP1370 mod HP0924 dmpI HP1298 infA HP0057 HP0723 ansB HP0311 HP0979 ftsZ HP0186 HP1342 omp29 HP0536 cag15 HP1212 atpE HP0594 HP1295 rps11 HP0835 hup HP0853 yheS HP0408 HP1265 nuoF HP1432 putative HP0079 omp3 JHP0950 HP1097 Chapter 7 198

Table 7.3: Genes that vary between the pre- and post-mouse SS2000 strains.

UID Name Putative function Category PMSS200 SS2000 Array Array Array Array 1a 2a 1a 2a JHP0562 put.b lipopolysaccharide biosynthesis Cell -0.05 0 -0.5 -0.5 protein envelope JHP0829 vapD putative virulence-associated -0.4 -0.45 -0.5 -0.5 protein D JHP0929 unknown 0.5 0.5 -0.05 0 HP1295 rps11 ribosomal protein S11, 30S Translation 0.5 0.5 0.45 0.35 RIBOSOMAL PROTEIN HP0080 unknown 0.45 0.15 0.5 0.5 HP0486 OMP Outer membrane protein Cell 0.05 0.45 0.5 0.5 envelope HP1388 unknown -0.5 -0.5 -0.4 -0.2 a The numbers quoted in these four columns refer to the values assigned by the graded GACK analysis to each of the genes in each of the four microarrays shown (described in Materials and Methods section) b put. stands for putative function

Chapter 7 199

Table 7.4: Genes that vary between the pre- and post-mouse SS1 strains.

UNIQID NAME Putative Functionb Category 10700 SS1 Array Array Array Array 1a 2a 1a 2a HP1265 nuoF putative NADH Energy -0.5 -0.5 0.1 0.5 oxidoreductase metabolism HP0079 omp3 outer membrane protein Cell envelope -0.5 -0.5 0.5 0.05 HP1432 histidine and glutam ine-rich Transport and -0.5 -0.5 0.05 0.5 protein binding HP1298 infA translation initiation factor EF- Translation -0.5 -0.3 0.5 0.5 1 HP0354 dxs deoxyxylulose-5-phosphate Biosynthesis 0.4 0.5 0.5 0.5 synthase cofactors HP1058 panB 3-methyl-2-oxobutanoate Biosynthesis 0.25 0.2 0.5 0.5 hydroxymethyltransferase cofactors HP0843 thiB thiamin phosphate Biosynthesis 0.35 0.15 0.5 0.5 pyrophosphorylase/hyroxyeth cofactors ylthiazole kinase HP1342 omp29 outer membrane protein Cell envelope 0.05 0 0.5 0.5 HP0227 omp5 outer membrane protein Cell envelope 0.45 0.1 0.5 0.5 HP0486 Outer membrane protein Cell envelope -0.2 -0.4 0.5 0.5 HP0896 babB outer membrane protein Cell envelope -0.1 0.15 0.5 0.5 HP0536 cag15 cag pathogenicity island Cellular 0.15 -0.05 0.5 0.5 protein processes HP0979 ftsZ cell division protein , GTPase Cellular -0.4 -0.35 0.5 0.5 processes HP1452 tdhF thiophene and furan oxidizer Cellular 0.1 0.35 0.5 0.5 processes HP0010 groEL chaperone and heat shock Cellular 0.3 0.25 0.5 0.5 protein , 60kDa processes HP0599 hylB methyl-accepting chemotaxis Cellular 0.05 0.35 0.5 0.5 protein (MCP) processes HP1370 mod type III restriction enzyme M DNA -0.3 -0.4 0.5 0.5 metabolism HP0464 hsdR type I restriction enzyme R DNA -0.2 -0.3 0.5 0.5 metabolism HP1347 ung uracil-DNA glycosylase DNA -0.1 0.35 0.5 0.5 metabolism HP1383 restriction modification DNA 0.35 0.05 0.5 0.5 system S subunit metabolism JHP1297 res_1 putative TYPE III DNA 0.2 0.25 0.5 0.5 RESTRICTION ENZYME metabolism HP1269 nuoJ NADH-ubiquinone Energy 0.25 0.2 0.5 0.5 oxidoreductase, NQO10 metabolism HP0723 ansB L-asparaginase II energy Energy -0.45 -0.45 0.5 0.5 metabolism metabolism HP0903 ackA ACETATE KINASE energy Energy 0.45 0.5 0.5 0.5 metabolism metabolism HP1212 atpE ATP synthase F0, subunit c Energy -0.1 -0.1 0.5 0.5 energy metabolism metabolism HP0961 gpsA glycerol-3-phosphate Energy 0 0.3 0.5 0.5 dehydrogenase metabolism HP0690 fadA acetyl coenzyme A FA metabolism -0.05 0.2 0.5 0.5 acetyltransferase(thiolase) HP0557 accA acetyl-coenzyme A FA metabolism 0.3 -0.05 0.5 0.5 carboxylase Chapter 7 200

HP0919 carB carbamoyl-phosphate Pyrimidine -0.15 0.1 0.5 0.5 synthase (glutamine- ribonucleotide hydrolysing) biosynthesis HP1572 dniR regulatory protein DniR Regulator 0.15 0.3 0.5 0.5 HP1213 pnp polynucleotide phosphorylase Transcription 0.35 0.5 0.5 0.5 HP0835 hup histone-like DNA-binding Translation -0.05 -0.3 0.5 0.5 protein HP1241 alaS alanyl-tRNA synthetase Translation 0.45 0.3 0.5 0.5 HP0169 prtC collagenase Translation 0.3 0.45 0.5 0.5 HP1295 rps11 ribosomal protein S11 , 30S Translation 0.1 -0.25 0.5 0.5 HP1153 valS valyl-tRNA synthetase Translation -0.15 0.2 0.5 0.5 HP0686 fecA iron(III) dicitrate transport Transport and -0.2 -0.4 0.5 0.5 protein binding HP0939 yckJ amino acid ABC transporter, Transport and 0.1 0.25 0.5 0.5 permease protein binding HP0071 ureI urease accessory protein , Transport and 0 0.25 0.5 0.5 urea transporter binding HP1341 tonB siderophore-mediated iron Transport and -0.2 -0.35 0.5 0.5 transport protein binding HP0872 phnA alkylphosphonate uptake Transport and 0.5 0.5 0.5 0.5 protein binding HP0915 frpB iron-regulated outer Transport and 0.5 0.25 0.5 0.5 membrane protein binding HP1506 gltS glutamate permease , Transport and 0.45 0.25 0.5 0.5 Sodium/Glutamate Symporter binding HP0853 yheS ABC transporter, ATP-binding Transport and 0 -0.35 0.5 0.5 protein binding HP0710 cons. hypoth. protein, -0.1 0.2 0.5 0.5 putative OMP HP1337 cons. hypoth. protein -0.2 0.15 0.5 0.5 HP1484 cons. hypoth . integral 0.25 0.15 0.5 0.5 membrane protein HP0102 cons. hypoth. protein -0.05 0.2 0.5 0.5 HP0105 cons. hypoth. protein 0.1 0.25 0.5 0.5 HP0186 -0.35 -0.3 0.5 0.5 HP0720 -0.15 0.3 0.5 0.5 HP0189 cons. hypoth. integral -0.4 -0.05 0.5 0.5 membrane protein HP0902 0.45 0.45 0.5 0.5 HP0762 0.4 0.3 0.5 0.5 HP1566 0.4 0.5 0.5 0.5 HP0408 -0.05 -0.4 0.5 0.5 HP1353 -0.3 0.05 0.5 0.5 HP0594 0 -0.15 0.5 0.5 HP0057 -0.4 -0.2 0.5 0.5 HP1250 0.5 0.05 0.5 0.5 HP0924 dmpI 4-oxalocrotonate tautomerase -0.3 -0.4 0.5 0.5 HP0783 -0.2 -0.35 0.5 0.5 HP0311 -0.45 -0.45 0.5 0.5 HP0823 cons. hypoth. protein 0.5 0.5 0 0.45 JHP0950 0.5 0.35 -0.5 -0.5 HP1097 0.5 0.5 -0.2 -0.5 HP0898 hypD hydrogenase Central 0.5 0.5 0.4 0.2 expression/formation protein intermediary metabolism a The numbers quoted in these four columns refer to the values assigned by the graded GACK analysis to each of the genes in each of the four microarrays shown (described in Materials and Methods section) b Cons. hypoth. stands for conserved hypothetical. Chapter 7 201

7.4A) and the cag PAI cluster (Fig. 7.4B) (The full set of differences between SS1 and SS2000 are listed in the Supplementary Material, Table S7.4).

The restriction enzyme cluster represents 51 genes that were called present or slightly divergent in SS2000, but were divergent to varying degrees in SS1. There were 11 genes involved in DNA recombination, replication or repair as well as the putative LPS biosynthesis protein (HP1578) and a lipoprotein gene (HP1436). The Tn cluster represents genes which were also called present in SS2000 and are divergent in the SS1 strains. In this set of 24 genes there were 4 Tn-like genes, 2 insertion sequences, 1 topoisomerase and 2 genes involved in DNA recombination and metabolism. Finally the cag PAI cluster consisted of 55 genes which were assigned as present in SS1 but divergent in SS2000. The majority of this group represents the genes of the cag pathogenicity island (27). The rest of the genes found divergent in SS2000 included 6 genes involved in DNA metabolism, recombination and repair, 2 OMPs, and 2 LPS proteins. Three of these genes (HP0483, HP0484 and omp27) were also found to covary with the cag PAI genes by Salama et al. (298).

Thus the genes which differ between these two mouse colonising strains include a large compliment of DNA metabolism genes (19), genes involved in cell envelope structure and synthesis (7), genes involved in Tn-related functions (7) and genes of unknown function (60). Many of the genes in the later group are located in one of the two ‘plasticity’ zones (6).

7.3.2. Microarray Analysis of the Genomic Content of Strains Isolated from the Long Term Colonisation Study The genomic content of the input strains (those used to infect the mice) and output strains (obtained from mice after euthanasia at the 6 or 15 month time points) from the long term colonisation experiment was assessed using GACK analysis of microarrays performed on the gDNA of these strains. Analysis of the GACK data for the SS1 strains showed that 186 genes were assigned varying levels in more than 20% of the strains (Listed in Supplementary Material: SS1 genes that vary). The majority of these genes were of unknown function 115 10700 (1) 10700 (2) SS1 (1) SS1 (2) PMSS2000 (1) PMSS2000 (2) SS2000 (1) SS2000 (2)

Restriction enzyme cluster Tn cluster

cag PAI cluster

Figure 7.2: Genomic differences between SS1 and SS2000. Unsupervised hierarchical cluster of the trinary GACK analysis of the non-core genes (genes with values < 1 in at least one array), in the duplicate arrays representing the pre-mouse (10700 and PMSS2000) and post-mouse (SS1 and SS2000) strains. Blue represents genes called as present (value 1), yellow represents genes called as divergent (value -1) and black represents genes called as slightly divergent (value 0) by GACK. Grey represents missing data. The coloured triangles on the right represent the position of the clusters shown in full in the following figures for the genes that are divergent between SS1 and SS2000.

Figure 7.3: Restriction enzyme cluster of trinary GACK data of a restriction enzyme cluster for the duplicate arrays representing the pre-mouse (10700 and PMSS2000) and post-mouse (SS1, and SS2000) strains. Blue represents genes called as present (value 1), yellow represents genes called as divergent (value - 1) and black represents genes called as slightly divergent (value 0). Grey represents missing data. M/T stands for methyltransferase; cons. hypoth. stands for conserved hypothetical; LPS is lipopolysaccharide.

Figure 7.3

10700 (1) Restriction enzymecluster 10700 (2) SS1 (1) SS1 (2) PMSS2000 (1) PMSS2000 (2) SS2000 (1) SS2000 (2) JHP004 putativetypeIIDNAM/T JHP0929 HP1397 HP0054 adenine/cytosineDNAM/T HP0340 HP0059 HP0456 HP0667 HP1404 hsdS HP0262 HP0341 JHP0587 HP0344 HP0513 HP0765 JHP0945 HP0600 spaB JHP0414 hsdS_1a JHP0946 JHP0960 HP0052 HP0428 terY HP0431 ptc1 HP0432 proteinkinase HP0433 HP0434 HP0435 putativemRNAdecayfactor HP0449 HP0451 HP0452 putativerestrictionenzyme HP0454 hypotheticalprotein HP0592 res HP0766 HP0856 HP0986 HP1283 HP1366 MBOIIR HP1367 mod HP1438 mod HP1438 cons.hypoth.lipoprotein HP1499 HP1578 LPSbiosynthesisprotein HP0593 mod HP0342 HP0453 HP1009 site-specificrecombinase HP0053 HP0339 JHP0937 HP0343 HP1276 (adenine specificDNAM/T) C -like protein -like protein

Figure 7.4: A) Transposon cluster and B) cag PAI cluster of trinary GACK data of genes in the indicated clusters for the duplicate arrays representing the pre- mouse (10700 and PMSS2000) and post-mouse (SS1, and SS2000) strains. Blue represents genes called as present (value 1), yellow represents genes called as divergent (value -1) and black represents genes called as slightly divergent (value 0). Grey represents missing data. M/T is methyltransferase; cons. hypoth. prot. is conserved hypothetical protein; OMP is outer membrane protein; Tn is transposon or transposase; IS is insertion sequence; put. is putative; LPS is lipopolysaccharide; trans. is transferase; bios. is biosynthesis; dehyd. is dehydrogenase. A Figure 7.4 10700 (1)

10700 (2) Tn

SS1 (1) cluster SS1 (2) PMSS2000 (1) PMSS2000 (2) SS2000 (1) SS2000 (2) JHP0927 JHP0928 HP1209 iceA JHP0826 tnpB JHP0931 topA_3 JHP0932 JHP0933 JHP0934 JHP0935 JHP0936 HP0058 HP0413 PS3IS HP0414 IS200 HP0994 HP0995 xerD HP1007 Tn HP1008 IS200 HP1142 JHP0827 HP0373 cons.hyp.prot. HP1145 HP0725 putativeOMP HP1224 hemD HP1265 nuoF tnpA -related (IS606 tn) B 10700 (1) cag 10700 (2)

SS1 (1) PAI cluster SS1 (2) PMSS2000 (1) PMSS2000 (2) SS2000 (1) SS2000 (2) HP0887 vacA HP1342 omp29 JHP0616 JHP1132 JHP0318 HP0030 HP1324 HP1381 HP0521 orf7 HP0548 DNAmetabolism JHP1049 JHP1429 alcoholdehyd JHP0562 put.LPSbios.prot. HP0846 hsdR HP0536 cag15 HP1353 HP1079 HP1354 put.DNAM/T HP0850 hsdM HP0520 cag1 HP0526 cag6 HP1520 HP0525 virB11_1 HP0517 cag7 HP0528 cag8 HP0529 cag9 HP0531 cag11 HP0532 cag12 HP0534 cag13 HP0535 cag14 HP0537 cag16 HP0538 cag17 HP0539 cag18 HP0540 cag19 HP0541 cag20 HP0543 cag22 HP0544 cag23 HP0545 cag24 HP0546 cag25 HP0547 cagA HP1177 omp27 HP1519 JHP0540 put.sugartrans. HP0091 hsdR HP0217 HP0484 HP0522 cag3 HP0523 cag4 HP0524 cag5 HP0542 cag21 HP0483 typeIIDNAM/T HP0530 cag10 HP1417 hypoth.prot HP0893 JHP0820 put.LPSprot. . prot. Chapter 7 205

(62%), while there were also 12 genes encoding cell envelope proteins, 10 DNA metabolism genes, 10 genes encoding proteins involved in translation (mainly ribosomal) and 9 transport and binding protein genes. A large number of these genes appear to have been assigned values close to 0.5 (indicated by a purple bracket in Fig. 7.5). These genes probably represent those which have only slightly varying hybridisation efficiencies across all the strains.

The graded output of these genes is shown in the hierarchical cluster (HC) in Fig. 7.5 (The entire list is in the Supplementary Material Table S7.5). This HC shows that 5/8 of the 15 month output strains cluster together on a separate branch of the dendrogram from the other strains. The SS1 input strains SS1-I and SS1-S cluster with both 6 and 15 month strains. In order to further elucidate genes which genes differed between sets of SS1 strains, Principal Component Analysis (PCA) was performed and a selection of interesting genes is shown in Fig. 7.6 A-C. PCA analysis, also known as singular value decomposition (SVD), was used as it is useful for finding underlying patterns in array data (8). In Fig. 7.6A genes which were present in the two input strains but were divergent or slightly divergent in the output strains are shown. These include mainly genes of unknown function as well as 2 conserved hypothetical proteins (HP1286 and HP1343), a putative OMP (HP0358) and a methyl-accepting chemotaxis protein gene hylB. In Fig. 7.6B genes which appear to be more divergent in strains from BALB/c mice (Group B strains) than the C57BL/6 mice (Group F strains) and vice versa are shown. These include 2 ribosomal proteins (rpl34 and rps20) and a putative LPS biosynthesis protein (JHP1032). Finally in Fig. 7.6C genes which appear to differentiate the 6 and 15 month output strains are shown. These include some unknown proteins, 2 DNA metabolism genes (mod_1 and HP0054) and a putative LPS biosynthesis gene (JHP0820).

There were 140 genes in the SS2000 group of isolates that were shown to vary across the strains (different in more than 20% of strains) and the graded output of these genes is shown in the HC in Fig. 7.7 (The entire list is in the Supplementary Material Table S7.6). As was shown for the SS1 strains, many of these genes Chapter 7 206 have been assigned values only slightly less than 0.5, indicating these may have small variations in hybridisation efficiency (purple bracket in Fig. 7.7). The clustering analysis separated the 6 and 15 month output strains of SS2000 into 2 clusters, similarly to the SS1 strains, although in this case the input strains clustered with the 15 month strains only. A SOM analysis was then used to elucidate sets of genes which differentiated the 6 and 15 month strains and a selection of these clusters are shown in Fig. 7.8 & 7.9. In contrast with the SS1 analysis, differences in the genetic content of the input strains as compared with the output strains were not obvious. Also genes that were different in strains isolated from different mouse types were not apparent. However, Fig. 7.8 shows genes which appeared to be present (blue-black shades) in the majority of the 15 month strains, but divergent or slightly divergent (yellow shades) in the 6 month strains. Included in this group were a number of cell envelope genes (murB, fliE, putative OMP HP0009, imp, omp29, putative LPS biosynthesis gene HP0619, and a vacA paralog) and the putative transport and binding genes (hpn and HP1432). In Fig. 7.9 a further set of genes which differentiate 6 and 15 month strains are shown, but in this case the genes in the input strains appear to be more like the 6 month strains than the 15 month strains. These were mainly genes of unknown function but also included the virulence associated protein vapD and the cell envelope protein omp3. Genes indicated in orange text in Fig. 7.6, 7.8 and 7.9 were found to vary in both the SS1 and SS2000 output strains. - S - I B9 B6 B27 B21 F27 B5 B4 SS1 SS1 B8 F1 F3 B3 B1 B10 F6 F10 F9 F25 B28 B25 F26 F22

Figure 7.5: Genes which vary in SS1 output strains. An unsupervised hierarchical cluster of the graded values from the GACK program showing the non-core genes of the SS1 input (green) and output strains (6M black; 15M red) from the long-term colonisation experiment. Only genes which varied in GACK assignment in more that 80% of the arrays were used. The purple bracket indicates a set of genes which have been assigned values close to 0.5 across the majority of strains. Blue colours represent present genes, yellow represents divergent genes and the blue, black and yellow intermediate shades represent genes which are slightly divergent, indicated by the scale shown at the bottom.

Figure 7.6: A selection of clusters from the principal component analysis of the genomic content of the SS1 input and output strains. A) Genes present in the input strains but varying in the output strains, B) genes which have different values in the Group B (BALB/c) strains compared to the Group F (C57BL/6) strains, C) genes different in the 6 M and 15 M output strains. In each case strains in: green are the input strains; black are 6 M isolates; and red are 15M isolates. Genes shown in orange text were found to vary in SS2000 output strains also. Blue colours represent present genes, yellow represents divergent genes and the blue, black and yellow intermediate shades represent genes which are slightly divergent. M/T is methyltransferase; cons. hypoth. prot. is conserved hypothetical protein; OMP is outer membrane protein; LPS is lipopolysaccharide; bios. is biosynthesis; dehyd. is dehydrogenase; rest. is restriction; secr. is secreted.

A - I - S B1 B10 B21 B25 B27 B28 B3 B4 B5 B6 B8 B9 F1 F10 F22 F25 F26 F27 F3 F6 F9 SS1 SS1 HP1145 HP0335 HP1412 HP0962 acpP (acyl carrier prot.) HP1001 HP1409 HP0849 HP0731 HP1286 cons. hypoth. prot. HP0218 HP0462 hsdS (type I rest. enzyme S) HP1343 cons. hypoth. prot. HP0488 HP0060 HP0358 putative OMP HP0965 HP0161 HP0357 short chain alcohol dehyd. HP0730 HP0599 hylB (hemolysin secr. prot.) HP1474 tmk (thymidylate kinase) B - I - S B1 B10 B21 B25 B27 B28 B3 B4 B5 B6 B8 B9 F1 F10 F22 F25 F26 F27 F3 F6 F9 SS1 SS1 JHP1032 putative LPS bios. prot. HP0380 gdhA (glutamate dehyd.) HP1447 rpl34 (50S ribosomal prot.) JHP1437 HP0076 rps20 (30S ribosomal prot.) HP0225 HP0723 ansB (L-asparaginase II) HP1158 proC (reductase) HP0850 hsdM (type I rest. enzyme M) C - I - S B1 B10 B21 B25 B27 B28 B3 B4 B5 B6 B8 B9 F1 F10 F22 F25 F26 F27 F3 F6 F9 SS1 SS1 HP0054 adenine/cytosine DNA M/T HP0426 HP0338 JHP0820 putative LPS biosynthesis JHP0956 JHP0585 putative 3-hydroxyacid dehyd. JHP0955 JHP1296 mod_1 ( type III DNA M/T) JHP1306 HP1064 JHP0616 Figure 7.6 bottom. represent geneswhich areslightlydivergentasshowninthescaleat divergent genesandtheblue,blackyellow intermediateshades strains. Bluecoloursrepresentpresentgenes, yellowrepresents which havebeenassignedvaluescloseto0.5 acrossthemajority 80% ofthearrayswereused.Thepurplebracket indicatesasetofgenes experiment. OnlygeneswhichvariedinGACK assignmentinmorethat output strains(6Mblack;15Mred)fromthelong program showingthenon unsupervised hierarchicalclusterofthegraded valuesfromthe Figure 7.7: Genes whichvaryinSS2000outputstrains. An

G2 G1 C9

-core genesoftheSS2000input(green)and C7 C10 C1 C6 C5 C27 G23 G24 C28 C26

- G26 term colonisation G21 SS2000-S SS2000-I C21 GACK of

Figure 7.8: Genes more divergent in 6M than 15M SS2000 output strains. A cluster of genes from the SOM analysis of the SS2000 input and output strains from the long-term infection experiment. Strains in: green are the input strains; black are 6 M isolates; and red are 15M isolates. Genes shown in orange text were found to vary in SS1 output strains also. Blue colours represent present genes, yellow represents divergent genes and the blue, black and yellow intermediate shades represent genes which are slightly divergent. OMP is outer membrane protein; put. is putative; trans. is transferase; bios. is biosynthesis; dehyd. is dehydrogenase; prot. is protein; syn. is synthesis; transp. is transporter; cyt. is cytochrome; FA is fatty acid.

C1 C5 C6 C7 C9 C10 G1 G2 C21 C26 C27 C28 G21 G23 G24 G26 SS2000 - I SS2000 - S HP0107 cysK (cysteine syn.) HP1418 murB (bios. of murein sacculus) HP1557 filE (flagellar basal-body prot.) HP1281 trpD (anthranilate syn.) HP0009 putative OMP HP1211 HP1138 plasmid related prot. HP0300 dppC (dipeptide ABC transp.) HP0265 ccdA (cyt. c biogenesis prot.) HP0380 gdhA (glutamate dehyd.) HP1215 imp (OM permeability) HP1427 hpn (metal binding prot.) HP1342 omp29 HP0201 plsX (FA/phospholipid syn. prot.) HP1295 rps11 (30S ribosomal prot.) HP0333 dprA (DNA processing chain A) HP0574 lacA (glactosidase acetyltrans.) HP0620 ppa (inorganic pyrophosphatase) HP1269 nuoJ (NADH oxidoreductase I) HP0126 rpl20 (50S ribosomal prot.) HP0710 cons. hypoth. prot. HP0932 HP1447 rpl34 (50S ribosomal prot.) HP0681 HP0551 rpl31 (50S ribosomal prot.) HP1204 rpl33 (50S ribosomal prot.) HP1286 cons. hypoth. secreted prot. JHP0954 JHP0961 JHP1306 HP0003 kdsA (synthase) HP0041 comB3 (transformation compet.) HP0499 pldA (phospholipase A1) HP0491 rpl28 (50S ribosomal prot.) HP0365 HP1257 pyrE (phosphoribosyltransferase) HP0619 put. LPS bios. prot. HP0928 JHP0585 put. 3-hydroxyacid dehyd. HP1212 atpE (ATP synthase F0) HP1064 HP1432 put. metal-binding prot. HP0289 vacA paralog

Figure 7.8 C1 C5 C6 C7 C9 C10 G1 G2 C21 C26 C27 C28 G21 G23 G24 G26 SS2000 - I SS2000 - S HP1396 JHP0955 HP1289 HP0089 pfs protein JHP0958 JHP0828 JHP0929 JHP0937 JHP0956 HP0425 HP0447 cons. hypoth. prot. HP0261 HP0314 HP1410 HP0488 HP0315 vapD (virulence assoc. protein D) HP0462 hsdS (type I rest.enzyme S prot.) HP0346 HP0079 omp3

Figure 7.9: Genes different in the 6M and 15M output SS2000 strains. A cluster of genes from the SOM analysis of the SS2000 input and output strains from the long-term infection experiment. Strains in: green are the input strains; black are 6 M isolates; and red are 15M isolates. Genes shown in orange text were found to vary in SS2000 output strains also. Blue colours represent present genes, yellow represents divergent genes and the blue, black and yellow intermediate shades represent genes which are slightly divergent. Cons. hypoth. prot. is conserved hypothetical protein; rest. is restriction. Chapter 7 212

7.4. Discussion Host adaptation of H. pylori strains has been reported by a number of other groups in mice (269), gerbils (136), primates (84) and humans (100, 137, 153). The mechanisms of these adaptations are largely unknown, however they are likely to involve genomic variations given the high rate of genetic recombination and mutation in H. pylori (32, 100). Genomic variation is also likely to induce changes in phenotype such as cag PAI functionality, LPS and Lewis antigen expression (220, 269, 381). Thus, in this study the variations in genetic content between the pre-mouse isolates, 10700 and PMSS2000, and the post-mouse isolates SS1 and SS2000 were investigated using an H. pylori microarray.

The recently described GACK microarray analysis program was used to assign genes in each of the strains tested as present, slightly divergent or highly divergent/absent (157). Kim et al. reported that for the H. pylori microarray used in the original H. pylori genomic strain comparison (298) (the same one as used in the present analysis), the genes classified as present had more than 92% sequence identity with the region of the gene represented on the array (derived from 26695 or J99). Those genes classified as divergent had as high as 89% sequence similarity to the arrayed gene and those with 89-92% identity were assigned as slightly divergent. Consequently, a gene whose hybridisation was not detected on the H. pylori microarray may have been completely absent or may have been simply too divergent to hybridise (<89% sequence identity). This is significant considering that, in general, genes with 80-85% sequence identity are considered to be homologues (157). It was also found that hybridisation efficiency to these arrays correlates with sequence identity and thus more information than simply “present” versus “absent” genes can be obtained from gDNA array hybridisations using GACK analysis (157).

In the original genomic strain comparison using the H. pylori microarray, Salama et al. (298) calculated the number of false positive and negative’s which were obtained using a calculated constant cutoff value. In that study up to 7% of the strain specific genes (7/105) in the two strains on the array, J99 and 26695, were Chapter 7 213 detected as false positives, while up to 2% of the core genes (32/1570) were assigned as false negatives. Using the GACK analysis instead of the constant cutoff analysis this error rate may have been improved, however this was not empirically determined. Therefore, it is expected that some of the genes in the present study may be falsely called and that specific PCR and sequencing analyses are required to indisputably establish changes in the genetic content of the tested strains.

7.4.1. Genomic Changes during “Mousification” The GACK analysis of the pre- and post-mouse isolates revealed that there were changes in the genetic content of the strains after adaptation to colonisation of C57BL/6 mice. A larger number of changes were detected between 10700 and SS1 (58) than between PMSS2000 and SS2000 (7). It is possible that this difference reflects the fact that the pre-mouse isolate of SS2000 had higher intrinsic mouse colonisation ability than 10700 (see Chapter 6). Some of these changes may be accounted for by false positive or negative calls, particularly in the SS2000 strain where the differences between the strains were mostly in the slightly divergent category (Fig. 7.1A). In addition many of the genes which appeared to change during mousification, especially in SS1, were found to be divergent in the pre-mouse strain and present in the post-mouse strain. It is unlikely that this result indicates acquisition of genes during mousification, but rather that the sequence of these genes in 10700 may have had less identity with the portion of the gene used for the array than the same gene in the mousified SS1 strain. Thus, changes in these genes that occurred during mousification may have resulted in increased hybridisation efficiency through nucleotide modifications that made the sequence more like that of the gene on the array.

Interestingly, most of these possible changes in both strains occurred in genes of unknown function which suggests that these genes may be important in colonisation and persistence in vivo in different hosts. One of these, HP0486, a putative OMP gene was found to be slightly divergent in both of the pre-mouse isolates of SS1 and SS2000, while in both the post-mouse isolates it was Chapter 7 214 deemed to be present, thus is likely to have a high sequence similarity with the arrayed gene.

Many of the other genes which were divergent to varying degrees were genes involved in cell envelope modifications. These include: a gene for a putative LPS biosynthesis protein; two genes involved in fatty acid biosynthesis; five OMP genes and nine genes coding for transport and binding proteins (Fig. 7.3). These changes in the cell envelope may mediate improved interactions with the specific host (i.e. human versus murine host) and may include epithelial cell adhesion, host-cell signalling, immune evasion or variations in metabolite transport.

Recombination has been previously shown to occur in three of the OMP genes, babB, omp5 and omp29 (153, 274). These genes were found in this study to vary in 10700 and SS1. All three were found to be slightly divergent in 10700, but present (highly similar to the arrayed gene) in SS1. In the case of omp5 and omp29, Kersulyte et al. (153) found various different alleles of these genes in strains isolated from the same human patient. These two OMP genes are duplicate genes in the H. pylori genome (5). These are duplicated in the majority of H. pylori strains, which suggests a mechanism of homologous recombination may have resulted in the different alleles (5, 153). For the babB gene, Pride and Blaser (275) found that recombination occurs between the babB and babA loci in H. pylori, thus resulting in variation in both of these genes. Thus changes in the alleles of these three genes during mouse passaging may have changed the hybridisation efficiencies of these genes between the pre- and post-mouse SS1 isolates. Further PCR and sequencing analysis of these loci are needed to confirm this possibility. Although babA has been shown to be an adhesin, binding to Lewis b antigens, mice do not express this antigen (275, 351) and thus the changes in these strains are unlikely to affect adhesion in mice.

It is also intriguing that many transport and binding protein genes appear to have changed sequence identity between 10700 and SS1, especially the ureI gene and the iron uptake gene frpB. The ureI gene has been shown to code for a urea transporter required for survival in acid pH environments and colonisation (382). Chapter 7 215

Since gastric acid secretion in rodents is different than humans (163), it is possible that differential activity of the urease enzyme is required for mouse colonisation. This may also require variation in urea transport. Similarly, iron uptake is particularly important for in vivo survival. The frpB iron uptake gene is thought to be involved in heme and/or lactoferrin binding (388, 389). Thus, changes in the frpB gene may be necessary for H. pylori to adapt to the murine forms of these two proteins.

Together with the differences in genome content found in this microarray study there is a distinct possibility that some of the genetic changes occurring in these strains during mouse adaptation may occur in strain specific genes which are not represented on the microarray. Addition of newly discovered strain specific genes to the H. pylori microarray would help elucidate the importance of these genes in multiple strains adapted to different hosts.

7.4.2. Genomic Differences between SS1 and SS2000 The differences in genomic content between the two mouse colonising strains SS1 and SS2000 show a number of differences in genes of unknown function. This type of variation between strains is not unusual as most of these are located in a “plasticity” zone of the chromosome which is variable between many strains (6, 298). There were also differences found in the presence/divergence of OMP genes, LPS genes and DNA metabolism genes. These differences may contribute to variations in host responses to infection by these strains via immune evasion or host cell interaction properties. It is possible that the structure of the LPS is particularly important for these mouse colonising strains. LPS derived from SS1 has been shown to induce acid secretion from gastric glands derived from mice, whereas the other clinical strains tested did not (252). Also, SS1 has been shown to express the smooth LPS form required for mouse colonisation (220). These properties require further investigation and comparison to the SS2000 LPS.

The entire cag PAI was found to be divergent/absent in the SS2000 strain. Since this divergence occurs in all 27 genes of the cag PAI it is likely that this locus is Chapter 7 216 completely absent from SS2000. Preliminary RT-PCR detection of the cagA gene confirms that at least this gene appears to be missing in SS2000 (data not shown). The absence of the cag PAI in this mouse adapted strain illustrates that these genes are not required for mouse colonisation. Furthermore, all colonisation and inflammation properties of this strain can be assumed to be cag PAI independent. Interestingly, all 27 genes of the cag PAI were found to be present in the SS1 strain tested. A recent report by Salama et al. (298) showed that the SS1 strain used in this Stanford laboratory was missing the orf7 gene in the cag PAI. Additional reports have shown that the cag PAI of SS1 was non- functional because it was unable to induce IL-8 secretion from AGS cells (67). The particular strain used by Salama et al. (298) (SS1-SF) was obtained through A. Covacci in Sienna, Italy, rather than from the original source in Australia (174). Thus, a combination of movement between laboratories, extensive in vitro passaging and mouse passaging may have caused changes in the genome (see Supplementary Material for data from both SS1_AL and SS1_SF, Table S7.3.1 and Table S7.3.2). Full sequencing analysis of the genes in the PAI would be necessary to interpret the reason for the lack of ability of SS1 to induce IL-8 production from AGS cells in vitro.

7.4.3. Changes in the Genomic Content of SS1 and SS2000 after Long Term Colonisation Investigations into the genomic content of the both the 6 and 15 M output strains of SS1 and SS2000 showed that differences in many genes could be detected using microarray analysis. In this analysis though, most changes were likely to be due to differing hybridisation properties of the genes rather than gene deletions as most differences were assigned in the slightly divergent category. In addition, only one array hybridisation was performed for each of these isolates. Since the GACK program assigned about 10-15% of genes in the duplicate arrays for the “mousification” study differently, it is possible that the large number of genes which were slightly divergent across many of the strains in the long term study (purple bracket in Fig. 7.5 and Fig. 7.7) may have occurred due to artefacts of hybridisation. Thus, duplicate arrays need to be performed on these strains to Chapter 7 217 determine this possibility. Also only one strain per mouse was isolated and tested rather than multiple isolates from individual mice. Determining variation within individuals would be interesting to verify if the changes seen in this analysis were simply due to the limited sampling or that there were real changes in these genes during colonisation.

Despite these caveats, the difference between pre- and post-mouse isolates of these strains occurred mostly in genes of unknown function, genes involved in the cell envelope, and DNA metabolism genes, as was the case in the “mousification” study. These changes in genetic content varied enormously across all of the output strains tested, even in strains from the same parental origin. For this reason the graded output from the GACK program was used rather than the trinary output in order to ascertain possible trends in the changes which occurred in these strains. With the duplication of arrays for each of these strains and the investigation of more strains, it is likely that the trinary output assigning genes to the divergent, slightly divergent and present categories will be of more use.

Variations in both the SS1 and SS2000 groups of output strains showed that to some extent the 6 month strains clustered separately from the 15 month strains. This suggests that there may have been some consistent time dependent changes during infection. Also a small number of genes were shown to vary in both the SS1 and SS2000 output strains, indicating some consistency in these genetic changes due to mouse colonisation. For example, HP1064, JHP0955, JHP0956, were found to be less divergent from the arrayed gene in the 15 M strains than in the 6 M strains of the same parental origin. In the SS1 group of strains some changes could be loosely correlated to the difference between input and output strains (Fig. 7.6A), and between strains obtained from different mouse types (Fig. 7.6B). These correlations were not obvious in the SS2000 group of strains possibly due to the smaller number of isolates tested. Also, although there were significant variations in the level of pathology produced by infection with Chapter 7 218 either strain in individual mice, the genetic differences of the output strains from these animals could not be correlated with severity of pathology.

Together this data suggests that there may be microevolution of H. pylori strains within each individual host over time. It also shows that significant changes in genetic makeup can be detected within 6 months of infection in mice. It is possible that there are also significant variations in genetic content of the strains within each individual, but in the present study only one output strain (one single colony) per infected mouse was analysed. The clustering of the 6 and 15 month strains independently using an unsupervised clustering technique suggests that differences between animals and time points are probably more important and consistent than differences within clones from one individual. More array analyses of these output strains together with sequence information are needed to confirm these observations.

The large numbers of genomic changes detected in the output strains in this study appear to be in disagreement with the study by Bjorkholm et al. which did not find any genetic changes during 3 or 10 month infection of mice (32). This group used a membrane array containing just the genes of the cag PAI and also sequenced 3 other genetic loci to detect changes. Thus, the reason for this discrepancy may be related to the genic loci which this group tested. In addition, Bjorkholm and colleagues used a constant cutoff analysis to assess the presence/absence of genes in the strains tested. The use of the GACK program to analyse hybridisation results from genome typing arrays increases the ability to detect subtle changes in sequence identity which are missed when using a constant cutoff approach (157). However, as discussed some of the changes detected in the present analysis may be related to hybridisation artefacts and thus duplicate arrays are required.

In another study, Israel et al. were able to detect considerable changes in the genomic content of the same strain isolated from an individual patient 6 yr apart (137). In addition, recent multiple isolates of this strain showed variation, indicating substantial microevolution of this H. pylori strain within the individual Chapter 7 219 host (137). This analysis was also done using a constant cutoff analysis of the microarray data and thus is probably a conservative estimate of the changes which may have occurred in this isolate over time. Some of the same genes found to vary in the study by Israel et al. were also found to change in the output strains in the present study such as JHP0929 and JHP0937. Thus, this comparison suggests that even in the comparatively short time of infection in the present study similar changes in genetic content may have occurred as they did in the human host.

7.5. Conclusions The use of microarrays for H. pylori genome typing concurrent with infection studies has highlighted some of the dynamics of the host pathogen relationship. These studies have shown that mouse adaptation of two H. pylori strains included changes in genetic content which was partly strain specific. In addition further genomic changes occurred in a time dependent fashion during long term colonisation of mice, as has been observed in humans. The level of colonisation achieved and the inflammation produced was shown to vary in this study even between two highly mouse adapted strains and this may be directly related to differences in their genomic content. It is also possible that the factors affecting these differences may be due to strain specific genes not present on the H. pylori microarray used.

Chapter 8

HOST TRANSCRIPTIONAL RESPONSE TO H. PYLORI INFECTION

8.1. Background The aberrant expression of a number of individual genes in the gastric mucosa has already been linked to H. pylori infection. These include somatostatin (393), actin (255), and the Major Histocompatibility Complex class two (MHC II) encoding genes (101). However, little is known about the global expression response of the gastric mucosa to infection with H. pylori. To attempt to address this, a number of groups have used cultured human gastric epithelial cells to investigate the effect of H. pylori infection. These studies compared the effects of strains containing functional and non-functional cag pathogenicity islands (cag PAI) (20, 66, 122, 185). In all of these studies a relatively small number of genes appear to have been differentially expressed during infection indicating that the response to H. pylori was narrow and specific. Of these genetic changes, some may be attributed to a generalised response to bacterial contact rather than being specific to H. pylori infection (66).

There was little overlap in the genes found to be differentially regulated in the global in vitro analyses mentioned above. This may stem from the divergent conditions and experimental protocols used by each group, in particular the multiplicity of infection, the cell line used and the length of infection. All these factors could result in important differences in the transcriptional response of the infected cells. These types of in vitro cell culture array experiments are primarily useful as a screen for genes involved in the infection response rather than representing a complete picture of the pathogen-induced changes in host gene expression. In vivo confirmation of the induced expression of the genes identified in vitro is essential. In the study by Cox et al. (66), patient gastric biopsies were tested for the induction of a selection of genes suspected to be involved in the Chapter 8 221 host response to H. pylori. The induction of some of these genes in vivo was confirmed using this method.

Although transcriptional profiling of specific host cells can be informative, it is desirable to capture the response of all host cells within the stomach tissue of an infected animal as this may be more clinically relevant. Using whole tissue from an infected animal though, provides the extra complication of a mixture of cell types being present. In an attempt to address this issue, Mills et al. (214) infected mice with H. pylori for 2 or 8 weeks and then used lectin panning to isolate the parietal cell population in the stomachs of these mice. The transcriptional profile of the parietal cell (PC) population was then compared with the non-PC population in infected versus uninfected mice (214). This approach enables the complexity of gastric samples to be reduced while providing an in vivo profile of the host response to infection.

Recently, it has been shown that RNA extracted from whole stomachs of H. heilmannii infected mice can be used in microarray analysis without separation of the cell types. Despite the complexity of these samples, differences in disease state could be linked to the genetic profile of these samples (224).

The two previous chapters (Chapters 6 & 7) outlined the results from an experiment in which C57BL/6 and BALB/c mice were infected with the H. pylori strains SS1 or SS2000. The pathology detected 15 months post infection in these mice varied. The C57BL/6 mice developed a mild to moderate chronic active gastritis, while the BALB/c mice accumulated MALT tissue. These results indicated that both host and pathogen strain specific effects influence disease outcome in the murine model. Also, in the previous chapter (Chapter 7) a number of genetic differences were detected between the SS1 and SS2000 strains and these may influence the type of pathology produced in the mouse model. An understanding of the host’s transcriptional responses to H. pylori infection of the two different mouse strains may help further explain the diverse disease outcomes seen in the mouse model and in infected human populations and individuals (215). Chapter 8 222

Thus, the aim of the present study was to investigate the global transcriptional basis for the differences in the host response seen in infection with the two H. pylori strains SS1 and SS2000 (Chapter 6 & 7). We predicted that specific transcriptional signatures could be detected depending on the mouse strain, the infecting H. pylori strain, and the pathology produced. Chapter 8 223

8.2. Experimental Procedures To investigate the host transcriptional response to infection with H. pylori, total RNA samples extracted from a portion of the animals euthanased after 15 M infection (included in the long term colonisation study described in Chapter 6) were analysed by microarray hybridisation (shaded in Table 8.1). The samples were hybridised to two separate sets of mouse microarrays. Stanford style cDNA murine microarrays containing 23 000 spotted elements derived from the RIKEN mouse clone set (named SMK and SML arrays) were used for the samples from the C57BL/6 animals: control animals (Group E- 5 samples), SS1 infected animals (Group F- 5 samples), and SS2000 infected animals (Group G- 10 samples). A total of 20 arrays for the C57BL/6 animals were analysed.

Due to supply problems with these SMK & SML arrays, a different murine cDNA microarray set supplied by the Stanford Functional Genomics Facility containing 38 000 elements derived from the RIKEN (213) and NIA (353) mouse clone sets (named MMM arrays) were used for the samples from the BALB/c animals: control animals (Group A- 10 samples), SS1 infected animals (Group B- 10 samples), and SS2000 infected animals (Group C- 10 animals). To assess the reproducibility of these array hybridisations four repeat arrays were done for samples A16, B11, B15 and C20 (indicated by r in all Figures). Thus a total of 34 arrays were performed for the BALB/c animals.

In both cases the reference sample for array hybridisations consisted of pooled RNA extracted from all 10 control animals of the appropriate mouse strain. The experimental sample RNAs were labelled with Cy5 dye while the reference RNAs were labelled with Cy3 dye with the exception of the F12 sample for which the dyes were swapped to check for labelling bias {log2 (G/R) ratio for F12 was obtained from SMD and then converted to log2 (R/G) for analysis}. For the BALB/c array hybridisations mixtures of samples from the control, SS1 and SS2000 infected animals were completed on the same day to control for day to day variations in labelling and hybridisation parameters. Chapter 8 224

Table 8.1: Animal groups and euthanasia time points for the long term colonisation experiment.

Mouse BALB/c C57BL/6 strain Group A B C E F G Inoculation BHI SS1 SS2000 BHI SS1 SS2000 6 monthsa 10 10 10 10 10 10 A1-A10 B1-B10 C1-C10 E1-E10 F1-F10 G1-G10 15 months 10 10 10 10 10 10 b Time (i) A11-A20 B11-B20 C11-C20 E10-E21† F11-F20 G11-G20 point Analysed c c c d d d 10 10 10 5 5 10 by array 15 months 8* 10 10 9* 10 10 (ii)a A21-A28 B21-B30 C21-C30 E22-E30 F21-F30 G21-G30 * A portion of animals in these groups died before the time of euthanasia. † One control animal in this group excluded from analysis. a The stomachs from these animals were split in half for CFU counts and histology b The stomachs from these animals were split in half for RNA extraction and histology c Analysed on the MMM arrays d Analysed on the SMK or SML arrays Chapter 8 225

8.2.1. RNA Labelling and Microarray Hybridisation The detailed protocols for murine RNA labelling and microarray hybridisation are described in Chapter 2. In brief, 40 µg total RNA (for samples from C57BL/6 animals) or 20 µg total RNA (for samples from BALB/c animals) were used to generate labelled cDNA probes in which aminoallyl-dUTP was first incorporated into the first strand cDNA and then subsequently coupled with the appropriate Cy-dye. The experimental and reference samples were then combined and hybridised to the appropriate murine array for 48 h at 65ºC in 3.5 X SSC and 0.3% SDS.

8.2.2. Data Retrieval Arrays were scanned using a GenePix 4000A scanner (Axon instruments) and the images analysed with the GenePix Pro software. These data were stored in the Stanford Microarray Database (SMD) (320). The data for the SMK and SML arrays (C57BL/6 samples) were retrieved from SMD in a separate analysis from the MMM arrays (BALB/c animals) because the latter array type contained elements not present in the SMK/L arrays (the NIA set of clones). In each case the normalised log2 (R/G) ratios were retrieved from SMD using a filtering criteria where: spots were excluded due to obvious spot abnormality, spot quality {a regression correlation of < 0.6; a standard deviation of pixel intensity ratios of > 2.5; and the percent saturated pixels > 30% in either Channel 1 (green) or Channel 2 (red)}, and low signal (sum of the median intensities from both channels = 350). Finally, only those genes whose log2 (R/G) ratios were more than 1.5 (for the SMK and SML arrays) or 2 (for the MMM arrays) standard deviations away from the mean in at least 2 arrays were retrieved for analysis and those genes and arrays with < 80% good data across all the arrays from the same type were excluded. For the SMK and SML array set the filtering resulted in 3146 elements and only 17/20 arrays; two control and one SS1 array were found to contain too much missing data to be included in the analysis (Full data set available in the Supplementary Material, Table S8.1). For the MMM set of arrays the filtered gene set contained 3141 elements and the full 34 arrays (Full data set available in the Supplementary Material, Table S8.2). Chapter 8 226

8.2.2.1. Comparison of data from the two array sets To assess the differences between the samples from the C57BL/6 and BALB/c mice the data for both sets of arrays (51 arrays in total) were retrieved from SMD using the same filtering criteria described above except that a cutoff of genes with log2 (R/G) ratios greater than 2 standard deviations away from the mean in at least 4 arrays and only those genes with > 90% good data were used (Full data set available in the Supplementary Material, Table S8.3).

8.2.3. Data Analysis Hierarchical clustering of the data were performed using the Cluster and Treeview programs. Statistical analysis was performed using the Significance Analysis of Microarrays (SAM) program (described in Chapter 4) (364) or with student t-tests using a cutoff P-value < 0.01 or 0.001. For the SAM analyses a cutoff false discovery rate (FDR) of < 5% and a 2 fold minimum change in expression level between groups of arrays was used in all cases. For annotation of all the retrieved genes the on-line databases used were the SOURCE database (79) and the Database Referencing of Array Genes On-line (DRAGON) database (http://pevsnerlab.kennedykrieger.org/dragon.htm), using Clone IDs or Accession numbers as references, respectively (Results for all statistical analyses are available in the Supplementary Material, Tables S8.4-8.9).

8.2.4. Supplementary Material The following material is available in the supplementary material (see Appendix): Table S8.1: Normalised data for the microarrays (SMK/L arrays) used to investigate the C57BL/6 samples. Table S8.2: Normalised data for the microarrays (MMM arrays) used to investigate the BALB/c samples. Table S8.3: Normalised data for the overlapping clones in the microarrays (SMK/L & MMM arrays) used to investigate both the C57BL/6 and BALB/c samples. Chapter 8 227

Table S8.4: Results for the SAM analysis investigating the expression pattern of the uninfected versus infected C57BL/6 samples. Table S8.5: Results for the SAM analysis investigating the expression pattern of the C57BL/6 samples with and without lymphoid aggregates. Table S8.6: Results for the t-test analysis investigating the expression pattern of the SS1 versus SS2000 infected C57BL/6 samples with high monocytic infiltration. Table S8.7: Results for the SAM analysis investigating the expression pattern of the uninfected versus infected BALB/c samples. Table S8.8: Results for the SAM analysis investigating the expression pattern of the SS1 versus SS2000 infected BALB/c samples. Table S8.9A: Genes in the cluster in which expression was induced in both C57BL/6 and BALB/c infected animals compared to the uninfected controls (Fig. 8.10, turquoise). Table S8.9B: Genes in the cluster in which expression was repressed in both C57BL/6 and BALB/c infected animals compared to the uninfected controls (Fig. 8.10, turquoise). Table S8.9C: Genes in the cluster in which expression was induced in infected BALB/c animals compared to infected C57BL/6 animals (Fig. 8.10, purple). Table S8.9D: Genes in the three clusters in which expression was induced in infected C57BL/6 animals compared to infected BALB/c animals (Fig. 8.10, orange). Chapter 8 228

8.3. Results and Discussion The host transcriptional response to infection with H. pylori was investigated using microarray hybridisations. Whole stomach RNA samples extracted from either C57BL/6 or BALB/c mice infected for 15 M with either SS1 or SS2000 were compared with samples from age-matched control animals on murine microarrays. The samples from each mouse type were analysed separately and the results were compared with the histopathological scores for each mouse (assessed as described in Chapter 6).

8.3.1. Transcriptional Response of C57BL/6 Mice to Infection As described in Chapter 6 the histopathological scores for the C57BL/6 animals showed that although the level of colonisation by SS2000 was significantly higher than for SS1, the animals infected with the later strain had the most severe pathology. These SS1 infected animals produced a moderate level of chronic active gastritis, characterised by significantly greater infiltration of monocytes into the antrum and body regions of the gastric mucosa and submucosal inflammation, than the SS2000 infected animals. In order to assign a level of severity to the pathology in each animal for which array data was obtained, the scores representing the amount of monocytic infiltration in the antrum and body were summed (Table 8.2). This cumulative score was used to order the mice (in all three groups) according to their level of pathology. This order also closely resembled the level of submucosal infiltration in these animals. Thus those animals with a cumulative score of < 5 were considered to have low monocytic infiltration while those with a score of = 5 had high monocytic infiltration (H in Table 8.2).

To assess whether the gene expression data from the C57BL/6 mice could be linked with the individual animal’s infection status (uninfected- Group E, SS1- Group F or SS2000- Group G) or with the level of pathology, a series of hierarchical clusters (HC) and statistical analyses of the data were performed. First, an “unsupervised” HC was performed in order to group all the genes and Chapter 8 229

Table 8.2: C57BL/6 histopathology scores (details of scoring criteria in Chapter 6 Experimental Procedures).

Array ANTRUM ANTRUM BODY BODY LAd GAe Atropyf Sub. Total Namea PMNb L/Pc PMNb L/Pc Inflam.g L/P Score Score Score Score Scoreh

E15 0 1 0 0 0 0 0 0 1 E21 1 1 0 0 0 0 0 0 1 G15 0 1 0 0 0 0 0 0 1 E13 0 2 0 0 0 0 0 0 2 G13 0 1 0 1 0 0 0 0 2 G18 0 1 1 1 0 0 0 0 2 G11 1 1 1 2 2 0 0 1 3 G14 0 1 2 2 0 0 2 3 3 G17 2 2 0 2 0 0 0 2 4 G20 2 2 2 2 3 5 0 2 4 F12 0 2 2 3 2 1 2 3 5H G19 1 2 2 3 2 3 0 3 5H F11 0 2 2 4 3 2 2 4 6H G12 2 3 2 3 3 3 2 4 6H G16 2 3 2 3 2 3 0 2 6H F13 2 3 2 4 2 2 0 3 7H F14 2 4 2 3 2 2 0 4 7H a Gr. E animals were uninfected, Gr. F animals were SS1 infected, Gr. G animals were SS2000 infected (Table 8.1). Arrays ordered according to total L/P score. b PMN is polymorphonuclear cells c L/P is monocytes d LA is lymphoid aggregates e GA is gland abscesses f Atrophy refers to functional atrophy (loss of parietal and chief cells from body mucosa) g Sub. Inflam. refers to submucosal inflammation h Total L/P score is the sum of antrum and body L/P scores, H is high monocytic infiltrate G17 G20 E21 E13 E15 G14 G13 G18 G15 G19 F11 G12 F12 G11 F14 F13 G16

Figure 8.1: Unsupervised Hierarchical Cluster of microarrays hybridised with the samples from the C57BL/6 animals. Blue represents uninfected samples, green represents animals with low monocytic infiltrate and red represents animals with high monocytic infiltrate (see Table 8.2 for details). Group F samples from SS1 infected animals and Group G samples from SS2000 infected animals. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. Chapter 8 231 experimental samples (17 arrays and 3146 genes) based on the similarities in gene expression profiles (Fig. 8.1). The dendrogram at the top of the Figure indicates the relatedness of the profiles from each of the arrays and the length of the arms correlates with the degree of similarity. The array names are colour coded to indicate, uninfected animals (blue), animals with low monocytic infiltrate (green), and those with high monocytic infiltrate (red), regardless of the infecting strain. The experimental samples cluster on two separate branches of the dendrogram, one containing all the uninfected animals, and the other all the animals with high monocytic infiltrate. The animals with low monocytic infiltration are divided between these two arms. The reason for this incomplete separation of infected and uninfected animals may be due to either, the relatively small number of uninfected controls used in this analysis, or that there may be relatively few changes in transcriptional profile in animals with low monocytic infiltrate.

8.3.1.1. Genes with significantly different expression in samples from uninfected versus infected C57BL/6 mice To assess specific genes that differ between infected and uninfected animals a second analysis using “supervised” clustering was employed. This supervised approach uses statistical analyses to distinguish a set of “signature” genes that are divergently expressed in the two groups of arrays. In the first case a two- class unpaired SAM analysis was performed to find significant differences in the data from uninfected as compared with infected animals. Using a cutoff of at least a 2 fold difference between groups, 202 genes were found to be expressed at a higher level in the infected animals as compared with the controls (see Table 8.3 for representative set of genes and Fig. 8.2). No genes were found to be induced in the controls as compared with the infected animals. This is most likely due to the small number of control animals analysed. The HC of these genes resulted in good separation of the control arrays from those of the infected samples with two exceptions, G17 and G20. Also, all of the animals with high monocytic infiltration were clustered on a distinct branch (red), together with one sample with low monocytic infiltration, G11. Interestingly, G11 was one of only two animals with Chapter 8 232 low monocytic infiltration (the other being G20) that were also found to have lymphoid aggregates (LA) (Table 8.2).

A number of the genes found to be upregulated in the infected animals have been previously shown to be induced in response to H. pylori infection. In particular a number of genes that encode components of the MHC II were induced, likely indicating upregulation of antigen presentation. This increased expression of MHC II molecules due to H. pylori infection in mice and humans has been previously reported (214, 365). Other genes whose products may be involved in antigen processing and presentation were also induced. The gene for the endocytosis related protein, dynamin was induced and this may be important for the endocytosis of H. pylori cells for antigen presentation. The D and the proteasome subunit genes, Psmb8 and Psmb9, were also induced, and these may be involved in degradation of proteins to peptides for presentation. Interestingly it has been shown that MHC II molecules can also increase H. pylori attachment to gastric epithelial cells in vitro and possibly induce IFN-g mediated apoptosis (101).

The expression of an IFN-g induced gene, Bcl2-associated X protein (Bax), which encodes a pro-apoptotic protein (357) was also increased. Expression of the Bax protein has been found to be increased at the site of gastric ulcers in humans (323) as well as in the stomach (mainly the antrum) of patients with duodenal ulcer (164). Bax was also induced in endothelial cells infected with H. pylori in vitro that may suggest a mechanism by which microcirculation in ulcers is disrupted leading to inhibition of ulcer healing (168).

Another group of genes whose products are involved in host defence against invading organisms were induced in the infected animals. One of these, the lysozyme C gene encodes a bacteriostatic molecule expressed by the monocyte/macrophage system, and is thought to enhance the activity of other immunoagents. Two genes encoding molecules possibly involved in fortifying the gastric epithelial barrier against invasion by microorganisms were induced. The

Figure 8.2: Supervised Hierarchical Cluster of genes significantly induced in infected versus uninfected C57BL/6 mice (using SAM). Blue represents uninfected samples, green represents animals with low monocytic infiltrate and red represents animals with high monocytic infiltrate (see Table 8.2). Representative genes are indicated on the right side. Group F samples from SS1 infected animals and Group G samples from SS2000 infected animals. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data.

Figure 8.3: Dendrogram resulting from the supervised Hierarchical Cluster of genes expressed at significantly different levels in infected C57BL/6 mice with low (green) versus high (red) monocytic infiltrate (see Table 8.2). G17 E21 E13 E15 G20 G15 G18 G13 G14 F11 F14 F13 G16 G11 F12 G12 G19 crp-ductin IL -18 claudin 11 CD9 antigen

Mapk8ip3

Bax Ig joining chain lysozyme C

SOCS5 Tnfrsf5/CD40 ly6e proteasome B2m Stat1 MHC II Dmb1 Figure 8.2 F14 F13 G11 G19 G12 F11 F12 G16 G20 G14 G15 G17 G13 G18 Figure 8.3 Chapter 8 234

Table 8.3: A selection of the genes with significantly different levels of expression in control versus infected C57BL/6 mice by SAM analysis (see Supplementary Material Table S8.4 for full list).

Accession Symbol Name Classificationa Scoreb Fold Dc AV135945 Cd9 CD9 antigen Antigen 0.58 2.2 AV086706 Ly6d lymphocyte antigen 6 complex, Antigen 0.65 2.8 locus D AV036454 Ly6e lymphocyte antigen 6 complex, Antigen 0.51 2.2 locus E AA104861 Bax Bcl2-associated X protein Apoptosis 0.73 2.9 AA572306 Tnfaip3 tumour necrosis factor, alpha- Apoptosis 0.72 2.5 induced protein 3 AA170279 Itgb7 integrin beta 7 Cell adhesion 1.17 8.5 AV013361 Jup junction plakoglobin Cell adhesion 0.63 2.2 AV029846 Cgef2- cAMP-regulated guanine Cell signalling 0.63 2.2 pending nucleotide exchange factor II AA636506 Ksr kinase suppressor of ras Cell signalling 0.60 2.2 AV039876 Mapk8ip3 mitogen-activated protein Cell signalling 0.63 2.3 kinase 8 interacting protein 3 AV031472 Cldn11 claudin 11 Cell structure 0.56 2.2 AV094452 Dnclc1 dynein, cytoplasmic, light chain Cell structure 0.57 2.0 1 AV089432 Sprr2a small proline-rich protein 2A Cell structure 0.58 2.5 AV012852 C3 complement component 3 Complement 0.56 2.7 NM_008360 IL-18 interleukin 18 Cytokine 0.51 2.2 AV084904 small inducible cytokine A6 Cytokine 0.56 2.0 AV058500 Lyzs lysozyme Defence 0.58 2.3 AA170626 Igj immunoglobulin joining chain Immunoglobuli 0.74 3.6 n AI505981 Chi3l3 chitinase 3-like 3 Inflammatory 0.90 3.6 response AV069980 B2m beta-2 microglobulin MHC I 0.81 4.8 AI841614 H2-Eb1 histocompatibility 2, class II MHC II 0.68 4.2 antigen E beta AV070793 Ii Ia-associated invariant chain MHC II 0.63 2.9 AV094664 Slc25a4 solute carrier family 25 Mitochondrion 0.51 2.0 (adenine nucleotide translocator), member 4 AI838985 Cirbp cold inducible RNA binding Nuclear protein 0.62 2.4 protein AV127307 G1rzfp- g1-related zinc finger protein Nuclear protein 0.50 2.1 pending AV034679 Npm1 nucleophosmin 1 Nuclear protein 0.53 2.1 AV093449 Ptma prothymosin alpha Nuclear protein 0.56 2.1 AV040590 Sdh1 sorbitol dehydrogenase 1 Oxidoreductas 0.52 2.2 e Chapter 8 235

AV061782 Psmb8 proteosome (prosome, Proteasome 0.57 3.1 macropain) subunit, beta type 8 AA125374 Psmb9 proteosome (prosome, Proteasome 0.75 5.7 macropain) subunit, beta type 9 AA013561 Ap3s2 adaptor-related protein Protein 0.62 2.1 complex AP-3, sigma 2 subunit transport AV074655 Rab11b RAB11B, member RAS Protein 0.55 2.1 oncogene family transport AV069368 Ubd Protein 0.55 7.7 transport AV073586 Cpe carboxypeptidase E Signal 0.69 2.9 AV029386 Ier2 immediate early response 2 Signal 0.51 2.2 AA175527 Tnfrsf5/ tumour necrosis factor receptor Signal 1.36 13.4 CD40 superfamily, member 5 AA220816 Cish5 cytokine inducible SH2- Signal 0.80 4.3 containing protein 5 transduction W63975 Nfkb1 nuclear factor of kappa light Transcriptional 0.70 2.4 chain gene enhancer in B-cells regulation 1, p105 AA879612 Stat1 signal transducer and activator Transcriptional 0.65 3.4 of transcription 1 regulation AV095202 Naca nascent polypeptide- Translation 0.56 2.1 associated complex alpha polypeptide AV098980 Atp1b1 ATPase, Na+/K+ transporting, Transmembran 0.54 2.1 beta 1 polypeptide e AI838784 Tm9sf2 transmembrane 9 superfamily Transmembran 0.66 2.4 member 2 e a Classification was determined in this study in order to group genes by function b Score was assigned by SAM and indicates the degree of significance in the different levels between control and infected samples c Fold change was assigned by SAM and refers to the difference in level between the control and infected samples Chapter 8 236 first, small proline-rich protein 2A (Sprr2A), has been shown to be expressed in response to protein kinase C (PKC) signalling induced by epidermal growth factor (301). Sprr2A is significantly induced during colonisation of the mouse gastrointestinal tract by the commensal organism Bacteroides thetaiotaomicron (132). The other gene encoding crp-ductin, is also thought to have epithelial barrier functions, but is not usually expressed in the stomach of mice (48).

8.3.1.2. Genes with significantly different expression in C57BL/6 samples with different levels of pathology The pattern of expression of many of the genes induced in the infected animals was not uniform (Fig. 8.2). Thus, a second statistical analysis was performed to identify genes that showed differential expression in animals with low compared with high monocytic infiltration. Using a two-class unpaired SAM analysis, and a 2-fold cutoff, 15 genes were found to be significantly induced in the samples with a high level of monocytic infiltration. Clustering of these genes did not result in complete separation of the samples with low and high monocytic infiltration (Fig. 8.3). The dendrogram shows that the two animals in the low category G11 and G20 clustered with the animals with high monocytic infiltration. In fact the two branches of this dendrogram clearly separated animals that had developed LA versus those that had not (Table 8.2). Therefore, another statistical analysis was performed to identify a signature of genes that indicated the presence of LA. A two-class unpaired t-test comparing the animals with and without LA was done. A total of 63 genes were differentially expressed between the groups. Of these genes, 44 were expressed at a higher level in animals with LA and 19 genes were repressed in these samples (Fig. 8.4). The dendrogram derived from the HC of these selected genes clearly separates samples with (purple) and without (brown) LA on two distinct branches.

Some of the same genes (21 clones representing 18 different genes, * in Table 8.4) were induced in animals with LA as were identified as being induced in the infected as compared with the uninfected samples. For example, genes encoding components of the MHC I, the MHC II, the proteasome, and Sprr2A were

Figure 8.4: Supervised Hierarchical Cluster of genes expressed at significantly different levels in samples containing LA (purple) compared to those with none (brown) in the infected C57BL/6 mice (see Table 8.2). Representative genes are indicated on the right side. Group F samples from SS1 infected animals and Group G samples from SS2000 infected animals. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. G16 G20 G12 G11 F14 F13 F11 F12 G19 G17 G14 G18 G13 G15 caspase 7 Sprr2A beta-arrestin2

chitinase 3-like 3

B2m

ly6e

Ia -associated invariant chain

episialin (MUC1)

keratin complex 1 mast cell protease

MHC II antigen A

somatostatin hamp amylase 2

protein tyrosine phosphatase

Figure 8.4 Chapter 8 238

Table 8.4: A selection of the genes with significantly different levels of expression in samples with lymphoid aggregates as compared with those with none in infected C57BL/6 mice (see Supplementary Material, Table S8.5 for full list and gene details).

Induced in samples with LA

Accession Name Symbol Classification P- value a AV086706* lymphocyte antigen 6 complex, locus D Ly6d Antigen 0.0031 AV060427* lymphocyte antigen 6 complex, locus E Ly6e Antigen 0.0057 AA198716 caspase 7 Casp7 Apoptosis 0.0081 AV149922 clusterin Clu Apoptosis 0.0008 AV134862 erythrocyte protein band 4.2 Epb4.2 Cell structure 0.0012 AV074243 keratin complex 1, acidic, gene 19 Krt1-19 Cell structure 0.0045 AV089432* small proline-rich protein 2A Sprr2a Cell structure 0.0036 AV093449* prothymosin alpha Ptma Defence 0.0024 AV078853 carboxylesterase precursor Hydrolase 0.0079 AA387869 T-cell, immune regulator 1 Tcirg1 Immune 0.0042 regulation AI838340* chitinase 3-like 3 Chi3l3 Inflammatory 0.0002 response AI838345* histocompatibility 2, D region locus 1 H2-D1 MHC I 0.0005 AV069980* beta-2 microglobulin B2m MHC I 0.0058 AV070793* Ia-associated invariant chain Ii MHC II 0.0004 AV062404 solute carrier family 25 (adenine Slc25a13 Mitochondrion 0.0020 nucleotide translocator), member 13 AV054688 episialin (MUC1) Mucin 0.0055 AV127307* g1-related zinc finger protein G1rzfp- Nuclear 0.0014 pending AV080368 mast cell protease 4 Man2a1 Protease 0.0052 AV065218* ESTs, Highly similar to PSB8_MOUSE PSB8 Proteasome 0.0009 Proteasome subunit beta type 8 precursor AV073997 glucose regulated protein, 58 kDa Grp58 Signal 0.0068 AA125147 manic fringe homolog (Drosophila) Mfng Signal 0.0088 AV023736 similar to Beta-arrestin 2 LOC216869 Signal 0.0008 transduction AI838299 chloride intracellular channel 1 Clic1 Transport 0.0047 W41212* syntaxin binding protein 1 Stxbp1 Transport 0.0077 AI851033 cysteine and histidine rich 1 Cyhr1 0.0092 AV094890 haematological and neurological Hn1 0.0062 expressed sequence 1 AV048950* onzin onzin 0.0095 * These genes significantly induced in infected as compared with uninfected animals a un-paired t-test used to calculate level of significance Chapter 8 239

Repressed in samples with LA

Accession Name Symbol Classification P-value AV030957 myelin-associated glycoprotein Mag Cell adhesion 0.0029 AV084614 amylase 2, pancreatic Amy2 Hydrolase 0.0083 AV073586* carboxypeptidase E Cpe Hydrolase 0.0005 AA062019 protein tyrosine phosphatase, non- Ptpn11 Signal 0.0045 receptor type 11 transduction AV064590 histocompatibility 2, class II antigen A, H2-Aa MHC II 0.0049 alpha AV051534 branched chain ketoacid Bckdhb Mitochondrion 0.0082 dehydrogenase E1, beta polypeptide AV090110 cAMP inducible gene 1 Cil-pending Signal 0.0073 AV059045 hepcidin antimicrobial peptide Hamp Signal 0.0054 AV060541 somatostatin Smst Signal 0.0062 AV035003 transthyretin Ttr Transport 0.0030 * These genes significantly induced in infected as compared with uninfected animals a un-paired t-test used to calculate level of significance Chapter 8 240 upregulated in the presence of LA. Also a number of genes known to be expressed in lymphocytes were induced in samples with LA. These included the transcription factor, T-cell immune regulator 1, and a gene encoding chitinase 3- like 3 protein that may act as an eosinophil chemotactic chemokine (251). The genes encoding the lymphocyte antigen 6 complex locus D and E (ly6D & ly6E) were also highly expressed in LA containing samples. The ly6E protein is involved in T-cell development, activating MHC restricted T-cells. Its expression in B-cells is enhanced by IFN stimulation (154). The gene encoding prothymosin alpha was also highly expressed and this protein has been implicated in the control of lymphocyte proliferation (118), contributing to resistance to opportunistic infections.

There were also a number of genes specifically induced due to the presence of LA encoding proteins that may be involved in apoptosis, such as caspase 7 and clusterin. Caspase 7 is involved in the activation of the cascade of caspases responsible for the execution of apoptosis (366). Increased apoptosis of gastric epithelial cells has been associated with H. pylori infection. The induced expression of a number of caspases, caspase-3, -6, -8, and -9, have been shown in numerous studies to occur in response to H. pylori infection in vitro (147, 158, 322). One recent study showed induced expression of caspase-3 in infected human gastric mucosa (16). Two separate studies have measured the H. pylori induced expression of caspase-7 in vitro in numerous cell lines, the first reported no effect on caspase-7 expression (273) and the second reported induced expression (186). This inconsistency may have occurred due to the cell line used. The present study is the first to show induction of caspase-7 mRNA expression in response to H. pylori infection in vivo.

The exact function of the clusterin molecule is unclear at present. It has been implicated in wide ranging functions such as membrane lipid recycling and stress-induced secretion as a chaperone protein (139). Clusterin also appears to be involved in programmed cell death (363). Increased expression of clusterin has been shown in a number of diseases such as Alzheimer’s disease and Chapter 8 241 prostate cancer, and has been linked to either abnormal cell death or proliferation (139). The role of clusterin may depend on the cellular context of the molecule and thus functional studies of this molecule in the stomach are necessary to improve understanding of its function in H. pylori induced disease.

The group of 19 genes specifically repressed in LA containing samples included the gene encoding the gastric hormone somatostatin. Somatostatin, gastrin and gastric acid form a negative feedback loop that controls the level of acid release in the stomach (Fig. 8.5). Gastrin acts on parietal cells to promote gastric acid secretion (294), while somatostatin is secreted by D cells in the stomach and this hormone negatively regulates gastrin release. Although it did not make the imposed cutoff in the present study, gastrin transcription levels were found to be higher in the samples containing LA than those containing none (P=0.02). Also increased were the levels of another gene encoding beta-arrestin 2 that inhibits beta-adrenergic receptor function. Activation of beta-adrenergic receptors can lead to the upregulation of enterochromaffin-like (ECL) cell release of cholecystokinin (CCK) that in turn induces somatostatin to be released from D cells (Fig. 8.5) (294). Thus, the inhibition of somatostatin expression in mice with LA may have occurred due to a reduction in ECL activation. This may have caused an increase in gastrin release due to the cessation of the negative feedback loop and in turn increasing gastric acid secretion in these animals. Alternatively, the reduced amount of somatostatin expression may have occurred due to a decrease in the number of D cells in the gastric mucosa of the animals with LA as this has been shown to occur in H. pylori infected humans (257).

Disregulation of the gastrin/somatostatin loop has been previously reported in relation to H. pylori infection. In patients with DU the level of serum gastrin is increased, leading to stimulation of gastric acid secretion (39, 212). This increase in acid secretion does not occur in patients that develop gastric ulcer, probably because the pangastritis in this disease involves the body mucosa that leads to the loss of parietal and chief cells causing diminished gastric acid secretion (39). This difference in outcome may be related to many factors, but colonisation Oesophagus Vagus Cardia ACh

P Body ACID Beta-arrestin 2

Antrum

Duod. ECL Food

CCK D G Gastrin Somatostatin

Figure 8.5: Schematic showing some of the major relationships between the gastric hormones, endocrine cells and gastric acid in the stomach. The regions of the stomach are indicated, antrum, body and cardia as well as the duodenum (Duod.). Endocrine cell types are indicated in shaded boxes: P are parietal cells, ECL are enterochromaffin-like cells, D are somatostatin releasing cells and G are gastrin releasing cells. Lines of inhibition are indicated and black arrows indicate activation of endocrine secretion. Vagus is the vagus nerve, Ach is acetylcholine released from the vagus nerve, CCK is cholecystokinin. Chapter 8 243

Table 8.5A: Correlation between colonisation distribution and presence of lymphoid aggregates in SS1 infected C57BL/6 mice. (Details of distribution scoring described in Chapter 6)

Animal Antrum A/B Body B/C Cardia Total LAd no.† TZa TZb Scorec Animals with lymphoid aggregates F11 0.5 1 0 2 1 4.5 3 F12 0 0.5 0.5 1.5 0 2.5 2 F13 0.5 0 0.5 0 0 1 2 F14 1 1 0.5 1 2 5.5 2 F18 1 2 0 1 1 5 1 F20 0 1 0 1 1 3 1 F24 1 0.5 0.5 1 1 4 1 F25 2 2 1 0.5 0.5 6 2 F26 1.5 1 1 2 2 7.5 2 F28 0.5 1 0 0 0 1.5 2 Mean 0.8 1 0.4 1 0.85 4.05 Animals with no lymphoid aggregates F15 1 2 1 0.5 0.5 5 0 F17 3 3 2 2.5 3 13.5 0 F19 0 1 0 1 1 3 0 F21 1 1 0 1 2 5 0 F23 3 3 2 2.5 2.5 13 0 F27 3 3 2 2 2 12 0 F29 3 3 2.5 2 3 13.5 0 F30 1 2 0 2 1 6 0 Mean 1.9 2.3 1.2 1.7 1.9 8.9 P-valuee 0.0299 0.0029 0.0454 0.0637 0.0210 0.0080 a refers to the transitional zone between the antrum and body regions of the stomach mucosa b refers to the transitional zone between the body and cardia regions of the stomach mucosa c sum of the colonisation scores for all regions of the stomach (see Chapter 2 for explanation of scoring) d refers to the number of lymphoid aggregates e level of significance for the two-class unpaired t-test between the colonisation scores of animals with LA as compared with those with no LA; scores in bold are considered significant. † only animals with complete colonisation distribution scores included in analysis Chapter 8 244

Table 8.5B: Correlation between colonisation distribution and presence of lymphoid aggregates in SS2000 infected C57BL/6 mice.

Animal Antrum A/B Body B/C Cardia Total LAd no. † TZa TZb Scorec Animals with lymphoid aggregates G11 3 2.5 2 2 1 10.5 2 G12 1 1 0.5 0.5 0.5 3.5 3 G16 3 3 0.5 2 2 10.5 2 G19 2 2.5 1 2 1 8.5 2 G20 3 2 1 1 3 10 3 G23 2.5 2 1.5 2.5 1 9.5 2 G28 3 3 2 2 1 11 1 Mean 2.5 2.3 1.2 1.7 1.4 9.1 Animals with no lymphoid aggregates G14 3 2 1 2 2 10 0 G15 3 3 2 2 2 12 0 G17 3 2.5 2 2.5 1.5 11.5 0 G18 2 3 2 2 3 12 0 G21 3 2 2 3 3 13 0 G22 3 3 2 2 2 12 0 G24 3 2 1 3 2 11 0 G25 3 2 1.5 2 2 10.5 0 G26 3 3 2 2 2 12 0 G27 3 3 2 2.5 2 12.5 0 G29 3 3 2.5 3 3 14.5 0 G30 3 3 2 3 2 13 0 Mean 3.0 2.6 1.8 2.4 2.2 12 P-valuee 0.1038 0.2264 0.0228 0.0174 0.0130 0.0035 a refers to the transitional zone between the antrum and body regions of the stomach mucosa b refers to the transitional zone between the body and cardia regions of the stomach mucosa c sum of the colonisation scores for all regions of the stomach (see Chapter 2 for explanation of scoring) d refers to the number of lymphoid aggregates e level of significance for the two-class unpaired t-test between the colonisation scores of animals with LA as compared with those with no LA; scores in bold are considered significant. † only animals with complete colonisation distribution scores included in analysis Chapter 8 245 distribution differences are thought to play a role as increased colonisation of the antrum is likely to lead to increased gastric acid secretion and DU, while colonisation of both the corpus and antral areas is more likely to lead to reduced gastric acid secretion, thus predisposing the individual to GU and gastric cancer (171, 198).

Interestingly, comparison of the level and distribution of H. pylori colonisation in all the C57BL/6 animals included in this long term experiment (ie. all animals in Groups F & G collected 15 M post infection with complete colonisation distribution data, shown in Table 8.5A & B), showed that animals with LA had a significantly lower level of overall colonisation (t-test, Group G P=0.0035; Group F P=0.008) than those animals with no LA. The level of overall colonisation was measured by the sum of all the distribution scores, as colony forming unit counts were only available for a subset of these animals (Table 8.5). In the SS2000 infected animals (Group G) the level of bacteria in the body, body/cardia TZ and cardia regions was significantly lower in the animals with LA (t-test, P<0.05), while in the SS1 infected animals (Group F) the level of colonisation in all areas except the body/cardia TZ was lower in the animals with LA (t-test, P<0.05). Therefore, it appears that there is a direct relationship between the level of overall colonisation and the presence of LA and thus those genes differentially expressed in animals with and without LA may also relate to colonisation level.

This is the first study showing a correlation between transcription of gastrin and somatostatin mRNA, level of colonisation and pathology in the same strain of mouse. The same trend linking colonisation level and LA could not be found in the BALB/c mice. This was probably because of the different type of pathology produced, in which nearly all infected BALB/c mice produced LA. Thus these outcomes are probably related to subtle variations in individual C57BL/6 mice possibly at the time of infection. F11 F14 F13 F12 G16 G12 G19 NADH-Ubiquinone oxidoreductase

carbohydrate sulfotransferase matrix metalloproteinase 2 acylphosphatase 2 hemin-sensitive initiation factor 2

ribosomal protein L29

caspase 7

ly6e proteasome component solute carrier family 25 H2-D1 bromodomain-containing 4 cytoplasmic dynein light chain

prothymosin beta 4

suppressor of initiator codon mutations

transcriptional regulator Sin3b protease Psme2 protein phosphatase 3

cut-like 1 P450 cytochrome-oxidoreductase

ribosomal protein L41

Figure 8.6: Supervised Hierarchical Cluster of genes expressed at significantly different levels in SS1 (pink) compared to SS2000 (black) infected samples from C57BL/6 mice with high monocytic infiltrate (see Table 8.2) (t-test P<0.01). Named genes are indicated on the right side. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. Chapter 8 247

Table 8.6: Genes expressed at a significantly different level in SS1 as compared with SS2000 infected animals with high monocytic infiltrate (Table 8.2) (See Supplementary Material Table S8.6 for complete list of genes with details).

Accession Symbol Name Classification P- value a AV123103* Ndufs2 NADH dehydrogenase (ubiquinone) Fe-S Mitochondrion 0.0093 protein 2 AV025235 Ly6a lymphocyte antigen 6 complex, locus A Antigen 0.0049

AV060427 Ly6e lymphocyte antigen 6 complex, locus E Antigen 0.0012 AV093808 Tctex1 t-complex testis expressed 1 Cell structure 0.0001

AV140495 Tmsb4x thymosin, beta 4, X chromosome Cell structure 0.0000 AA198716 Casp7 caspase 7 Apoptosis 0.0070 W41371 Mmp2 matrix metalloproteinase 2 Hydrolase 0.0097 AV173851 Ppp3cc protein phosphatase 3, catalytic subunit, Hydrolase 0.0013 gamma isoform AI838345 **histocompatibility 2, D region locus 1 H2- MHC I 0.0019 D1 AV062404 Slc25a1 solute carrier family 25 (adenine Mitochondrion 0.0075 3 nucleotide translocator), member 13

AV066159 Por P450 (cytochrome) oxidoreductase Oxidoreductase 0.0099

AV065218 PSB8 ESTs, Highly similar to PSB8_MOUSE Proteasome 0.0085 Proteasome subunit beta type 8 precursor (Proteasome component C13) AV133785 Psme2 proteasome (prosome, macropain) 28 Proteasome 0.0095 subunit, beta AV092823 Rpl29 ribosomal protein L29 Ribosomal 0.0069 AV140568 Rpl39 ribosomal protein L39 Ribosomal 0.0078 AV066119 Rpl41 ribosomal protein L41 Ribosomal 0.0061 AV067804 Rps27l ribosomal protein S27-like Ribosomal 0.0075 AV056014 Eif2ak1 eukaryotic translation initiation factor 2 Signal 0.0087 alpha kinase 1 AV032392 Ndr4 N-myc downstream regulated 4 Signal 0.0017 AV041056 Cutl1 cut-like 1 (Drosophila) Transcriptional 0.0053 regulation AA051137 Sin3b transcriptional regulator, SIN3B (yeast) Transcriptional 0.0059 regulation AV013696 Chst7 carbohydrate (N-acetylglucosamino) Transferase 0.0028 sulfotransferase 7 AV085357 acylphosphatase 2, muscle type Acyp2 0.0095 AI841611 Brd4 bromodomain-containing 4 0.0017 * gene repressed in SS1 infected animals as compared with SS2000 infected animals. All other genes induced in SS1 infected animals as compared with SS2000 infected animals. a un-paired t-test used to calculate significance level. Chapter 8 248

8.3.1.3. Genes with significantly different expression in samples from SS1 versus SS2000 infected C57BL/6 mice

Since the pathology produced by SS1 in the C57BL/6 mice was significantly higher than in the SS2000 infected mice, the signature of genes representing the difference in the host response to these two strains was sought. The differential gene expression in the samples with high monocytic infiltrate (antrum and body L/P score = 5) in the SS1 infected animals (4 samples, pink) as compared with the SS2000 infected animals (3 samples, black) was analysed using an un-paired t-test (P<0.01) (Fig. 8.6). This analysis revealed 36 genes that were differentially expressed in SS1 as compared with SS2000 infected animals (Table 8.6). The HC of these genes separates the samples infected with the two different strains onto two branches (Fig. 8.6). Only one of these genes, an EST highly similar to NADH-ubiquinone oxidoreductase, had reduced expression in the SS1 infected animals, while the rest of the genes were induced in SS1 as compared with SS2000 infected animals. Some of these genes encode products involved in cell structure, such as matrix metalloproteinase 2, cytoplasmic dynein light chain, and prothymosin beta. It is tempting to speculate that the presence of the cag PAI in SS1 may have caused these transcriptional changes, as attachment and CagA translocation has been linked to changes in the cytoskeleton (312). However, investigation of a larger number of samples from SS1 and SS2000 infected C57BL/6 mice is required for further conclusions to be drawn from this data.

In summary, the data from the C57BL/6 animals suggests that the expression signature derived from whole stomach samples is related primarily to the type and severity of the pathology produced, particularly the presence of LA, rather than the infecting strain. This is particularly evident by the induced expression of particular markers for some cell types. For example induced expression of lysozyme suggests the presence of macrophages, while the expression of chitinase 3-like 3 indicates the presence of T-cells and the expression of lymphocyte antigen 6e and 6d suggest the presence of B-cells in these samples with LA. Chapter 8 249

8.3.2. Transcriptional Response of BALB/c Mice to Infection

The set of arrays used to analyse the samples from the BALB/c animals (MMM arrays) were assessed in a separate analysis. Stomach RNA samples from 10 SS1 (Group B) and 10 SS2000 (Group C) infected animals along with 10 control (Group A) animals were analysed on individual arrays using a reference consisting of a pool of all 10 RNA samples from the control animals. Repeat arrays for four samples were also performed to assess reproducibility of the microarray results. In the set of genes that passed the filtering criteria (3141 genes), there were two large sets of genes in which there was a lot of redundancy. That is, these genes were represented by multiple elements on the microarray representing different clones derived from the same ORF. These two sets of redundant genes appeared to be driving the clustering of the entire set. The first set of genes consisted of 91 haemoglobin-related genes, and the second set contained 62 genes encoding a number of enzymes such as trypsin, lipase, elastase, amylase, and ribonuclease. In both cases a representative example of each of the different genes was retained while all of the redundant copies were deleted from the data set for further analysis.

An unsupervised HC of all the genes and arrays (34 arrays) in the BALB/c set was produced (Fig. 8.7). The resulting dendrogram separated the uninfected animals (blue) onto two branches distinct from the branch containing all the infected animals. The repeat arrays (denoted by an r) cluster beside their partner in all cases except for C20. This is possibly because the first array for C20 has a lot of missing data (grey colour in cluster). Clustering without the repeat arrays resulted in a very similar cluster, thus the repeats were not driving the clustering.

The transcriptional response of the BALB/c mice appears to be quite uniform across all the infected animals (Fig. 8.7). The differences between samples lie in the degree of induction or repression in expression of genes as compared with the uninfected controls. One group of genes appears to vary enormously across individual animals regardless of their infection status (shown by a star in Fig. 8.7). This set of genes includes enzymes such as amylase, trypsin and elastase as Chapter 8 250

Table 8.7: BALB/c histopathology scores (details of scoring criteria in Chapter 6 Experimental Procedures)

Array ANTRUM ANTRUM BODY BODY LAd GAe Atropyf Sub. Total Namea PMNb L/Pc PMNb L/Pc Inflam.g L/P Score score Score score Scoreh A11 0 1 0 1 0 0 0 0 2 A12 0 1 0 1 0 0 0 0 2 A13 0 1 0 1 0 0 0 1 2 A15 0 1 0 1 0 0 0 0 2 A16 0 1 0 1 0 0 0 0 2 A20 0 1 0 1 0 0 0 0 2 A17 0 1 0 2 0 0 0 0 3 A19 0 1 0 2 0 0 0 0 3 A14 0 2 0 2 0 0 0 0 4 A18 0 2 0 2 0 0 0 0 4 B11 1 2 2 3 0 0 0 0 5 H B13 1 3 1 3 1 0 0 2 6 H B15 1 3 1 3 1 1 0 2 6 H B17 1 2 2 4 1 1 0 3 6 H B18 0 2 0 2 1 0 0 2 4 B19 0 2 0 2 1 0 0 2 4 B20 0 2 0 2 1 0 0 3 4 B12 1 3 1 3 2 1 0 3 6 H C15 2 3 2 3 2 0 0 3 6 H C16 2 3 2 3 2 0 0 2 6 H C18 1 2 1 3 2 0 0 2 5 H B14 1 3 1 4 3 1 0 3 7 H C12 1 2 1 2 3 0 1 2 4 C13 1 2 2 4 3 2 2 3 6 H B16 2 3 2 4 4 3 0 4 7 H C14 2 3 1 4 4 2 2 4 7 H C19 0 1 0 2 4 0 0 2 3 C11 2 1 3 3 5 1 1 3 4 C20 1 2 1 2 5 2 0 1 4 C17 1 2 1 3 6 2 0 3 5 H a See Table 8.1 for infection status of each animal analysed on the indicated array. Arrays ordered according to number of lymphoid aggregates b PMN is polymorphonuclear cells c L/P is monocytes d LA is lymphoid aggregates e GA is gland abscesses f Atrophy refers to functional atrophy (loss of parietal and chief cells from body mucosa) g Sub. Inflam. refers to submucosal inflammation h Total L/P score is the sum of antrum and body L/P scores, H refers to those animals high monocytic infiltrate

Figure 8.7: Unsupervised Hierarchical Cluster of all the genes which passed the filtering criteria in the arrays analysing the BALB/c mice. Blue indicates uninfected animals (Group A), pink indicates SS1 infected animals (Group B), black indicates SS2000 infected animals (Group C). A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. Star indicates a cluster of genes that shows enormous individual variation regardless of infection status. Figure 8.7 A17 A11 A15 A12 A14 A16 A16r B15 B15r B18 B16 B14 B11 B11r C20 C18 C19 C17 B13 B19 B20 B12 B17 C20r C11 C13 C14 C12 C15 C16 A18 A19 A13 A20 Chapter 8 252 well as a number of growth factors. The reason for this individual variation is unclear, but it may indicate a difference in digestive state of the animal at the time of euthanasia.

The analysis in Chapter 6 indicated that the pathology produced by SS2000 strain in these mice was greater than that produced by SS1. This is the reverse situation as was observed in the infected C57BL/6 mice in which SS1 produced the greatest pathology. In particular, the number of LAs and gland abscesses in the SS2000 infected BALB/c animals were greater than in the SS1 infected animals (Table 8.7). However, the unsupervised cluster (Fig. 8.7) does not appear to cluster based on the severity of pathology, but rather on the infecting strain. It is possible that this difference is directly related to the colonisation levels of these two H. pylori strains since SS2000 colonised at a significantly higher level than SS1 in these mice (see Chapter 6 for details).

8.3.2.1. Genes with significantly different expression in samples from uninfected versus infected BALB/c mice

To identify the set of “signature” genes induced or repressed in infected BALB/c animals a two-class unpaired SAM analysis was performed on the uninfected controls as compared with all the infected animals. Using a cutoff of at least a 2 fold change between groups, 203 genes were found to differ significantly between the groups (Table 8.8). A HC of these genes and arrays is shown in Fig. 8.8. The dendrogram for this cluster separates most of the controls onto one branch with 3 exceptions, A11, A12 and A15. This dendrogram also suggests that there was much more individual variation within the uninfected animals than there was in the infected ones (indicated by the length of the arms). Within the infected group of animals, samples are separated onto two arms based on the infecting strain {SS1 (pink) versus SS2000 (black)} with the exception of C20 and C13, rather than on the level of pathology. Chapter 8 253

8.3.2.1.1. Genes induced in infected BALB/c mice

The set of genes that were induced in the infected BALB/c mice, like those in the infected C57BL/6 animals, include a large number of genes involved in the inflammatory response. These include genes encoding components of the MHC II molecules and many genes coding for immunoglobulins particularly the IgM and IgA subtypes. This latter finding is particularly pertinent given the finding that BALB/c mice secrete higher levels of IgA and IgM in response to H. pylori infection (159). Many IFN regulated genes were also induced such as ly6a, ly6d, IFN regulatory factor 1 (IRF-1), a number of IFN induced GTPases (5), and three transcriptional regulators: signal transducer and activator of transcription 1 (STAT1), and cytokine inducible SH2-containing protein 1 and 5 (CISH1/SOCS1 and SOCS5).

The Stat1 protein is a transcriptional activator that is part of the JAK/STAT signal transduction pathway induced by cytokine stimulation, mainly IFNg. The SOCS family of proteins is part of a negative feedback loop that regulates the JAK/STAT pathway so that the cytokine stimulation cycle is terminated and the cells may respond to a second round of stimulation (165, 397). Both the SOCS1 and SOCS5 proteins are negative regulators of IL-6 and Leukaemia inhibitory factor (LIF) cytokines. The induced expression of these proteins in response to H. pylori infection, suggests tight regulation of the levels of IFNg and IL-6 cytokines.

Developing inflammatory lesions are characterised by the excess migration of recruited phagocytes as well as enhanced expression of cell adhesion molecules (152). One such cell adhesion molecule, integrin beta 7 (Itgb7), is induced by 4.4 fold in these infected mice. Itgb7 is a mucosal homing receptor for leukocytes. The mucosal vascular addressin cell adhesion molecule 1 (MAdCAM-1) is the counter receptor for this integrin on endothelial cells. It has been shown that gastric endothelial cells express MAdCAM-1 and that both B and T cells, activated by antigens delivered to the gastric mucosa, express Itgb7 (278). Further it has been shown that CD4 lymphocytes in the body of H. pylori infected mice express Itgb7 and that MAdCAM-1 expression is most prominent adjacent A14 A18 A19 A20 A13 A16 A16r A17 A12 A15 B18 B13 B15 B15r B19 B20 B12 B14 B11 B11r B17 C20r C13 C20 B16 C16 C17 C11 C18 C15 C14 C12 C19 A11 smoothlin gastrin phospholipase A2 somatostatin colipase lipoprotein lipase mucin 5AC slc31a2 keratin complex 2 cyclin A2 lysozyme C ly6d vanin 1 integrin beta 7 tnfrsf5/CD40 IgJ SOCS5 transferrin R crp-ductin Stat1 ly6a IFNg induced GTPase MHC II DMb-1 B2m MHC I Sprr2j

thrombospondin 1

Figure 8.8: Supervised hierarchical cluster of genes significantly induced in infected versus uninfected BALB/c mice (using SAM). Blue represents uninfected samples (Group A), pink represents SS1 infected animals (Group B) and black represents SS2000 infected animals (Group C). Representative genes are indicated on the right side. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. Chapter 8 255

Table 8.8: A selection of the genes with significantly different levels of expression in uninfected as compared with infected BALB/c mice (see Supplementary Material Table S8.7 for full list and gene details).

Genes Induced in Infected Samples Accession Symbol Name Classi ficationa Scoreb Fold Db BG072793 Ly6a lymphocyte antigen 6 complex, Antigen 3.80 2.1 locus A AV086706 Ly6d lymphocyte antigen 6 complex, Antigen 5.76 2.4 locus D AI838607 Thbs1 thrombospondin 1 Cell adhesion 2.84 2.1 Sprr2a small proline-rich protein 2A Cell structure/ 4.15 2.1 defence AV081711 Sprr2j small proline-rich protein 2J Cell structure/ 4.35 2.1 defence AV035281 Bmp8a bone morphogenetic protein 8a Cytokine 2.87 2.6 AV088069 Tnfsf13 tumour necrosis factor (ligand) Cytokine 4.55 2.0 superfamily, member 13 AV084558 Dmbt1 deleted in malignant brain Defence 5.36 3.4 tumours 1 AV058500 Lyzs lysozyme Defence 2.84 2.0 AV050073 S100A9 S100 calcium binding protein Defence 5.57 2.7 A9 (calgranulin B) BG072353 Sftpd surfactant associated protein D Defence 5.58 5.5 BG064772 Hsp86-1 heat shock protein Heat-shock 4.47 2.1 BG065920 Vnn1 vanin 1 Hydrolase 8.46 6.3 AV372067 Ceacam10 CEA-related cell adhesion Immunoglobulin 8.71 3.8 molecule 10 AV074733 Igh-6 immunoglobulin heavy chain 6 Immunoglobulin 3.95 2.4 (heavy chain of IgM) AV060216 Igj immunoglobulin joining chain Immunoglobulin 11.27 5.8

AV061673 Igk-V28 immunoglobulin kappa chain Immunoglobulin 7.86 5.1 variable 28 (V28) AV060304 Igl-V1 immunoglobulin lambda chain, Immunoglobulin 4.91 4.0 variable 1 BG064651 Mail- molecule possessing ankyrin- Inflammatory 5.72 2.2 pending repeats induced by response lipopolysaccharide AV059081 Reg1 regenerating islet-derived 1 Lectin 1.81 2.0 AV065971 Fabp1 fatty acid binding protein 1, Lipid binding 2.80 2.1 liver AW538652 Acly ATP citrate lyase Lyase 5.50 2.5 BG063994 B2m beta-2 microglobulin MHC I 3.16 2.1 AU041598 H2-K histocompatibility 2, K region MHC I 4.74 2.4 BG072346 H2-DMa histocompatibility 2, class II, MHC II 4.31 3.2 locus DMa AA709560 H2-DMb1 histocompatibility 2, class II, MHC II 3.27 2.3 locus Mb1 Chapter 8 256

AV043418 Ii Ia-associated invariant chain MHC II 5.36 4.2 AV108211 Bcrp1- breakpoint cluster region Nuclear 3.83 2.3 pending protein 1 AV113541 Magoh mago-nashi homolog, Nuclear 5.15 2.9 proliferation-associated (Drosophila) AV135036 Pmscl1 polymyositis/scleroderma Nuclear 3.38 2.7 autoantigen 1 AV154117 Top2a topoisomerase (DNA) II alpha Nuclear 3.73 2.1 C77965 Fgfrp fibroblast growth factor Oxidoreductase 5.10 2.2 regulated protein AV063215 Psmb8 proteosome (prosome, Proteasome 5.43 4.1 macropain) subunit, beta type 8 (large multifunctional protease 7) AA125374 Psmb9 proteosome (prosome, Proteasome 3.54 2.6 macropain) subunit, beta type 9 (large multifunctional protease 2) AA170279 Itgb7 integrin beta 7 Receptor 9.78 4.4 AV257618 Trfr transferrin receptor Receptor 6.27 2.5 AA175527 Tnfrsf5/ tumour necrosis factor receptor Receptor 11.92 6.7 CD40 superfamily, member 5 AA163925 Bcl3 B-cell leukaemia/lymphoma 3 Signal 5.18 2.3 AV004153 Igtp interferon gamma induced Signal 5.40 6.7 GTPase AA163154 Ifi47 interferon gamma inducible Signal 5.01 4.6 protein, 47 kDa BG070106 Lcn2 lipocalin 2 Signal 6.25 3.4 AA288982 Cish1 cytokine inducible SH2- Signal 3.50 2.3 containing protein 1 transduction AA220816 Cish5 cytokine inducible SH2- Signal 7.96 4.4 containing protein 5 transduction AV088747 Irf1 interferon regulatory factor 1 Transcriptional 4.18 2.4 regulation AA879612 Stat1 signal transducer and activator Transcriptional 3.23 2.0 of transcription 1 regulation BG073108 G7e- G7e protein Viral envelope 4.36 2.4 pending

Chapter 8 257

Genes Repressed in Infected Samples Accession Symbol Name Classificationa Scoreb Fold Dc 2210003A22 Spp1 secreted phosphoprotein 1 Cell adhesion -3.73 2.6 2300001L14 Krt2-4 keratin complex 2, basic, gene Cell structure -2.95 2.2 4 AV053031 Scd1 stearoyl-Coenzyme A Fatty acid -3.87 2.9 desaturase 1 biosynthesis AV073358 Hsp86-1 heat shock protein, 86 kDa 1 Heat-shock -4.07 2.2 AV077926 Gast gastrin Hormone -5.21 3.2 AA050761 Smst somatostatin Hormone -4.93 2.3 AV074533 Pla2g1b phospholipase A2, group IB, Hydrolase -4.25 2.2 pancreas AV073536 Lgals4 lectin, galactose binding, Lectin -4.92 2.2 soluble 4 AV014227 Pitpnb phosphotidylinositol transfer Lipid binding -4.29 2.5 protein, beta AV088103 Clps colipase, pancreatic Lipid -4.92 2.4 degradation AV006290 Lpl lipoprotein lipase Lipid -4.17 2.0 degradation AV014754 Car3 carbonic anhydrase 3 Lyase -4.28 2.4 AV073495 Muc5ac mucin 5, subtypes A and C, Mucin -3.05 2.1 tracheobronchial/gastric AV006091 Smtn smoothelin Muscle -3.55 2.0 structure AV081405 Mup1 major urinary protein 1 Signal -4.50 2.4 AW551388 E2f6 E2F transcription factor 6 Transcriptional -2.83 2.0 regulation AV059814 Slc12a4 solute carrier family 12, Transport -3.90 2.1 member 4 AV073600 Slc31a2 solute carrier family 31, Transport -4.20 2.5 member 2 a Classification was determined in this study in order to group genes by function b Score was assigned by SAM and indicates the degree of significance in the different levels between control and infected samples c Fold change was assigned by SAM and refers to the difference in level between the control and infected samples Chapter 8 258

to these cells in the lamina propria and the submucosa (126). Thus, it appears that the induction of Itgb7 in the infected mucosa is most likely due to the presence of recruited B and T cells. In the recent study by Mueller et al. (224) the expression of a number of genes in response to “H. heilmannii” infection in BALB/c mice were assigned to either the lymphocytic or mucosal fractions of the infected stomach through the use of laser capture microdissection and microarray analysis. The expression of Itgb7 was associated solely with the lymphocytic fraction confirming this assumption.

8.3.2.1.2. Genes repressed in infected BALB/c mice

Of the genes whose expression was significantly reduced in infected animals, muc5ac encoding a major gastric mucin is one. One of the major effects of H. pylori infection is the loss of mucus coat continuity (331). The expression and synthesis of muc5ac has been shown to be inhibited by H. pylori both in vitro (38) and in vivo (221). The H. pylori LPS is thought to be the factor causing this decline in mucin synthesis (331). Interestingly, H. pylori was found to be co- localised with muc5ac or the cells producing this mucin component in infected patients, indicating that this mucin may also act as an adhesin for H. pylori (367).

The decreased expression of the genes for both the gastric hormones gastrin (and its precursor) and somatostatin is curious. H. pylori infection usually leads to an increased secretion of gastrin and a decreased secretion of somatostatin leading to induced acid secretion, as was observed in the C57BL/6 animals with lymphoid aggregates (Fig. 8.4). It has been hypothesised that since colonisation in BALB/c mice is mostly limited to the antrum/body and body/cardia transitional zones, that the basal acid secretion in these mice may be less than that in C57BL/6 mice that have significant colonisation in the antral and cardia region (A. Lee, personal communication). Therefore, it appears that both the basal level of gastric acid and the endocrine response to H. pylori infection in BALB/c mice may be different than in C57BL/6 and may contribute significantly to the differences in pathology observed. Further measurement of the relative numbers of G and D Chapter 8 259 cells in the stomachs of these two strains of mice and of the basal gastric acid will help elucidate the importance of this phenomenon.

8.3.2.2. Genes with significantly different expression in BALB/c samples with different levels of pathology 8.3.2.2.1. Total monocyte infiltration

In order to identify a transcriptional signature to distinguish the severity of the pathology produced in infected BALB/c animals, a number of t-tests were performed. First, using the same criteria used in the C57BL/6 mice, all mice with a total monocyte score of less than 5 (low monocyte infiltration) were compared with those = 5 (high monocyte infiltration) (Table 8.7). Using a stringent cutoff (P<0.001), 11 genes were found to be expressed at a significantly higher level in animals with high monocyte infiltration. Only three of these genes are named, vitronectin, hepcidin antimicrobial peptide (Hamp), and haemoglobin beta chain. Vitronectin belongs to a family of cell adhesion molecules and also functions as a regulator of several proteolytic enzyme cascades such as the complement cascade and coagulation. Its expression in mice is induced by IL-6 in response to LPS (314). It has been hypothesised that this protein may provide a link between cell adhesion, humoral defence mechanisms and cell invasion (314). The Hamp protein has been shown to have antimicrobial activity and may also function in maintaining iron homeostasis in the host (270). The physiological significance of haemoglobin synthesis in the stomach will be discussed in the following sections.

8.3.2.2.2. Number of gland abscesses and lymphoid aggregates

A second t-test was done to compare animals with and without gland abscesses. In this test only one gene was found to be significantly induced (P<0.001) in animals with gland abscesses, the calcyclin binding protein. The function of this protein is largely unknown but it may be involved in cyclin-dependent protein kinase regulation indicating a role in cell cycle progression. Chapter 8 260

A third t-test was performed to differentiate animals with a large number of lymphoid aggregates (= 4) from those with only a few LA (< 4). Again this difference in pathology was not well separated by gene expression patterns in these mice. Only two genes, Moloney leukaemia virus 10 (Mov10) and an EST (Accession Number BG072089), were found to be significantly repressed in the samples with numerous LA. The Mov10 protein has been shown in humans to be important in development and control of cell proliferation. Thus down-regulation of Mov10 may allow aberrant lymphoid cell proliferation necessary for LA formation. In summary, in contrast to the C57BL/6 animals, the gene expression signature detected for infected BALB/c mice with different levels of pathology was limited to only a few genes. This may reflect a more uniform inflammatory response to infection in the BALB/c mice.

8.3.2.3. Genes with significantly different expression in samples from SS1 versus SS2000 infected BALB/c mice

An unpaired two-class SAM analysis (using a 2 fold cutoff) was performed to distinguish the difference in expression induced by SS1 as compared with SS2000 in the BALB/c mice. In this analysis 8 genes were found to be significantly induced in SS2000 infected mice, and 68 genes were found to be repressed in these animals (Table 8.9). A HC of these genes separated the SS2000 infected animals from most of the SS1 infected animals (Fig. 8.9), confirming the pattern seen in the unsupervised cluster (Fig. 8.7). Of the genes expressed at a higher level in the SS2000 infected animals, only thrombospondin and a fatty acid binding protein appeared to be universally expressed in these mice (Fig. 8.9). The other 6 genes, including the kallikrein 5 & 11 and albumin genes, all show large variations in the level of expression in these animals that does not appear to be related to the level of pathology or infecting H. pylori strain. These genes also vary in expression in the control animals (data not shown) and thus are likely to be subject to individual variation.

Of the genes significantly repressed in the SS2000 infected animals two were apoptotic factors, cytotoxic granule-associated RNA binding protein 1 (Tia1) and Chapter 8 261 a serine protease (HtrA2). Considering that SS2000 infected animals have significantly more LA than SS1 infected animals, this decreased apoptosis may be directly related to the accumulation of lymphocytes due to disregulation of cell death in these cells. Another set of genes that are significantly decreased in the SS2000 infected mice are the haemoglobin alpha adult and beta adult minor chains that were repressed by up to 12 fold. This is a curious result as it is widely believed that haemoglobin is selectively expressed in cells of erythroid lineage. However, recently it has been shown that at least the beta minor chain of haemoglobin can be induced in mouse macrophages in response to LPS and IFNg and may be involved in functions relating to oxygen or NO sensing (181). Thus, this provides a possible explanation as to the regulation of haemoglobin expression in the stomach of infected mice.

Interestingly, the gene for the ferritin heavy chain is also repressed significantly in these animals. Ferritin is a protein that stores iron in an available, non-toxic form in the cytoplasm of cells. H. pylori infection has been associated with cases of iron- deficiency anaemia that cannot be explained by gastrointestinal bleeding alone (12). In these patients the level of serum ferritin and haemoglobin is significantly lower than non-IDA patients and these levels are normalised after H. pylori eradication (50). The reduced expression of both haemoglobin and ferritin in these mice suggests there is a direct effect of H. pylori on the expression of these genes at the site of colonisation in the gastric mucosa. The difference in the amount of repression of these genes by the two different H. pylori strains indicates that strain differences, along with susceptibility to IDA, may both play a role in the development of this disease in response to H. pylori infection.

Overall the results for SS1 and SS2000 infection of BALB/c animals suggest that for this set of animals the gene expression patterns distinguish the infecting strain rather than the level of pathology produced by infection. This is the opposite case to that found in the C57BL/6 animals where the level of pathology, particularly the presence of LA, was the main determining factor for the gene expression signature. The reason for the apparent differences in these expression signatures

Figure 8.9: Supervised hierarchical cluster showing genes expressed at a significantly different level in SS1 (pink) as compared with SS2000 (black) infected samples. A selection of named genes are shown on the right side. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. (SAM, FDR < 5% and 2-fold cutoff) Figure 8.9

B15 B15r B20 C18 C19 C15 C17 C20r C12 C11 C14 C13 C16 C20 B17 B19 B13 B12 B18 B16 B14 B11 B11r palmitoyl-protein MHC II,H2 thrombospondin fatty acidbindingprotein stearoyl-coenzyme A microtubule haemoglobin betachain tubulin, beta4 ferritin heavychain lamp2 elongation translocation erythroblast membrane alpha globin synaptotagmin albumin kallikrein 5 epidermal growthfactor kallikrein 11 nerve growthfactor mast-cell IgEreceptor protein1 lightchain3 desaturase 1 factor 2 associated protein binding proteinC thioesterase -Ab1 -associated - Chapter 8 263

Table 8.9: A selection of the genes with significantly different levels of expression in SS1 as compared with SS2000 infected BALB/c infected mice (see Supplementary Material Table S8.8 for full list and gene details).

Genes Induced in SS2000 Infected BALB/c Mice Accession Symbol Name Classificationa Scoreb Fold Dc AI838607 Thbs1 thrombospondin 1 Cell adhesion 3.10 2.1 J05020 Fc receptor, IgE, high affinity I, Immunoglobulin 2.51 2.2 gamma polypeptide AV065971 Fabp1 fatty acid binding protein 1, liver Lipid binding 3.05 2.1 AV049334 Klk11 kallikrein 11 Signal 2.60 3.0 AV059489 Klk5 kallikrein 5 Signal 2.63 3.5 AV059404 Ngfg nerve growth factor, gamma Signal 2.53 2.6 BG066824 Alb1 albumin 1 Transport 2.51 6.8

Genes Repressed in SS2000 Infected BALB/c Mice Accession Symbol Name Classificationa Scoreb Fold Dc BG073264 Tia1 cytotoxic granule-associated RNA Apoptosis -4.65 3.7 binding protein 1 AI847573 protease, serine, 25 Apoptosis -4.73 5.4 BG065535 Fsd1 fibronectin type 3 and SPRY domain- Cell structure -6.41 4.2 containing protein AV103588 Map1lc3 microtubule-associated protein 1 light Cell structure -5.33 3.6 chain 3 AV171076 Col3a1 procollagen, type III, alpha 1 Cell structure -3.61 4.0 AV035995 Slap sarcolemmal-associated protein Cell structure -4.77 3.0 AV171034 Tubb3 tubulin, beta 3 Cell structure -5.23 8.2 AV105246 Gc group specific component Complement -6.46 3.5 AV005269 Hba-a1 haemoglobin alpha, adult chain 1 Defence -5.23 5.0 BG072639 Hbb-b2 haemoglobin, beta adult minor chain Defence -9.46 12.3 AV106988 Prodh2 proline dehydrogenase (oxidase) 2 Dehydrogenase -3.85 2.7 AV053031 Scd1 stearoyl-Coenzyme A desaturase 1 Fatty acid -5.41 5.6 biosynthesis AV107769 Golph3 golgi phosphoprotein 3 Golgi apparatus -3.08 2.6 AV073817 Ppt palmitoyl-protein thioesterase Hydrolase -3.63 2.4 AV106908 Fth ferritin heavy chain Iron storage -6.81 7.5 AV103171 Apoa1 apolipoprotein A-I Lipid transport -2.69 2.1 AV078152 H2-Ab1 histocompatibility 2, class II antigen MHC II -2.92 2.3 A, beta 1 AV105208 Nt5c3 5'-nucleotidase, cytosolic III Nuclear -4.72 5.1 C75991 Cbx5 chromobox homolog 5 (Drosophila Nuclear -5.13 5.2 HP1a) AV030893 Thrsp thyroid hormone responsive SPOT14 Nuclear -3.94 2.4 homolog (Rattus) Chapter 8 264

AA059746 Pam peptidylglycine alpha-amidating Oxidoreductase -5.76 4.2 monooxygenase AV105115 Ermap erythroblast membrane-associated Plasma -6.10 5.8 protein membrane AV103103 Gnai2 guanine nucleotide binding protein, Signal -3.55 2.3 alpha inhibiting 2 transduction AA967233 Lamp2 lysosomal membrane glycoprotein 2 Signal -7.10 7.1 transduction AV006208 Ywhae tyrosine 3-monooxygenase/ Signal -6.22 7.0 tryptophan 5-monooxygenase transduction activation protein, epsilon polypeptide AW551388 E2f6 E2F transcription factor 6 Transcriptional -8.56 10.4 regulator AV108627 Eef2 eukaryotic translation elongation Translation -6.30 9.6 factor 2 AI322332 Syt5 synaptotagmin 5 Transport -4.92 4.2 a Classification was determined in this study in order to group genes by function b Score was assigned by SAM and indicates the degree of significance in the different levels between control and infected samples c Fold change was assigned by SAM and refers to the difference in level between the control and infected samples

Chapter 8 265 may be due to the relative number of lymphocytes in the stomachs of the BALB/c as compared with C57BL/6 mice 15 M post infection. Nearly all infected BALB/c mice accumulated LA while only about half of the C57BL/6 animals obtained LA.

8.3.3. Direct Comparison of the Transcriptional Responses of C57BL/6 and BALB/c mice to Infection The analyses of the C57BL/6 and BALB/c samples were done on two separate sets of murine microarrays (as described in Materials and Methods). However a portion of the elements on each of these arrays were overlapping, so a direct comparison of the expression patterns in the two mice was done using all of the genes that had values at least 2 SD from the mean in at least 4 arrays (total of 51 arrays). Using this filtering criteria 558 genes were extracted from the linked data sets and a HC algorithm was used to cluster these genes and arrays in an unsupervised fashion (Fig. 8.10). This cluster separated the C57BL/6 animals from the BALB/c animals on two branches of the dendrogram. This is not surprising given the difference in the arrays used as well as the difference in the reference samples (pools of the uninfected animals of each strain).

Regardless, there are a number of interesting clusters resulting from this analysis (Fig. 8.10). There is one cluster of genes that are induced in the majority of the infected animals as compared with the controls in both mouse strains (Table 8.10A). Included in this group of genes are some of the MHC class I and II molecules, the mucosal defence molecules, Sprr2a and crp-ductin, the lymphocyte antigens, ly6a, ly6d and ly6e, the receptors, CD40 and integrin beta 7, and finally the signal transduction molecules, Stat1 and Socs5. There was also a cluster of genes repressed in all the infected animals (Table 8.10B) including muc5ac, galactose binding lectin, somatostatin, Hamp, and phospholipase A2 (PLA2 1b). This loss of expression is likely to reflect functional atrophy in the infected animals in which parietal and chief are lost.

There were also a number of clusters of genes induced in the infected animals of one mouse strain as compared with the other, which may suggest a basis for the different host responses. One cluster of genes induced in BALB/c infected Chapter 8 266 animals as compared with C57BL/6 animals is shown in purple in Fig. 8.10 (Table 8.10C). This cluster includes some interesting genes related to host defence mechanisms including lysozyme C, cryptidin 4 (defensin 5) and calgranulin B. Calgranulin B has been associated with the anti-pathogen response. Interestingly, the calgranulin B gene has been shown to inhibit the expression of casein kinase II, which was found to be induced in C57BL/6 animals but not BALB/c animals (Table 8.11D).

The gene encoding the receptor for IL-1 (IL-1R) was also induced in BALB/c infected mice as compared with C57BL/6 mice. This receptor is known to bind IL- 1, TNFa and IL-6 cytokines (167). The IL-1R is expressed on rat parietal cells and induced expression of this receptor was shown in vitro to mediate inhibition of H+ secretion in response to IL-1b (305). Infection by H. pylori has also been shown to induce increased IL-1b and thus this suggests that the BALB/c mice may have reduced acid output in response to infection (305, 350). Alternatively, cytokine binding to IL-1R may induce activation of the NFKb pathway (167). This pathway can activate increased expression of adhesion molecules on endothelial cells and increase vascular permeability and coagulation. Interestingly, the genes encoding various proteins involved in thrombin production, thrombomodulin and serine proteinase inhibitor-1, as well as the angiotensin converting enzyme that increases vasoconstriction, were also highly expressed. This suggests that in the BALB/c mice there may be changes in vascular permeability induced by activation of the NFKb pathway. In humans with ulcer disease and mice with induced ulcers, changes in vascularisation at the affected site have been shown to inhibit ulcer healing (163).

The group of genes expressed at a higher level in the infected C57BL/6 mice as compared with the BALB/c mice are quite different from the previous group (Table 8.11D). There was induced expression of a number of genes encoding cell structure, cell-cell interaction and extracellular matrix related proteins such as junction plakoglobin, transgelin, dystroglycan-1, procollagen and sortilin-related receptor. The expression of these genes may indicate an increased amount of

Figure 8.10: Unsupervised hierarchical cluster of all genes that passed the filtering criteria in both the SML & SMK arrays (C57BL/6 samples) and the MMM arrays (BALB/c samples), shown by the green lines at the top of the figure. Blue are controls (A & E), pink are SS1 infected (B & F), and black are SS2000 infected (C & G). Clusters are denoted by lines on the right side: purple is induced in BALB/c v C57BL/6, orange clusters 1-3 are induced in C57BL/6 v BALB/c, and turquoise clusters are induced in both or repressed in both mouse strains. A red colour indicates gene induced and a green colour indicates gene repressed compared to pooled control sample (reference). Grey colour indicates missing data. Repress. is repressed.

G18 Figure 8.10 E21 E15 E13

G15 C57BL/6 G13 G19 G14 G17 F11 G11 G16 F14 F13 G20 G12 F12 A15 A12 B15r B15 B11r B11 B14 B16 B13 C16 C15 C12 C11 C14 BALB/c C13 C20r C17 C19 C18 C20 B20 B17 B12 B19 B18 A19 A20 A13 A18 A14 A16r A16 A17 A11 Repress. Induced C57BL/6 C57BL/6 Induced Induced C57BL/6 Induced Induced BALB/c both both 2: 3: 1: Chapter 8 268

Table 8.10A: Selection of genes in cluster (turquoise in Figure 8.10) showing induced expression in both C57BL/6 and BALB/c infected mice as compared with uninfected controls (see Supplementary Material Table S8.9A for full list and details).

Accession Symbol Name Classification AV025235 Ly6a lymphocyte antigen 6 complex, locus A Antigen AV086706 Ly6d lymphocyte antigen 6 complex, locus D Antigen AI840560 Ly6e lymphocyte antigen 6 complex, locus E Antigen AV134243 Calr calreticulin Calcium-binding AA170279 Itgb7 integrin beta 7 Cell adhesion AV074243 Krt1-19 keratin complex 1, acidic, gene 19 Cell structure AV072082 Sprr2a small proline-rich protein 2A Cell structure AV084558 Dmbt1 deleted in malignant brain tumours 1 Defence AI841358 Hspa8 heat shock 70kD protein 8 Heat-shock protein AV095159 Hsp86-1 heat shock protein, 86 kDa 1 Heat-shock protein AV073586 Cpe carboxypeptidase E Hydrolase AV056755 Wars tryptophanyl-tRNA synthetase ligase AV112804 Mest mesoderm specific transcript Mesoderm AV037118 B2m beta-2 microglobulin MHC I AI841614 H2-Eb1 histocompatibility 2, class II antigen E MHC II beta AV070793 Ii Ia-associated invariant chain MHC II AV073418 Vdac3 voltage-dependent anion channel 3 Mitochondrion AV135036 Pmscl1 polymyositis/scleroderma autoantigen 1 Nuclear AV006327 Pdha1 pyruvate dehydrogenase E1 alpha 1 Oxidoreductase AI385718 Psmb8 proteosome (prosome, macropain) Proteasome subunit, beta type 8 W41212 Stxbp1 syntaxin binding protein 1 Protein transport AA175527 Tnfrsf5/ tumour necrosis factor receptor Receptor CD40 superfamily, member 5 AV149922 Clu clusterin Signal AA220816 Cish5 cytokine inducible SH2-containing Signal protein 5 transduction AA879612 Stat1 signal transducer and activator of Transcriptional transcription 1 regulator AV085802 Brd4 bromodomain-containing 4 AV103442 onzin onzin

Chapter 8 269

Table 8.10B: Selection of genes in cluster (turquoise in Figure 8.10) showing reduced expression in both C57BL/6 and BALB/c infected mice as compared with uninfected controls (see Supplementary Material Table S8.9B for full list and details).

Accession Symbol Name Classification AV140431 Calb3 calbindin-D9K Calcium-binding AV085039 Cd36 CD36 antigen Cell adhesion AV034324 Spp1 secreted phosphoprotein 1 Cell adhesion AA105880 Egf epidermal growth factor Cell growth AV059045 Hamp hepcidin antimicrobial peptide Defence AA796822 Siat4a sialyltransferase 4A (beta- Golgi apparatus galactosidase alpha-2,3- sialytransferase) AV060541 Smst somatostatin Hormone AV072325 Chia- chitinase, acidic Hydrolase pending AA086839 Mcpt5 mast cell protease 5 Hydrolase AV088650 Pla2g1b phospholipase A2, group IB, pancreas Hydrolase AV005737 Acrp30 adipocyte complement related protein Inflammatory of 30 kDa response AV073536 Lgals4 lectin, galactose binding, soluble 4 Lectin AV083944 Lgals7 lectin, galactose binding, soluble 7 Lectin AV087797 Clps colipase, pancreatic Lipid degradation AV061738 Fabp2 fatty acid binding protein 2, intestinal Lipid-binding AV095036 Fabp3 fatty acid binding protein 3, muscle and Lipid-binding heart AV078594 Clps colipase, pancreatic Lipid degradation AV040804 Car3 carbonic anhydrase 3 Lyase AV057600 Mrpl2 mitochondrial ribosomal protein L2 Mitochondrion AV073495 Muc5ac mucin 5, subtypes A and C, mucin tracheobronchial/gastric AV140164 Tpm2 tropomyosin 2, beta Muscle protein AV043494 Tnni3 troponin I, cardiac Muscle protein AV087698 Nupr1 nuclear protein 1 Nuclear AV006245 Ube2b ubiquitin-conjugating enzyme E2B, Nuclear RAD6 homology (S. cerevisiae) AV073600 Slc31a2 solute carrier family 31, member 2 Transport AV056794 Txnl thioredoxin-like (32kD) AV034964 Upa uterine-specific proline-rich acidic protein AV005660 Pln phospholamban

Chapter 8 270

Table 8.10C: Selection of genes in cluster (purple in Figure 8.10) showing induced expression in BALB/c as compared with C57BL/6 infected mice (see Supplementary Material Table S8.9C for full list and details).

Accession Symbol Name Classification AV058287 Ly64 lymphocyte antigen 64 Antigen AA003942 Tnc tenascin C Cell adhesion AA033406 Mad2l1 MAD2 (mitotic arrest deficient, homolog)-like Cell cycle 1 (yeast) AA166217 Krt1-16 keratin complex 1, acidic, gene 16 Cell structure AA395947 Il1r1 interleukin 1 receptor, type I Cytokine receptor Lyzs lysozyme Defence AV050073 S100a9 S100 calcium binding protein A9 (calgranulin Defence B) AV066151 cryptdin 4 ESTs, Moderately similar to defensin related Defence cryptdin 4 AV064563 Gyk glycerol kinase Glycerol metabolism AV140236 Hsp60 heat shock protein, 60 kDa Heat-shock AI316346 Mmp13 matrix metalloproteinase 13 Hydrolase AV025177 Pla2g4a phospholipase A2, group IVA (cytosolic, Hydrolase calcium-dependent) AV065971 Fabp1 fatty acid binding protein 1, liver Lipid-binding AV140565 Idh2 isocitrate dehydrogenase 2 (NADP+), Mitochondrion mitochondrial AA108913 Mut methylmalonyl-Coenzyme A mutase Mitochondrion AV083542 Slc25a4 solute carrier family 25 (mitochondrial carrier; Mitochondrion adenine nucleotide translocator), member 4 AI838463 RAP1B member of RAS oncogene family Nuclear AI837750 Idh1 isocitrate dehydrogenase 1 (NADP+), soluble Oxidoreductase AV093705 Prdx5 peroxiredoxin 5 Oxidoreductase AV093698 Psma7 proteasome (prosome, macropain) subunit, Proteasome alpha type 7 AV014739 Sec61g SEC61, gamma subunit Protein transport AI385582 Thbd thrombomodulin Receptor AV034921 Spi1-1 serine protease inhibitor 1-1 Serpin AA163925 Bcl3 B-cell leukaemia/lymphoma 3 Signal AA002985 Tgfb1i1 transforming growth factor beta 1 induced Signal transcript 1 AI574416 Tgfb2 transforming growth factor, beta 2 Signal AA545513 Cish3 cytokine inducible SH2-containing protein 3 Signal transduction AA017831 Gnaq guanine nucleotide binding protein, alpha q Signal transduction polypeptide AA116377 Mfng manic fringe homolog (Drosophila) Signal transduction AA771105 Mapkapk5 MAP kinase-activated protein kinase 5 Signal transduction AI325884 Rgs19 regulator of G-protein signalling 19 Signal transduction AV087433 Srp9 signal recognition particle 9 kDa Signal transduction AA268754 Elk1 ELK1, member of ETS oncogene family Transcriptional regulation AV025941 Aqp1 aquaporin 1 Transport AV035003 Ttr transthyretin Transport Chapter 8 271

Table 8.10D: Selection of genes in clusters (node 1, 2 & 3-orange in Figure 8.10) showing induced expression in C57BL/6 as compared with BALB/c infected mice (see Supplementary Material Table S8.9D for full list and details).

NODE 1: Induced in C57BL/6 Accession Symbol Name Classification AV003063 C1qa complement component 1, q subcomponent, alpha Complement polypeptide AV012852 C3 complement component 3 Complement AV095108 Lap3 leucine aminopeptidase 3 Hydrolase AI838340 Chi3l3 chitinase 3-like 3 Inflammatory response AA155094 Mpeg1 macrophage expressed gene 1 Macrophage AV021300 Arbp acidic ribosomal phosphoprotein PO Ribosomal AV074721 Clu clusterin Signal AV104049 Kpna1 karyopherin (importin) alpha 1 Transport NODE 2: Induced in C57BL/6 Accession Symbol Name Classification AV057243 Sh3bp5 SH3-domain binding protein 5 (BTK-associated) Apoptosis AI183128 Ppp1cb protein phosphatase 1, catalytic subunit, beta Cell cycle isoform AV085570 Dag1 dystroglycan 1 Cell structure AA162273 Col4a1 procollagen, type IV, alpha 1 Extracellular matrix AV082012 Ctsb cathepsin B Hydrolase AV104017 Ephx1 epoxide hydrolase 1, microsomal Hydrolase AV006145 Fabp3 fatty acid binding protein 3, muscle and heart Lipid-binding AV004305 Tagln transgelin Muscle protein AV104088 Aldh3a2 aldehyde dehydrogenase family 3, subfamily A2 Oxidoreductase AI882471 Sorl1 sortilin-related receptor, LDLR class A Receptor AV149987 Cst3 cystatin C Signal AV081925 Csnk2b casein kinase II, beta subunit Signal transduction AV074473 Atp4b ATPase, H+/K+ transporting, beta polypeptide, Transport gastric specific AV025577 Vcp valosin containing protein Transport NODE 3: Induced in C57BL6 Accession Symbol Name Classification AV013361 Jup junction plakoglobin Cell adhesion AV015972 Cox7c cytochrome c oxidase, subunit VIIc Mitochondrion AV095084 Mrpl15 mitochondrial ribosomal protein L15 Mitochondrion AV084372 Timm8b translocase of inner mitochondrial membrane 8 Mitochondrion homolog b (yeast) AV149974 Idb1 inhibitor of DNA binding 1 Nuclear AV111598 Dhfr dihydrofolate reductase Oxidoreductase AI841372 Ndr4 N-myc downstream regulated 4 Signal AV074655 Rab11b RAB11B, member RAS oncogene family Signal transduction AV113950 Kpna3 karyopherin (importin) alpha 3 Transport Chapter 8 272 tissue remodelling occurring in these mice as compared with the infected BALB/c mice. Tissue remodelling is an important factor in the healing of human gastric ulcers as this process involves cell migration and proliferation that are dependent on the cell cytoskeleton. Epithelial “barrier” function is also dependent on an intact cell cytoskeleton. Recent studies have shown that the VacA protein of H. pylori may interfere with cytoskeleton-dependent cell functions (253, 254).

8.4. Conclusion This study represents the first to investigate the global transcriptional response to long term infection of mice with H. pylori. It also is the first to show differing expression signatures in response to parallel infection of two different mouse strains infected with two different H. pylori isolates. The use of this animal model to investigate host responses to infection allowed much more control over the parameters of both host and strain effects than can be achieved in human studies. Thus these results have provided insight into the mechanisms responsible for the differing pathology observed in the C57BL/6 and BALB/c mice due to H. pylori infection.

In the C57BL/6 mice the expression signature appeared to differentiate animals with and without LA regardless of the infecting H. pylori strain. As discussed, this may occur because of the differing proportions of cell types this type of infiltration produces. It also suggests that although the histopathology indicated that the SS1 strain induced more inflammation in the C57BL/6 mice than SS2000, these differences did not translate into significantly different expression signatures in these mice. In contrast, in the BALB/c mice, where the majority of infected animals developed LA, the differences in pathology were not evident in the expression signatures. This suggests that the inflammatory response to H. pylori infection in these mice may be quite uniform compared to in the C57BL/6 mice. Instead, significant differences in the expression signature of animals infected with SS1 as compared with SS2000 could be determined. Most of these differences were evident in the extent of induction or repression of transcription Chapter 8 273 levels compared to that in uninfected controls. The gene expression in the SS2000 infected animals, which had the greatest inflammation, responded more strongly to infection than the SS1 infected mice. This result confirms the involvement of transcription in the host responses to differing H. pylori strains in infected mice.

These expression studies have begun to highlight many of the universal cellular responses to H. pylori infection in an animal model, such as induction of the expression of IFN responsive genes and MHC II molecules. In addition, some unexpected patterns of expression were uncovered using this approach. For example the difference in the expression regulation of the gastric endocrine system, particularly somatostatin and gastrin transcription, may highlight the mechanism by which an individual’s gastric acid secretion responds to infection and results in a different colonisation level and distribution, in turn resulting in widely differing pathologies. Thus previously unrealised relationships between physiology and gene expression were realised using these experiments.

Uncovering new patterns of expression and using techniques such as LCM to assign individual gene expression patterns to particular cell types, allows for predictions as to the cellular responses of the gastric tissue to infection. From these analyses hypotheses can be generated regarding the mechanisms of H. pylori induced pathology and these can then be tested in this animal model. Finally this data can be utilised to investigate the response to infection in humans and perhaps discover new ways to treat the infection and/or prevent progression to serious disease, especially gastric cancer.

Chapter 9

GENERAL DISCUSSION AND FUTURE DIRECTIONS

This thesis has demonstrated the utility of microarrays for investigating a number of aspects of the complex interaction between microbe and host. Transcriptional profiling of H. pylori grown in vitro, first in regular culture conditions and second in the presence of a mammalian epithelial cell line, were performed (Chapter 4 & 5). The host side of this interaction was then investigated using the mouse model of H. pylori infection with two persistent mouse colonising strains of H. pylori (Chapter 6). The colonisation parameters of these strains were compared and genome-typing was used to investigate dynamic changes in the bacteria that occurred after 3, 6, and 15 months of colonisation (Chapter 7). Finally, the host inflammatory and global transcriptional responses after 15 months infection were examined (Chapter 8). Using these experiments some specific aspects contributing to both host and strain specific differences were examined.

9.1. Bacterial Transcript Profiling Methodology Transcript profiling of mammalian cells and genome-typing of bacterial strains are both well established methods for microarray analysis, however bacterial transcript profiling is still in its infancy. Thus, reliable methods for H. pylori transcriptional profiling were developed (Chapter 3). These included protocols by which the bacteria could be harvested and RNA extracted quickly and reliably to minimise RNA degradation and to avoid changes in the message due to stress response. Thus the use of cold centrifugation of the bacterial culture was avoided and instead the cells were collected on a membrane filter that was immediately snap-frozen. Using this method, up to 2.5 OD600 nm of H. pylori cells could be harvested in less than 1 min. The RNA extraction procedure developed involved a combination of Trizol RNA extraction and the Qiagen RNeasy clean-up columns with an on-column DNase digestion. This Chapter 9 275 procedure resulted in clean, undegraded RNA that was free of DNA and could be efficiently labelled for microarray analysis.

A number of different methods for the labelling of bacterial RNA for microarray analysis were investigated. The method chosen consisted of the indirect incorporation of amino-allyl dUTP into the second strand cDNA that was subsequently coupled with the appropriate cyanine dye. This indirect labelling method represented the most efficient, reliable and sensitive protocol tested. It required as little as 0.5 µg RNA and resulted in the non-biased incorporation of the cyanine-dyes. Biased incorporation of Cy5 compared with Cy3 dyes has been shown to be a problem with the currently accepted protocol of direct incorporation of cy-dUTP moieties into the first strand cDNA using reverse transcriptase (72, 183). It was also established that Gene Specific Primers designed for H. pylori reverse transcription were very efficient and resulted in complete coverage of the vast majority of mRNA transcripts. Finally, the use of genomic DNA as a reference for transcript profiling experiments was deemed inappropriate because the resulting bias in fluorescent signalling negatively affected the normalisation procedure. The developed procedures were used for all bacterial transcript profiling experiments in the remainder of the thesis. Scrutiny of the data obtained from transcript profiling experiments (Chapter 4 & 5) revealed that the developed methods and the H. pylori microarray used were very efficient at reliably representing the levels of RNA species in the samples tested. This was evident by the low variability between duplicate measurements from the same array and by the RNase-protection assays (Chapter 4) that showed the same trends in expression level as was detected with microarray analysis.

The bacterial transcript profiling methods developed in the present work should be of use to the broader bacterial microarray community. The new protocols for bacterial harvesting, RNA extraction and RNA labelling all represent more efficient and reliable methods than are presently used. In particular the use of indirect cy-dye incorporation for labelling represents a Chapter 9 276 significantly cheaper and more sensitive method than the direct incorporation methods currently used. These protocols are also widely applicable to many other bacterial species. Some modifications to the RNA extraction protocol may need to be incorporated for use with gram positive bacteria, or those bacterial species with capsules. In fact, at present these methods are being utilised in the Falkow lab for transcript profiling of Campylobacter jejuni, Salmonella typhimurium, and Streptococcus pneumoniae.

In future studies these methods may be further streamlined for high throughput. For example it may be possible to increase the efficiency of bacterial harvesting by using disposable sterilised filter units in which the filters are removable or where the entire unit can be snap-frozen to preserve the RNA for extraction. These units may further benefit the RNA extraction process by improving the lysis and RNA collection procedure in Trizol. Finally, the bacterial labelling protocol may be shortened by developing a method by which the amino-allyl dUTP (aa-dUTP) moieties may be directly incorporated into the first strand cDNA by reverse transcriptase, instead of into the second strand. This requires investigation into the appropriate ratios of dTTP and aa-dUTP in the reaction mixture to ensure efficient incorporation of aa-dUTP by reverse transcriptase.

9.2. H. pylori Transcriptional Regulation in vitro 9.2.1. Broth Culture To date, most of the studies investigating H. pylori transcriptional regulation and coordination have concentrated on just a few of the virulence or stress related genes. With the exception of a few recent microarray studies (4, 11, 142, 209), investigators have not extensively studied the global transcription of H. pylori. In order to understand H. pylori transcriptional regulation on a global scale, in depth time course analyses are required. Analyses using only a single time point present a snap-shot of the transcription profile of the bacterium rather than illustrating the dynamic nature of gene expression. In the present study, an extensive time-course of H. pylori transcriptional responses in the well controlled environment of broth culture was used (Chapter 4). These experiments Chapter 9 277 uncovered previously unappreciated relationships between genes; enabled predictions of the function for some unknown genes; highlighted the transcriptional control of some operon structures; and enabled the investigation of the roles of some regulatory factors. These experiments have shown, above all, that despite the paucity of regulatory factors in H. pylori, this organism controls transcription in a dynamic fashion, which is both time and growth phase dependent.

A major switch in virulence gene expression was observed during the transition from log to stationary phase during the time course, suggesting that H. pylori may be most virulent in the late log phase of growth. In line with this idea we observed maximum motility, peak expression of virulence factors such as napA and cagA, as well as expression of stress response proteins, such as dnaK in this phase of growth. Further investigation of the growth phase in which H. pylori is best able to adapt, colonise and cause disease may indicate key targets for vaccines and intervention measures. Such knowledge may help in the development of methods to prevent or treat infection.

One of the most interesting and previously unrealised relationships that was discovered in these experiments is the tight regulation of iron uptake and storage systems. The dynamic switch in expression of these gene products during the transition from log to stationary phase further implicates this phase of growth as important in colonisation and persistence. The uncoupling of this relationship, possibly through the targeting of the Fur regulon, would certainly render the bacterium unable to persistently infect. This is due to of the extensive iron scavenging mechanisms present in the host designed to starve bacteria of this essential nutrient. Further studies into the response of H. pylori to iron deplete conditions or conditions of excess iron will help elucidate the regulatory mechanisms involved in the control of iron homeostasis. These experiments are currently underway, and thus far have highlighted an enhanced ability for stationary phase cells to survive in iron deplete conditions and to remain motile for much longer periods than log phase cells. This may provide a mechanism by Chapter 9 278 which the bacteria can sense the environment and move in order to find more favourable conditions. A similar phenomenon has also been observed by Merrell et al. (209) in organisms exposed to low pH conditions, thus suggesting that H. pylori senses and responds to environments encountered within the host.

In order to associate these microarray results with one aspect of the biology of the organism, the gene expression patterns of the flagella regulon was compared to actual changes in H. pylori motility. The results obtained in this study have been recently mirrored by a separate study using reporter constructs instead of microarray analysis (230). These types of experiments help to highlight regulatory controls and to further understand the hierarchy of flagella gene regulation. Future time course microarray studies in which key regulators of the flagella regulon have been disrupted will further elucidate these mechanisms.

Future experiments using time course analysis of H. pylori growth in different environmental conditions mimicking the in vivo environment, such as high viscosity, differing concentrations of oxygen and carbon dioxide and other nutrients will further highlight the regulatory mechanisms of this bacterium. In particular, the specific effects of the known regulators may be found through transcriptional analysis under various environmental conditions, or for non- essential regulators, through gene mutation analysis. Understanding the direct and downstream effects of the proposed stringent response genes, spoT and gppA, and the two-component regulators will be of particular interest. These microarray analyses in conditions mimicking the stomach environment, in combination with protein profiling, computational analyses of promoter regions and footprint analysis to show evidence of specific protein/promoter interactions will help elucidate H. pylori’s mechanisms of gene regulation in vivo.

Since about one third of the ORFs in H. pylori are still of unknown function, expression profiling of these genes may help elucidate functions for these gene products. This may be particularly important as many of these products are H. pylori specific and thus are likely directly related to the unique environment in which this organism lives and the disease manifestations it causes. An example Chapter 9 279 may be the two genes encoding OMPs, omp5/29, found to be co-regulated with cagA. Mutation analysis of these genes may uncover a previously unappreciated function for these gene products in virulence. Finally, testing these ideas in cell culture or animal models may further elucidate colonisation and disease causing factors.

9.2.2. Co-culture One particular in vitro environment to which we can expose H. pylori is exposure to mammalian epithelial cells as this mimics an in vivo situation. Interaction of H. pylori with epithelial cells is a dynamic process. The organism has been shown to adhere to cells in culture and in vivo (55, 95, 138, 187, 227, 238, 358), induce vacuolisation (64, 197, 211, 267, 299), inject bacterial products that cause actin rearrangement and release of cytokines (15, 130, 217, 237, 287, 311, 315, 316, 340, 394), increase cellular permeability (256, 348), and to induce apoptosis (47, 147, 158, 161, 164, 168, 186, 223, 273, 322, 330). However, little is known about what induces H. pylori to do these things and the benefits it gains by these actions. Thus, in the present work, the transcriptional response of H. pylori to a mammalian epithelial cell line (MDCK) was investigated (Chapter 5). MDCK cells were chosen because H. pylori did not attach in large numbers to the monolayer. However, H. pylori infection of the MDCK cells caused the cell to cell junctions to become more apparent and eventually induced a change in cell shape indicating interaction between the bacterial and mammalian cell types. Therefore, these interactions represent a highly simplified version of the in vivo environment that H. pylori encounters.

The fact that the H. pylori culture does not adhere to the MDCK cells was beneficial for two main reasons. First, the bacterial culture was easily harvested from the cell monolayer and the bacterial RNA could be extracted without contamination of mammalian RNA. Second, the co-culture could be maintained in a semi-continuous culture system for extended periods of time, much longer than any other cell-culture system previously described {for example AGS and Caco-2 Chapter 9 280 cell co-culture (21, 89, 312, 317)}. Considering the chronic nature of H. pylori infections, this latter point is particularly pertinent.

The transcriptional response of H. pylori in co-culture not surprisingly highlighted many differences from the previous broth culture. Comparison with parallel control cultures was thus necessary and revealed a specific expression signature induced by the presence of the mammalian cells. These differences in expression were relatively small, indicating that the influence of the MDCK cells was narrow. The genes specifically induced in co-culture were involved primarily in metabolism and respiration, suggesting that the presence of cells induced competition for nutrients and the need for increased energy production in H. pylori. This suggests that metabolic functions in the bacterium are probably important for colonisation and persistence. Supporting this notion are a number of deletion mutant studies performed on various metabolic factors, such as the gene encoding the urea transporter, ureI (329), the gene encoding arginase, rocF (203) and the gamma-glutamyltransferase gene, g-GGT (204), which resulted in reduced ability of these mutants to colonise mice.

Interestingly, many genes of unknown function and some regulatory factors, for which no function has been established, were also induced in co-culture. This perhaps suggests that these gene products may be required specifically in the context of infection. Three of the genes found to be induced in co-culture in both SS1 and G27 cultures are of presently unknown function. First, the outer membrane protein gene, omp32, may encode an adhesin necessary for colonisation or perhaps a porin required for uptake of nutrients in short supply in vivo. Second, the expression of the 2-component regulator, HP1021, was induced in co-culture. This is the first environmental condition to which this gene has been shown to respond. Thus this regulator may respond specifically to signals from host cells and provide an important in vivo function. Finally, the gene encoding a possible immune response regulator, gcpE was also induced. This is interesting considering that modulation of the immune response appears to be important for persistent colonisation. To elucidate the roles of these genes Chapter 9 281 further, gene mutation analysis and specific expression detection techniques, in cell culture systems as well as investigation of colonisation abilities in a mouse model should be conducted.

This MDCK co-culture system may be used for many purposes. For example it would be interesting to investigate the differential transcriptional response of individual H. pylori strains or isogenic mutants in key genes, to the mammalian cells. In addition, H. pylori cultures grown in MDCK co-culture are better adapted to cell culture conditions than broth or plate grown cultures and thus, when these are used in cell culture infection experiments they respond more quickly. H. pylori cells grown in MDCK co-culture have been shown to inject CagA protein into cultured Caco-2 cells within 2 h whereas H. pylori grown on solid media takes much longer (~24 h) (M. Amieva, personal communication). The reason for this delay in CagA delivery by the plate grown bacteria is most likely due to a much lower percentage of motile bacteria (~1%). The MDCK grown bacteria always have a high percentage (> 50%) of motile organisms 12-24 h after media replenishment. In addition the plate grown bacteria require adaptation to the tissue culture medium (DMEM/FCS/BB) and incubation conditions (5% CO2) that may extend the lag phase of growth. The ability of log phase H. pylori cells from either broth culture medium (BBF) or tissue culture medium (DMEM/FCS/BB) (without MDCK cells) to inject CagA into Caco-2 cells has not been thoroughly investigated. However, since the pattern of growth of H. pylori in DMEM/FCS/BB is similar to the co-cultures, bacteria grown in this way may also be able to adapt to further cell culture experiments more quickly than those grown on solid media or in regular broth culture.

The next step will be to investigate the transcriptional response of H. pylori in co- culture with human gastric cell lines such as AGS. These studies will then elucidate further effects of the more intimate interaction H. pylori has with these mammalian cells through adherence. At present these experiments are difficult because there are no appropriate methods for separating and/or detecting the Chapter 9 282 bacterial RNA without contamination from the mammalian RNA species from such co-cultures.

The ability to investigate both the bacterial and mammalian transcriptional responses from a single experiment is the ultimate goal for understanding these interactions. For the bacterial side, various possibilities have recently arisen. One study investigated the global in vivo gene expression of H. pylori for the first time (121). The method used in this study involved enrichment of cDNA molecules from infected tissues using a non-linear PCR based procedure that was then detected on a DNA microarray. Thus, the results of this study were non- quantitative. To determine genes specifically induced in vivo, the authors compared the in vivo expression profile to the in vitro expression of the same strain at a single time point in mid-log phase of growth in regular broth culture (121). The studies presented in Chapter 5 have shown that this type of comparison can be misleading. It was interesting however, that Graham and colleagues (121) reported that a large number of the genes expressed specifically in vivo were H. pylori specific genes of currently unknown function, just as was found in the co-cultures in the present study (for example: HP0228, HP0661 and HP0667).

The approach used by Graham et al. (121) in parallel with more effective linear bacterial RNA amplification techniques and/or new ways to separate bacterial and mammalian RNA from a mixed sample will enable more detailed quantitative analysis of the in vivo expression of this bacterium. Another promising technique that could be used to specifically dissect H. pylori cells from the crypts of infected stomachs is laser capture microdissection (LCM) (283). RNA extracted from these cells could then be amplified for microarray analysis. However, it must be noted that understanding the significance of the in vivo expression profile of H. pylori will be challenging as this is just a snap-shot of the expression of the bacteria that are likely to be in a dynamic state of flux. Also individual bacteria from differing regions of the stomach are likely to have a different expression profile, thus separating these will be important. Also at present, microarray Chapter 9 283 analysis is not completely quantitative and thus other methods may need to be used to confirm the levels of expression of genes of interest such as real-time RT-PCR analysis.

9.3. The New Mouse Model 9.3.1. Colonisation and Inflammation The difficulty of using human studies to unravel the specific contributions of bacterial and host specific differences to disease development has revealed the requirement for comparative animal studies. Animal models allow the investigator to control various aspects of infection such as host and strain differences. One of the problems with this type of study lies in the general lack of animal models in which multiple strains of H. pylori can persistently colonise. Thus, we isolated a new mouse colonising strain of H. pylori, SS2000 that exceeded the colonisation ability of the established strain, SS1. Use of a combination of these two H. pylori isolates and two different mouse strains provided some insight into the specific contributions of strain and host differences. Indications of the strain specific effects on inflammation were evident in the fact that while colonisation distribution of the two strains was similar, the colonisation level of SS2000 was significantly higher than SS1 in both mouse types. In addition, while the level of pathology produced by SS1 and SS2000 was similar after 6 months infection, there were significant differences by 15 months post-infection. This latter point illustrates the importance of long term colonisation studies in understanding these phenomena.

The mouse strains utilised were chosen because of their previously demonstrated differences in levels of inflammation induced by Helicobacter infection (296, 297). The C57BL/6 strain has been shown to have a “responder” phenotype in which gastritis is induced early in infection. The BALB/c strain has a “non-responder’ phenotype, showing little inflammation in the early stages of infection (296). The differences in the host response of the C57BL/6 and BALB/c mice was evident in the present work due to both the different levels of colonisation induced in the two mice (C57BL/6 mice having significantly higher levels of colonisation), and also in the inflammatory response. Chronic active Chapter 9 284 gastritis was induced in responder C57BL/6 mice after 6 months. This increased in severity by 15 months regardless of the infecting strain. In contrast, the non- responder BALB/c mice produced a very mild level of inflammation (after 6 months infection) followed by accumulation of MALT tissue in response to both H. pylori strains after 15 months infection. Strain specific differences were also evident due to the fact that SS1 induced a more severe inflammation than SS2000 in the C57BL/6 mice, and that SS2000 induced a significantly larger number of lymphoid aggregates and gland abscesses as well as a small amount of functional atrophy, in the BALB/c mice. These results also indicate that a higher level of colonisation does not necessarily induce an increased inflammatory response in mice, in contrast to the case in human infections in which there appears to be a correlation between colonisation level and severity of inflammation (80).

Differences in host response have largely been attributed to the different T-helper response of these mouse strains. C57BL/6 mice have a dominant Th1 type pro- inflammatory response and BALB/c mice have a Th2 type, anti-inflammatory response. A number of other factors that may contribute to differences in host response have been suggested. These include differences in the level of antibody production (159), MHC class (218), and Toll receptor responsiveness to specific ligands leading to differing levels of the Th1 cytokine IL-12 (182). Thus, future studies with this animal model should include measurements of the levels of serum and gastric antibody and cytokine production to confirm these theories in the context of H. pylori infection. To date, there have been difficulties in demonstrating differences in the levels of some cytokines at the gastric mucosa, for example IL-4, due to the small quantity of protein produced. Therefore, another more sensitive technique is required, such as real-time RT-PCR which can be used to detect levels of specific cytokine mRNA in the gastric mucosa. Along with showing differences in the host response to infection, such studies may demonstrate a significant difference in the levels of IgA and IgM antibodies and cytokines induced by SS2000 as compared with SS1. It is possible that SS2000 induces a less pronounced antibody response, thus enabling enhanced Chapter 9 285 colonisation levels that in turn affect the level of cytokine production and inflammation produced. Other phenotypic differences between the host strains, such as basal gastric acid secretion, as well as phenotypic and genetic differences between the H. pylori strains may explain these host and strain specific effects.

9.3.2. Genome Typing of Mouse Colonising Strains Adaptation of H. pylori to the human host probably occurred many thousands of years ago. The organism does not readily infect other animal species and cannot survive in the environment. Thus, after transmission to a new individual it must be able to adapt effectively in order to establish persistent infection. Although macrodiversity and microdiversity have been shown among strains isolated from different individuals, the factors that enable some strains to colonise other animal species have not yet been identified. Some studies have suggested enhanced motility and in vivo passaging can increase the level of infectivity, while in vitro passaging decreases the ability of strains to infect. In the present study we have shown evidence of genomic changes which may have enhanced the ability of two H. pylori isolates to colonise mice.

Genome typing of the two mouse colonising strains, SS1 and SS2000, and their original clinical isolates revealed that, as evidenced by the enhanced colonisation ability of the mousified strains, there were significant changes in genetic makeup post-colonisation. One gene that was altered in both SS1 and SS2000, as compared with their respective clinical isolates, was the putative OMP gene HP0486. Changes in this gene therefore, represents a possible reason for the enhanced colonisation ability of the mousified strains. Sequencing and mutational analysis of HP0486 will show the importance of this OMP in mouse colonisation. Identification of such a colonisation factor may allow more efficient screening for additional mouse colonising strains for use in further comparative studies.

Comparison of the genomic content of “mousified” SS1 and SS2000 revealed a number of specific differences that, with further research may be linked to the different pathologies induced by these strains. In particular, the complete lack of Chapter 9 286 the cag PAI in SS2000 and the presence of the entire island in SS1 are of interest. Despite the presence of the entire PAI in SS1, it has been found by a number of investigators that this strain is unable to induce IL-8 secretion from AGS cells in vitro (67, 89). Whether this functionality translates to the in vivo situation in infected mice is unclear. It appears from a number of studies that the lack of the cag PAI may in fact increase the ability of H. pylori strains to infect mice (269). This suggests that in mice limiting the immune response to infection is important for persistence.

In the SS1 strain two genes that were shown to change during “mousification” were the duplicate genes omp5/29. As discussed previously, the expression of these genes was found to covary with cagA transcription (Chapter 4). Thus further investigation into the sequence changes in these genes and the effect of deletion mutants may elucidate a function for these proteins in relation to the non- functional cag PAI present in SS1. Many of the differences between SS1 and SS2000 were present in genes encoding cell envelope proteins and LPS biosynthesis genes. Considering the importance of LPS in mouse colonisation ability and immune evasion, these changes warrant investigation (182).

The genome typing experiments of the strains collected after long term colonisation revealed that, like in humans (137), there appears to be microevolution of H. pylori over time during infection of mice. This was evident in that bacteria recovered after 6 months infection were somewhat separated by cluster analysis of their genetic content from those collected after 15 months infection. Although the genetic changes were not universal, several classes of genes were shown to be the most susceptible to genetic changes. The majority of the genes which appeared to be altered were of unknown function and, like in the “mousification” analysis, a significant number of LPS and OMP genes were particularly prone to genetic variability. Many of the changes detected in the analysis of these isolates only appear to change the hybridisation efficiency of the gene slightly. Thus it is important to perform duplicate arrays on these gDNA samples in order to account for artefacts of hybridisation. Also analysis of a larger Chapter 9 287 number of individual clones would be necessary to identify candidate genes for knockout studies in order to illustrate particular functions that enable a specific strain to persist for long periods in mice. In addition some ongoing adaptation of isolates over time in each individual host may be required for persistent colonisation. These adaptations may include mechanisms to cope with individual variations in physical features such as basal gastric acid secretion, endocrine hormone levels, mucus viscosity, and the level of mucosal defence response.

The results from these genome typing experiments highlight the sensitivity of microarrays for the detection of genetic changes in H. pylori using the GACK analysis program. However, sequencing analysis of individual genes of interest is needed to confirm the level of divergence these genes acquired during colonisation, and to determine whether these changes are related to phenotypic differences observed during infection. It is tempting to speculate that genetic changes in genes encoding proteins expressed on the cell surface may enable H. pylori to avoid recognition by the host immune system or to further adapt to persistent colonisation of the stomach through enhanced adhesion mechanisms. There is also a distinct possibility that the most significant genetic changes that occur in these strains may be in strain specific genes that are not present on the H. pylori microarray utilised in this study. Therefore continual updating of this array with newly discovered strain specific genes would be beneficial for this type of analysis.

9.3.3. Transcriptional Profiling of the Host Response to Infection Using transcriptional profiling of the infected mouse stomachs from the long term infection study (Chapter 6), further factors that may contribute to the type of inflammatory response observed in these mice were identified (Chapter 8). Statistical analysis of the transcriptional responses in these animals was used to perform supervised hierarchical clustering of the data. This type of clustering enabled the identification of specific expression signatures. In the C57BL/6 mice the expression signatures appeared to represent the presence/absence of lymphoid aggregates (LA), rather than the infecting H. pylori strain or any other Chapter 9 288 measure of pathology. Particular cell markers for T- and B-lymphocytes were up- regulated in samples with LA. These include chitinase 3-like 3 and the lymphocyte antigens D & E, respectively. Interestingly, the presence of LA was also linked with the level of overall colonisation in these animals; colonisation was significantly lower in animals with LA. It is unclear whether the inflammatory response or the colonisation level is driving this relationship. However, the fact that somatostatin was significantly repressed in animals with LA suggests that the level of basal gastric acid may be higher in these animals causing reduced colonisation. It is possible that the reason for this change in somatostatin transcription is the loss of D cells in animals with LA, but this needs to be specifically investigated using a technique such as immunohistochemistry.

In contrast with the C57BL/6 analysis, in the BALB/c samples a significant expression signature was demonstrated that distinguished the samples by the infecting strain, rather than by the level of pathology produced. In these animals it was apparent that the SS2000 strain induced a stronger transcriptional response than SS1. This may account for the enhanced pathology induced by the former strain. The majority of differences occurred in down-regulated genes, perhaps suggesting some contribution of atrophy to the strain differences. Further specific quantification of particular cell types in these animal’s stomachs would be necessary to confirm this possibility. This is especially true as loss of parietal and chief cells is thought to be minimal in BALB/c mice. One interesting gene that was significantly more repressed in the SS2000 mice was the gene encoding ferritin. The repression of ferritin expression may suggest that SS2000 has an enhanced ability to inhibit the iron sequestration properties of the host. This suggests both a mechanism for the enhanced colonisation level of this strain and also shows a mechanism by which particular strains may be linked with the induction of iron deficiency anaemia in infected individuals. Measurement of serum iron and ferritin levels in mice infected with these two strains may confirm this notion. Chapter 9 289

The direct comparison of the array results from the C57BL/6 and BALB/c mice are preliminary due to the fact that low numbers of controls and SS1 infected samples were analysed from the C57BL/6 set of animals and because two different array types were used for these studies. However, specific differences have been highlighted by this analysis. In particular, the different expression of the gastric endocrine hormones gastrin and somatostatin indicate a possible difference in the gastric acid secretion response of these two mouse strains to infection. Also, enhanced expression of cell remodelling associated genes like junction plakoglobin in the C57BL/6, may contribute to the understanding of the different type of inflammation induced in these animals. A more direct analysis of these samples may provide more details regarding the host dependent effects on inflammatory response to H. pylori infection.

Assigning these gene expression signatures to individual cell types in the stomach is of importance in the understanding of the overall cellular response to H. pylori. The recent study by Mueller et al. has used LCM to begin this process (224). These investigators were able to assign expression patterns of a number of genes to either the lymphocytic fractions obtained from lymphoid aggregates or to the mucosal fractions from Helicobacter infected BALB/c mice. More precise LCM procedures will enable the investigation of expression in more individual cell types such as the parietal cells, chief cells, and endocrine cells (G, D, and ECL). This cellular information can then be compared to the human situation in which the expression pattern in whole biopsy samples will be analysed. Because of the small size of these human biopsies, division of the transcriptional response into cell types is impractical, thus comparison with animal models is essential.

Some preliminary studies by Parsonnet and colleagues have shown the ability to amplify the host mRNA obtained from tiny gastric biopsies for microarray analysis (K. Guillemin, personal communication). Information resulting from these in-depth global analyses will help broaden our understanding of the specific cellular events leading to disease progression in humans. This will be particularly useful for identifying gene expression signatures present in the stomachs of individuals at Chapter 9 290 risk of developing gastric cancer and/or those individuals that will not respond to therapy. In addition, it will soon be possible to detect the transcriptional responses of both the bacterial and mammalian cells concurrently in human biopsy samples. This will provide the means to investigate individual host and strain variations.

Some factors need to be taken into consideration before attempting to interpret transcriptional data from human biopsy samples. The finding that in the C57BL/6 mice, the expression signatures most effectively indicated the presence or absence of LA in the sample, suggests that these transcription studies are particularly prone to variations in the type and proportions of cell types present in the sample. This suggests that in future investigation of transcriptional responses in biopsies it will be imperative to carefully characterise the histology of each biopsy. Only expression signatures from samples with similar proportions of different cell types can be compared in order to understand individual host variations and responses to different types of H. pylori strains. For example it was possible in the present study to discern a different response to SS1 and SS2000 in the BALB/c animals because these samples all had a similar proportion of cell types due to the presence of LA in all animals. The small size of human gastric biopsies will make this a particularly difficult task and this further highlights the importance of LCM studies of different cell types in animal models to gain a better understanding of the cellular responses of the host to H. pylori infection. In addition the comparative mouse model may provide an excellent means to investigate the temporal changes occurring in both host and pathogen during the course of infection and as a function of modifications in environmental or host factors.

9.4. Concluding Remarks H. pylori is a unique and highly evolved pathogen. Unique in that it resides in the very constant specialised niche of the stomach without significant competition from other pathogens. Yet work in our laboratory has shown how susceptible the organism is to changes in the local environment, such as gastric Chapter 9 291 pH, and these changes have important impacts on disease manifestations (373). It is critical we better understand the effects of changes in the environment on the organism both in vitro and in vivo. Perhaps even more important is the understanding of the specific cellular events which occur in the host leading to PUD and gastric cancer, and how this relates to specific virulence factors in H. pylori. Genomic data provided by the sequencing projects and the advent of microarray technology have provided a powerful methodology with which to increase understanding of these aspects. This thesis for the first time takes us along that pathway. Appropriate methods have been developed; tips and traps have been identified; and tantalising areas for future research have been uncovered. A reliable comparative mouse model was also developed that will be useful for further understanding the contributions of differing strains to pathological processes. However, most importantly, this thesis shows the power of global transcriptional analysis using microarrays and confirms the approach to be feasible and fruitful. Hopefully, these first findings will be built on by others and so be the foundation of a new area in Helicobacter research. In the future, the ability to investigate the contributions of both bacterial and host transcriptional responses in animal tissues and human biopsy samples from many different individuals is the ultimate goal and will further advance the search for appropriate drug and vaccine candidates.

APPENDIX

All Supplementary Material for this thesis is stored on the enclosed CD on the back cover. The entire text and figures are also included. The list of files and the programs on which they were created are as follows:

Chapter 4 Table S4.1: Normalised data set for the first and second TC. File name: Supplementary Table S4.1_normalised data.txt (Excel-XP ) Table S4.2: Full list of the genes from the Induced Set indicated in Fig. 4.1B that vary by at least two fold over time in both TCs. File name: Supplementary Table S4.2_Induced Set.xls (Excel-XP ) Table S4.3: Full list of the genes from the Repressed Set indicated in Fig. 4.1B that vary by at least two fold over time in both TCs. File name: Supplementary Table S4.3_Repressed Set.xls (Excel-XP )

Chapter 5 Movie S5.1: Movie showing the movement of H. pylori cells infecting MDCK cells maintained for two weeks in co-culture. File name: Supplementary Movie S5.1_2wk old culture (QuickTime Movie) Movie S5.2: Movie showing the movement of H. pylori cells infecting MDCK cells maintained for 8 weeks in co-culture. File name: Supplementary Movie S5.1_8wk old culture (QuickTime Movie) Table S5.1: Results from the SAM1 analysis comparing expressing in co-culture to static broth culture. File name: Supplementary Tables S5.1 & S5.2_SAM1 and SAM2.doc (Word-XP ) Table S5.2: Results from the SAM2 analysis comparing expressing in co-culture to growth in media alone and in the presence of fixed MDCK cells. File name: Supplementary Tables S5.1 & S5.2_SAM1 and SAM2.doc (Word-XP ) Table S5.3A: Results from the SAM3 analysis showing genes which were significantly induced log phase in the co-cultures SS1-A and SS1-B. File name: Supplementary Tables S5.3A_induced in SS1 coculture.xls (Excel-XP ) Appendix 293

Table S5.3B: Results from the SAM3 analysis showing genes which were significantly repressed in log phase in the co-cultures SS1-A and SS1-B. File name: Supplementary Tables S5.3B_repressed in SS1 coculture.xls (Excel-XP ) Table S5.4A: Normalised averaged data for the G27 time courses used in this analysis. TC6 is the live co-culture, TC7 is the culture in media alone, TC8 is culture in static BB, and TC9 is culture with Fixed MDCK cells. File name: Supplementary Tables S5.4A_SS1 data.txt (Excel-XP ) Table S5.4B: Normalised averaged data for the SS1-A (TC4) and SS1-B (TC5) time courses used in this analysis. File name: Supplementary Tables S5.4B_G27 data.txt (Excel-XP )

Chapter 7 Table S7.1: Normalised averaged data for the pre- and post-mouse strains of SS1 and SS2000. File name: Supplementary Tables S7.1_pre and post mouse.txt (Excel-XP) Table S7.2: Normalised averaged data for Input and Output strains of SS1 and SS2000 from long term colonisation experiment. File name: Supplementary Tables S7.2_Input&Output strains.txt (Excel-XP) Table S7.3.1: Normalised averaged data for the SS1_AL and SS1_SF strains. File name: Supplementary Tables S7.3.1_SS1_AL and SS1_SF.txt (Excel-XP) Table S7.3.2: Genes that vary between the SS1 isolate from the A. Lee laboratory (SS1-AL) and the isolate obtained by Stanford University (SS1-SF) as determined by GACK analysis, graded output. File name: Supplementary Tables S7.3.2.doc (Word-XP ) Table S7.4: Genes that differ between SS1 and SS2000 as determined by GACK analysis, graded output. File name: Supplementary Tables S7.4_Diff between SS1 and SS2000.xls (Excel-XP) Table S7.5: Genes that vary in the SS1 Input and Output strains (Non-core genes) as determined by GACK analysis, graded output. File name: Supplementary Tables S7.5_SS1 non-core genes.xls (Excel-XP) Table S7.6: Genes that vary in the SS2000 Input and Output strains (Non-core genes) as determined by GACK analysis, graded output. Appendix 294

File name: Supplementary Tables S7.6_SS2000 non-core genes.xls (Excel-XP)

Chapter 8 Table S8.1: Normalised data for the microarrays (SMK/L arrays) used to investigate the C57BL/6 samples. File name: Supplementary Tables S8.1_C57BL data.txt (Excel-XP) Table S8.2: Normalised data for the microarrays (MMM arrays) used to investigate the BALB/c samples. File name: Supplementary Tables S8.2_BALB data.txt (Excel-XP) Table S8.3: Normalised data for the overlapping clones in the microarrays (SMK/L & MMM arrays) used to investigate both the C57BL/6 and BALB/c samples. File name: Supplementary Tables S8.3_C57 & BALB data.txt (Excel-XP) Table S8.4: Results for the SAM analysis investigating the expression pattern of the uninfected versus infected C57BL/6 samples. File name: Supplementary Tables S8.4_C57BL_controlvinfect.xls (Excel-XP) Table S8.5: Results for the SAM analysis investigating the expression pattern of the C57BL/6 samples with and without lymphoid aggregates. File name: Supplementary Tables S8.5_C57BL_LA.xls (Excel-XP) Table S8.6: Results for the t-test analysis investigating the expression pattern of the SS1 versus SS2000 infected C57BL/6 samples with high monocytic infiltration. File name: Supplementary Tables S8.6_C57BL_SS1vSS2000.xls (Excel-XP) Table S8.7: Results for the SAM analysis investigating the expression pattern of the uninfected versus infected BALB/c samples. File name: Supplementary Tables S8.7_BALB_controlvinfect.xls (Excel-XP) Table S8.8: Results for the SAM analysis investigating the expression pattern of the SS1 versus SS2000 infected BALB/c samples. File name: Supplementary Tables S8.8_BALB_SS1vSS2000.xls (Excel-XP) Table S8.9A: Genes in the cluster in which expression was induced in both C57BL/6 and BALB/c infected animals compared to the uninfected controls (Fig. 8.10, turquoise). Appendix 295

File name: Supplementary Tables S8.9A_induced both.xls (Excel-XP) Table S8.9B: Genes in the cluster in which expression was repressed in both C57BL/6 and BALB/c infected animals compared to the uninfected controls (Fig. 8.10, turquoise). File name: Supplementary Tables S8.9B_repressed both.xls (Excel-XP) Table S8.9C: Genes in the cluster in which expression was induced in infected BALB/c animals compared to infected C57BL/6 animals (Fig. 8.10, purple). File name: Supplementary Tables S8.9C_induced BALB v C57BL.xls (Excel-XP) Table S8.9D: Genes in the three clusters in which expression was induced in infected C57BL/6 animals compared to infected BALB/c animals (Fig. 8.10, orange). File name: Supplementary Tables S8.9D_induced C57BL v BALB.xls (Excel-XP)

Thesis Text File name: PhD Thesis_Lucinda Thompson_text.pdf (Word-XP)

Thesis Figures Black and White Figures. File name: PhD Thesis_Lucinda Thompon_B&W figures.pdf (Powerpoint-XP ) Colour Figures. File name: PhD Thesis_Lucinda Thompon_Colour figures.pdf (Adobe Acrobat 5)

BIBLIOGRAPHY

1. Akada JK, Shirai M, Takeuchi H, Tsuda M, Nakazawa T. 2000. Identification of the urease operon in Helicobacter pylori and its control by mRNA decay in response to pH. Mol Microbiol. 36:1071-84. 2. Akopyanz N, Bukanov N, Westblom T, Kresovich S, Berg D. 1992. DNA diversity among clinical isolates of Helicobacter pylori detected by PCR- based RAPD fingerprinting. Nucl. Acids. Res. 20:5137-5142. 3. Akopyanz N, Bukanov NO, Westblom TU, Berg DE. 1992. PCR-based RFLP analysis of DNA sequence diversity in the gastric pathogen Helicobacter pylori. Nucleic Acids Research. 20:6221-5. 4. Allan E, Clayton CL, McLaren A, Wallace DM, Wren BW. 2001. Characterization of the low-pH responses of Helicobacter pylori using genomic DNA arrays. Microbiology. 147:2285-2292. 5. Alm RA, Bina J, Andrews BM, Doig P, Hancock RE, Trust TJ. 2000. Comparative genomics of Helicobacter pylori: analysis of the outer membrane protein families. Infect Immun. 68:4155-68. 6. Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, deJonge BL, Carmel G, Tummino PJ, Caruso A, Uria-Nickelsen M, Mills DM, Ives C, Gibson R, Merberg D, Mills SD, Jiang Q, Taylor DE, Vovis GF, Trust TJ. 1999. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature. 397:176-80. 7. al-Moagel MA, Evans DG, Abdulghani ME, Adam E, Evans DJ, Jr., Malaty HM, Graham DY. 1990. Prevalence of Helicobacter (formerly Campylobacter) pylori infection in Saudia Arabia, and comparison of those with and without upper gastrointestinal symptoms. Am J Gastroenterol. 85:944-8. 8. Alter O, Brown PO, Botstein D. 2000. Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 97:10101-6. Bibliography 297

9. Altincicek B, Moll J, Campos N, Foerster G, Beck E, Hoeffler J, Grosdemange-Billiard C, Rodriguez-Concepcion M, Rohmer M, Boronat A, Eberl M, Jomaa H. 2001. Cutting Edge: Human gd T cells are activated by intermediates of the 2-C-methyl-D-erythritol 4-phosphate pathway of isoprenoid biosynthesis. The Journal of Immunology. 166:3655- 3658. 10. Andersen LP, Wadstrom T. 2001. Basic Bacteriology and Culture. In: Mobley HL, Mendz GL, Hazell SL (eds) Helicobacter pylori physiology and genetics. ASM Press, Washington, DC:27-38 11. Ang S, Lee CZ, Peck K, Sindici M, Matrubutham U, Gleeson MA, Wang JT. 2001. Acid-induced gene expression in Helicobacter pylori: study in genomic scale by microarray. Infect Immun. 69:1679-86. 12. Annibale B, Capurso G, Martino G, Grossi C, Delle Fave G. 2000. Iron deficiency anaemia and Helicobacter pylori infection. Int J Antimicrob Agents. 16:515-9. 13. Anonymous, Live flukes and Helicobacter pylori. IARC working group on the evaluation of carcinogenic risks to humans, Lyon, 1994 14. Arfin SM, Long AD, Ito ET, Tolleri L, Riehle MM, Paegle ES, Hatfield GW. 2000. Global gene expression profiling in Escherichia coli K12. The effects of integration host factor. J Biol Chem. 275:29672-84. 15. Asahi M, Azuma T, Ito S, Ito Y, Suto H, Nagai Y, Tsubokawa M, Tohyama Y, Maeda S, Omata M, Suzuki T, Sasakawa C. 2000. Helicobacter pylori CagA protein can be tyrosine phosphorylated in gastric epithelial cells. J Exp Med. 191:593-602. 16. Ashktorab H, Neapolitano M, Bomma C, Allen C, Ahmed A, Dubois A, Naab T, Smoot DT. 2002. In vivo and in vitro activation of caspase-8 and -3 associated with Helicobacter pylori infection. Microbes Infect. 4:713-22. 17. Atherton JC. 2002. The chips are down for Helicobacter pylori. Gut. 50:293-4. 18. Atherton JC, Cover TL, Papini E, Telford JL. 2001. Vacuolating Cytotoxin. In: Mobley HLT, Mendz GL, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, D. C.:97-110 Bibliography 298

19. Atherton JC, Tham KT, Peek RM, Jr., Cover TL, Blaser MJ. 1996. Density of Helicobacter pylori infection in vivo as assessed by quantitative culture and histology. J Infect Dis. 174:552-6. 20. Bach S, Makristathis A, Rotter M, Hirschl AM. 2002. Gene Expression Profiling in AGS Cells Stimulated with Helicobacter pylori Isogenic Strains (cagA Positive or cagA Negative). Infect. Immun. 70:988-992. 21. Backert S, Ziska E, Brinkmann V, Zimny-Arndt U, Fauconnier A, Jungblut PR, Naumann M, Meyer TF. 2000. Translocation of the Helicobacter pylori CagA protein in gastric epithelial cells by a type IV secretion apparatus. Cell Microbiol. 2:155-64. 22. Barabino A. 2002. Helicobacter pylori-related iron deficiency anemia: A review. Helicobacter. 7:71-75. 23. Barrett MT, Glogovac J, Porter P, Reid BJ, Rabinovitch PS. 1999. High yields of RNA and DNA suitable for array analysis from cell sorter purified epithelial cell and tissue populations. Nat Genet. 23:32-33. 24. Bayerdorffer E, Lehn N, Hatz R, Mannes GA, Oertel H, Sauerbruch T, Stolte M. 1992. Difference in expression of Helicobacter pylori gastritis in antrum and body. Gastroenterology. 102:1575-82. 25. Bayerdorffer E, Neubauer A, Rudolph B, Thiede C, Lehn N, Eidt S, Stolte M. 1995. Regression of primary gastric lymphoma of mucosa- associated lymphoid tissue type after cure of Helicobacter pylori infection. MALT Lymphoma Study Group. Lancet. 345:1591-4. 26. Beier D, Frank R. 2000. Molecular characterization of two-component systems of Helicobacter pylori. Journal of Bacteriology. 182:2068-76. 27. Bereswill S, Neuner O, Strobel S, Kist M. 2000. Identification and molecular analysis of superoxide dismutase isoforms in Helicobacter pylori. FEMS Microbiol Lett. 183:241-5. 28. Bereswill S, Waidner U, Odenbreit S, Lichte F, Fassbinder F, Bode G, Kist M. 1998. Structural, functional and mutational analysis of the pfr gene encoding a ferritin from Helicobacter pylori. Microbiology. 144:2505-16. Bibliography 299

29. Berg DJ, Lynch NA, Lynch RG, Lauricella DM. 1998. Rapid development of severe hyperplastic gastritis with gastric epithelial dedifferentiation in Helicobacter felis-infected IL-10(-/-) mice. Am J Pathol. 152:1377-86. 30. Bergthorsson U, Ochman H. 1995. Heterogeneity of genome sizes among natural isolates of Escherichia coli. J Bacteriol. 177:5784-9. 31. Bijlsma JJE, Waidner B, Vliet AHMv, Hughes NJ, Hag S, Bereswill S, Kelly DJ, Vandenbroucke-Grauls CMJE, Kist M, Kusters JG. 2002. The Helicobacter pylori Homologue of the Ferric Uptake Regulator Is Involved in Acid Resistance. Infect. Immun. 70:606-611. 32. Bjorkholm B, Lundin A, Sillen A, Guillemin K, Salama N, Rubio C, Gordon JI, Falk P, Engstrand L. 2001. Comparison of genetic divergence and fitness between two subclones of Helicobacter pylori. Infect Immun. 69:7832-8. 33. Blaser MJ, Perezperez GI, Kleanthous H, Cover TL, Peek RM, Jr., Chyou PH, Stemmermann GN, Nomura A. 1995. Infection with Helicobacter pylori strains possessing cagA is associated with an increased risk of developing adenocarcinoma of the stomach. Cancer Research. 55:2111-2115. 34. Borén T, Falk P, Roth KA, Larson G, Normark S. 1993. Attachment of Helicobacter pylori to human gastric epithelium mediated by blood group antigens. Science. 262: 35. Borody TJ, George LL, Brandl S, Andrews P, Ostapowicz N, Hyland L, Devine M. 1991. Helicobacter pylori-negative duodenal ulcer. Am J Gastroenterol. 86:1154-7. 36. Brumbaugh JA, Middendorf LR, Grone DL, Ruth JL. 1988. Continuous, on-line DNA sequencing using oligodeoxynucleotide primers with multiple fluorophores. Proc Natl Acad Sci USA. 85:5610-4. 37. Bury-Mone S, Skouloubris S, Labigne A, De Reuse H. 2001. The Helicobacter pylori UreI protein: role in adaptation to acidity and identification of residues essential for its activity and for acid activation. Mol Microbiol. 42:1021-34. Bibliography 300

38. Byrd JC, Yunker CK, Xu QS, Sternberg LR, Bresalier RS. 2000. Inhibition of gastric mucin synthesis by Helicobacter pylori. Gastroenterology. 118:1072-9. 39. Calam J. 1998. Helicobacter pylori and somatostatin cells. European Journal of Gastroenterology & Hepatology. 10:281-3. 40. Cantorna MT, Balish E. 1990. Inability of human clinical strains of Helicobacter pylori to colonize the alimentary tract of germfree rodents. Can. J. Microbiol. 36:237-241. 41. Cao P, McClain MS, Forsyth MH, Cover TL. 1998. Extracellular release of antigenic proteins by Helicobacter pylori. Infect Immun. 66:2984-6. 42. Carpousis AJ, Vanzo NF, Raynal LC. 1999. mRNA degradation. A tale of poly(A) and multiprotein machines. Trends in Genetics. 15:24-8. 43. Caselli M, Figura N, Trevisani L, Pazzi P, Guglielmetti P, Bovolenta MR, Stabellini G. 1989. Patterns of physical modes of contact between Campylobacter pylori and gastric epithelium: implications about the bacterial pathogenicity. Am J Gastroenterol. 84:511-513. 44. Cassels R, Oliva B, Knowles D. 1995. Occurrence of the regulatory nucleotides ppGpp and pppGpp following induction of the stringent response in Staphylococci. Journal of Bacteriology. 177:5161-5165. 45. Censini S, Lange C, Xiang Z, Crabtree JE, Ghiara P, Borodovsky M, Rappuoli R, Covacci A. 1996. cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc Natl Acad Sci USA. 93:14648-53. 46. Chatterji D, Ojha AK. 2001. Revisiting the stringent response, ppGpp and starvation signaling. Curr Opin Microbiol. 4:160-5. 47. Chen G, Sordillo EM, Ramey WG, Reidy J, Holt PR, Krajewski S, Reed JC, Blaser MJ, Moss SF. 1997. Apoptosis in gastric epithelial cells is induced by Helicobacter pylori and accompanied by increased expression of BAK. Biochemical & Biophysical Research Communications. 239:626-32. 48. Cheng H, Bjerknes M, Chen H. 1996. CRP-ductin: a gene expressed in intestinal crypts and in pancreatic and hepatic ducts. Anat Rec. 244:327-43. Bibliography 301

49. Chiou CC, Chan CC, Sheu DL, Chen KT, Li YS, Chan EC. 2001. Helicobacter pylori infection induced alteration of gene expression in human gastric cells. Gut. 48:598-604. 50. Choe YH, Kwon YS, Jung MK, Kang SK, Hwang TS, Hong YC. 2001. Helicobacter pylori-associated iron-deficiency anemia in adolescent female athletes. J Pediatr. 139:100-4. 51. Chomczynski P, Sacchi N. 1987. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal Biochem. 162:156-9. 52. Chowdhury A, Berg DE, Jeong JY, Mukhopadhyay AK, Nair GB. 2002. Metronidazole resistance in Helicobacter pylori: magnitude, mechanism and implications for India. Indian Journal of Gastroenterology. 21:23-8. 53. Chu S, DeRisi J, Eisen M, Mulholland J, Botstein D, Brown PO, Herskowitz I. 1998. The Transcriptional Program of Sporulation in Budding Yeast. Science. 282:699-705. 54. Clarkson KS, West KP. 1993. Gastric cancer and Helicobacter pylori infection. J Clin Pathol. 46:997-9. 55. Clyne M, Ocroinin T, Suerbaum S, Josenhans C, Drumm B. 2000. Adherence of isogenic flagellum-negative mutants of Helicobacter pylori and Helicobacter mustelae to human and ferret gastric epithelial cells. Infect Immun. 68:4335-9. 56. Correa P. 1983. The gastric precancerous process. Cancer Surv. 2:437- 450. 57. Correa P. 1988. A human model of gastric carcinogenesis. Cancer Research. 48:3554-3560. 58. Correa P. 1991. Is gastric carcinoma an infectious disease? N Engl J Med. 325:1170-1. 59. Corthesy-Theulaz I. 2000. Vaccination against Helicobacter pylori. Recent Results Cancer Res. 156:55-9. 60. Cottet S, Corthesy-Theulaz I, Spertini F, Corthesy B. 2002. Microaerophilic Conditions Permit to Mimic in vitro Events Occurring during Bibliography 302

in vivo Helicobacter pylori Infection and to Identify Rho/Ras-associated Proteins in Cellular Signaling. J. Biol. Chem. 277:33978-33986. 61. Covacci A, Falkow S, Berg DE, Rappuoli R. 1997. Did the inheritance of a pathogenicity island modify the virulence of Helicobacter pylori? Trends Microbiol. 5:205-208. 62. Covacci A, Kennedy GC, Cormack B, Rappuoli R, Falkow S. 1997. From Microbial Genomics to Meta-Genomics. Drug Development Research. 41:180-192. 63. Covacci A, Telford JL, Del Giudice G, Parsonnet J, Rappuoli R. 1999. Helicobacter pylori virulence and genetic geography. Science. 284:1328-33. 64. Cover TL. 1996. The vacuolating cytotoxin of Helicobacter pylori. Mol Microbiol. 20:241-6. 65. Cox JM, Clayton CL, Tomita T, Wallace DM, Robinson PA, Crabtree JE. 2001. cDNA array analysis of cag pathogenicity island-associated Helicobacter pylori epithelial cell response genes. Infection & Immunity. 69:6970-80. 66. Cox JM, Clayton CL, Tomita T, Wallace DM, Robinson PA, Crabtree JE. 2001. cDNA array analysis of cag pathogenicity island-associated Helicobacter pylori epithelial cell response genes. Infect and Immun. 69:6970-6980. 67. Crabtree JE, Ferrero RL, Kusters JG. 2002. The mouse colonizing Helicobacter pylori strain SS1 may lack a functional cag pathogenicity island. Helicobacter. 7:139-140. 68. Cummings CA, Relman DA. 2000. Using DNA microarrays to study host- microbe interactions. Emerg Infect Dis. 6:513-25. 69. Czinn S, Cai A, Nedrud JG. 1993. Protection of germ-free mice from infection by Helicobacter felis after active oral or passive IgA immunization. Vaccine. 11:637-42. 70. Danon SJ, O'Rourke J, Moss ND, Lee A. 1995. The importance of local acid production in the distribution of Helicobacter felis in the mouse stomach. Gastroenterology. 108:1386-1395. Bibliography 303

71. De Groote D, Ducatelle R, Van Doorn LJ, Tilmant K, Quint WGV, Verschuurn A, Haesebrouck F. 2000. Detection of "Candidatus Helicobacter suis" in gastric samples of pig by PCR: comparison with other invasive diagnostic techniques. J. Clin. Microbiol. 38:1131-1135. 72. De Risi J. 2001. Amino-allyl dye coupling protocol 73. de Saizieu A, Gardes C, Flint N, Wagner C, Kamber M, Mitchell TJ, Keck W, Amrein KE, Lange R. 2000. Microarray-based identification of a novel Streptococcus pneumoniae regulon controlled by an autoinduced peptide. J Bacteriol. 182:4696-703. 74. de Vries N, Kuipers EJ, Kramer NE, van Vliet AHM, Bijlsma JJE, Kist M, Bereswill S, Vandenbroucke-Grauls CMJE, Kusters JG. 2001. Identification of environmental stress-regulated genes in Helicobacter pylori by a lacZ reporter gene fusion system. Helicobacter. 6:300-309. 75. de Vries N, van Vliet AH, Kusters JG. 2001. Gene regulation. In: Mobley HL, Mendz GL, Hazell SL (eds) Helicobacter pylori physiology and genetics. ASM Press, Washington, DC:321-334 76. Delany I, Spohn G, Rappuoli R, Scarlato V. 2001. The Fur repressor controls transcription of iron-activated and -repressed genes in Helicobacter pylori. Mol Microbiol. 42:1297-1309. 77. Dempsey JA, Litaker W, Madhure A, Snodgrass TL, Cannon JG. 1991. Physical map of the chromosome of Neisseria gonorrhoeae FA1090 with locations of genetic markers, including opa and pil genes. J Bacteriol. 173:5476-86. 78. Detweiler CS, Cunanan DB, Falkow S. 2001. Host microarray analysis reveals a role for the Salmonella response regulator phoP in human macrophage cell death. Proc Natl Acad Sci USA. 98:5850-5. 79. Diehn M, Sherlock G, Binkley G, Jin H, Matese JC, Hernandez- Boussard T, Rees CA, Cherry JM, Botstein D, Brown PO, Alizadeh AA. 2003. SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Res. 31:219-23. Bibliography 304

80. Dixon M. 2001. Pathology of gastritis and peptic ulceration. In: Mobley HLT, Mendz G, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, DC:459-469 81. Dooley CP, Cohen H, Fitzgibbons PL, Bauer M, Appleman MD, Perez- Perez GI, Blaser MJ. 1989. Prevalence of Helicobacter pylori infection and histologic gastritis in asymptomatic persons. N Engl J Med. 321:1562-6. 82. Du MQ, Isaccson PG. 2002. Gastric MALT lymphoma: from aetiology to treatment. Lancet Oncology. 3:97-104. 83. Dubois A, Berg D, Incecik E, Fiala N, Heman-Ackah L, Perez-Perez G, Blaser M. 1996. Transient and persistent experimental infection of nonhuman primates with Helicobacter pylori: implications for human disease. Infect. Immun. 64:2885-2891. 84. Dubois A, Berg DE, Incecik E, Fiala N, Heman-Ackah L, del Valle J, Yang M, Wirth HP, Perez-Perez GI, Blaser MJ. 1999. Host Specificity of Helicobacter pylori Strains and Host Responses in Experimentally Challenged Nonhuman Primates. Gastroenterol. 116:90-96. 85. Duggan DJ, Bittner M, Chen Y, Meltzer P, Trent JM. 1999. Expression profiling using cDNA microarrays. Nat Genet. 21:10-4. 86. Dundon WG, Polenghi A, Del Guidice G, Rappuoli R, Montecucco C. 2001. Neutrophil-activating protein (HP-NAP) versus ferritin (Pfr): comparison of synthesis in Helicobacter pylori. FEMS Microbiol Lett. 199:143-9. 87. Eaton KA, Brooks CL, Morgan DR, Krakowka S. 1991. Essential role of urease in pathogenesis of gastritis induced by Helicobacter pylori in gnotobiotic piglets. Infect and Immun. 59:2470-2475. 88. Eaton KA, Gilbert JV, Joyce EA, Wanken AE, Thevenot T, Baker P, Plaut A, Wright A. 2002. In vivo Complementation of ureB Restores the Ability of Helicobacter pylori To Colonize. Infect. Immun. 70:771-611. 89. Eaton KA, Kersulyte D, Mefford M, Danon SJ, Krakowka S, Berg DE. 2001. Role of Helicobacter pylori cag region genes in colonization and gastritis in two animal models. Infect Immun. 69:2902-8. Bibliography 305

90. Eaton KA, Krakowka S. 1994. Effect of gastric pH on urease-dependent colonization of gnotobiotic piglets by Helicobacter pylori. Infect and Immun. 62:3604-3607. 91. Eaton KA, Morgan DR, Krakowka S. 1992. Motility as a factor in the colonisation of gnotobiotic piglets by Helicobacter pylori. J Med Microbiol. 37:123-7. 92. Eaton KA, Radin MJ, Krakowka S. 1995. An animal model of gastric ulcer due to bacterial gastritis in mice. Vet Pathol. 32:489-97. 93. Eisen MB, Brown PO. 1999. DNA arrays for analysis of gene expression. Methods Enzymol. 303:179-205. 94. Eisen MB, Spellman PT, Brown PO, Botstein D. 1998. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 95:14863-8. 95. el-Shoura SM. 1995. Helicobacter pylori: I. Ultrastructural sequences of adherence, attachment, and penetration into the gastric mucosa. Ultrastruct Pathol. 19:323-33. 96. Enno A, O'Rourke J, Braye S, Howlett R, Lee A. 1998. Antigen- dependent progression of mucosa-associated lymphoid tissue (MALT)-type lymphoma in the stomach. Effects of antimicrobial therapy on gastric MALT lymphoma in mice. Am J Pathol. 152:1625-1632. 97. Enno A, O'Rourke J, Howlett CR, Jack A, Dixon MF, Lee A. 1995. MALToma-like lesions in the murine gastric mucosa after long-term infection with Helicobacter felis - a mouse model of Helicobacter pylori- induced gastric lymphoma. Am J Pathol. 147:217-222. 98. Evans DJ, Jr., Evans DG, Lampert HC, Nakano H. 1995. Identification of four new prokaryotic bacterioferritins, from Helicobacter pylori, Anabaena variabilis, Bacillus subtilis and Treponema pallidum, by analysis of gene sequences. Gene. 153:123-7. 99. Falkow S. 1997. Invasion and intracellular sorting of bacteria - searching for bacterial genes expressed during host/pathogen interactions. Journal of Clinical Investigation. Bibliography 306

100. Falush D, Kraft C, Taylor NS, Correa P, Fox JG, Achtman M, Suerbaum S. 2001. Recombination and mutation during long-term gastric colonization by Helicobacter pylori: Estimates of clock rates, recombination size, and minimal age. PNAS. 98:15056-15061. 101. Fan X, Crowe SE, Behar S, Gunasena H, Ye G, Haeberle H, Van Houten N, Gourley WK, Ernst PB, Reyes VE. 1998. The Effect of Class II Major Histocompatibility Complex Expression on Adherence of Helicobacter pylori and Induction of Apoptosis in Gastric Epithelial Cells: A Mechanism for T Helper Cell Type 1-mediated Damage. J. Exp. Med. 187:1659-1669. 102. Ferrero RL, Ave P, Radcliff FJ, Labigne A, Huerre MR. 2000. Outbred mice with long-term Helicobacter felis infection develop both gastric lymphoid tissue and glandular hyperplastic lesions. J Pathol. 191:333-40. 103. Ferrero RL, Thiberge J-M, Huerre M, Labigne A. 1998. Immune Responses of Specific-Pathogen-Free Mice to Chronic Helicobacter pylori (Strain SS1) Infection. Infect. Immun. 66:1349-1355. 104. Fiocca R, Luinetti O, Villani L, Chiaravalli AM, Capella C, Solcia E. 1994. Epithelial cytotoxicity, immune responses, and inflammatory components of Helicobacter pylori gastritis. Scand J Gastroenterol Suppl. 205:11-21. 105. Fischer W, Puls J, Buhrdorf R, Gebert B, Odenbreit S, Haas R. 2001. Systematic mutagenesis of the Helicobacter pylori cag pathogenicity island: essential genes for CagA translocation in host cells and induction of interleukin-8. Mol Microbiol. 42:1337-1348. 106. Forman D, Newell DG, Fullerton F, Yarnell JW, Stacey AR, Wald N, Sitas F. 1991. Association between infection with Helicobacter pylori and risk of gastric cancer: evidence from a prospective investigation. Bmj. 302:1302-5. 107. Fox JG, Beck P, Dangler CA, Whary MT, Wang TC, Shi HN, Nagler- Anderson C. 2000. Concurrent enteric helminth infection modulates inflammation and gastric immune responses and reduces helicobacter- induced gastric atrophy. Nat Med. 6:536-42. Bibliography 307

108. Fox JG, Dangler CA, Taylor NS, King A, Koh TJ, Wang TC. 1999. High- salt diet induces gastric epithelial hyperplasia and parietal cell loss, and enhances Helicobacter pylori colonization in C57BL/6 mice. Cancer Research. 59:4823-4828. 109. Francis CA, Lockley AC, Sartory DP, Watkins J. 2001. A simple modified membrane filtration medium for the enumeration of aerobic spore-bearing bacilli in water. Water Research. 35:3758-61. 110. Fulkerson JF, Garner RM, Mobley H. 1998. Conserved residues and motifs in the nixa protein of Helicobacter pylori are critical for the high affinity transport of nickel ions. Journal of Biological Chemistry. 273:235- 241. 111. Furuta T, El-Omar EM, Xiao F, Shirai N, Takashima M, Sugimura H, Sugimurra H. 2002. Interleukin 1beta polymorphisms increase risk of hypochlorhydria and atrophic gastritis and reduce risk of duodenal ulcer recurrence in Japan. Gastroenterology. 123:92-105. 112. Garcia H, Rosendo-Gomis R, Franceshi D, Fabrega JM. 1992. Prevalence of Helicobacter pylori in patients with gastric cancer in Panama. Rev Med Panama. 17:203-7. 113. Genta RM, Hamner HW, Graham DY. 1993. Gastric lymphoid follicles in Helicobacter pylori infection: frequency, distribution, and response to triple therapy. Hum Pathol. 24:577-583. 114. Genta RM, Huberman RM, Graham DY. 1994. The gastric cardia in Helicobacter pylori infection. Hum Pathol. 25:915-9. 115. Glupczynski Y, Burette A. 1990. Drug therapy for Helicobacter pylori infection: problems and pitfalls. Am J Gastroenterol. 85:1545-51. 116. Glupczynski Y, Burette A, De Koster E, Nyst JF, Deltenre M, Cadranel S, Bourdeaux L, De Vos D. 1990. Metronidazole resistance in Helicobacter pylori. Lancet. 335:976-7. 117. Goh KL, Peh SC, Wong NW, Parasakthi N, Puthucheary SD. 1990. Campylobacter pylori infection: experience in a multiracial population. J Gastroenterol Hepatol. 5:277-80. Bibliography 308

118. Gomez-Marquez J, Segade F, Dosil M, Pichel JG, Bustelo XR, Freire M. 1989. The expression of prothymosin alpha gene in T lymphocytes and leukemic lymphoid cells is tied to lymphocyte proliferation. J Biol Chem. 264:8451-4. 119. Graham DY. 1996. Helicobacter pylori and perturbations in acid secretion: the end of the beginning. Gastroenterology. 110:1647-50. 120. Graham DY, Adam E, Reddy GT, Agarwal JP, Agarwal R, Evans DJ, Jr., Malaty HM, Evans DG. 1991. Seroepidemiology of Helicobacter pylori infection in India. Comparison of developing and developed countries. Dig Dis Sci. 36:1084-8. 121. Graham JE, Peek RM, Jr., Krishna U, Cover TL. 2002. Global analysis of Helicobacter pylori gene expression in human gastric mucosa. Gastroenterol. 123:1637-1648. 122. Guillemin K, Salama NR, Tompkins LS, Falkow S. 2002. Cag pathogenicity island-specific responses of gastric epithelial cells to Helicobacter pylori infection. PNAS. 99:15136-15141. 123. Guo BP, Mekalanos JJ. 2002. Rapid genetic analysis of Helicobacter pylori gastric mucosal colonization in suckling mice. PNAS. 99:8354-8359. 124. Guruge JL, Falk P, Lorenz RG, Dans M, Wirth H-P, Blaser MJ, Berg DE, Gordon JI. 1998. Epithelial attachment alters the outcome of Helicobacter pylori infection. Proc Natl Acad Sci USA. 95:3925-3930. 125. Handfield M, Levesque RC. 1999. Strategies for isolation of in vivo expressed genes from bacteria. Fems Microbiology Reviews. 23:69-91. 126. Hatanaka K, Hokari R, Matsuzaki K, Kato S, Kawaguchi A, Nagao S, Suzuki H, Miyazaki K, Sekizuka E, Nagata H, Ishii H, Miura S. 2002. Increased expression of mucosal addressin cell adhesion molecule-1 (MAdCAM-1) and lymphocyte recruitment in murine gastritis induced by Helicobacter pylori. Clin Exp Immunol. 130:183-9. 127. Hazell SL. 1993. Cultural techniques for the growth and isolation of Helicobacter pylori. In: Goodwin CS, Worsley BW (eds) Helicobacter pylori: Biology and Clinical Practice. CRC Press, Boca Raton, FL:273-283 Bibliography 309

128. Hazell SL, Lee A, Brady L, Hennessy W. 1986. Campylobacter pyloridis and gastritis: association with intercellular spaces and adaptation to an environment of mucus as important factors in colonization of the gastric epithelium. J Inf Dis. 153:658-663. 129. Hazell SL, Mendz GL. 1997. How Helicobacter pylori works: an overview of the metabolism of Helicobacter pylori. Helicobacter. 2:1-12. 130. Higashi H, Tsutsumi R, Muto S, Sugiyama T, Azuma T, Asaka M, Hatakeyama M. 2002. SHP-2 tyrosine phosphatase as an intracellular target of Helicobacter pylori CagA protein. Science. 295:683-6. 131. Homuth G, Domm S, Kleiner D, Schumann W. 2000. Transcriptional analysis of major heat shock genes of Helicobacter pylori. J Bacteriol. 182:4257-63. 132. Hooper LV, Wong MH, Thelin A, Hansson L, Falk PG, Gordon JI. 2001. Molecular analysis of commensal host-microbial relationships in the intestine. Science. 291:881-4. 133. Huang J, Lih CJ, Pan KH, Cohen SN. 2001. Global analysis of growth phase responsive gene expression and regulation of antibiotic biosynthetic pathways in Streptomyces coelicolor using DNA microarrays. Genes Dev. 15:3183-92. 134. Huesca M, Borgia S, Hoffman PS, Lingwood CA. 1996. Acidic pH changes receptor binding specificity of Helicobacter pylori: a binary adhesion model in which surface heat shock (stress) proteins mediate sulfatide recognition in gastric colonization. Infect and Immun. 64: 135. Huesca M, Goodwin A, Bhagwansingh A, Hoffman PS, Lingwood CA. 1998. Characterization of an acidic-pH-inducible stress protein (hsp70), a putative sulfatide binding adhesin, from Helicobacter pylori. Infect and Immun. 66:4061-4067. 136. Israel DA, Salama N, Arnold CN, Moss SF, Ando T, Wirth HP, Tham KT, Camorlinga M, Blaser MJ, Falkow S, Peek RM, Jr. 2001. Helicobacter pylori strain-specific differences in genetic content, identified by microarray, influence host inflammatory responses. J Clin Invest. 107:611-20. Bibliography 310

137. Israel DA, Salama N, Krishna U, Rieger UM, Atherton JC, Falkow S, Peek RM, Jr. 2001. Helicobacter pylori genetic diversity within the gastric niche of a single human host. Proc Natl Acad Sci USA. 98:14625-30. 138. Jones AC, Logan RP, Foynes S, Cockayne A, Wren BW, Penn CW. 1997. A flagellar sheath protein of Helicobacter pylori is identical to HpaA, a putative N-acetylneuraminyllactose-binding hemagglutinin, but is not an adhesin for AGS cells. J Bacteriol. 179:5643-7. 139. Jones SE, Jomary C. 2002. Molecules in focus. Clusterin. The International Journal of Biochemistry & Cell Biology. 34:427-431. 140. Josenhans C, Eaton KA, Thevenot T, Suerbaum S. 2000. Switching of Flagellar Motility in Helicobacter pylori by Reversible Length Variation of a Short Homopolymeric Sequence Repeat in fliP, a Gene Encoding a Basal Body Protein. Infect. Immun. 68:4598-4603. 141. Josenhans C, Labigne A, Suerbaum S. 1995. Comparative ultrastructural and functional studies of Helicobacter pylori and Helicobacter mustelae flagellin mutants: both flagellin subunits, FlaA and FlaB, are necessary for full motility in Helicobacter species. J Bacteriol. 177:3010-20. 142. Josenhans C, Niehus E, Amersbach S, Horster A, Betz C, Drescher B, Hughes KT, Suerbaum S. 2002. Functional characterization of the antagonistic flagellar late regulators FliA and FlgM of Helicobacter pylori and their effects on the H-pylori transcriptome. Mol Microbiol. 43:307-322. 143. Joyce EA, Gilbert JV, Eaton KA, Plaut A, Wright A. 2001. Differential gene expression from two transcriptional units in the cag pathogenicity island of Helicobacter pylori. Infect Immun. 69:4202-9. 144. Jungblut PR, Bumann D, Haas G, Zimny-Arndt U, Holland P, Lamer S, Siejak F, Aebischer A, Meyer TF. 2000. Comparative proteome analysis of Helicobacter pylori. Mol Microbiol. 36:710-25. 145. Kang JY, Wee A, Math MV, Guan R, Tay HH, Yap I, Sutherland IH. 1990. Helicobacter pylori and gastritis in patients with peptic ulcer and non-ulcer dyspepsia: ethnic differences in Singapore. Gut. 31:850-3. Bibliography 311

146. Karita M, Tummuru MK, Wirth HP, Blaser MJ. 1996. Effect of growth phase and acid shock on Helicobacter pylori cagA expression. Infect Immun. 64:4501-7. 147. Kawahara T, Teshima S, Kuwano Y, Oka A, Kishi K, Rokutan K. 2001. Helicobacter pylori lipopolysaccharide induces apoptosis of cultured guinea pig gastric mucosal cells. American Journal of Physiology - Gastrointestinal & Liver Physiology. 281:G726-34. 148. Keenan J, Oliaro J, Domigan N, Potter H, Aitken G, Allardyce R, Roake J. 2000. Immune response to an 18-kilodalton outer membrane antigen identifies lipoprotein 20 as a Helicobacter pylori vaccine candidate. Infect Immun. 68:3337-43. 149. Kelly DJ. 1998. The physiology and metabolism of the human gastric pathogen Helicobacter pylori. Advances in Microbial Physiology. 40:137-89. 150. Kelly DJ, Hughes NJ, Poole RK. 2001. Microaerobic Physiology: Aerobic Respiration, Anaerobic Respiration, and Carbon Dioxide Metabolism. In: Mobley HL, Mendz GL, Hazell SL (eds) Helicobacter pylori physiology and genetics. ASM Press, Washington, DC:113-124 151. Kennedy BP, Payette P, Mudgett J, Vadas P, Pruzanski W, Kwan M, Tang C, Rancourt DE, Cromlish WA. 1995. A natural disruption of the secretory group II phospholipase A2 gene in inbred mouse strains. J Biol Chem. 270:22378-85. 152. Kerkhoff C, Eue I, Sorg C. 1999. The regulatory role of MRP8 (S100A8) and MRP14 (S100A9) in the transendothelial migration of human leukocytes. Pathobiology. 67:230-2. 153. Kersulyte D, Chalkauskas H, Berg DE. 1999. Emergence of recombinant strains of Helicobacter pylori during human infection. Molecular Microbiology. 31:31-43. 154. Khodadoust MM, Khan KD, Park EH, Bothwell AL. 1998. Distinct regulatory mechanisms for interferon-alpha/beta (IFN-alpha/beta)- and IFN- gamma-mediated induction of Ly-6E gene in B cells. Blood. 92:2399-409. 155. Khodursky AB, Peter BJ, Cozzarelli NR, Botstein D, Brown PO, Yanofsky C. 2000. DNA microarray analysis of gene expression in Bibliography 312

response to physiological and genetic changes that affect tryptophan metabolism in Escherichia coli. Proc Natl Acad Sci USA. 97:12170-5. 156. Kim C. 2002. Microarray software 157. Kim C, Joyce EA, Chan K, Falkow S. 2002. Improved analytical methods for microarray-based genome composition analysis. Genome Biology. 3:research0065.1-0065.17. 158. Kim JM, Kim JS, Jung HC, Song IS, Kim CY. 2000. Apoptosis of human gastric epithelial cells via caspase-3 activation in response to Helicobacter pylori infection: possible involvement of neutrophils through tumor necrosis factor alpha and soluble Fas ligands. Scandinavian Journal of Gastroenterology. 35:40-8. 159. Kim JS, Chang JH, Chung SI, Yum JS. 2001. Importance of the host genetic background on immune responses to Helicobacter pylori infection and therapeutic vaccine efficacy. FEMS Immunol Med Microbiol. 31:41-46. 160. Kim JS, Chang JH, Chung SI, Yum JS. 1999. Molecular cloning and characterization of the Helicobacter pylori fliD gene, an essential factor in flagellar structure and motility. J Bacteriol. 181:6969-76. 161. Kim JS, Kim JM, Jung HC, Song IS. 2001. Caspase-3 activity and expression of Bcl-2 family in human neutrophils by Helicobacter pylori water-soluble proteins. Helicobacter. 6:207-15. 162. Kneller RW, Guo WD, Hsing AW, Chen JS, Blot WJ, Li JY, Forman D, Fraumeni JF, Jr. 1992. Risk factors for stomach cancer in sixty-five Chinese counties. Cancer Epidemiol Biomarkers Prev. 1:113-8. 163. Konturek PC, Brzozowski T, Konturek SJ, Stachura J, Karczewska E, Pajdo R, Ghiara P, Hahn EG. 1999. Mouse model of Helicobacter pylori infection: studies of gastric function and ulcer healing. Alimentary Pharmacology & Therapeutics. 13:333-46. 164. Konturek PC, Pierzchalski P, Konturek SJ, Meixner H, Faller G, Kirchner T, Hahn EG. 1999. Helicobacter pylori induces apoptosis in gastric mucosa through an upregulation of Bax expression in humans. Scand J Gastroenterol. 34:375-83. Bibliography 313

165. Krebs DL, Hilton DJ. 2001. SOCS proteins: negative regulators of cytokine signaling. Stem Cells. 19:378-87. 166. Kubicka S, Claas C, Staab S, Kuhnel F, Zender L, Trautwein C, Wagner S, Rudolph KL, Manns M. 2002. p53 mutation pattern and expression of c- erbB2 and c-met in gastric cancer - Relation to histological subtypes, Helicobacter pylori infection, and prognosis. Digestive Diseases & Sciences. 47:114-121. 167. Kuby J. 1994. Immunology. Second ed. W. H. Freeman and Company, New York 168. Kurosawa A, Miwa H, Hirose M, Tsune I, Nagahara A, Sato N. 2002. Inhibition of cell proliferation and induction of apoptosis by Helicobacter pylori through increased phosphorylated p53, p21 and Bax expression in endothelial cells. J Med Microbiol. 51:385-91. 169. Kusters JG. 2001. Recent developments in Helicobacter pylori vaccination. Scand J Gastroenterol Suppl. 15-21. 170. Laub MT, McAdams HH, Feldblyum T, Fraser CM, Shapiro L. 2000. Global analysis of the genetic network controlling a bacterial cell cycle. Science. 290:2144-8. 171. Lee A, Dixon MF, Danon SJ, Kuipers E, Megraud F, Larsson H, Mellgard B. 1995. Local Acid production and H. pylori: A unifying hypothesis of gastroduodenal disease. Eur J Gastroenterol and Hepatol. 7:461-465. 172. Lee A, Fox JG, Otto G, Murphy J. 1990. A small animal model of human Helicobacter pylori active chronic gastritis. Gastroenterol. 99:1315-1323. 173. Lee A, Mitchell H, O'Rourke J. 2002. The mouse colonizing Helicobacter pylori strain SS1 may lack a functional cag pathogenicity island - Response. Helicobacter. 7:140-141. 174. Lee A, O'Rourke J, De Ungria MC, Robertson B, Daskalopoulos G, Dixon MF. 1997. A standardized mouse model of Helicobacter pylori infection - introducing the Sydney strain. Gastroenterology. 112:1386-1397. 175. Lee A, O'Rourke J, Enno A. 2000. Gastric mucosa-associated lymphoid tissue lymphoma: implications of animal models on pathogenic and Bibliography 314

therapeutic considerations--mouse models of gastric lymphoma. Recent Results Cancer Res. 156:42-51. 176. Lee JJ, Smith HO. 1988. Sizing of the Haemophilus influenzae Rd genome by pulsed-field agarose gel electrophoresis. J Bacteriol. 170:4402-5. 177. Lillis TO, Bissonnette GK. 2001. Detection and characterization of filterable heterotrophic bacteria from rural groundwater supplies. Lett Appl Microbiol. 32:268-72. 178. Lin JT, Wang JT, Wang TH, Wu MS, Lee TK, Chen CJ. 1993. Helicobacter pylori infection in a randomly selected population, healthy volunteers, and patients with gastric ulcer and gastric adenocarcinoma. A seroprevalence study in Taiwan. Scand J Gastroenterol. 28:1067-72. 179. Lind T, Veldhuyzen van Zanten S, Unge P, Spiller R, Bayerdorffer E, O'Morain C, Bardhan KD, Bradette M, Chiba N, Wrangstadh M, Cederberg C, Idstrom JP. 1996. Eradication of Helicobacter pylori using one-week triple therapies combining omeprazole with two antimicrobials: the MACH I Study. Helicobacter. 1:138-44. 180. Litwin C, Calderwood S. 1993. Role of iron in regulation of virulence genes. Clin. Microbiol. Rev. 6:137-149. 181. Liu L, Zeng M, Stamler JS. 1999. Hemoglobin induction in mouse macrophages. Proc Natl Acad Sci U S A. 96:6643-7. 182. Liu T, Matsuguchi T, Tsuboi N, Yajima T, Yoshikai Y. 2002. Differences in expression of toll-like receptors and their reactivities in dendritic cells in BALB/c and C57BL/6 mice. Infect Immun. 70:6638-45. 183. Lucchini S, Thompson A, Hinton JC. 2001. Microarrays for microbiologists. Microbiology. 147:1403-14. 184. Lundberg U, Vinatzer U, Berdnik D, von Gabain A, Baccarini M. 1999. Growth phase-regulated induction of Salmonella-induced macrophage apoptosis correlates with transient expression of SPI-1 genes. J Bacteriol. 181:3433-7. 185. Maeda S, Otsuka M, Hirata Y, Mitsuno Y, Yoshida H, Shiratori Y, Masuho Y, Muramatsu M, Seki N, Omata M. 2001. cDNA microarray Bibliography 315

analysis of Helicobacter pylori-mediated alteration of gene expression in gastric cancer cells. Biochem Biophys Res Commun. 284:443-9. 186. Maeda S, Yoshida H, Mitsuno Y, Hirata Y, Ogura K, Shiratori Y, Omata M. 2002. Analysis of apoptotic and antiapoptotic signalling pathways induced by Helicobacter pylori. Mol Pathol. 55:286-93. 187. Mahdavi J, Sonden B, Hurtig M, Olfat FO, Forsberg L, Roche N, Angstrom J, Larsson T, Teneberg S, Karlsson KA, Altraja S, Wadstrom T, Kersulyte D, Berg DE, Dubois A, Petersson C, Magnusson KE, Norberg T, Lindh F, Lundskog BB, Arnqvist A, Hammarstrom L, Boren T. 2002. Helicobacter pylori SabA adhesin in persistent infection and chronic inflammation. Science. 297:573-8. 188. Mähler M, Janke C, Wagner S, Hedrich HJ. 2002. Differential susceptibility of inbred mouse strains to Helicobacter pylori infection. Scand J Gastroenterol. 37:267-278. 189. Maier RJ, Fu C, Gilbert J, Moshiri F, Olson J, Plaut AG. 1996. Hydrogen uptake hydrogenase in Helicobacter pylori. FEMS Microbiol Lett. 141:71-6. 190. Marais A, Mendz GL, Hazell SL, Megraud F. 1999. Metabolism and genetics of Helicobacter pylori: the genome era. Microbiol Mol Biol Rev. 63:642-74. 191. Marchetti M, Rappuoli R. 2002. Isogenic mutants of the cag pathogenicity island of Helicobacter pylori in the mouse model of infection: effects on colonization efficiency. Microbiology. 148:1447-1456. 192. Marshall BJ. 1994. Helicobacter pylori. Am J Gastroenterol. 89:S116-28. 193. Marshall BJ, Goodwin CS, Warren JR, Murray R, Blincow D, Blackbourn SJ, Phillips M, Waters TE, Sanderson CR. 1988. Prospective double-blind trial of duodenal ulcer relapse after eradication of Campylobacter pylori. Lancet. 2:1437-1442. 194. Marshall BJ, Warren JR. 1984. Unidentified curved bacilli in the stomach of patients with gastritis and peptic ulceration. Lancet. 1:1311-5. 195. Marshall BM, Armstrong JA, McGechie DB, Glancy RJ. 1985. Attempt to fulful Koch's postulates for pyloric campylobacter. Med. J. Aust. 142:436- 439. Bibliography 316

196. Martin RM, Brady JL, Lew AM. 1998. The need for IgG2c specific antiserum when isotyping antibodies from C57BL/6 and NOD mice. J Immunol Methods. 212:187-92. 197. McClain MS, Schraw W, Ricci V, Boquet P, Cover TL. 2000. Acid activation of Helicobacter pylori vacuolating cytotoxin (VacA) results in toxin internalization by eukaryotic cells. Mol Microbiol. 37:433-42. 198. McColl K, El-Omar EM, Gillen D. 1997. Alterations in gastric physiology in Helicobacter pylori infection - causes of different diseases or all epiphenomena. Italian Journal of Gastroenterology & Hepatology. 29:459- 464. 199. McColl KE. 1997. Helicobacter pylori and acid secretion: where are we now? Eur J Gastroenterol Hepatol. 9:333-5. 200. McColl KE. 1997. What remaining questions regarding Helicobacter pylori and associated diseases should be addressed by future research? View from Europe. Gastroenterology. 113:S158-62. 201. McColl KE, el-Nujumi AM, Chittajallu RS, Dahill SW, Dorrian CA, el- Omar E, Penman I, Fitzsimons EJ, Drain J, Graham H, et al. 1993. A study of the pathogenesis of Helicobacter pylori negative chronic duodenal ulceration. Gut. 34:762-8. 202. McDaniel TK, Dewalt KC, Salama NR, Falkow S. 2001. New approaches for validation of lethal phenotypes and genetic reversion in Helicobacter pylori. Helicobacter. 6:15-23. 203. McGee DJ, Radcliff FJ, Mendz GL, Ferrero RL, Mobley HL. 1999. Helicobacter pylori rocF is required for arginase activity and acid protection in vitro but is not essential for colonization of mice or for urease activity. Journal of Bacteriology. 181:7314-22. 204. McGovern KJ, Blanchard TG, Gutierrez JA, Czinn SJ, Krakowka S, Youngman P. 2001. {gamma}-Glutamyltransferase Is a Helicobacter pylori Virulence Factor but Is Not Essential for Colonization. Infect. Immun. 69:4168-4173. Bibliography 317

205. Mégraud F, Hazell SL, Glupczynski Y. 2001. Antibiotic susceptibility and Resistance. In: Mobley HLT, Mendz G, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, D. C.:511-530 206. Mendz GL, Hazell SL. 1993. Fumarate catabolism in Helicobacter pylori. Biochem Mol Diol Int. 31:325-332. 207. Merrell DS, Camilli A. 1999. The cadA gene of Vibrio cholerae is induced during infection and plays a role in acid tolerance. Mol Microbiol. 34:836-49. 208. Merrell DS, Falkow S. 2003. Expression Profiling in H. pylori Infection. In: Appasani K (ed) Perspectives in Gene Expression. Eaton Publishing, Westboro, MA. 209. Merrell DS, Goodrich ML, Otto G, Tompkins L, Falkow S. 2003 in press. pH regulated gene expression of the gastric pathogen Helicobacter pylori. Infect and Immun. 210. Michetti P, Wadstrom T, Kraehenbuhl JP, Lee A, Kreiss C, Blum AL. 1996. Frontiers in Helicobacter pylori research- pathogenesis, host response, vaccine development and new therapeutic approaches. Europ J Gastroenterol Hepatol. 8:717-722. 211. Miehlke S, Meining A, Morgner A, Bayerdorffer E, Lehn N, Stolte M, Graham DY, Go MF. 1998. Frequency of vacA genotypes and cytotoxin activity in Helicobacter pylori associated with low-grade gastric mucosa- associated lymphoid tissue lymphoma. J. Clin. Micobiol. 36:2369-2370. 212. Mihaljevic S, Katicic M, Karner I, Vuksic-Mihaljevic Z, Dmitrovic B, Ivandic A. 2000. The influence of Helicobacter pylori infection on gastrin and somatostatin values present in serum. Hepato-Gastroenterology. 47:1482-4. 213. Miki R, Kadota K, Bono H, Mizuno Y, Tomaru Y, Carninci P, Itoh M, Shibata K, Kawai J, Konno H, Watanabe S, Sato K, Tokusumi Y, Kikuchi N, Ishii Y, Hamaguchi Y, Nishizuka I, Goto H, Nitanda H, Satomi S, Yoshiki A, Kusakabe M, DeRisi JL, Eisen MB, Iyer VR, Brown PO, Muramatsu M, Shimada H, Okazaki Y, Hayashizaki Y. 2001. Delineating developmental and metabolic pathways in vivo by expression Bibliography 318

profiling using the RIKEN set of 18,816 full-length enriched mouse cDNA arrays. PNAS. 98:2199-2204. 214. Mills JC, Syder AJ, Hong CV, Guruge JL, Raaii F, Gordon JI. 2001. A molecular profile of the mouse gastric parietal cell with and without exposure to Helicobacter pylori. Proc Natl Acad Sci USA. 98:13687-92. 215. Mitchell H. 2001. Epidemiology of Infection. In: Mobley HLT, Mendz GL, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, D. C. 216. Mizote T, Yoshiyama H, Nakazawa T. 1997. Urease-independent chemotactic responses of Helicobacter pylori to urea, urease inhibitors, and sodium bicarbonate. Infect and Immun. 65:1519-1521. 217. Mobley H. 1997. Helicobacter pylori factors associated with disease development. Gastroenterology. 218. Mohammadi M, Redline R, Nedrud J, Czinn S. 1996. Role of the host in pathogenesis of Helicobacter-associated gastritis - H. felis infection of inbred and congenic mouse strains. Infect and Immun. 64: 219. Montecucco C, Papini E, de Bernard M, Zoratti M. 1999. Molecular and cellular activities of Helicobacter pylori pathogenic factors. FEBS Lett. 452:16-21. 220. Moran AP, Sturegård E, Sjunnesson H, Wadstrom T, Hynes SO. 2000. The relationship between O-chain expression and colonisation ability of Helicobacter pylori in a mouse model. FEMS Immunol Med Microbiol. 29:263-270. 221. Morgenstern S, Koren R, Moss SF, Fraser G, Okon E, Niv Y. 2001. Does Helicobacter pylori affect gastric mucin expression? Relationship between gastric antral mucin expression and H. pylori colonization. Eur J Gastroenterol Hepatol. 13:19-23. 222. Morris A, Nicholson G. 1987. Ingestion of Campylobacter pyloridis causes gastritis and raised fasting gastric pH. Am J Gastroenterol. 82:192-199. 223. Moss SF, Sordillo EM, Abdalla AM, Makarov V, Hanzely Z, Perez-Perez GI, Blaser MJ, Holt PR. 2001. Increased gastric epithelial cell apoptosis Bibliography 319

associated with colonization with cagA + Helicobacter pylori strains. Cancer Research. 61:1406-11. 224. Mueller A, O'Rourke J, Grimm J, Guillemin K, Dixon MF, Lee A, Falkow S. 2003. Distinct gene expression profiles characterize the histopathological stages of disease in Helicobacter- induced mucosa-associated lymphoid tissue lymphoma. Proc Natl Acad Sci U S A. 27:27. 225. Nabwera HM, Logan RP. 1999. Epidemiology of Helicobacter pylori: transmission, translocation and extragastric reservoirs. J Physiol Pharmacol. 50:711-22. 226. Nakamura H, Yoshiyama H, Takeuchi H, Mizote T, Okita K, Nakazawa T. 1998. Urease Plays an Important Role in the Chemotactic Motility of Helicobacter pylori in a Viscous Environment. Infect. Immun. 66:4832-4837. 227. Namavar F, Sparrius M, Veerman EC, Appelmelk BJ, Vandenbroucke- Grauls CM. 1998. Neutrophil-activating protein mediates adhesion of Helicobacter pylori to sulfated carbohydrates on high-molecular-weight salivary mucin. Infect Immun. 66:444-7. 228. Nardone G, Staibano S, Rocco A, Mezza E, D'Armiento FP, Insabato L, Coppola A, Salvatore G, Lucariello A, Figura N, De Rosa G, Budillon G. 1999. Effect of Helicobacter pylori infection and its eradication on cell proliferation, DNA status, and oncogene expression in patients with chronic gastritis. Gut June. 44:789-799. 229. Neubauer A, Thiede C, Morgner A, Alpen B, Ritter M, Neubauer B, Wundisch T, Ehninger G, Stolte M, Bayerdorffer E. 1997. Cure of Helicobacter pylori infection and duration of remission of low-grade gastric mucosa-associated lymphoid tissue lymphoma. J Natl Cancer Inst. 89:1350-5. 230. Niehus E, Ye F, Suerbaum S, Josenhans C. 2002. Growth phase- dependent and differential transcriptional control of flagellar genes in Helicobacter pylori. Microbiology. 148:3827-37. 231. Nolan KJ, McGee DJ, Mitchell HM, Kolesnikow T, Harro JM, O'Rourke J, Wilson JE, Danon SJ, Moss ND, Mobley HLT, Lee A. 2002. In vivo Bibliography 320

Behavior of a Helicobacter pylori SS1 nixA Mutant with Reduced Urease Activity. Infect. Immun. 70:685-611. 232. Nomura A, Stemmermann GN, Chyou PH, Kato I, Perez-Perez GI, Blaser MJ. 1991. Helicobacter pylori infection and gastric carcinoma among Japanese Americans in Hawaii. N Engl J Med. 325:1132-6. 233. Obonyo M, Guiney DG, Harwood J, Fierer J, Cole SP. 2002. Role of gamma interferon in Helicobacter pylori induction of inflammatory mediators during murine infection. Infect Immun. 70:3295-9. 234. Obonyo M, Guiney DG, Harwood J, Fierer J, Cole SP. 2002. Role of Gamma Interferon in Helicobacter pylori Induction of Inflammatory Mediators during Murine Infection. Infect. Immun. 70:3295-3299. 235. O'Brien RL, Yin X, Huber SA, Ikuta K, Born WK. 2000. Depletion of a gd T cell subset can increase host resistance to a bacterial infection. The Journal of Immunology. 165:6472-6479. 236. Oda T, Murakami K, Nishizono A, Kodama M, Nasu M, Fujioka T. 2002. Long-term Helicobacter pylori infection in Japanese monkeys induces atrophic gastritis and accumulation of mutations in the p53 tumor suppressor gene. Helicobacter. 7:143-51. 237. Odenbreit S, Puls J, Sedlmaier B, Gerland E, Fischer W, Haas R. 2000. Translocation of Helicobacter pylori CagA into gastric epithelial cells by type IV secretion. Science. 287:1497-500. 238. Odenbreit S, Till M, Haas R. 1996. Optimized BlaM-transposon shuttle mutagenesis of Helicobacter pylori allows the identification of novel genetic loci involved in bacterial virulence. Molecular Microbiology. 20:361-73. 239. Odenbreit S, Till M, Hofreuter D, Faller G, Haas R. 1999. Genetic and functional characterization of the alpAB gene locus essential for the adhesion of Helicobacter pylori to human gastric tissue. Molecular Microbiology. 31:1537-1548. 240. Ogram A, Sun W, Brockman FJ, Fredrickson JK. 1995. Isolation and characterization of RNA from low-biomass deep-subsurface sediments. Appl Environ Microbiol. 61:763-8. Bibliography 321

241. Ogura K, Maeda S, Nakao M, Watanabe T, Tada M, Kyutoku T, Yoshida H, Shiratori Y, Omata M. 2000. Virulence Factors of Helicobacter pylori Responsible for Gastric Diseases in Mongolian Gerbil. J. Exp. Med. 192:1601-1610. 242. Ogura M, Yamaguchi H, Yoshida K, Fujita Y, Tanaka T. 2001. DNA microarray analysis of Bacillus subtilis DegU, ComA and PhoP regulons: an approach to comprehensive analysis of B.subtilis two-component regulatory systems. Nucleic Acids Res. 29:3804-13. 243. Ohana M, Okazaki K, Oshima C, Andra's D, Nishi T, Uchida K, Uose S, Nakase H, Matsushima Y, Chiba T. 2001. A critical role for IL-7R signaling in the development of Helicobacter felis-induced gastritis in mice. Gastroenterology. 121:329-36. 244. Oi M, Oshida K, Sugimura S. 1959. The location of gastric ulcer. Gastroenterology. 36:45-59. 245. Olfat FO, Naslund E, Freedman J, Boren T, Engstrand L. 2002. Cultured human gastric explants: a model for studies of bacteria-host interaction during conditions of experimental Helicobacter pylori infection. Journal of Infectious Diseases. 186:423-7. 246. O'Toole PW, Kostrzynska M, Trust TJ. 1994. Non-motile mutants of Helicobacter pylori and Helicobacter mustelae defective in flagellar hook production. Mol Microbiol. 14:691-703. 247. O'Toole PW, Lane MC, Porwollik S. 2000. Helicobacter pylori motility. Microbes Infect. 2:1207-14. 248. Ottemann KM, Lowenthal AC. 2002. Helicobacter pylori uses motility for initial colonization and to attain robust infection. Infect Immun. 70:1984- 1990. 249. Ottlecz A, Romero JJ, Lichtenberger LM. 2001. Helicobacter infection and phospholipase A2 enzymes: effect of Helicobacter felis-infection on the expression and activity of sPLA2 enzymes in mouse stomach. Mol Cell Biochem. 221:71-7. 250. Owen RJ. 2002. Molecular testing for antibiotic resistance in Helicobacter pylori. Gut. 50:285-9. Bibliography 322

251. Owhashi M, Arita H, Hayai N. 2000. Identification of a novel eosinophil chemotactic cytokine (ECF-L) as a chitinase family protein. J Biol Chem. 275:1279-86. 252. Padol IT, Moran AP, Hunt RH. 2001. Effect of purified Lipopolysaccharides from strains of Helicobacter pylori and Helicobacter felis on acid secretion in mouse gastric glands in vitro. Infect and Immun. 69:3891-3896. 253. Pai R, Cover TL, Tarnawski AS. 1999. Helicobacter pylori vacuolating cytotoxin (VacA) disorganizes the cytoskeletal architecture of gastric epithelial cells. Biochemical & Biophysical Research Communications. 262:245-50. 254. Pai R, Sasaki E, Tarnawski AS. 2000. Helicobacter pylori vacuolating cytotoxin (VacA) alters cytoskeleton-associated proteins and interferes with re-epithelialization of wounded gastric epithelial monolayers. Cell Biology International. 24:291-301. 255. Palovuori R, Perttu A, Yan Y, Karttunen R, Eskelinen S, Karttunen TJ. 2000. Helicobacter pylori induces formation of stress fibers and membrane ruffles in AGS cells by rac activation. Biochemical & Biophysical Research Communications. 269:247-53. 256. Papini E, Satin B, Norais N, de Bernard M, Telford JL, Rappuoli R, Montecucco C. 1998. Selective increase of the permeability of polarized epithelial cell monolayers by Helicobacter pylori vacuolating toxin. J Clin Invest. 102:813-20. 257. Park SM, Lee HR, Kim JG, Park JW, Jung G, Han SH, Cho JH, Kim MK. 1999. Effect of Helicobacter pylori infection on antral gastrin and somatostatin cells and on serum gastrin concentrations. Korean Journal of Internal Medicine. 14:15-20. 258. Parsonnet J, Friedman GD, Vandersteen DP, Chang Y, Vogelman JH, Orentreich N, Sibley RK. 1991. Helicobacter pylori infection and the risk of gastric carcinoma. N Engl J Med. 325:1127-31. 259. Parsonnet J, Hansen S, Rodriguez L, Gelb AB, Warnke RA, Jellum E, Orentreich N, Vogelman JH, Friedman GD. 1994. Helicobacter pylori infection and gastric lymphoma. N Engl J Med. 330:1267-71. Bibliography 323

260. Pateraki E, Mentis A, Spiliadis C, Sophianos D, Stergiatou I, Skandalis N, Weir DM. 1990. Seroepidemiology of Helicobacter pylori infection in Greece. FEMS Microbiol Immunol. 2:129-36. 261. Paustian ML, May BJ, Kapur V. 2001. Pasteurella multocida gene expression in response to iron limitation. Infect Immun. 69:4109-15. 262. Peck B, Ortkamp M, Diehl KD, Hundt E, Knapp B. 1999. Conservation, localization and expression of HopZ, a protein involved in adhesion of Helicobacter pylori. Nucl. Acids. Res. 27:3325-3333. 263. Peek RM, Jr., Blaser MJ. 2002. Helicobacter pylori and gastrointestinal tract adenocarcinomas. Nature Reviews. Cancer. 2:28-37. 264. Peek RM, Jr., Miller GG, Tham KT, Perezperez GI, Zhao XM, Atherton JC, Blaser MJ. 1995. Heightened inflammatory response and cytokine expression in vivo to cagA(+) Helicobacter pylori strains. Laboratory Investigations. 73:760-770. 265. Perez-Perez GI, Israel DA. 2000. Role of iron in Helicobacter pylori: its influence in outer membrane protein expression and in pathogenicity. Eur J Gastroenterol Hepatol. 12:1263-5. 266. Perez-Perez GI, Taylor DN, Bodhidatta L, Wongsrichanalai J, Baze WB, Dunn BE, Echeverria PD, Blaser MJ. 1990. Seroprevalence of Helicobacter pylori infections in Thailand. J Infect Dis. 161:1237-41. 267. Petersen AM, Sorensen K, Blom J, Krogfelt KA. 2001. Reduced intracellular survival of Helicobacter pylori vacA mutants in comparison with their wild-types indicates the role of VacA in pathogenesis. FEMS Immunol Med Microbiol. 30:103-8. 268. Peura D. 1998. Helicobacter pylori: rational management options. Am J Med. 105:424-30. 269. Philpott DJ, Belaid D, Troubadour P, Thiberge JM, Tankovic J, Labigne A, Ferrero RL. 2002. Reduced activation of inflammatory responses in host cells by mouse-adapted Helicobacter pylori isolates. Cellular Microbiology. 4:285-96. 270. Pigeon C, Ilyin G, Courselaud B, Leroyer P, Turlin B, Brissot P, Loreal O. 2001. A new mouse liver-specific gene, encoding a protein homologous Bibliography 324

to human antimicrobial peptide hepcidin, is overexpressed during iron overload. J Biol Chem. 276:7811-9. 271. Piotrowski J, Czajkowski A, Yotsumoto F, Slomiany A, Slomiany BL. 1993. Helicobacter pylori lipopolysaccharide inhibition of gastric mucosal laminin receptor: effect of sulfglycotide. Gen. Phamacol. 24:1467-1472. 272. Piotrowski J, Majka J, Murty VLN, Czajkowski A, Slomiany A, Slomiany BL. 1994. Inhibition of gastric mucosal mucin receptor by Helicobacter pylori lipopolysaccharide: effect of sulglycotide. Gen. Phamacol. 25:969- 976. 273. Potthoff A, Ledig S, Martin J, Jandl O, Cornberg M, Obst B, Beil W, Manns MP, Wagner S. 2002. Significance of the Caspase Family in Helicobacter pylori Induced Gastric Epithelial Apoptosis. Helicobacter. 7:367-77. 274. Pride DT, Blaser MJ. 2002. Concerted evolution between duplicated genetic elements in Helicobacter pylori. J Mol Biol. 316:629-42. 275. Pride DT, Meinersmann RJ, Blaser MJ. 2001. Allelic Variation within Helicobacter pylori babA and babB. Infect Immun. 69:1160-71. 276. Queiroz DM, Quintao JG, Rocha GA, Barbosa AJ, Mendes EN. 1991. Helicobacter pylori density on antral mucosa of patients with and without duodenal ulceration. Gastroenterol Clin Biol. 15:558. 277. Queiroz DM, Rocha GA, Mendes EN, Carvalho AS, Barbosa AJ, Oliveira CA, Lima GF, Jr. 1991. Differences in distribution and severity of Helicobacter pylori gastritis in children and adults with duodenal ulcer disease. J Pediatr Gastroenterol Nutr. 12:178-81. 278. Quiding-Jarbrink M, Ahlstedt I, Lindholm C, Johansson EL, Lonroth H. 2001. Homing commitment of lymphocytes activated in the human gastric and intestinal mucosa. Gut. 49:519-25. 279. Radcliff F, Hazell S, Kolesnikow T, Doidge C, Lee A. 1997. Catalase, a novel antigen for Helicobacter pylori vaccination. Infect. Immun. 65:4668- 4674. Bibliography 325

280. Randolph JB, Waggoner AS. 1997. Stability, specificity and fluorescence brightness of multiply-labeled fluorescent DNA probes. Nucleic Acids Res. 25:2923-9. 281. Rauhut R, Klug G. 1999. mRNA degradation in bacteria. FEMS Microbiol Rev. 23:353-70. 282. Reindel JF, Fitzgerald AL, Breider MA, Gough AW, Yan C, Mysore JV, Dubois A. 1999. An epizootic of lymphoplasmacytic gastritis attributed to Helicobacter pylori infection in cynomolgus monkeys (Macaca fascicularis). Vet Pathol. 36:1-13. 283. Rekhter MD, Chen J. 2001. Molecular analysis of complex tissues is facilitated by laser capture microdissection: critical role of upstream tissue processing. Cell Biochemistry & Biophysics. 35:103-13. 284. Reynolds DJ, Penn CW. 1994. Characteristics of Helicobacter pylori growth in a defined medium and determination of its amino acid requirements. Microbiology. 140:2649-2656. 285. Rhodius V. 2000. Isolation of total RNA from E. coli for microarrays 286. Richmond CS, Glasner JD, Mau R, Jin H, Blattner FR. 1999. Genome- wide expression profiling in Escherichia coli K-12. Nucleic Acids Res. 27:3821-35. 287. Rieder G, Einsiedl W, Hatz RA, Stolte M, Enders GA, Walz A. 2001. Comparison of CXC chemokines ENA-78 and interleukin-8 expression in Helicobacter pylori-associated gastritis. Infect Immun. 69:81-8. 288. Rimini R, Jansson B, Feger G, Roberts TC, de Francesco M, Gozzi A, Faggioni F, Domenici E, Wallace DM, Frandsen N, Polissi A. 2000. Global analysis of transcription kinetics during competence development in Streptococcus pneumoniae using high density DNA arrays. Mol Microbiol. 36:1279-1292. 289. Rossi G, Rossi M, Vitali CG, Fortuna D, Burroni D, Pancotto L, Capecchi S, Sozzi S, Renzoni G, Braca G, Del Giudice G, Rappuoli R, Ghiara P, Taccini E. 1999. A Conventional Beagle Dog Model for Acute and Chronic Infection with Helicobacter pylori. Infect. Immun. 67:3112-3120. Bibliography 326

290. Rota CA, Pereira-Lima JC, Blaya C, Nardi NB. 2001. Consensus and variable region PCR analysis of Helicobacter pylori 3' region of cagA gene in isolates from individuals with or without peptic ulcer. J Clin Microbiol. 39:606-12. 291. Roth KA, Kapadia SB, Martin SM, Lorenz RG. 1999. Cellular immune responses are essential for the development of Helicobacter felis- associated gastric pathology. J Immunol. 163:1490-7. 292. Ruzsovics A, Molnar B, Unger Z, Tulassay Z, Pronai L. 2001. Determination of Helicobacter pylori cagA, vacA genotypes with real-time PCR melting curve analysis. J Physiol Paris. 95:369-77. 293. Sabarth N, Hurwitz R, Meyer TF, Bumann D. 2002. Multiparameter Selection of Helicobacter pylori Antigens Identifies Two Novel Antigens with High Protective Efficacy. Infect. Immun. 70:6499-6503. 294. Sachs G, Zeng N, Prinz C. 1997. Physiology of isolated gastric endocrine cells. Annual Review of Physiology. 59:243-56. 295. Saeed ZA, Evans DJ, Jr., Evans DG, Cornelius MJ, Maton PN, Jensen RT, Graham DY. 1991. Helicobacter pylori and Zollinger-Ellison syndrome. Dig Dis Sci. 36:15-8. 296. Sakagami T, Dixon MF, O'Rourke J, Howlett R, Alderuccio F, Vella J, Shimoyama T, Lee A. 1996. Atrophic gastric changes in both Helicobacter felis and Helicobacter pylori infected mice are host dependent and separate from antral gastritis. GUT. 39:639-648. 297. Sakagami T, Vella J, Dixon M, O'Rourke J, Radcliff F, Sutton P, Shimoyama T, Beagley K, Lee A. 1997. The endotoxin of Helicobacter pylori is a modulator of host-dependent gastritis. Infect. Immun. 65:3310- 3316. 298. Salama N, Guillemin K, McDaniel TK, Sherlock G, Tompkins L, Falkow S. 2000. A whole-genome microarray reveals genetic diversity among Helicobacter pylori strains. Proc Natl Acad Sci USA. 97:14668-73. 299. Salama NR, Otto G, Tompkins L, Falkow S. 2001. Vacuolating cytotoxin of Helicobacter pylori plays a role during colonization in a mouse model of infection. Infect Immun. 69:730-6. Bibliography 327

300. Sambrook J, Fritsch EF, Maniatis T. 1989. Molecular cloning, a laboratory manual. second ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor 301. Sark MW, Borgstein AM, Medema JP, van de Putte P, Backendorf C. 1999. Opposite effects of Ras or PKC activation on the expression of the SPRR2A keratinocyte terminal differentiation marker. Exp Cell Res. 250:475-84. 302. Sarkar N. 1996. Polyadenylation of mRNA in bacteria. Microbiology. 142:3125-33. 303. Satti MB, Twum-Danso K, al-Freihi HM, Ibrahim EM, al-Gindan Y, al- Quorain A, al-Ghassab G, al-Hamdan A, al-Idrissi HY. 1990. Helicobacter pylori-associated upper gastrointestinal disease in Saudi Arabia: a pathologic evaluation of 298 endoscopic biopsies from 201 consecutive patients. Am J Gastroenterol. 85:527-34. 304. Saunders KE, Mcgovern KJ, Fox JG. 1997. Use of pulsed-field gel electrophoresis to determine genomic diversity in strains of Helicobacter hepaticus from geographically distant locations. Journal of Clinical Microbiology. 35:2859-2863. 305. Schepp W, Dehne K, Herrmuth H, Pfeffer K, Prinz C. 1998. Identification and functional importance of IL-1 receptors on rat parietal cells. Am J Physiol. 275:G1094-105. 306. Schoolnik GK. 2002. Functional and comparative genomics of pathogenic bacteria. Curr Opin Microbiol. 5:20-26. 307. Schreiber S, Stuben M, Josenhans C, Scheid P, Suerbaum S. 1999. In vivo distribution of Helicobacter felis in the gastric mucus of the mouse: experimental method and results. Infect Immun. 67:5151-6. 308. Schroder G, Krause S, Zechner EL, Traxler B, Yeo HJ, Lurz R, Waksman G, Lanka E. 2002. TraG-like proteins of DNA transfer systems and of the Helicobacter pylori type IV secretion system: inner membrane gate for exported substrates? J Bacteriol. 184:2767-79. 309. Scoarughi GL, Cimmino C, Donini P. 1999. Helicobacter pylori: a eubacterium lacking the stringent response. J Bacteriol. 181:552-5. Bibliography 328

310. Scott DR, Weeks D, Hong C, Postius S, Melchers K, Sachs G. 1998. The role of internal urease in acid resistance of Helicobacter pylori. Gastroenterology. 114:58-70. 311. Segal ED, Cha J, Lo J, Falkow S, Tompkins LS. 1999. Altered states: involvement of phosphorylated CagA in the induction of host cellular growth changes by Helicobacter pylori. Proc Natl Acad Sci U S A. 96:14559-64. 312. Segal ED, Falkow S, Tompkins LS. 1996. Helicobacter pylori attachment to gastric cells induces cytoskeletal rearrangements and tyrosine phosphorylation of host cell proteins. Proc Natl Acad Sci USA. 93:1259-64. 313. Segal ED, Lange C, Covacci A, Tompkins LS, Falkow S. 1997. Induction of host signal transduction pathways by Helicobacter pylori. Proc. Natl. Acad. Sci. USA. 94:7595-7599. 314. Seiffert D, Curriden SA, Jenne D, Binder BR, Loskutoff DJ. 1996. Differential Regulation of Vitronectin in Mice and Humans in vitro. J. Biol. Chem. 271:5474-5480. 315. Selbach M, Moese S, Hauck CR, Meyer TF, Backert S. 2002. Src is the kinase of the Helicobacter pylori CagA protein in vitro and in vivo. J Biol Chem. 277:6775-8. 316. Selbach M, Moese S, Hurwitz R, Hauck CR, Meyer TF, Backert S. 2003. The Helicobacter pylori CagA protein induces cortactin dephosphorylation and actin rearrangement by c-Src inactivation. Embo J. 22:515-528. 317. Selbach M, Moese S, Meyer TF, Backert S. 2002. Functional Analysis of the Helicobacter pylori cag Pathogenicity Island Reveals Both VirD4-CagA- Dependent and VirD4-CagA-Independent Mechanisms. Infect. Immun. 70:665-611. 318. Sepulveda AR, Tao H, Carloni E, Sepulveda J, Graham DY, Peterson LE. 2002. Screening of gene expression profiles in gastric epithelial cells induced by Helicobacter pylori using microarray analysis. Aliment Pharmacol Ther. 16:145-57. 319. Seyler RW, Jr., Olson JW, Maier RJ. 2001. Superoxide Dismutase- Deficient Mutants of Helicobacter pylori Are Hypersensitive to Oxidative Stress and Defective in Host Colonization. Infect. Immun. 69:4034-4040. Bibliography 329

320. Sherlock G, Hernandez-Boussard T, Kasarskis A, Binkley G, Matese JC, Dwight SS, Kaloper M, Weng S, Jin H, Ball CA, Eisen MB, Spellman PT, Brown PO, Botstein D, Cherry JM. 2001. The Stanford Microarray Database. Nucleic Acids Res. 29:152-5. 321. Sherman DR, Voskuil M, Schnappinger D, Liao R, Harrell MI, Schoolnik GK. 2001. Regulation of the Mycobacterium tuberculosis hypoxic response gene encoding alpha -. Proc Natl Acad Sci USA. 98:7534-9. 322. Shibayama K, Doi Y, Shibata N, Yagi T, Nada T, Iinuma Y, Arakawa Y. 2001. Apoptotic signaling pathway activated by Helicobacter pylori infection and increase of apoptosis-inducing activity under serum-starved conditions. Infection & Immunity. 69:3181-9. 323. Shimada M, Ina K, Kyokane K, Imada A, Yamaguchi H, Nishio Y, Hayakawa M, Iinuma Y, Ohta M, Ando T, Kusugami K. 2002. Upregulation of mucosal soluble fas ligand and interferon-gamma may be involved in ulcerogenesis in patients with Helicobacter pylori-positive gastric ulcer. Scand J Gastroenterol. 37:501-11. 324. Shomer NH, Dangler CA, Whary MT, Fox JG. 1998. Experimental Helicobacter pylori Infection Induces Antral Gastritis and Gastric Mucosa- Associated Lymphoid Tissue in Guinea Pigs. Infect. Immun. 66:2614-2618. 325. Sierra R, Munoz N, Pena AS, Biemond I, van Duijn W, Lamers CB, Teuchmann S, Hernandez S, Correa P. 1992. Antibodies to Helicobacter pylori and pepsinogen levels in children from Costa Rica: comparison of two areas with different risks for stomach cancer. Cancer Epidemiol Biomarkers Prev. 1:449-54. 326. Sipponen P. 1992. Helicobacter pylori infection--a common worldwide environmental risk factor for gastric cancer? Endoscopy. 24:424-7. 327. Sipponen P, Seppala K. 1992. Gastric carcinoma: failed adaptation to Helicobacter pylori. Scand J Gastroenterol Suppl. 193:33-8. 328. Skouloubris S, Labigne A, De Reuse H. 2001. The AmiE aliphatic amidase and AmiF formamidase of Helicobacter pylori: natural evolution of two enzyme paralogues. Mol Microbiol. 40:596-609. Bibliography 330

329. Skouloubris S, Thiberge JM, Labigne A, De Reuse H. 1998. The Helicobacter pylori UreI protein is not involved in urease activity but is essential for bacterial survival in vivo. Infect Immun. 66:4517-21. 330. Slomiany BL, Piotrowski J, Slomiany A. 1998. Induction of caspase-3 and nitric oxide synthase-2 during gastric mucosal inflammatory reaction to Helicobacter pylori lipopolysaccharide. Biochemistry & Molecular Biology International. 46:1063-70. 331. Slomiany BL, Slomiany A. 2002. Disruption in gastric mucin synthesis by Helicobacter pylori lipopolysaccharide involves ERK and p38 mitogen- activated protein kinase participation. Biochem Biophys Res Commun. 294:220-4. 332. Smythies LE, Chen J, Lindsey JR, Ghiara P, Smith PD, Waites KB. 2000. Quantitative analysis of Helicobacter pylori infection in a mouse model. Journal of Immunological Methods. 242:67-78. 333. Sobala GM, Schorah CJ, Shires S, Lynch AF, Gallacher B, Dixon MF, Axon ATR. 1993. Effect of eradication of Helicobacter pylori on gastric juice ascorbic acid concentrations. Gut. 34:1038-1041. 334. Sommer F, Faller G, Rollinghoff M, Kirchner T, Mak TW, Lohoff M. 2001. Lack of gastritis and of an adaptive immune response in interferon regulatory factor-1-deficient mice infected with Helicobacter pylori. Eur J Immunol. 31:396-402. 335. Sozzi M, Crosatti M, Kim S-K, Romero J, Blaser MJ. 2001. Heterogeneity of Helicobacter pylori cag genotypes in experimentally infected mice. FEMS Microbiol Lett. 203:109-114. 336. Spohn G, Scarlato V. 1999. The autoregulatory HspR repressor protein governs chaperone gene transcription in Helicobacter pylori. Mol Microbiol. 34:663-74. 337. Spohn G, Scarlato V. 2001. Motility, Chemotaxis and Flagella. In: Mobley HL, Mendz GL, Hazell SL (eds) Helicobacter pylori physiology and genetics. ASM Press, Washington, DC:239-248 Bibliography 331

338. Spohn G, Scarlato V. 1999. Motility of Helicobacter pylori is coordinately regulated by the transcriptional activator FlgR, an NtrC homolog. J Bacteriol. 181:593-9. 339. Stadelmann K, Elster K, Stolte M, Miederer SE, Deyhle P, Demling L, Siegenthaler W. 1971. The peptic gastric ulcer-histopographic and functional investigations. Scand J Gastroenterol. 6:613-623. 340. Stein M, Bagnoli F, Halenbeck R, Rappuoli R, Fantl WJ, Covacci A. 2002. c-Src/Lyn kinases activate Helicobacter pylori CagA through tyrosine phosphorylation of the EPIYA motifs. Mol Microbiol. 43:971-80. 341. Stein M, Rappuoli R, Covacci A. 2001. The cag Pathogenicity Island. In: Mobley HL, Mendz GL, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, DC:345-353 342. Stein M, Rappuoli R, Covacci A. 2000. Tyrosine phosphorylation of the Helicobacter pylori CagA antigen after cag-driven host cell translocation. Proc Natl Acad Sci USA. 97:1263-8. 343. Stolte M, Eidt S, Ohnsmann A. 1990. Differences in Helicobacter pylori associated gastritis in the antrum and body of the stomach. Z Gastroenterol. 28:229-33. 344. Sturegård E, Sjunnesson H, Nilsson HO, Andersson R, Areskoug C, Wadström T. 2001. Infection with cagA- and vacA-positive and -negative strains of Helicobacter pylori in a mouse model. FEMS Immunol Med Microbiol. 30:115-120. 345. Sutton P. 2001. Progress in vaccination against Helicobacter pylori. Vaccine. 19:2286-90. 346. Sutton P, Danon SJ, Walker M, Thompson LJ, Wilson J, Kosaka T, Lee A. 2001. Post-immunisation gastritis and Helicobacter infection in the mouse: a long term study. Gut. 49:467-73. 347. Sutton P, Lee A. 2000. Review article: Helicobacter pylori vaccines-the current status. Aliment Pharmacol Ther. 14:1107-18. 348. Suzuki K, Kokai Y, Sawada N, Takakuwa R, Kuwahara K, Isogai E, Isogai H, Mori M. 2002. SS1 Helicobacter pylori disrupts the paracellular Bibliography 332

barrier of the gastric mucosa and leads to neutrophilic gastritis in mice. Virchows Archiv An International Journal of Pathology. 440:318-324. 349. Takami S, Hayashi T, Tonokatsu Y, Shimoyama T, Tamura T. 1993. Chromosomal heterogeneity of Helicobacter pylori isolates by pulsed-field gel electrophoresis. Zentralblatt fur Bakteriologie. 280:120-7. 350. Takashima M, Furuta T, Hanai H, Sugimura H, Kaneko E. 2001. Effects of Helicobacter pylori infection on gastric acid secretion and serum gastrin levels in Mongolian gerbils. Gut. 48:765-73. 351. Takata T, El-Omar E, Camorlinga M, Thompson SA, Minohara Y, Ernst PB, Blaser MJ. 2002. Helicobacter pylori Does Not Require Lewis X or Lewis Y Expression To Colonize C3H/HeJ mice. Infect. Immun. 70:3073- 3079. 352. Talaat AM, Hunter P, Johnston SA. 2000. Genome-directed primers for selective labeling of bacterial transcripts for DNA microarray analysis. Nat Biotechnol. 18:679-82. 353. Tanaka TS, Jaradat SA, Lim MK, Kargul GJ, Wang X, Grahovac MJ, Pantano S, Sano Y, Piao Y, Nagaraja R, Doi H, Wood WH, III, Becker KG, Ko MSH. 2000. Genome-wide expression profiling of mid-gestation placenta and embryo using a 15,000 mouse developmental cDNA microarray. PNAS. 97:9127-9132. 354. Taylor DN, Blaser MJ. 1991. The epidemiology of Helicobacter pylori infection. Epidemiol Rev. 13:42-59. 355. Taylor DN, Parsonnet J. 1995. Epidemiology and natural history of H. pylori infections. In: Blaser MJ, Smith PF, Ravdin J, Greenberg H, Guerrant RL (eds) Infections of the gastrointestinal tract. Raven Press, New York City:551-564 356. Tekamp-Olson P, Gallegos C, Bauer D, McClain J, Sherry B, Fabre M, van Deventer S, Cerami A. 1990. Cloning and characterization of cDNAs for murine macrophage inflammatory protein 2 and its human homologues. J. Exp. Med. 172:911-919. Bibliography 333

357. Tesfaigzi Y, Fischer MJ, Daheshia M, Green FH, De Sanctis GT, Wilder JA. 2002. Bax is crucial for IFN-gamma-induced resolution of allergen- induced mucus cell metaplasia. J Immunol. 169:5919-25. 358. Testerman TL, McGee DJ, Mobley HLT. 2001. Adherence and Colonization. In: Mobley HLT, Mendz GL, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, D. C.:381-417 359. Thalmaier U, Lehn N, Pfeffer K, Stolte M, Vieth M, Schneider-Brachert W. 2002. Role of Tumor Necrosis Factor Alpha in Helicobacter pylori Gastritis in Tumor Necrosis Factor Receptor 1-Deficient Mice. Infect. Immun. 70:3149-3155. 360. Thoreson AC, Hamlet A, Celik J, Bystrom M, Nystrom S, Olbe L, Svennerholm AM. 2000. Differences in surface-exposed antigen expression between Helicobacter pylori strains isolated from duodenal ulcer patients and from asymptomatic subjects. J Clin Microbiol. 38:3436-41. 361. Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Venter JC. 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 388:539-47. 362. Trinchieri G. 1995. Interleukin-12: a proinflammatory cytokine with immunoregulatory functions that bridge innate resistance and antigen- specific adaptive immunity. Annu Rev Immunol. 13:251-76. 363. Trougakos IP, Gonos ES. 2002. Clusterin/apolipoprotein J in human aging and cancer. Int J Biochem Cell Biol. 34:1430-48. 364. Tusher VG, Tibshirani R, Chu G. 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 98:5116-21. 365. Valnes K, Huitfeldt HS, Brandtzaeg P. 1990. Relation between T cell number and epithelial HLA class II expression quantified by image analysis in normal and inflamed human gastric mucosa. Gut. 31:647-52. Bibliography 334

366. Van de Craen M, Vandenabeele P, Declercq W, Van den Brande I, Van Loo G, Molemans F, Schotte P, Van Criekinge W, Beyaert R, Fiers W. 1997. Characterization of seven murine caspase family members. FEBS Lett. 403:61-9. 367. Van den Brink GR, Tytgat KM, Van der Hulst RW, Van der Loos CM, Einerhand AW, Buller HA, Dekker J. 2000. H. pylori colocalises with MUC5AC in the human stomach. Gut. 46:601-7. 368. van der Wouden EJ, Thijs JC, Kusters JG, van Zwet AA, Kleibeuker JH. 2001. Mechanism and clinical significance of metronidazole resistance in Helicobacter pylori. Scand J Gastroenterol Suppl. 10-4. 369. van Doorn LJ, Figueiredo C, Sanna R, Pena S, Midolo P, Ng EK, Atherton JC, Blaser MJ, Quint WG. 1998. Expanding allelic diversity of Helicobacter pylori vacA. J. Clin. Microbiol. 36:2597-2603. 370. van Doorn NE, Namavar F, Sparrius M, Stoof J, van Rees EP, van Doorn LJ, Vandenbroucke-Grauls CM. 1999. Helicobacter pylori- associated gastritis in mice is host and strain specific. Infect Immun. 67:3040-6. 371. van Vliet AH, Kuipers EJ, Waidner B, Davies BJ, de Vries N, Penn CW, Vandenbroucke-Grauls CM, Kist M, Bereswill S, Kusters JG. 2001. Nickel-responsive induction of urease expression in Helicobacter pylori is mediated at the transcriptional level. Infection & Immunity. 69:4891-7. 372. van Vliet AHM, Stoof J, Vlasblom R, Wainwright SA, Hughes NJ, Kelly DJ, Bereswill S, Bijlsma JJE, Hoogenboezem T, Vandenbroucke- Grauls C, Kist M, Kuipers EJ, Kusters JG. 2002. The role of the ferric uptake regulator (Fur) in regulation of Helicobacter pylori iron uptake. Helicobacter. 7:237-244. 373. Van Zanten SJ, Dixon MF, Lee A. 1999. The gastric transitional zones: neglected links between gastroduodenal pathology and helicobacter ecology. Gastroenterology. 116:1217-29. 374. Wada S, Matsuda M, Kikuchi M, Kodama T, Takei I, Ogawa S, Takahashi S, Shingaki M, Itoh T. 1994. Genome DNA analysis and genotyping of clinical isolates of Helicobacter pylori. Cytobios. 80:109-16. Bibliography 335

375. Waidner B, Greiner S, Odenbreit S, Kavermann H, Velayudhan J, Stahler F, Guhl J, Bisse E, van Vliet AHM, Andrews SC, Kusters JG, Kelly DJ, Haas R, Kist M, Bereswill S. 2002. Essential role of ferritin Pfr in Helicobacter pylori iron metabolism and gastric colonization. Infection & Immunity. 70:3923-3929. 376. Wang J, Blanchard TG, Ernst PB. 2001. Host Inflammatory Response to Infection. In: Mobley HLT, Mendz G, Hazell SL (eds) Helicobacter pylori: Physiology and Genetics. ASM Press, Washington, D.C.:471-480 377. Wang J, Chi DS, Kalin GB, Sosinski C, Miller LE, Burja I, Thomas E. 2002. Helicobacter pylori infection and oncogene expressions in gastric carcinoma and its precursor lesions. Digestive Diseases & Sciences. 47:107-113. 378. Wang MC, Furuta T, Takashima M, Futami H, Shirai N, Hanai H, Kaneko E. 1999. Relation between interleukin-1 beta messenger RNA in gastric fundic mucosa and gastric juice pH in patients infected with Helicobacter pylori. Journal of Gastroenterology. 34:10-17. 379. Wang X, Willén R, Wadstrom T, Aleljung P. 1998. RAPD-PCR, histopathological and serological analysis of four mouse strains infected with multiple strains of Helicobacter pylori. Microbial Ecology in Health and Disease. 10:148-154. 380. Watanabe T, Tada M, Nagai H, Sasaki S, Nakao M. 1998. Helicobacter pylori infection induces gastric cancer in Mongolian gerbils. Gastroenterology. 115:642-648. 381. Webb GF, Blaser MJ. 2002. Dynamics of bacterial phenotype selection in a colonized host. PNAS. 99:3135-3140. 382. Weeks DL, Eskandara S, Scott DR, Sachs G. 2000. A H+-gated urea channel: the link between Helicobacter pylori urease and gastric colonization. Science. 287:482-485. 383. Wei Y, Lee JM, Richmond C, Blattner FR, Rafalski JA, LaRossa RA. 2001. High-density microarray-mediated gene expression profiling of Escherichia coli. J Bacteriol. 183:545-56. Bibliography 336

384. Wilson M, DeRisi J, Kristensen HH, Imboden P, Rane S, Brown PO, Schoolnik GK. 1999. Exploring drug-induced alterations in gene expression in Mycobacterium tuberculosis by microarray hybridization. Proc Natl Acad Sci USA. 96:12833-8. 385. Wong KK, McClelland M. 1992. A BlnI restriction map of the Salmonella typhimurium LT2 genome. J Bacteriol. 174:1656-61. 386. Wood H, Feldman M. 1997. Helicobacter pylori and iron deficiency. Jama. 277:1166-7. 387. Worku ML, Sidebotham RL, Walker MM, Keshavarz T, Karim QN. 1999. The relationship between Helicobacter pylori motility, morphology and phase of growth: implications for gastric colonization and pathology. Microbiology. 145:2803-2811. 388. Worst DJ, Maaskant J, Vandenbroucke-Grauls CM, Kusters JG. 1999. Multiple haem-utilization loci in Helicobacter pylori. Microbiology. 145:681-8. 389. Worst DJ, Otto BR, de Graaff J. 1995. Iron-repressible outer membrane proteins of Helicobacter pylori involved in heme uptake. Infect Immun. 63:4161-5. 390. Wotherspoon AC, Doglioni C, Diss TC, Pan L, Moschini A, de Boni M, Isaacson PG. 1993. Regression of primary low-grade B-cell gastric lymphoma of mucosa-associated lymphoid tissue type after eradication of Helicobacter pylori. Lancet. 342:575-7. 391. Wyatt JI, Rathbone BJ, Dixon MF, Heatley RV. 1987. Campylobacter pyloridis and acid-induced gastric metaplasia in the pathogenesis of duodenitis. J. Clin. Pathol. 40:841-848. 392. Xiang Z, Censini S, Bayeli PF, Telford JL, Figura N, Rappuoli R, Covacci A. 1995. Analysis of expression of CagA and VacA virulence factors in 43 strains of Helicobacter pylori reveals that clinical isolates can be divided into two major types and that CagA is not necessary for expression of the vacuolating cytotoxin. Infection & Immunity. 63:94-8. 393. Yamamoto S, Kaneko H, Konagaya T, Mori S, Kotera H, Hayakawa T, Yamaguchi C, Uruma M, Kusugami K, Mitsuma T. 2001. Interactions Bibliography 337

among gastric somatostatin, interleukin-8 and mucosal inflammation in Helicobacter pylori-positive peptic ulcer patients. Helicobacter. 6:136-45. 394. Yamazaki S, Yamakawa A, Ito Y, Ohtani M, Higashi H, Hatakeyama M, Azuma T. 2003. The CagA Protein of Helicobacter pylori Is Translocated into Epithelial Cells and Binds to SHP-2 in Human Gastric Mucosa. J Infect Dis. 187:334-7. 395. Ye RW, Tao W, Bedzyk L, Young T, Chen M, Li L. 2000. Global gene expression profiles of Bacillus subtilis grown under anaerobic conditions. J Bacteriol. 182:4458-65. 396. Yoshida K, Kobayashi K, Miwa Y, Kang CM, Matsunaga M, Yamaguchi H, Tojo S, Yamamoto M, Nishi R, Ogasawara N, Nakayama T, Fujita Y. 2001. Combined transcriptome and proteome analysis as a powerful approach to study genes under glucose repression in Bacillus subtilis. Nucleic Acids Res. 29:683-92. 397. Zhang J-G, Farley A, Nicholson SE, Willson TA, Zugaro LM, Simpson RJ, Moritz RL, Cary D, Richardson R, Hausmann G, Kile BJ, Kent SBH, Alexander WS, Metcalf D, Hilton DJ, Nicola NA, Baca M. 1999. The conserved SOCS box motif in suppressors of cytokine signaling binds to elongins B and C and may couple bound proteins to proteasomal degradation. PNAS. 96:2071-2076.