Imperial College London Department of Life Sciences Centre for Molecular Bacteriology and Infection

Investigating the Type IV Pili of difficile and Clostridium sordellii

Edward Charles Couchman

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

July 2016

1

The research disclosed in this thesis has been conducted by the author. All work not performed by the author is appropriately referenced, acknowledged or attributed.

A number of figures initially published in academic journals have been re-used in this work. In all instances, these figures have been used with the permission of the copyright-holder, and where relevant the necessary licenses have been obtained. These licenses are presented in an annex at the end of this thesis.

The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work.

2

Abstract

Type IV pili (T4P) are the only type of bacterial pili known to be produced by both Gram- negative and Gram-positive organisms. Though the main pilus shaft consists primarily of only one protein (the major pilin), T4P are unusual in their complexity, requiring multiple (10 or more) different protein components for assembly. Like most types of pili, T4P often function as virulence factors. In particular, T4P frequently operate as adhesins, enabling on which they are present to stick to each other (to form a biofilm or suchlike) or to adhere directly to host cells. Many T4P systems are able to retract, in which case the T4P may mediate flagella-independent motility. Most research into T4P has historically been performed on Gram-negative organisms, with T4P-encoding genes only being identified in Gram-positive organisms more recently. In particular, all sequenced species of the genus Clostridium are known to encode T4P, but only minimal investigation of these systems has been performed to date. In this study, the T4P of Clostridium difficile were investigated. C. difficile is an important pathogen, being the leading cause of antibiotic-associated diarrhoea in the developed world and thus a considerable burden on Western healthcare systems. By investigating the T4P of this species it was hoped to further elucidate its mechanisms of pathogenicity. Data is presented demonstrating the control of T4P expression by cyclic-di-GMP, and identifying which genes are essential for T4P production in C. difficile. Additionally, a genomic analysis of the related pathogen Clostridium sordellii was performed, using the first high quality genome sequence produced for this species. Genes encoding T4P were identified, analysed and investigated. Furthermore, plasmids carrying the genes encoding the species’ key virulence factors (Lethal Toxin, TcsL, and in some cases Haemorrhagic Toxin, TcsH) were identified. These plasmids appear to be unstable, a fact with significant implications for diagnosis of C. sordellii disease.

3

Acknowledgements

Firstly I extend my thanks to my supervisor, Neil Fairweather, for accepting me into his lab, providing me with two fascinating projects, and for all his help and support throughout my PhD. I was always grateful for his calm and relaxed attitude to life, particularly while I attempting to plan a wedding in the middle of my project! I would also like to thank him for all his help in putting this thesis together, and for his patience over the many months it took me to write it.

Thank you also to all those members and former members of the Fairweather lab who helped me throughout my time there. To Paula Salgado, for all her contributions to the pili project from up in Newcastle; to Marcin Dembek, for always knowing everything, particularly his incredible troubleshooting skills and amazing protocols; to Allie, Steph, Tom, Johann, Ana, Mandy and Jo, for all your help, suggestions and friendship over the years; and finally of course to Emma, my only lab-buddy for my final six months, for your company, and the entertainment you always seemed to provide!

I would also like to thank those from outside the Fairweather group who made significant contributions to my project. My thanks go to Hilary Browne and Matt Dunn of the Wellcome Trust Sanger Institute, for all their work in assembling, annotating and optically mapping the C. sordellii genomes. I would also like to thank all those who helped me with electron microscopy: Morgan Beeby, at Imperial; Gill Douce, Tony Buckley and Margaret Mullin at Glasgow University; and most especially Maria McCrossan at the UCL EM Unit at the Royal Free Hospital.

My thanks go to my housemates, with whom I lived for the first two years of my PhD: Ed, Nat, Jens and Rachel. Thanks for being awesome, and for being the perfect people to moan about science with!

Finally, thank you to my family: to my parents and step-parents for generally being brilliant, to my siblings for all the banter, and most of all to my lovely wife, Ros, for all her help and support, and of course for putting up with me spending almost every weekend for a year working on this thesis!

4

Contents

Chapter 1: Introduction…………………………………………………………………….17 1.1 Types of Pili……………………………………………………………………...... 17 1.1.1 Gram-Negative Pili…………………………………………………………………….17 1.1.2 Gram-Positive Pili……………………………………………………………………...19 1.2 Type IV Pili………………………………………………………………………....21 1.2.1 Type IV Pili Components…………………………………………………………...... 22 1.3 Assembly of Type IV Pili………………………………………………………...... 28 1.3.1 Outside-In Model…………………………………………………………………...... 29 1.3.2 Mutagenesis Studies…………………………………………………………………...30 1.4 Related Structures…………………………………………………………………...32 1.4.1 Type II Secretion System………………………………………………………………32 1.4.2 The Archaellum………………………………………………………………………..33 1.4.3 Competence System………………………………………………………………...... 35 1.5 Gram-Positive Type IV Pili…………………………………………………………36 1.6 Functions of Type IV Pili………………………………………………………...... 38 1.6.1 Adhesion to Host Cells……………………………………………………………...... 38 1.6.2 Bacterial Aggregation………………………………………………………………….39 1.6.3 Twitching Motility……………………………………………………………………..40 1.6.4 Other Functions………………………………………………………………………...41 1.7 Regulation of Type IV Pili Expression…………………………………………...... 42 1.8 Clostridium difficile……………………………………………………………...... 44 1.8.1 Sporulation and Germination…………………………………………………………..45 1.8.2 Virulence and Infection………………………………………………………………...46 1.8.3 Experimental Tools for C. difficile Research…………………………………………..51 1.8.4 Type IV Pili of C. difficile……………………………………………………………..53 1.9 Clostridium sordellii………………………………………………………………...54 1.9.1 C. sordellii Disease…………………………………………………………………….54 1.9.2 C. sordellii Virulence…………………………………………………………………..56 1.10 Aims of the Project………………………………………………………………….57

Chapter 2: Materials and Methods……………………………………………………...... 59 2.1 Bacterial Strains and Growth Conditions………………………………………...... 59 2.1.1 Bacterial Strains Used………………………………………………………………….59 2.1.2 Culturing Escherichia coli……………………………………………………………..60

5

2.1.3 Culturing Clostridium difficile…………………………………………………………62 2.1.4 Culturing Clostridium sordellii………………………………………………………...63 2.1.5 Storage of Strains………………………………………………………………………63 2.1.6 Production of Chemically Competent E. coli…………………………………………..63 2.1.7 Transformation of E. coli………………………………………………………………64 2.1.8 Conjugative Transfer of Plasmid DNA into ……………………………...... 64 2.2 Bioinformatics…………………………………………………………………...... 64 2.2.1 DNA Sequence Visualisation…………………………………………………………..64 2.2.2 DNA Sequencing………………………………………………………………………65 2.2.3 DNA/RNA, Gene and Protein Analysis………………………………………………..65 2.3 DNA Manipulation………………………………………………………………….65 2.3.1 Genomic DNA Extraction……………………………………………………………...65 2.3.2 Purification of Plasmid DNA…………………………………………………………..66 2.3.3 Polymerase Chain Reaction (PCR) Primers……………………………………………66 2.3.4 High Fidelity PCR……………………………………………………………………...67 2.3.5 Colony/Low Fidelity PCR……………………………………………………………...67 2.3.6 Agarose Gel Electrophoresis…………………………………………………………...68 2.3.7 Purification of PCR Products…………………………………………………………..68 2.3.8 Restriction Digests……………………………………………………………………..68 2.3.9 Extraction of DNA from Agarose Gels…………………………………………………69 2.3.10 DNA Ligation………………………………………………………………………….69 2.3.11 Gibson Assembly…………………………………………………………………...... 69 2.3.12 Sequencing of Plasmids………………………………………………………………..70 2.3.13 ClosTron Mutagenesis…………………………………………………………………70 2.3.14 Southern Blotting…………………………………………………………………...... 70 2.3.15 Allele Exchange Mutagenesis………………………………………………………….71 2.3.16 Vector Construction………………………………………………………………...... 72 2.4 RNA Manipulation………………………………………………………………….80 2.4.1 RNA Extraction………………………………………………………………………...80 2.4.2 cDNA Synthesis………………………………………………………………………..81 2.4.3 RT-PCR………………………………………………………………………………..81 2.4.4 qRT-PCR………………………………………………………………………………82 2.5 Protein Manipulation………………………………………………………………..83 2.5.1 Protein Expression in E. coli…………………………………………………………...83 2.5.2 Protein Solubility Tests from E. coli…………………………………………………...84 2.5.3 Protein Purification from E. coli……………………………………………………….84

6

2.5.4 Protein Concentration Quantitation……………………………………………………85 2.5.5 Protein Expression in C. difficile………………………………………………………86 2.5.6 Protein Expression in C. difficile………………………………………………………86 2.5.7 Tri-Chloroacetic Acid Precipitation of Protein……………………………………...... 87 2.6 SDS-PAGE and Immunoblotting………………………………………………...... 87 2.6.1 Cell Lysate Preparation………………………………………………………………...87 2.6.2 SDS-PAGE…………………………………………………………………………….87 2.6.3 Coomassie Staining…………………………………………………………………….88 2.6.4 Western Blotting……………………………………………………………………….88 2.7 Microscopy………………………………………………………………………….89 2.7.1 Phase Contrast Microscopy………………………………………………………...... 89 2.7.2 Fluorescence Microscopy…………………………………………………………...... 91 2.7.3 Transmission Electron Microscopy (TEM)…………………………………………….91 2.8 Phenotypic Analyses………………………………………………………………..91 2.8.1 Growth Curves……………………………………………………………………...... 91 2.8.2 Cell Morphology Comparisons………………………………………………………...91 2.8.3 Colony Morphology Comparisons……………………………………………………..91 2.8.4 Sporulation Assay……………………………………………………………………...92 2.8.5 Aggregation Assays……………………………………………………………………92 2.8.6 Biofilm Assays……………………………………………………………………...... 92 2.8.7 Twitching Motility Assays……………………………………………………………..93 2.9 Protein Interaction Studies…………………………………………………………..93 2.9.1 Bacterial-Two-Hybrid………………………………………………………………….93 2.9.2 Co-Purification……………………………………………………………………...... 94

Chapter 3: Expression of Type IV Pili in C. difficile……………………………………….95 3.1 Introduction……………………………………………………………………...... 95 3.1.1 Medical Relevance of Type IV Pili……………………………………………………..95 3.1.2 C. difficile Type IV Pili……………………………………………………………...... 95 3.1.3 Aims………………………………………………………………………………...... 96 3.2 Results…………………………………………………………………………...... 96 3.2.1 Analysis of the C. difficile Type IV Pili Gene Loci…………………………………….96 3.2.2 Conservation of Type IV Pili Genes Within C. difficile……………………………….100 3.2.3 Anti-Pilin Antibodies…………………………………………………………………103 3.2.4 Identifying Conditions for Type IV Pili Production…………………………………..107 3.2.5 The Effects of Cyclic-di-GMP on Growth of C. difficile……………………………..112 3.2.6 Optimisation of Conditions for T4P Production………………………………………114

7

3.2.7 Production of Type IV Pili by C. difficile 630………………………………………...117 3.3 Discussion…………………………………………………………………………120 Chapter 4: Dissecting the Primary Type IV Pilus Gene Cluster………………………..124 4.1 Introduction……………………………………………………………………...... 124 4.2 Results…………………………………………………………………………...... 128 4.2.1 Transcription of pilB1, as well as pilA1, is Up-Regulated in Response to Elevated Levels of Cyclic-di-GMP………………………………………………………………….....128 4.2.2 Induction of dccA Expression with a Gradient of Increasing Atc Concentrations does not Result in a Gradient of pilA1 Transcription…………………………………………...130 4.2.3 A Transcriptional Terminator Exists Between pilA1 and pilB1……………………….134 4.2.4 An Operon Runs from pilB1 to prsA…………………………………………………..138 4.2.5 Transcription of the Secondary Type IV Pilus Gene Cluster Appears to be Down- Regulated by c-di-GMP………………………………………………………………143 4.2.6 The Minor Pilins PilV, PilU and PilK are Essential for Type IV Pilus Biogenesis…..144 4.2.7 Investigating the Function of PilU, PilV and PilK…………………………………….147 4.2.8 The Pre-Pilin Peptidase PilD1 Processes PilA1……………………………………….151 4.2.9 PilB1 is Essential for Type IV Pilus Biosynthesis from PilA1………………………..154 4.2.10 The Inner Membrane Proteins are Essential for Type IV Pilus Biosynthesis………….157 4.2.11 The mfd Gene is not Necessary for Type IV Pilus Biogenesis………………………...159 4.3 Discussion…………………………………………………………………………161 4.3.1 The Cdi2_4 Riboswitch……………………………………………………………….161

4.3.2 The pilA1Terminator……………………………………………………………………...162 4.3.3 The Primary T4P Operon……………………………………………………………..164 4.3.4 Systematic Mutagenesis of the Primary Type IV Pilus Operon………………...... 164 4.3.5 Functional Separation of the Two C. difficile Type IV Pilus Gene Clusters………….166

Chapter 5: Functions of the Primary Type IV Pilus Cluster…………………………….167 5.1 Introduction……………………………………………………………………...... 167 5.2 Results…………………………………………………...... 169 5.2.1 Effect of Type IV Pilus Expression on the Colony Morphology of Strain 630……….169 5.2.2 Twitching Motility in Strain 630……………………………………………………...170 5.2.3 Colony Morphology and Twitching Motility in Strain 630…………………………...171 5.2.4 Role of Type IV Pili in R20291 Motility……………………………………………...171 5.2.5 Type IV Pilus-Mediated Aggregation………………………………………………...175 5.2.6 Type IV Pilus-Mediated Biofilm Formation……………………………………….....176 5.3 Discussion…………………………………………………………………………177

8

Chapter 6: Investigations of Clostridium sordellii Type IV Pili and its Genome…...... 182 6.1 Introduction…………………………………………………………………...... 182 6.2 Results…………………………………………………………………………...... 184 6.2.1 The Strain Collection Divides into Four Clades………………………………………185 6.2.2 The Type IV Pili Genes of Clostridium sordellii ATCC9714…………………………186 6.2.3 Conservation of Type IV Pili Genes Across C. sordellii……………………………...191 6.2.4 Choosing a C. sordellii Strain for Type IV Pili Investigations………………………..192 6.2.5 Cross-Reactivity of W3025 Pilin Proteins with α-C. difficile Pilin Antibodies……….193 6.2.6 Does a Riboswitch also Regulate Expression of the C. sordellii Primary Type IV Pilus Cluster?……………………………………………………………………………….195 6.2.7 Optimisation of Protein Expression in C. sordellii W3025…………………………..196 6.2.8 Effect of Cyclic-di-GMP on Transcription of the Primary Type IV Pilus Locus…….205 6.2.9 A Minority of Strains from the Collection Encode Cytotoxins……………………….209 6.2.10 The LCC Genes are Localised Within a Pathogenicity Locus……………………...... 211 6.2.11 The LCC Genes are Located on a Plasmid…………………………………………….212 6.2.12 Analysis of the ATCC9714 Genome………………………………………………….215 6.3 Discussion…………………………………………………………………………217 6.3.1 Importance of the C. sordellii Genomic Analysis………………………………….....217 6.3.2 The Role of the LCCs in Disease……………………………………………………..218 6.3.3 Evolution of the LCC Genes……………………………………………………….....219 6.3.4 The C. sordellii Type IV Pili Genes…………………………………………………..221

Chapter 7: Discussion……………………………………………………………………...223 7.1 Future Work on the Primary Type IV Pilus Cluster………………………………..223 7.2 Are C. difficile Type IV Pili “T4aP” or “T4bP”?…………………………………..224 7.3 C. difficile as a Model for Investigation of Gram-Positive Type IV Pili…………..226 7.4 Conclusions in Relation to C. sordellii and Future Work………………………….227

Bibliography…………………………………………………………………………….....229

Appendices…………………………………………………………………………………250 Appendix 1……………………………………………………………………………….250 Appendix 2……………………………………………………………………………….261

9

Figures

Figure 1.1 Schematic Diagram Showing the Overall Structure of the Type 1 and P Chaperone-Usher Pili from UPEC……………………………………………………………………...... 18 Figure 1.2 Model of the Streptococcus pyogenes M1 Pilus…...………………………………………...... 20 Figure 1.3 Type IV Pili of Neisseria meningitidis…………………………………………………………...21 Figure 1.4 Structure of the Neisseria gonorrhoeae Major Pilin PilE………………………………………..23 Figure 1.5 A Model of a C. difficile T4P…………………………………………………………………….24 Figure 1.6 Structure of a PilT Hexamer from Aquifex aeolicus……………………………………………..26 Figure 1.7 Schematic Diagram Showing the Structure of a Gram-Negative Type IV Pilus…………………29 Figure 1.8 Schematic Diagram of a Hypothetical Structure of the Secretin/Alignment Complex…………..30 Figure 1.9 Structures of the T2SS and a Gram-Negative T4P……………………………………………….34 Figure 1.10 Comparison of the Structures of a Gram-Negative T4P and the Archaellum……………..…….35 Figure 1.11 The Gram-Positive Competence System………………………………………………………...36 Figure 1.12 Type IV Pili Genes of Clostridium perfringens………………………………………………….37 Figure 1.13 TEM Images of C. difficile Spores………………………………………………………………46 Figure 1.14 The C. difficile PaLoc………………………………………………………………………...... 47 Figure 1.15 The C. difficile T4P Gene Clusters, as Reported in (Varga et al., 2006)…………………………54 Figure 3.1 In Silico Analysis of the Product of the C. difficile 630 Primary T4P Cluster “pilM” Gene…….97 Figure 3.2 The C. difficile 630 Primary T4P Gene Cluster………………………………………………….99 Figure 3.3 The Secondary T4P Gene Cluster from C. difficile 630………………………………………..100

Figure 3.4 Purification of His-Tagged PilA2Δ1-33 from E. coli Rosetta (pECC38)…………………………104 Figure 3.5 Western Blots Showing the Specificity of the Five α-C. difficile Pilin Antibodies…………….105 Figure 3.6 Immuno-Detection of Pilins Expressed Exogenously in C. difficile……………………………108 Figure 3.7 PilA1 Production by C. difficile when Grown on Solid Media…………………………………110 Figure 3.8 C-di-GMP-Induced PilA1 Expression by C. difficile…………………………………………..111 Figure 3.9 Effect of C-di-GMP on C. difficile Growth…………………………………………………….113 Figure 3.10 Effect of C-di-GMP on C. difficile Cell Morphology…………………………………………..114 Figure 3.11 PilA1 Production Following Atc-Induced Expression of DccA………………………………..115 Figure 3.12 Effect of Increasing Expression of DccA on Cell Morphologies of C. difficile 630……………116 Figure 3.13 TEM Visualisation of C. difficile Type IV Pili…………………………………………………118 Figure 3.14 Identification of 630ΔpilA1 Mutant…………………………………………………………….118 Figure 3.15 Complementation of 630ΔpilA1 Mutant…………………………………………………….....119 Figure 3.16 EM Images Showing Complementation of 630ΔpilA1 Mutant………………………………...120 Figure 4.1 The C. difficile Primary T4P Gene Cluster…………………………………………………...... 127 Figure 4.2 RNA Extraction and cDNA Synthesis from C. difficile 630 (pASF85) and 630 (pECC17)……129 Figure 4.3 C-di-GMP Regulation of Transcription of pilA1 and pilB1…………………………………….129 Figure 4.4 Standard Curves and Melt Curves of the qPCR Reactions Analysed in Figure 4.3……………130 Figure 4.5 Confirmation of Purity of RNA Extracted from C. difficile 630 (pECC17) Cultures………...... 131 Figure 4.6 Confirmation of Successful cDNA Synthesis from the RNA Samples in Figure 4.4…………..131 Figure 4.7 RNA Extraction and cDNA Synthesis from C. difficile 630 (pMTL960) and 630 (pECC12)…132 Figure 4.8 Effect of Diguanylate Cyclase Expression on Expression of pilA1……………………………..133 Figure 4.9 Standard Curves and Melt Curves of the qPCR Reactions Analysed in Figure 4.8…………….135 Figure 4.10 Identification of a Transcription Terminator Between pilA1 and pilB1………………………..135

Figure 4.11 Colony PCR Screening Showing Successful Generation of 630ΔpilA1Terminator Mutant……….136

Figure 4.12 RNA Extraction and cDNA Synthesis from C. difficile 630ΔpilA1Terminator (pASF85) and (pECC17) Cultures…………………………………………………………………………...... 137

10

Figure 4.13 Effect of DccA Expression on Expression of pilA1 and pilB1 in 630ΔpilA1Terminator……………137 Figure 4.14 RT-PCR to Investigate Co-Transcription of Genes in the Primary T4P Gene Cluster…………140 Figure 4.15 Effect of DccA on Primary T4P Operon Expression Excluding pilA1…………………………141 Figure 4.16 Standard Curves and Melt Curves of qPCR Reactions on Genes in the Primary T4P Operon…142 Figure 4.17 Effect of C-di-GMP Concentration on pilA2 Transcription…………………………………….144 Figure 4.18 Standard Curve and Melt Curves for the pilA2 qPCR Reactions Analysed in Figure 4.17…….145 Figure 4.19 Screening of Putative pilU Mutants, Derived from pECC59 Single Cross-Over Integrants.…..145 Figure 4.20 Screening of Putative pilV Mutants, Derived from pECC60 Single Cross-Over Integrants……146 Figure 4.21 Screening of Putative pilK Mutants, Derived from pECC65 Single Cross-Over Integrants……146 Figure 4.22 Testing of Minor Pilin Mutants for the Ability to Synthesise T4P……………………………..146 Figure 4.23 Complementation of Minor Pilin Mutants for the Ability to Synthesise T4P………………….147 Figure 4.24 Examples of B2H Plates from B2H Analysis of Interactions Between the C. difficile 630 Pilin Proteins………………………………………………………………………………………...149 Figure 4.25 Attempted Co-Purification of the Soluble Domains of C. difficile 630 PilK, PilV and PilU….151 Figure 4.26 Screening of Putative pilD1 Mutants, Derived from pECC70 Single Cross-Over Integrants….152 Figure 4.27 Example Screen of pECC69-Derived Double Cross-Overs…………………………………….152 Figure 4.28 Testing of 630ΔpilD1 for the Ability to Synthesise T4P…………………………………….....153 Figure 4.29 Complementation of the pilD1 Mutant for the Ability to Synthesise T4P……………………..154 Figure 4.30 Screening of Putative pilB1 Mutants, Derived from pECC62 Single Cross-Over Integrants…..155 Figure 4.31 Screening of Putative pilB2 Mutants, Derived from pECC63 Single Cross-Over Integrants…..155 Figure 4.32 Screening of Putative pilT Mutants, Derived from pECC64 Single Cross-Over Integrants…….155 Figure 4.33 Testing of T4P ATPase Mutants for the Ability to Synthesise T4P…………………………….156 Figure 4.34 Complementation of the pilB1 mutant with both pilB1 and pilB2………………………………157 Figure 4.35 Colony PCR Screening Showing Deletion of pilC1, pilMN and pilO from C. difficile 630……158 Figure 4.36 Testing of Inner Membrane Protein Mutants for the Ability to Synthesise T4P……………….158 Figure 4.37 Complementation of the Inner Membrane Protein Mutants with Genes from Both the Primary and Secondary T4P Gene Clusters………………………………………………………………….160 Figure 4.38 Testing of mfd Mutant for the Ability to Synthesise T4P……………………………………….160 Figure 4.39 Two Representations of the Predicted Structure of the Cdi2_4 Riboswitch……………………162 Figure 5.1 Effect of dccA Expression on the Colony Morphology of C. difficile 630……………………...169 Figure 5.2 “Twitching Motility” in C. difficile Strain 630…………………………………………………170 Figure 5.3 Effect of dccA Expression on the Colony Morphology of C. difficile R20291…………………171 Figure 5.4 “Twitching Motility” in C. difficile Strain R20291………………………………………….....172 Figure 5.5 Screening of Putative R20291 pilB1 ClosTron Mutants…………………………………….....172 Figure 5.6 Southern Blots of R20291 ClosTron Mutants……………………………………………….....173 Figure 5.7 R20291 pilB1 Mutant is Unable to Produce T4P……………………………………………….174 Figure 5.8 Colony Morphology and “Twitching Motility” of R20291 pilB1 Mutant………………………174 Figure 5.9 Cyclic-di-GMP-Mediated Aggregation in C. difficile………………………………………….176 Figure 6.1 Phylogenetic Tree Showing Maximum Likelihood Phylogeny of the 44 Strains of C. sordellii in our Collection………………………………………………………………………………...... 186 Figure 6.2 The T4P Gene Clusters of C. sordellii ATCC9714…………………………………………...... 188 Figure 6.3 Comparison of the Genomic Loci of the Primary T4P Gene Clusters in C. sordellii ATCC9714 and C. difficile 630……………………………………………………………………………...189 Figure 6.4 Comparison of the Genomic Loci of the Secondary T4P Gene Clusters in C. sordellii ATCC9714 and C. difficile 630……………………………………………………………………………..190 Figure 6.5 Western Blots for C. sordellii Pilins Using Antibodies Raised Against C. difficile Pilins…….194 Figure 6.6 Alignment of the 300 bp Regions Upstream of pilA1 from C. difficile 630 and pilA1A from C. sordellii W3025…………………………………………………………………………...... 197

11

Figure 6.7 Alignment of the 300 bp Regions Upstream of pilA1 from C. difficile 630 and pilA1B from C. sordellii W3025…………………………………………………………………………...... 198 Figure 6.8 A Rho-Independent Terminator Between C. sordellii W3025 pilA1B and pilB1………………199 Figure 6.9 Schematic Diagram Showing the C. sordellii Type IV Pilus Primary Gene Cluster……………200 Figure 6.10 C. sordellii W3025 Growth Curve in BHIS Medium…………………………………………..201 Figure 6.11 Growth Curves of C. sordellii W3025 with Varying Concentrations of Atc……………………202 Figure 6.12 Coomassie Stain Showing Cell Lysates of C. sordellii W3025………………………………..203 Figure 6.13 DccA Expression in C. sordellii W3025 (pECC17) Induced with Atc…………………………204 Figure 6.14 Growth Curve Showing the Effect of DccA Expression on Growth of C. sordellii W3025……204 Figure 6.15 Cell Morphologies of C. sordellii W3025 (pASF85) and (pECC17)…………………………...205 Figure 6.16 16S rRNA PCRs Demonstrating RNA Isolation and cDNA Production……………………….206 Figure 6.17 Standard Curves and Melting Curves for qPCR Reactions Shown in Figure 6.18……………..207 Figure 6.18 The Effect of DGC Expression on Expression of Genes from the Primary T4P Cluster of C. sordellii W3025…………………………………………………………………………...... 208 Figure 6.19 PCR Screen of C. sordellii Strains for tcsL Gene……………………………………………….210 Figure 6.20 PCR screen of C. sordellii Strains for Full-Length tcsH Gene…………………………………210 Figure 6.21 PCR Screen of C. sordellii Strains for Fragments of the tcsH Gene……………………………211 Figure 6.22 Comparison of the PaLoc from C. difficile 630 and C. sordellii ATCC9714…………………..212 Figure 6.23 Comparison of the PaLoc and Surrounding Region from the Four pCS1-Type Plasmids Identified in C. sordellii Strains in our Collection………………………………………………………...214 Figure 6.24 Phylogenetic Tree as Shown in Figure 6.1 with Additional Annotations………………………220

12

Tables

Table 2.1 Strains of C. difficile Used in This Study………………………………………………………...60 Table 2.2 Strains of C. sordellii Used in This Study……………………………………………………61-62 Table 2.3 Recipe for CDMM……………………………………………………………………………….73 Table 2.4 Antibodies Used in Western Blotting……………………………………………………………90 Table 3.1 C. difficile Genomes Analysed for T4P Gene Content…………………………………………101 Table 3.2 Conservation of all 9 (Putative) Pilin Proteins from C. difficile 630……………………………102 Table 6.1 Conservation of the Putative Type IV Pilin Genes from C. sordellii ATCC9714………………192 Table A1 Plasmids Used in This Study………………………………………………………………250-253 Table A2 Primers Used in This Study………………………………………………………………..254-260

13

Abbreviations

ANOVA Analysis of Variance APS Ammonium Persulphate Atc Anhydrotetracycline BCA Bicinchoninic Acid BD Beckton Dickinson BFP Bundle-Forming Pili BHI Brain-Heart Infusion BHIS Supplemented Brain Heart Infusion BLAST Basic Local Alignment Search Tool BSA Bovine Serum Albumin B2H Bacterial-2-Hybrid C-di-GMP Cyclic-di-GMP CBA Colombia Blood Agar CDMM Clostridium difficile Minimal Medium C.F.U. Colony-Forming Units CU Chaperone-Usher CDT Clostridium difficile Transferase ChiRP Chitin-Regulated Type IV Pilus CRE Catabolite Response Element CSPG4 Chondroitin Sulphate Proteoglycan 4 CWB2 Cell Wall Binding Repeat 2 DGC Diguanylate Cyclase dsDNA Double-Stranded DNA EIA Enzyme Immuno-Assay EPEC Enteropathogenic Escherichia coli FAA Fastidious Anaerobe Agar FAB Fastidious Anaerobe Broth FC 5-Fluorocytosine

14

GSP General Secretory Pathway HBS HEPES-Buffered Saline HRP Horseradish Peroxidase LB Luria Bertani LCC Large Clostridial Cytotoxin LSR Lipolysis-Stimulated Lipoprotein Receptor MCS Multiple Cloning Site MLST Multi-Locus Sequence Typing MMLV-RT Moloney Murine Leukaemia Virus Reverse Transcriptase MTG Mitotracker Green NAAT Nucleic Acid Amplification Test NEB New England Biolabs ORF Open Reading Frame PBS Phosphate-Buffered Saline PaLoc Pathogenicity Locus

PGE2 Prostaglandin E2 PLG Phase-Lock Gel PVDF Polyvinylidene Fluoride PVRL3 Poliovirus Receptor-Like 3 qPCR Quantitative Reverse Transcriptase PCR RBS Ribosome Binding Site RT Room Temperature RT-PCR Reverse Transcriptase PCR SDS Sodium Dodecyl Sulphate SDS-PAGE SDS-Polyacrylamide Gel Electrophoresis SLH S-Layer Homology S/N Supernatant

TCA Tri-Chloroacetic Acid TCP Toxin Co-Regulated Pilus

15

TEM Transmission Electron Microscopy TEMED Tetramethylethylenediamine TGS Tris-Glycine-SDS Tris Tris-(hydroxymethyl)aminomethane TPP Thiamine Pyrophosphate T2SS Type II Secretion System T3SS Type III Secretion System T4P Type IV Pili T4aP Type IVa Pili T4bP Type IVb Pili UPEC Uropathogenic Escherichia coli WCL Whole Cell Lysate WT Wild-Type

16

1. Introduction

Bacterial pili (or fimbriae) are long, hair-like structures which protrude from the surface of many bacterial species. Indeed, pilus (the singular of pili) is the Latin word for “hair”. There are various different types of bacterial pili (described below), but certain features are common to all or most. Firstly, the main pilus structure (sometimes known as the rod or shaft) is a polymer comprising up to a thousand or more protein subunits, known as pilins (Proft and Baker, 2009). Secondly, pili generally mediate adhesion, both intra-bacterially (in the course of biofilm formation) and between bacterial and eukaryotic cells (during host colonisation) (Proft and Baker, 2009). Pili are vitally important for these functions as they allow bacteria to adhere to other cells (whether fellow bacteria or those of a host) without bringing the two cell membranes close together, which, on account of the net negative charge of both membranes, is problematic, due to the associated repulsion (Proft and Baker, 2009). For these reasons, pili frequently act as virulence factors for pathogenic species, and are therefore objects of major research interest. Bacterial pili were first identified in 1949 (Anderson, 1949), and over the next few years pili were identified in many bacterial species, including Escherichia coli, Klebsiella sp., Salmonella sp. and many more (Brinton, 1959). Since then, several types of pili have been discovered across various groups of species.

1.1 Types of Pili 1.1.1 Gram-Negative Pili

Two types of pili exclusive to Gram-negative bacteria are known: Chaperone-Usher pili and Curli pili. In both these types of pili, the pilin subunits which polymerise to form the shaft are non-covalently joined (Sauer et al., 2000).

Chaperone-Usher (CU) Pili These are relatively widespread across Gram-negative species, the best studied examples being the ‘Type 1 Pilus’ and ‘P Pilus’ of uropathogenic E. coli (UPEC) (Busch et al., 2015). All components of CU pili are first translocated through the inner bacterial membrane into the periplasm by the Sec machinery (Busch et al., 2015). These components include pilins, a ‘Chaperone’ protein and an ‘Usher’ protein. Once translocated, the usher protein is embedded (by the β-Barrel Assembly Machinery Complex) in the outer membrane, where it forms a β- barrel structure and functions as a platform for pilus assembly (Palomino et al., 2011). In Type 1 Pili, the usher protein is FimD; in P Pili it is PapC (see Figure 1.1).

17

Periplasmic chaperone proteins (FimC and PapD in Type 1 and P Pili respectively) bind to pilin proteins upon their arrival in the periplasm, promoting correct folding of the pilins and targeting them to the usher protein (Ng et al., 2004; Puorger et al., 2011). Being a β-barrel protein, the usher contains a central pore, through which pilins are translocated, whereupon each pilin is incorporated into the base of the growing pilus (Remaut et al., 2008). The pilins bind each other non-covalently by hydrophobic interactions. Specifically, each pilin subunit contains a groove, into which a β-strand of the following pilin is inserted, linking them together in a process known as ‘Donor Strand Complementation’ (Sauer et al., 1999). The major pilin of the Type 1 Pilus is FimA, and that of the P Pilus is PapA. These major pilins form the majority of the body of the pili, but the tips of the pili are formed by a specific tip subunit (FimH and PapG in Type 1 and P Pili respectively), which contain lectin domains and function as adhesins (Busch et al., 2015). These tip adhesin subunits somehow prime the usher proteins for pilus assembly, though how this happens is not well understood. Immediately following the tip adhesin, other tip-specific subunits are incorporated prior to formation of the main pilus body by the major pilin (Busch et al., 2015) (see Figure 1.1). Once an appropriate length is reached, pilus biogenesis is terminated (how this length is decided/detected is unknown). In P Pili termination is achieved by incorporation of the ‘termination pilin’ PapH (Verger et al., 2006); no equivalent unit is known in Type 1 Pili, so how termination is achieved in that system is unclear.

Figure 1.1. Schematic Diagram Showing the Overall Structure of the Type 1 and P Chaperone-Usher Pili from UPEC. Green subunits are the tip adhesin pilins, yellow red and purple are other minor pilins which constitute the tips of the pili. Light blue indicates the minor pilins, dark blue the usher proteins (embedded in the cream-coloured outer membrane), and yellow the chaperones. Brown indicates a termination pilin. All subunits are appropriately labelled. P, N, C1 and C2 indicate domains of the usher protein. The details of these domains are not discussed here. From (Busch et al., 2015); image used with permission.

18

As indicated by the presence of the lectin adhesin tips, the primary function of CU pili is in adhesion to host cells. UPEC is a major cause of recurrent urinary tract infections, with its CU pili being important virulence factors, P Pili being able to bind epithelial kidney cells and Type 1 Pili epithelial bladder and kidney cells (Korea et al., 2011). UPEC strains deficient in Type 1 or P Pili are significantly less virulent than strains producing functional such pili (Busch et al., 2015), demonstrating the important role that pili generally (and CU pili in particular) can play in pathogenesis.

Curli Pili Curli pili are neither as widespread nor as well understood as CU pili. They consist of amyloid fibres which are found on various enteric bacteria (Proft and Baker, 2009). The pili are formed from a major pilin called ‘curlin’, or CsgA, which, following secretion across the double membrane, is polymerised via a ‘nucleation precipitation pathway’. This nucleation is mediated by the nucleator protein CsgB, and involves the precipitation of soluble CsgA into an insoluble form, which polymerises (Bian and Normark, 1997). Curli pili are able to mediate biofilm formation in E. coli and Salmonella spp. (Proft and Baker, 2009), while those of E. coli have also been shown to bind fibronectin (Bian and Normark, 1997), implicating curli pili in virulence.

1.1.2 Gram-Positive Pili Only one type of pilus exclusive to Gram-positive bacteria is known. Unlike in the Gram- negative pili described above, the pilin subunits of Gram-positive pili are covalently bonded to one-another (Kang and Baker, 2012). The best characterised example of a Gram-positive pilus is that of Streptococcus pyogenes M1. The Gram-positive pilins are initially secreted via the Sec machinery, whereupon they anchor themselves to the bacterial membrane via hydrophobic C-terminal domains (Kang and Baker, 2012). As for CU pili in Gram-negative species, both major pilins (which form the body of the pilus shaft), and minor pilins (which form the tip and base of each pilus) exist. In S. pyogenes M1, the major pilin is Spy0128 (Quigley et al., 2009), while the minor pilins are Cpa (which forms the tip of each pilus (Quigley et al., 2009)) and FctB (which forms the base (Linke et al., 2010)). Following secretion, pilins are incorporated into a growing pilus chain by a pilin- specific sortase enzyme (Spy0129 in S. pyogenes M1) (Smith et al., 2010), an extra-cellular, membrane-associated protein which catalyses the covalent linking together of pilin subunits. These linkages take the form of a peptide bond between the C-terminus of one residue and a

19

particular lysine side-chain of the next (Kang and Baker, 2012). Spy0129 catalyses formation of the entire pilus shaft, initiating with the attachment of a Spy0128 subunit to a Cpa subunit, followed by polymerisation of Spy0128 and finally attachment of FctB to the base of the shaft. FctB is then covalently attached to the peptidoglycan cell wall by a different sortase, the housekeeping SrtA (Smith et al., 2010). Thus, the ultimate structure of the paradigm Gram- positive pilus of S. pyogenes M1 has a tip of Cpa, following which is a chain of Spy0128 forming the majority of the shaft, and finally FctB at the base, covalently linked to the cell wall (Figure 1.2).

MembraneCell Wall FctB Spy0128 Spy0128 Cpa

Figure 1.2. Model of the Streptococcus pyogenes M1 Pilus. The base of the pilus is formed from the minor pilin FctB, which is covalently linked to the peptidoglycan cell wall; the body of the pilus is formed from the major pilin Spy0128; the tip of the pilus is formed from the minor pilin Cpa. The yellow/orange domain of Cpa is an adhesin domain; the dotted oval represents an unstructured domain. Adapted from (Kang and Baker, 2012). Structure of FctB from (Linke et al., 2010); structure of Spy0128 from (Kang et al., 2007); structure of Cpa from (Pointon et al., 2010). Used with permission. The minor pilins which form the tips of Gram-positive pili generally appear to function as adhesins (Kang and Baker, 2012). Certainly Cpa contains an adhesin domain (Pointon et al., 2010), as does the only other tip pilin which has been structured (RrgA from Streptococcus pneumoniae) (Izoré et al., 2010). Both RrgA and Cpa are known to bind collagen (Kreikemeyer et al., 2005), while RrgA also binds laminin and fibronectin (Hilleringmann et al., 2008), suggestive of a role in virulence, and indeed, pilus-mediated attachment of S. pyogenes to human tonsil epithelial cells and keratinocytes has been demonstrated (Abbot et al., 2007), while pilus-deficient S. pyogenes has been shown to be defective in binding to human pharyngeal cells and in biofilm formation (Manetti et al., 2007). Piliated S. pneumoniae have been shown to be significantly more virulent than an apiliated strain, outcompeting such a strain in murine models of infection (Barocchi et al., 2006). This demonstrates the role of pili in

20 virulence of certain Gram-positive species, emphasising the importance of pili in pathogenesis across the bacterial kingdom.

1.2 Type IV Pili The types of pili described above are exclusively found in either Gram-positive or Gram- negative bacteria. Type IV Pili (T4P), uniquely, are found across both Gram-positive and Gram-negative species. They are, therefore, the most widespread type of bacterial pilus (Pelicic, 2008). T4P are characterised in being extremely thin (only about 5-8 nm wide) but very long (up to several micrometres), and forming large bundles of filaments (Pelicic, 2008) (Figure 1.3). Furthermore, T4P can possess the unique ability to retract, allowing them to mediate a form of flagella-independent motility known as “twitching motility” (though not all T4P systems possess this function) (Burrows, 2005).

Figure 1.3. Type IV Pili of Neisseria meningitidis. These are visualised by EM after negative staining with phosphotungstic acid. Individual filaments are only ~6 nm wide, but as shown, these clump together to form bundles. These bundles are much thicker and stronger than individual filaments. Taken from (Berry and Pelicic, 2015). Used with permission.

Type IV Pili are also characterised in possessing a much more complex system of assembly (and disassembly) than other types of pili. While the types of pili described above have (as is apparent from the text and figures) minimal assembly systems comprising as few as two proteins, T4P assembly requires up to 12 or more proteins, depending on the specific species (Craig and Li, 2008).

21

1.2.1 Type IV Pili Components Pilins The primary component of a Type IV Pilus is (as for other types of pili), the major pilin. This is generically referred to as ‘PilA’. (This system of nomenclature is derived from Pseudomonas aeruginosa, one of the primary model systems for T4P research. Other systems of nomenclature exist, in particular one which is derived from Neisseria species, wherein the major pilin is referred to as PilE (Craig and Li, 2008). However, the Pseudomonas nomenclature is arguably more widespread, and, most relevantly for this thesis, has been adopted as the nomenclature for Clostridial T4P (Melville and Craig, 2013), so is therefore used throughout this thesis. Where alternative nomenclatures are used, this will be clearly highlighted). Minor pilins are also present in T4P systems, and are incorporated into the pili (Giltner et al., 2010) where they are able to play a variety of roles. They are known as minor pilins because they are far less abundant within the pilus than is the relevant major pilin. Certain minor pilins in certain species are essential for T4P biogenesis, while others are not (Carbonnelle et al., 2005). The nomenclature of minor pilins is extremely variable between species. All Type IV pilins share the same basic structure (an exemplary structure is shown in Figure 1.4), but vary significantly in size and sequence. Type IV pilins have to-date been classed as either Type IVa or Type IVb. These correspond to two distinct classes of T4P (Type IVa Pili (T4aP) and Type IVb Pili (T4bP)), which are discussed in later sections. However, it is important to note that these designations apply specifically only to Gram-negative systems. Type IVa pilins tend to have a total size of around 150 amino acids, while Type IVb pilins tend to be either significantly shorter than this (<100 amino acids) or significantly longer (around 200 amino acids) (Craig et al., 2004). All Type IV pilins are initially synthesised as pre-pilins, with N-terminal Type III signal peptides (Giltner et al., 2012). Canonically, this signal peptide is positively charged and ends with a glycine residue (Pelicic, 2008). The signal peptides of Type IVa pilins tend to be short, generally only 5-7 amino acids in length, while those of Type IVb pilins are longer, being around 15-30 amino acids in length (Craig et al., 2004).

22

Figure 1.4. Structure of the Neisseria gonorrhoeae Major Pilin PilE. PilE is a Type IVa pilin. The long N-terminal α-helix is coloured in cyan. The region of it labelled α1-N corresponds to the hydrophobic tail. The region labelled α1-C extends into the globular head domain, which is hydrophilic. This is connected via the purple αβ- loop to a β-sheet (grey). The most variable core domain of pilins is the blue D-region. The structure is taken from (Parge et al., 1995), and the figure adapted from (Giltner et al., 2012), used with permission.

Following the signal peptide is the N-terminal α-helix (also known as the tail domain). This tends to be around 30 amino acids in length and is hydrophobic in character (Craig et al., 2004). This N-terminal domain is the most homologous domain between pilins, with the C-terminal domain being significantly more variable (Burrows, 2012). In Type IVa pilins the first residue of the N-terminal α-helix is almost invariably phenylalanine; the residue is more variable in Type IVb pilins, but is generally one of methionine, leucine or valine (Craig et al., 2004). Canonically, the fifth residue in the N-terminal α-helix is glutamate (Pelicic, 2008). In conjunction with the signal peptide, the hydrophobic N-terminal helix targets prepilins to the Sec translocase, which inserts the prepilin into the cytoplasmic membrane (Francetic et al., 2007). The hydrophobic helix functions as a trans-membrane helix, with the signal sequence protruding out the cytoplasmic side of the membrane (Strom and Lory, 1987). The C-terminal domain is also known as the head domain. It is globular, with a structure containing both α-helix and β-sheet features (Craig et al., 2004). The general ‘pilin fold’ is quite well conserved and distinctive. However, the loop regions which link the core structural features are extremely variable, enabling the wide diversity of functions seen across species (Giltner et al., 2012). The D-region contains a disulphide bond conserved across almost all Type IVa and Type IVb pilins, which is structurally important (Giltner et al., 2012). Type IV pilins are assembled into pili through their hydrophobic tails. These bind together by hydrophobic forces, to form the central shaft of the pilus (Craig et al., 2004). The hydrophilic head domains face outwards into the extracellular milieu. Pilins are assembled helically into a pilus (Giltner et al., 2012), forming an overall spiral shape (Figure 1.5). An

23 individual pilus can be comprised of several thousand subunits of the major pilin (Giltner et al., 2012).

Figure 1.5. A Model of a C. difficile T4P. The pilus model runs base → tip from left → right, and consists of only one pilin type (the major pilin PilA1). The overall pilus structure is based on high- resolution electron micrographs of the Vibrio cholerae Toxin Co-Regulated Pilus (TCP), into which repeating units of both the crystal structure of the C. difficile PilA1 pilin (top), and a space-filling model of the structure (bottom), have been modelled. Clearly, the hydrophobic N-terminal helices of the pilin form the interior of the shaft, with the exterior formed by the C-terminal globular domain, which consists primarily of β-sheets. Taken from (Piepenbrink et al., 2015), used with permission.

Pre-Pilin Peptidase Prior to being assembled into pili, pre-pilins must be processed by the pre-pilin peptidase, PilD (Strom and Lory, 1991). PilD proteins form a family of membranous aspartyl proteases (LaPointe and Taylor, 2000) which sit in the (inner) membrane, and, at the cytoplasmic face of the membrane, both cleave the N-terminal Type III signal peptide from pre-pilins and methylate the N-terminal residue of the resultant mature protein (Strom et al., 1993). These two activities occur at separate active sites of PilD, and mutation of these two sites separately has shown that, interestingly, while cleavage of the Type III signal peptide from pre-pilins is essential for pilus assembly, N-terminal methylation of the mature pilin is not (Pepe and Lory, 1998). The role of this methylation is therefore unclear, though multiple purposes for it have been proposed, such as contributing to pilin-pilin interactions within a pilus.

Assembly ATPase Assembly of pilins into a pilus is driven by an assembly ATPase known as PilB (PilF in Neisseria) (Turner et al., 1993). PilB is a Walker Box-containing ATPase, which is cytoplasmic, though localised to the (inner) membrane (Tripathi and Taylor, 2007), and appears

24 to form hexamers (Crowther et al., 2004). How ATP hydrolysis by PilB drives pilus biogenesis is unknown, as PilB is located in the cytoplasm while pilus biogenesis occurs in the periplasm/extracellularly (in Gram-negative and -positive species respectively). It is presumed to occur via conformational changes in PilB being transmitted through interactions with partner proteins such as PilC or PilM (described below) to drive pilin polymerisation (Burrows, 2012).

Retraction ATPase As stated earlier, a unique property of T4P is that they are able to retract. This retraction is driven by depolymerisation of the pili, which is driven by retraction ATPases (Merz et al., 2000). The major retraction ATPase is PilT, though certain species (such as Pseudomonas aeruginosa) possess a second retraction ATPase called PilU, while Neisseria meningitidis possesses three retraction ATPases (PilT, PilU and PilT2) (Brown et al., 2010). Conversely, some T4P (particularly T4bP) lack a retraction ATPase and are not able to retract at all (Burrows, 2012). Retraction ATPases belong to the same family as the PilB assembly ATPases, and similarly contain essential Walker boxes, are cytoplasmic but localised to the (inner) membrane and hexamerise (Satyshur et al., 2007). A crystal structure of a PilT oligomer from Aquifex aeolicus (a thermophilic bacterium) showed a symmetrical hexamer wherein the subunits interact via the N-terminal domain of each subunit with the C-terminal domain of the next (Satyshur et al., 2007) (Figure 1.6). How PilT drives disassembly of T4P is also unknown, but is presumed to occur via the reverse mechanism of PilB-mediated pilus assembly, whatever that may be. The crystal structure of PilT shows a homohexamer, and it has been assumed that each individual traffic ATPase (i.e. assembly or retraction ATPase) forms a homohexamer which functions independently (e.g. PilB will form a PilB homohexamer, PilT will form a PilT homohexamer, and each of the PilB and PilT homohexamers will function independently of the other). However, recent bacterial-2-hybrid studies of the interactions between N. meningitidis T4P-associated proteins have suggested that all three retraction ATPases in that species may interact with each other, and that certain of these may even interact with the PilF assembly ATPase (Georgiadou et al., 2012), raising the intriguing possibility that these traffic ATPases may form ‘hybrid’ hexamers, able to drive both assembly and disassembly of T4P, depending on which components are active or dominant at any one time. As yet though, no evidence for these hybrids exists beside the referenced study.

25

Figure 1.6. Structure of a PilT Hexamer from Aquifex aeolicus. The quaternary structure is a homohexameric ring with 6- fold symmetry. One subunit is coloured in yellow to clarify the limits of each individual protein. Blue colouring indicates the N-terminal domain of each subunit; green, purple, red and grey the C-terminal domain. Bound ATP molecules are also shown. From (Satyshur et al., 2007), used with permission.

Platform Protein Type IV pilus systems contain a platform protein known as PilC (PilG in Neisseria). PilC is located in the (inner) membrane, and comprises three trans-membrane helices with a topology such that the N-terminus is cytoplasmic and the C-terminus periplasmic/extracellular (Berry and Pelicic, 2015). Its role is very unclear and may vary across species, but it appears (at least in some) to interact with PilB or PilT/U to mediate pilus extension or retraction.

Accessory Membrane Proteins The T4P base contains three or four other membrane-associated proteins. PilM is a cytoplasmic, actin-like protein which interacts with PilN, a monotopic membrane protein (Karuppiah and Derrick, 2011). PilN has a short, cytoplasmic N-terminus and a much larger, C-terminal periplasmic (or, in Gram-positive organisms, extracellular) domain (Sampaleanu et al., 2009). PilN contains a conserved INLLP motif at its N-terminus, through which it interacts with PilM (Tammam et al., 2013). In some T4P systems, PilM and PilN are fused into a single protein, such as in the E. coli Bundle-Forming Pilus (in which the PilM-PilN fusion protein is known as BfpC). As mentioned above, PilM (or PilM-PilN fusion proteins where relevant) appears to interact with the assembly ATPase, and may therefore be involved in transmitting signals from it across the membrane (Yamagata et al., 2012). PilO is another accessory membrane protein, which is a monotopic membrane protein similar to PilN, in that it has the same topology consisting of a short cytoplasmic and much larger periplasmic (or, in Gram-positive organisms, extracellular) domain (Sampaleanu et al.,

26

2009). PilO and PilN interact through their periplasmic/extracellular domains to form heterodimers (Sampaleanu et al., 2009). The fourth, and final, accessory membrane protein is PilP, which is a periplasmic inner membrane lipoprotein (Drake et al., 1997). PilP is not present in Gram-positive bacteria, (Melville and Craig, 2013), which therefore contain only three membrane accessory proteins. The PilP N-terminal domain interacts with the PilNO complex to form a PilNOP heterotrimer (Tammam et al., 2011). Thus, in both Gram-negative and Gram-positive T4P systems, PilMNO(P) form a transmembrane complex.

Secretin In Gram-negative bacteria, given the pilus is formed from the inner membrane, to exit the cell the pilus must pass through the outer membrane. To allow this, a multimeric protein complex known as the secretin forms a pore through which the pilus can pass. Secretins are homomultimeric complexes common to many Gram-negative protein systems which span both the inner and outer membranes (for instance the Type III Secretion System (Hodgkinson et al., 2009)). In the case of T4P, the secretin is formed of a protein known as PilQ; secretin complexes tend to form symmetrical circles comprising between 12-15 PilQ monomers (the precise number being species dependent) (Lieberman et al., 2015). The secretin domains of PilQ monomers are formed from their C-terminal domains, which are presumed to form β-barrels in the outer membrane, the C-termini being extracellular (Lieberman et al., 2015). The N-termini are periplasmic and form four discrete domains: SS1, SS2, N0 and N1 (“SS” stands for “species specific”, as these two N-terminal domains vary most between species, with the more C-terminal N domains being more highly conserved) (Tammam et al., 2013). The N0 domain of PilQ (which is formed from the central region of the protein) has been shown to interact with PilP, linking the secretin to the PilMNOP transmembrane complex (Tammam et al., 2013). This explains why PilP is not found in Gram- positive organisms – if its role is to link the PilMNO complex to the secretin, this is redundant in Gram-positives which lack an outer membrane and, therefore, a secretin. To allow the pilus to pass through, the secretin must of course contain a wide pore (~8 nm wide at its widest point in the case of Thermus thermophilus (Gold et al., 2015)). Were this pore to remain permanently open, even in the absence of a pilus, this would of course destroy the integrity of the outer membrane. Secretins are, therefore, gated. Secretin complexes contain two gates: Gate 1 is at the outer membrane and Gate 2 is at the base of the secretin, at

27 the centre of the periplasm. These are of course closed in the absence of a pilus, leaving a large, empty vestibule between them (Gold et al., 2015). To open these gates and allow the passage of pili, the secretin undergoes significant conformational changes, resulting in T. thermophilus in the formation of a pore all the way through the complex ~7 nm wide (Gold et al., 2015). (These precise widths will doubtless vary between species as the pili of different species are of different widths).

Secretin Pilotin The final component of T4P is the secretin pilotin. Again, this is only found in Gram-negative bacteria, and is called PilF in P. aeruginosa or PilW in N. meningitidis. PilF is an outer membrane lipoprotein which promotes the transport of PilQ monomers across the periplasm to the outer membrane and the subsequent formation of the multimeric secretin complex (Koo et al., 2008). How PilF performs this role remains unclear.

1.3 Assembly of Type IV Pili The T4P apparatus has previously been described as containing four “interdependent” subcomplexes: the motor subcomplex, the alignment subcomplex, the outer membrane subcomplex and, of course, the pilus itself (Burrows, 2012; Roux et al., 2012). The motor subcomplex is proposed to comprise PilC and PilB, and where present PilT and any other retraction ATPases; the alignment subcomplex is proposed to comprise PilM, PilN, PilO and (in Gram-negative species) PilP; the outer membrane complex is proposed to comprise PilQ and PilF (i.e. the secretin complex), and the pilus of course comprises the major pilin and various minor pilins (Burrows, 2012). The pre-pilin peptidase PilD is not specifically associated with any of these subcomplexes. A schematic diagram of an assembled Gram- negative Type IV Pilus is shown in Figure 1.7. The alignment sub-complex, as stated above, is composed of PilMNOP. These form a tetramer with a 1:1:1:1 stoichiometry (Karuppiah and Derrick, 2011; Tammam et al., 2011). It also appears that PilP and PilQ interact with a 1:1 stoichiometry (Berry et al., 2012). Given that secretins appear to comprise (on average) about 12 PilQ monomers (see above), it seems that the alignment sub-complex comprises (on average) 12 PilMNOP tetramers, which presumably form a ring (or series of rings) in the periplasm, through which the pilus can extend (see Figure 1.8). It is assumed that gaps must exist between each PilMNOP tetramer, in order to enable pilins to pass through the ring of alignment complexes to be incorporated into the pilus shaft.

28

Outer Figure 1.7. Schematic Diagram Membrane Showing the Structure of a Gram- Negative Type IV Pilus. PilM-PilN can exist as a single membrane protein (such as EPEC BfpC) or as two separate proteins, cytoplasmic PilM and membranous PilN (such as in P. aeruginosa etc.). In this instance only Inner Membrane one entity is shown as the example. Adapted from (Korotkov et al., 2012). Used with permission.

1.3.1 Outside-In Model How the T4P apparatus is assembled is, however, unclear. Evidence from Myxococcus xanthus suggests that the apparatus is assembled from the outside in (i.e. beginning with the outer membrane sub-complex, then the alignment complex (first PilP, PilN and PilO, then PilM) and then the inner membrane sub-complex, before the pilus can be formed) (Friedrich et al., 2014). This is based on the discovery that T4P expression in M. xanthus is bipolar (i.e. T4P assembly is localised to both poles of each bacterium). The authors of the study found that while localisation of the outer membrane complex was independent of any other T4P components, localisation of PilNOP was dependent on the outer membrane complex, localisation of PilM was dependent on the outer membrane complex and PilNOP, etc. (Friedrich et al., 2014). The evidence for this “outside-in” model of assembly therefore seems strong. However, interaction and structural studies of the Thermus thermophilus alignment complex proteins PilMNO suggests that, of these proteins, PilMN associate first and then recruit PilO (Karuppiah et al., 2013). These findings appear somewhat contradictory, with the former suggesting that PilM is recruited to the assembly complex after PilO, and the latter suggesting that PilM is recruited to the assembly complex before PilO. It may be that the different methodologies used have resulted in misleading results (while the M. xanthus study was performed in vivo, the T. thermophilus study was performed using proteins expressed recombinantly in E. coli), or it may be that the route of T4P assembly varies between species.

29

Figure 1.8. Schematic Diagram of a Hypothetical Structure of the Secretin/Alignment Complex. A. A view of the complex from above. PilMNOP tetramers surround the PilQ secretin pore, with the tetramers radiating outwards from the secretin pore. Gaps between the tetramers are clearly wide enough for PilA to pass through. B. A side on view of the same, incorporating crystal structures of the proteins, where known. Srn = secretin. The colour scheme for the proteins is the same as in A, with the structure of PilM in aquamarine. Figure adapted from (Tammam et al., 2013). Used with permission. Of further interest, the above-mentioned study of T. thermophilus proteins indicated that PilA4 (the T. thermophilus major pilin) is recruited to the PilMNO complex via interactions with PilN. It is hypothesised that in in vivo PilA4 would be recruited to the assembly complex in this manner, and then of course assembled into the pilus (Karuppiah et al., 2013). The authors of the study speculate that this recruitment of PilA4 could also function to ‘prime’ pilus assembly. This, however, seems unlikely to be the case: evidence from both the P. aeruginosa T4P and from the related Type II Secretion System (discussed below) suggests that T4P assembly is in fact primed by minor pilins, which are believed to form a sub-complex which acts as the tip of the pilus (Korotkov and Hol, 2008; Nguyen et al., 2015).

1.3.2 Mutagenesis Studies Important evidence regarding how T4P are assembled has also been derived from various mutagenesis studies. Initial studies identified the proteins essential for T4P biogenesis, showing for instance, that 15 proteins are essential for production of the T4aP of Neisseria meningitidis: PilE (the major pilin), PilD (the pre-pilin peptidase), PilF (the assembly ATPase), PilG (the platform protein), PilM, PilN, PilO and PilP (the alignment complex proteins), PilQ (the secretin), PilW (the secretin pilotin), the minor pilins PilHIJK and the tip protein PilC1/C2 (Carbonnelle et al., 2005). Similarly, 12 proteins have been found to be required for biogenesis of the T4bP of Enteropathogenic E. coli (EPEC), known as Bundle-Forming Pili (BFP): the

30 major pilin (BfpA/bundlin), the pre-pilin peptidase (BfpP), the assembly ATPase (BfpD), the platform protein (BfpE), the alignment complex proteins (BfpC, BfpL and BfpU), the secretin (BfpB), the secretin pilotin (BfpG) and the minor pilins BfpIJK (Anantha et al., 2000; Blank and Donnenberg, 2001; Ramer et al., 2002). In both systems, the retraction ATPase was found only to be necessary for T4P retraction, not initial formation (Carbonnelle et al., 2005; Ramer et al., 2002). Further work in N. meningitidis resulted, however, in the interesting discovery that those proteins apparently essential for T4P biogenesis can be split into three classes. The first class contains only the secretin, PilQ. PilQ is not essential for synthesis of the pili, only for their extrusion through the outer membrane. Thus, in the absence of PilQ, N. meningitidis does not display pili on its surface – rather, completely synthesised pili become trapped in the periplasm (Carbonnelle et al., 2006). The other two classes are: (1) proteins required for T4P assembly and (2) proteins required to counteract the retraction force of PilT. This has been elucidated from studies on N. meningitidis double mutants, wherein in each strain both pilT and one other gene apparently essential for biogenesis are deleted. This study showed that in such double mutants wherein pilD, pilF, pilM, pilN, pilO, or pilP are deleted with pilT, the bacterial cells remain apiliated. Thus, even when no retraction force can be exerted on any pili, the cells are nevertheless unable to synthesise them, indicating that PilD, PilF, PilM, PilN, PilO and PilP (i.e. the pre-pilin peptidase, the assembly ATPase and the components of the alignment sub-complex) are essential for initial assembly of T4P in N. meningitidis (along, of course, with the major pilin PilE) (Carbonnelle et al., 2006). However, in such double mutants wherein pilC1/C2, pilG, pilH, pilI, pilJ, pilK or pilW are deleted with pilT, the bacterial cells are piliated (Carbonnelle et al., 2006; Carbonnelle et al., 2005). This indicates that, though in a wild-type background these genes are essential for pilus biogenesis, they are not actually required for pilus assembly. Rather, they appear to function to somehow stabilise the pili and counteract the retractive action of PilT. (Interestingly, though N. meningitidis is able to assemble pili in the absence of the above-listed proteins in a ΔpilT background, in the majority of these strains the pili produced were completely non-functional. Only in pilG/pilT and pilH/pilT double mutants was any pilus functionality seen, and this was reduced in comparison to wild-type (Carbonnelle et al., 2006).) Following publication of the above study, an equivalent double-mutation study was performed in P. aeruginosa. Somewhat surprisingly, the platform protein PilC was found to be essential for T4P assembly in P. aeruginosa (the N. meningitidis homologue PilG was found to be necessary only for pilus stabilisation); meanwhile, the components of the alignment sub-

31 complex (PilMNOP), which were identified as essential for pilus assembly in N. meningitidis, were found to be required only for pilus stabilisation in P. aeruginosa (Takhar et al., 2013). These data appear to suggest that the process of pilus assembly differs somewhat between N. meningitidis and P. aeruginosa. If this is the case, this would of course mean that the attempts to develop a single model of pilus assembly is futile. It has thus been argued that one or both of these studies must have been flawed (Berry and Pelicic, 2015), as it seems unlikely that significant differences would exist in the assembly pathways of relatively closely related T4aP. However, both the above-referenced studies appear reliable and contain no obvious flaws, so it may be that the process of T4P assembly has diverged somewhat between species, and no single assembly pathway exists to be elucidated.

1.4 Related Structures Several T4P-like systems (sometimes known as Type IV Filaments (Berry and Pelicic, 2015)) are known throughout prokaryotic organisms, which are presumed to have a common origin. In particular, these include the Gram-positive Competence system, the archaeal flagellum (archaellum) and, most importantly, the Gram-negative Type II Secretion System (T2SS) (Berry and Pelicic, 2015).

1.4.1 Type II Secretion System The T2SS is the main terminus of the General Secretory Pathway (GSP), whereby proteins exported across the Gram-negative inner membrane by the Sec or Tat translocases are exported, in their folded states, across the Gram-negative outer membrane, and are thus secreted from the cells (Nivaskumar and Francetic, 2014). T2SSs have been identified in hundreds of bacterial species across all classes of proteobacteria and several others also, such as cyanobacteria (Nivaskumar and Francetic, 2014). These species include several important pathogens, in which the T2SS can be used to secrete virulence factors (Korotkov et al., 2012). As such, it is of great research interest and has been highly studied. The T2SS has a structure extremely homologous to that of Gram-negative T4P (shown in Figure 1.9). Its components are almost identical to T4P, comprising an outer membrane secretin, an inner membrane core protein, an alignment complex consisting of PilMNOP homologues and an assembly ATPase (see Figure 1.9). This ATPase drives the assembly of a pseudopilus, a structure which is very similar to a Type IV Pilus but is much shorter: while a Type IV Pilus can be up to several micrometres long, a T2SS Pseudopilus is thought only to be long enough to span the width of the bacterial periplasm (Giltner et al., 2012). This

32 pseudopilus consists primarily of a major pseudopilin (known generally as GspG (T2SS components generally have different specific nomenclatures in different species, but the ‘Gsp’ nomenclature can, for simplicity, be used as a general nomenclature, as is the case here (Korotkov et al., 2012))). Pseudopili also contain minor pseudopilins (GspHIJK) which, as in T4P, form the pseudopilus tip (Korotkov and Hol, 2008). Two major models for how the T2SS functions to export proteins across the outer membrane have been proposed: the “piston model” and the “rotary ratchet/Archimedes Screw model” (Nivaskumar and Francetic, 2014). The pre-eminent of these is the piston model, which postulates that substrate proteins initially bind to some part of the T2SS complex within the cross-periplasmic channel formed by the alignment and secretin complexes. This presumably stimulates GspE-driven pseudopilus biosynthesis; as the pseudopilus grows it will act as a piston, the tip of the pseudopilus pushing the substrate through the secretin channel and eventually through the secretin pore and out of the cell (Korotkov et al., 2012). In this model, how the pseudopilus would retract and the T2SS reset is unanswered, given the absence of a retraction ATPase from the system (as evident from Figure 1.9, no PilT homologue is found in the T2SS). The Archimedes Screw model postulates that the pseudopilus forms only once in each T2SS complex. The complex would then “switch on”, with GspE driving rotation of the pseudopilus. The pseudopilus would thus act as a “helical treadmill” (i.e. an Archimedes Screw), lifting substrate proteins out of the cell (Nivaskumar and Francetic, 2014). Though the functions of the T2SS and T4P are clearly very different (the T2SS being concerned with protein secretion, and T4P generally functioning in biofilm formation and/or host colonisation, as mentioned above and described in detail below), their close structural and operational relationship is extremely helpful in elucidating the mechanisms etc. of both systems. A great deal of our knowledge of T4P is derived or extrapolated from the T2SS, and vice-versa, as is evident from the references cited in this thesis.

1.4.2 The Archaellum Archaeal flagella, like those of eubacteria, drive swimming motility. However, archaeal flagella show no similarity to eubacterial flagella: rather, they have been found to be clearly related to T4P (Shahapure et al., 2014). The archaellum has a structure and components similar to T4P, as shown in Figure 1.10. The majority of archaea have only a single membrane, meaning that the composition of the archaellum appears to be a little more similar to Gram-

33 positive T4P than Gram-negative T4P, lacking a secretin and suchlike (Ghosh and Albers, 2011).

Figure 1.9. Structures of the T2SS (Left) and a Gram-Negative T4P (Right). The T4P structure is shown and described above in Figure 1.7. The similarity between the structures is clearly shown. Orange proteins are assembly ATPases (GspE is the T2SS homologue of PilB); purple proteins are inner membrane core proteins (GspF is the T2SS homologue of PilC); green proteins are alignment complex components (GspL is a T2SS homologue of PilM-PilN, GspO is the T2SS homologue of PilO, GspC is the T2SS homologue of PilP); GspD is the T2SS secretin (i.e. the PilQ homologue) and GspS is the T2SS secretin pilotin (i.e. the PilF homologue); GspG is the T2SS major pilin (i.e. equivalent to PilA); GspHIJK are minor pseudopilins, which form the pseudopilus tip (T4P minor pilins are believed to play an equivalent role, but these are not shown in the figure). Adapted from (Korotkov et al., 2012). Used with permission. Archaellum-mediated swimming motility is driven by archaellum rotation (Shahapure et al., 2014) (as is flagellum-mediated eubacterial motility (Berg, 2003)). Interestingly, it has been shown that rotation of the archaellum is driven by FlaI, the assembly ATPase (Reindl et al., 2013). This lends credence to the above-mentioned Archimedes Screw model of the T2SS, as it demonstrates that an assembly ATPase can drive Type IV Filament rotation as well as its synthesis. Our knowledge of T4P has been of great help in research into the archaellum, allowing us to quickly advance our understanding of the system. As yet, discoveries relating to the archaellum have not contributed to our understanding of T4P, research into T4P being so much more advanced than that into the archaellum, though this may of course change in the future.

34

It is also fascinating to see a T4P-related system in a wholly different kingdom of life to that wherein T4P are found.

Figure 1.10. Comparison of the Structures of a Gram-Negative T4P (Left) and the Archaellum (Right). The FlaI archaellum assembly ATPase is homologous to the PilB ATPase, and the archaellum itself comprises polymerised subunits of FlaB, the archaellin, which is homologous to bacterial Type IV pilins. Archaellins must be processed by the pre-pilin peptidase-like protein FlaK (not shown) prior to assembly into the archaellum (Shahapure et al., 2014). The archaellum protein FlaJ is believed to be a homologue of the T4P scaffold protein PilC. The roles of FlaFGHX are less certain. Figure adapted from (Korotkov et al., 2012). Used with permission. 1.4.3 Competence System The competence system is responsible for DNA uptake in naturally transformable Gram- positive bacteria (Korotkov et al., 2012). This system comprises a competence pseudopilus, which is related to a Type IV pilus (Johnston et al., 2014). A schematic diagram of the system is shown in Figure 1.11. The pseudopilus, which is formed from subunits known as ComGC, binds and enables the uptake of double-stranded DNA (Johnston et al., 2014). After binding the target DNA, the pseudopilus feeds it to the DNA receptor protein ComEA, whereupon one of the DNA strands is degraded by the recombinase RecA. The resultant single-stranded DNA is then fed into the cell through the transmembrane channel ComEC (Johnston et al., 2014). How DNA is fed to ComEA, and then transported into the cell, is unclear, given that no retraction ATPase is present in the system, which would of course easily allow the pseudopilus to pull the target DNA to the cell surface. ATP hydrolysis by the cytoplasmic ATPase ComFA is important in driving DNA uptake (Johnston et al., 2014). However, the proton-motive force

35 also appears to play an important role in driving DNA transport, though how this force is translated to exert force on target DNA is unknown (Berry and Pelicic, 2015). A great deal thus remains to be discovered as to how exactly the Competence system works.

Figure 1.11. The Gram-Positive Competence System. The pink-coloured ComGC pseudopilus binds the dark blue-coloured double-stranded DNA (dsDNA). The ComGC pseudopilus is assembled by the ComGA ATPase (a PilB homologue), and sits on the ComGB scaffold protein (a PilC homologue). The dsDNA is bound by the DNA receptor ComEA, and one strand then degraded by the recombinase RecA (not shown). The resultant single-stranded DNA (ssDNA) is then fed through the ComEC membrane channel. Transport through ComEC is thought to be driven by ATP hydrolysis by the ComFA ATPase (Johnston et al., 2014). Figure adapted from (Korotkov et al., 2012). Used with permission.

1.5 Gram-Positive Type IV Pili T4P were first identified in Gram-negative species, and for some time it was thought that they were exclusive to Gram-negative species (Melville and Craig, 2013). T4P-like surface structures were observed on the Gram-positive species Streptococcus sanguinis in the mid- 1980s (Fives-Taylor and Thompson, 1985), and a few years later on Clostridium difficile (Borriello et al., 1990). However, they were not positively identified as T4P (which was of course harder prior to wide-scale genome sequencing and the bioinformatics tools available today), and the work went largely unnoticed. In 2002, however, the production of T4P by a Gram-positive species was conclusively demonstrated. The cellulolytic, ruminal Gram-positive bacterium Ruminococcus albus was shown to use T4P to bind to cellulose (Rakotoarivonina et al., 2002). These Gram-positive T4P were similar in appearance to previously identified Gram-negative T4P, being very long (several µm), narrow (only ~4 nm) and flexible (Rakotoarivonina et al., 2002). In 2006, pili were identified in the Clostridia. The myonecrotic bacterium Clostridium perfringens was found to produce functional T4P (Varga et al., 2006). This species appeared to have the same T4P assembly genes as Gram-negative species, excepting pilF, pilP and pilQ (Varga et al., 2006). The absence of these genes is not surprising. PilQ and PilF form the secretin through which a Type IV Pilus passes through the outer membrane of Gram-

36 negative species. Gram-positive species of course lack this second membrane, rendering a secretin unnecessary, while given that the function of PilP appears to be to link the PilQ secretin to the PilMNO alignment complex, it would also appear to be obviously unnecessary in a Gram-positive organism. PilMNO, meanwhile, are hypothesised to form a channel through the Gram-positive cell wall, and thus remain essential. Interestingly, C. perfringens carries two T4P gene clusters on its genome: one cluster appears to contain all the genes necessary for T4P assembly; the second cluster encodes pilB, pilC, a pilin and an unknown gene. A pilT gene is present in a third locus elsewhere on the genome (Melville and Craig, 2013; Varga et al., 2006) (see Figure 1.12).

Figure 1.12. Type IV Pili Genes of Clostridium perfringens. The primary cluster (top) contains all genes thought to be necessary for assembly of T4P in Gram-positive species. The secondary cluster (middle) contains genes only for PilB, PilC, a pilin and an unknown protein. PilT is encoded by a gene at a third location on the genome with no other obviously related genes nearby (bottom). From (Melville and Craig, 2013), used with permission. It was found that both PilA1 and PilA2 were incorporated into T4P produced by C. perfringens. Interestingly though, it was found that PilC1 (which, as shown in the above image, is localised in a different cluster to PilA1 and PilA2), appeared to be essential for assembly of PilA1/2 pili (Varga et al., 2006). Furthermore, PilT (which as described above is generally required only for T4P retraction in Gram-negative species, and is actually antagonistic to T4P assembly) was also found to be apparently essential for assembly of pili from PilA1/2 (Varga et al., 2006). This suggested that, firstly, the mechanism of T4P synthesis may differ somewhat in Gram-positive species (or at least in some Clostridia) to that in Gram-negative species. It also suggested the possibility of some very interesting interplay between the separate T4P gene clusters in C. perfringens. Similar T4P gene clusters were also identified in Clostridium difficile and Clostridium beijerinckii, suggesting that the T4P may function the same in these species as in C. perfringens. In Clostridium tetani and Clostridium botulinum by contrast, the species appear to encode only one complete set of T4P genes, though these genes are scattered

37 around the genomes in at least three separate loci (Varga et al., 2006). This was all that was known of the fundamental biology of Gram-positive T4P at the time of initiation of the studies for this PhD – i.e. not a great deal.

1.6 Functions of Type IV Pili Having addressed the structure and assembly of T4P, I will now turn to their functions. T4P are of distinct scientific/medical interest as they frequently act as virulence factors, mediating adhesion of bacterial cells to host cells, and/or to each other during biofilm formation. They are also able to mediate bacterial adhesion to surfaces. As mentioned above, T4P also mediate a flagella-independent form of motility known as twitching motility, wherein the T4P are used effectively as grappling hooks, their retraction dragging the cells across a surface. In a handful of other species, T4P have been found to play unique and unexpected roles. All this is described below.

1.6.1 Adhesion to Host Cells The T4P of multiple species have been implicated in virulence by direct attachment to host cells, enabling host colonisation by pathogenic bacteria. This is generally enabled by adhesins at the pilus tips which specifically bind to host cells. N. meningitidis encodes PilC proteins, of which there are two isoforms, PilC1 and PilC2 (Morand et al., 2009). These proteins act as adhesins located at the tip of the T4P (Rudel et al., 1995) which both mediate direct attachment of T4P to human epithelial and endothelial cells (Morand et al., 2009). PilC1 in particular has also been shown to affect signalling within the bound human cells, reducing cellular motility, loosening cellular attachments to the substratum and causing the formation of aggregates of infected cells (Morand et al., 2009). Thus N. meningitidis T4P mediate bacterial attachment to human cells and play a direct role in toxicity, also. T4P also appear to play an important role in host colonisation by P. aeruginosa. P. aeruginosa T4P have been found to directly bind carbohydrates (N-glycans) found on the surface of human respiratory tract epithelial cells (Bucior et al., 2012); T4P have thus been shown to mediate P. aeruginosa binding to the apical surface of polarised epithelial cells, where these N-glycans are localised (Bucior et al., 2012), while P. aeruginosa lacking PilA has been found to display defective binding to respiratory tract epithelial cells (Hambrook et al., 2004). Interestingly, though T4P are not required for binding of P. aeruginosa to Caco-2 cells (intestinal epithelial cells), they are required for P. aeruginosa invasion of a Caco-2 cell layer,

38 while retractile T4P are required for epithelial invasion by P. aeruginosa (i.e. a pilT mutant, able to assemble but not retract T4P, was not able to invade an epithelial cell layer) (Hayashi et al., 2015; Heiniger et al., 2010). This shows that T4P are required for more than simply adhering to host cells by P. aeruginosa. P. aeruginosa invasion of human cell layers has been shown to be mediated by the Type III Secretion System (T3SS) (Okuda et al., 2010), and T3SS function (i.e. the injection of T3SS effector proteins into target cells by adherent bacterial cells) has been found to be reliant upon T4P, even when T4P have not been required for initial adherence to the host cells (Hayashi et al., 2015). This has been interpreted to mean that retraction of P. aeruginosa T4P which are bound to the surface of the host cell is required in order to pull adherent bacterial cells into close contact with the host cell, to enable T3SS function and injection of T3SS effectors into the target host cell. Presumably, without pilus retraction the bacterial cells are unable to get close enough to their bound host cells to reach them with their T3SS, giving P. aeruginosa T4P a fascinating double function in both initial adherence to and invasion of host epithelial cell layers. This function in the binding of host cells may be a quite common function of T4P in pathogens, being postulated to be a function of the T4bP of EPEC in addition to the T4aP described above (Roux et al., 2012).

1.6.2 Bacterial Aggregation Probably an even more common function of T4P is in driving the aggregation of bacteria and thus the formation of microcolonies and biofilms. In this function, the T4P mediate direct contacts between bacterial cells, allowing them to stick together. Microcolony formation is frequently an essential first stage in the successful colonisation of a host organism by a pathogenic bacterium, while biofilm formation tends to play an important role in the initiation of a chronic infection, protecting the biofilm inhabitants from the host immune system and, in the modern era, from antibiotics. T4P have been implicated in bacterial aggregation in many species, including those described above wherein the T4P have been implicated in direct host adhesion. For instance, in Neisseria meningitidis, the minor pilin PilX has been shown to function to mediate bacterial aggregation (Hélaine et al., 2005). N. meingitidis, when expressing wild-type (WT) T4P (which contain PilX), forms large aggregates, which are not seen in the absence of PilX. This aggregation was found to be essential for adherence of N. meningitidis to human cells, although PilX was shown to play no direct role in adhering to them, demonstrating that this PilX- mediated aggregation was essential for adherence (Hélaine et al., 2005).

39

The T4P of P. aeruginosa have been shown to be important for both microcolony and biofilm formation in the species (Burrows, 2012). Biofilms are essential for chronic P. aeruginosa infections (Høiby et al., 2010), further demonstrating the importance of T4P in P. aeruginosa virulence. Fully functional T4P have also been shown to be necessary for the formation of biofilms with proper architecture and structure. In the absence of the retraction ATPase PilT, P. aeruginosa biofilms are not formed properly, indicating that to form biofilms with the correct architecture the bacteria must be able to use their T4P to pull themselves around into the correct positions (Conrad et al., 2011). The T4bP BFP of EPEC and the Toxin Coregulated Pilus (TCP) of Vibrio cholerae have also been show to play important roles in bacterial aggregation. Both BFP and TCP are essential for microcolony formation in their respective species (Roux et al., 2012), meaning that both types of pili are important virulence factors. However, bacterial aggregation is not always important in virulence. Type IV pili have been found to play a role in biofilm formation by the Gram-positive pathogen C. perfringens. In this species T4P-deficient mutants were found to be as much as 80 % less efficient at biofilm synthesis than the WT (Varga et al., 2008). In a mouse model of C. perfringens infection the animals infected with these mutants were found to display significantly different symptoms compared to those infected with the WT, which may be due to the associated reduction in biofilm formation. However, no overall reduction in virulence was seen, with the mutant strains proving no less lethal than the WT (Varga et al., 2006) (though this may not be surprising as the disease model used is essentially toxin-mediated).

1.6.3 Twitching Motility As described above, T4P are able to mediate a unique flagella-independent form of motility known as twitching motility. In twitching motility, bacteria extend their T4P which adhere to the surface. Once the T4P are adhered to the surface, they are retracted, causing the bacterial cells to be dragged across the surface to the spot where their T4P have adhered to it (Burrows, 2012). Many species employ their T4P in this way, but not all T4P are able to mediate this form of motility. As discussed above, T4P are required to enable bacteria to move around a biofilm and ensure it form with the correct architecture (Conrad et al., 2011). This movement is twitching motility. However, twitching motility is not limited to use in biofilms. It is also use by bacteria to move around surfaces, potentially to explore and to move towards attractants (Burrows, 2012).

40

Twitching motility has been demonstrated in a broad range of bacterial species, including proteobacteria (Burrows, 2012), cyanobacteria (Khayatan et al., 2015) and Gram- positives (Varga et al., 2006).

1.6.4 Other Functions Though the above three functions may be the most widespread and/or most medically relevant functions of T4P, several others are also known, demonstrating the incredible versatility of T4P. In certain naturally transformable Gram-negative species (particularly Neisseria) T4P mediate DNA binding and uptake. Neisseria T4P comprise a minor pilin called ComP which binds DNA at species-specific DNA Uptake Sequences (Berry et al., 2013; Cehovin et al., 2013). Once bound by T4P DNA is taken up into the bacterial cell by PilT-mediated pilus retraction (Cehovin et al., 2013). T4P similarly mediate DNA uptake in V. cholerae, where a recently discovered chitin-regulated T4P (ChiRP, separate to the previously-discussed TCP) is responsible for natural competence (Matthey and Blokesch, 2016). It is as yet unknown though exactly how the ChiRP binds DNA, and whether this is mediated by a minor pilin as in Neisseria. T4P in some species appear to play a role in cell signalling. For instance, P. aeruginosa uses its T4P to sense surface attachment. When T4P attach to a surface and are retracted, this generates tension in the T4P. It has been shown that in P. aeruginosa, when a T4P is retracted under tension, this tension is somehow identified by the chemosensory protein PilJ, which in turn activates the Chp signalling cascade (Persat et al., 2015). This results firstly in a positive feedback loop, with the Chp cascade, increasing T4P assembly and retraction, and also activates the Vfr transcription factor, which drives the transcription of various virulence factors, including the Type II and Type III Secretion Systems. Binding of T4P to a surface thus directly up-regulates virulence of P. aeruginosa (Persat et al., 2015). Likewise, the T4P of Myxococcus xanthus are believed to function as extracellular sensors, which are able to detect the presence of other nearby cells. When such cells are identified, the T4P somehow signal back to M. xanthus and activate production of fibril exopolysaccharides (Black et al., 2006). How exactly this sensory activity of T4P functions is as yet unknown. Discussed above is the function of T4P in bacterial aggregation. Bdellovibrio bacteriovorus also uses T4P to attach to other bacterial cells, but for an altogether different purpose to the species discussed above. B. bacteriovorus is a predatory bacterium, which

41 effectively eats target Gram-negative bacteria. B. bacteriovorus creates a small hole in these bacteria, enters them and digests them from the inside using various secreted enzymes (Lambert et al., 2009). The first stage of this predation cycle is the attachment of a B. bacteriovorus bacterium to a target, prey bacterium. B. bacteriovorus has been found to use T4P to perform this attachment, which is essential for predation; in the absence of T4P B. bacteriovorus is unable to prey on other bacteria (Evans et al., 2007). Geobacter sulfurreducens, an Fe(III) reducer, uses T4P to contact and reduce Fe(III) oxides. These T4P are used as nanowires which conduct electrons to the target electron acceptors, and in their absence G. sulfurreducens is unable to reduce such targets (Reguera et al., 2005). The above examples demonstrate the incredible variety of functions T4P have been found to play across the bacterial kingdom, and it may well be that other, previously unimagined functions remain to be discovered.

1.7 Regulation of Type IV Pili Expression Many species do not constitutively express their T4P, rather regulating their expression so they are only produced at certain times/under certain conditions. Certain mechanisms have been discovered by which this regulation is implemented. P. aeruginosa for instance uses three types of mechanism to regulate the expression of its T4P: chemotactic control, small molecule signalling and two component signalling (Leighton et al., 2015). Chemotactic control is via the Chp system. The extracellular sensor of this system is PilJ, a methyl-accepting chemotaxis protein (Kearns et al., 2001). As discussed above, PilJ senses tension in T4P, but may well also sense extracellular conditions via the binding of specific ligands, particularly phosphatidylethonalamine (Kearns et al., 2001). Activation of PilJ is presumed to lead to a phosphorylation cascade which ultimately leads to activation of the adenylate cyclase CyaB (Leighton et al., 2015). This leads to synthesis of cyclic AMP that binds and activates the transcription factor Vfr, which activates transcription of various virulence genes, including the T4P alignment sub-complex and minor pilin genes (Wolfgang et al., 2003). The two component systems PilR-PilS and AlgR-FimS are involved in regulation of T4P in P. aeruginosa (Burrows, 2012). The PilR-PilS system regulates expression of the major pilin, pilA (Burrows, 2012), while AlgR-FimS operate another level of control of expression of the minor pilin operon (Belete et al., 2008). With regard to the PilR-PilS system, PilS is a sensor kinase and PilR a response regulator. PilS is activated by a currently unknown signal,

42 inducing its autophosphorylation followed by phosphorylation of PilR. Phosphorylated PilR then activates transcription of PilA (Mikkelsen et al., 2011). With regard to the AlgR-FimS system, FimS is the sensor kinase and AlgR the response regulator. Exactly how this system operates to control minor pilin expression is uncertain (Leighton et al., 2015). The small molecule regulating P. aeruginosa T4P is cyclic-di-GMP (c-di-GMP). Again, exactly how this works is not well understood, but the c-di-GMP-binding protein FimX is important for T4P formation, with mutants lacking FimX having much reduced numbers of T4P on their surfaces (Huang et al., 2003; Qi et al., 2011). C-di-GMP has been found to regulate T4P expression in other organisms, too. In Myxococcus xanthus c-di-GMP has been found to down-regulate T4P expression/production (Skotnicka et al., 2015). Increased c-di-GMP levels down-regulate the transcription of pilA in M. xanthus, but neither how this reduction is achieved, nor how c-di-GMP levels in M. xanthus are regulated, is known (Skotnicka et al., 2015). The regulation of T4bP tends to be simpler than that of T4aP, as unlike T4aP, T4bP genes tend all to be found in one operon, rather than spread around the genome in multiple clusters (Roux et al., 2012). The T4bP of EPEC and V. cholerae (the BFP and TCP respectively) are both regulated by AraC-type transcription regulators. For BFP this regulation is ultimately based on the growth phase of the bacteria and the temperature and calcium and ammonium concentrations of the environment (Puente et al., 1996). Under appropriate conditions transcription of the transcriptional regulator perA is activated. PerA then activates transcription of the BFP operon (Tobe et al., 1996). In V. cholerae, the AraC-type transcriptional regulator ToxT is primarily responsible for regulating TCP expression. Similarly to PerA in EPEC, production of ToxT is induced by various environmental signals, including temperature and bile constituents (Weber and Klose, 2011). Again, ToxT, once produced, activates transcription of the TCP genes (Weber and Klose, 2011). ToxT production is regulated by a combination of a signalling cascade, a two- component system and c-di-GMP concentrations (ToxT production being induced by low levels of c-di-GMP) (Roux et al., 2012). In addition to the above-described regulatory methods, others are known and still others await discovery, but these give a good flavour of how T4P expression tends to be regulated by Gram-negative bacteria. Only one method of regulation of T4P expression in Gram-positive bacteria had been elucidated at the time this PhD project was begun: that of Clostridium perfringens. In C. perfringens T4P expression is subject to carbon catabolite repression through the regulatory

43 protein CcpA (Mendez et al., 2008). When grown in the presence of rich carbon sources (e.g. glucose), the protein HPr-Ser is phosphorylated and binds CcpA to form a complex which binds Catabolite Response Elements (CREs). By binding CREs, CcpA is able to either up- or down-regulate the downstream gene. CcpA down-regulates expression of both the primary T4P operon in C. perfringens, and of pilT, which is encoded elsewhere on the genome, thus inhibiting T4P expression (Mendez et al., 2008).

1.8 Clostridium difficile One of the Clostridial species earlier identified as encoding T4P genes on its chromosome was C. difficile (Varga et al., 2006). C. difficile is a Gram-positive, spore-forming obligate anaerobe. It is of significant medical and research interest, being the main cause of nosocomial, antibiotic-associated diarrhoea (Kuipers and Surawicz, 2008). Though the number of cases per year has been falling significantly over the past several years in the UK (Cole, 2013), across continental Europe and the USA no such reduction has been seen, with global case numbers increasing (Lessa et al., 2012). Indeed, it is estimated that there are over 450,000 cases of C. difficile infection per year in the USA alone, contributing to 29,000 deaths (being the direct cause of half of which) (Lessa et al., 2015) and costing up to $4.8 billion per year (CDC-Press- Release, 2015). Both the medical and economic imperatives of combatting this disease are therefore apparent. Young, healthy individuals are not generally considered to be at risk for C. difficile infection; the elderly or infirm, however, are at risk, these comprising the majority of patients (Heinlen and Ballard, 2010). C. difficile disease is associated with a broad spectrum of symptoms, from mild diarrhoea to potentially fatal pseudomembranous colitis and toxic megacolon. Severity of disease is believed to arise from a combination of the status of the patient (age, state of immune system etc.) and virulence of the infecting strain (Heinlen and Ballard, 2010). The species of C. difficile comprises a vast array of strains. These are grouped into related strains based on a characteristic known as the ribotype. This is based on polymorphisms in the 16S-23S rRNA intergenic region, with diagnostic PCR resulting in specific and unique DNA banding patterns visible on agarose gel electrophoresis (Bidet et al., 1999). The significant increases seen in C. difficile infections globally are associated with the emergence in the middle of the last decade of hypervirulent strains of the bacterium of ribotype 027. These strains are more virulent than those previously known (virulence is discussed below) being associated with increased morbidity and mortality, and increased disease

44 incidence in previously low-risk patient groups (Heinlen and Ballard, 2010). It appears that it is these hypervirulent strains which have brought C. difficile to widespread public attention, though the proportion of infections by 027 strains has (in the UK at least) shrunk over the last few years (Hope, 2015), showing that with proper infection control even these fearsome strains can be contained.

1.8.1 Sporulation and Germination As mentioned above, C. difficile is, like all Clostridia, a spore-forming bacterium. These spores are of vital relevance to C. difficile disease, being as they are the infectious agent – strains deficient in sporulation are unable to persist in the environment or transfer from one animal to another, rendering them non-infective (Deakin et al., 2012). C. difficile spores are metabolically inactive, rendering them insensitive to oxygen (Paredes-Sabja et al., 2014) (as mentioned above, C. difficile is an obligate anaerobe, meaning that metabolically active, vegetative C. difficile cells cannot survive in the presence of oxygen). Spores are also insensitive to antibiotics, host immune systems and bleach-free disinfectants/cleaning products (Paredes-Sabja et al., 2014). Thus, while C. difficile vegetative cells are unable to survive in the environment, C. difficile spores are hard to eradicate from any setting, meaning that where they are present they are often picked up and unintentionally ingested by surrounding animals, including humans. As discussed below, this can be the first stage in C. difficile infection (Heinlen and Ballard, 2010). C. difficile sporulation is regulated by the master regulator Spo0A, a transcription factor which regulates transcription of sporulation genes (Underwood et al., 2009). Under conditions not conducive for vegetative cell growth (e.g. nutrient shortages and other stress factors), Spo0A is activated by phosphorylation, in turn activating the sporulation pathway resulting in the formation of endospores from vegetative cells (Underwood et al., 2009). As shown in the transmission electron microscopy (TEM) images of Figure 1.13, C. difficile spores have a basic structure of a central core surrounded by a cortex, which in turn is surrounded by a coat, which forms the exterior of the spore (Permpoonpattana et al., 2013).

45

Figure 1.13. TEM Images of C. difficile Spores. The main spore components are labelled. Scale bar is 200 nm. Adapted from (Permpoonpattana et al., 2013). Used with permission.

Germination of C. difficile spores is induced by glycine in combination with cholate- derived primary bile salts, particularly taurocholate (Sorg and Sonenshein, 2008). Primary bile salts are one of the primary components of bile, which is produced by the liver, stored by the gall bladder and released into the intestine during digestion, to aid uptake of fat and cholesterol. Ingested C. difficile spores will thus encounter bile salts in the intestine. This induction of germination by bile salts is mediated by their binding to the germination receptor CspC (Francis et al., 2013). A second, glycine-binding germination receptor is presumed to exist, but has not yet been identified. Spore germination also requires a neutral pH of between 6.5 and 7.5 and a physiological temperature of approx. 37oC (Wheeldon et al., 2008), and appears to be enhanced by attachment to a semi-solid surface (Sorg and Sonenshein, 2008). All these requirements may be met in the neutralised region of the intestine, thus providing optimal conditions for C. difficile spore germination.

1.8.2 Virulence and Infection Toxins The primary virulence factors of C. difficile are the Large Clostridial Cytotoxins (LCCs) TcdA and TcdB, encoded by the tcdA and tcdB genes (Just and Gerhard, 2004). These genes are encoded within a ‘Pathogenicity Locus’ (‘PaLoc’) (Hammond and Johnson, 1995). Toxinogenic strains of C. difficile can carry both tcdA and tcdB, just tcdA or just tcdB. The majority of toxinogenic strains are tcdA+/tcdB+ or tcdA-/tcdB+ – only a handful of tcdA+/tcdB- strains have ever been isolated (Monot et al., 2015). Non-toxinogenic strains are also relatively common, which lack the entire PaLoc locus and are generally considered avirulent (Natarajan et al., 2013). Also encoded within the PaLoc is the alternative sigma factor TcdR, which is required for transcription of tcdA and tcdB (Mani and Dupuy, 2001), the anti-sigma factor TcdC, which is believed to down-regulate expression of tcdR (Dupuy et al., 2008), and TcdE,

46 a holin-like protein through which TcdA and TcdB are secreted (Govind and Dupuy, 2012). A schematic diagram of the PaLoc is shown below in Figure 1.14.

Figure 1.14. The C. difficile PaLoc. The PaLoc represented here is that found in a TcdA+/TcdB+ Strain. The genes are ordered as shown, with tcdC encoded on the opposite strand to the other genes in the locus. TcdA and TcdB function by glucosylating small GTPases of the Ras and Rho families in the host cells in the vicinity of the infection (generally of course intestinal epithelial cells), thus inactivating the target GTPases, leading to actin condensation, cell rounding and, eventually, cell death (Voth and Ballard, 2005). Both TcdA and TcdB are able, alone, to carry out this function, meaning that any strain carrying one or other toxin has the potential for virulence, though TcdB appears to be significantly more toxic than TcdA (Kuehne et al., 2014). Both TcdA and TcdB enter host cells by receptor-mediated endocytosis, though they recognise different receptor proteins (Chaves-Olarte et al., 1997). Two TcdB-receptor proteins have recently been identified, which may function in a dual-receptor model. These proteins are Chondroitin Sulphate Proteoglycan 4 (CSPG4) and Poliovirus Receptor-Like 3 (PVRL3) (LaFrance et al., 2015; Yuan et al., 2015). It is suggested that of these two proteins, PVRL3 may be of more relevance in vivo, as intestinal epithelial cells do not appear to express CSPG4 on their cell surface (LaFrance et al., 2015). There are also indications that an as-yet-undefined third TcdB receptor may exist (LaFrance et al., 2015). The TcdA receptor(s) also remains to be discovered. Hyper-virulent strains of C. difficile have been found to express TcdA and TcdB more highly than less virulent strains, contributing to their increased virulence (Heinlen and Ballard, 2010). Certain strains of C. difficile, including hyper-virulent ones, also encode a third exotoxin, a binary toxin known as Clostridium difficile transferase (CDT) (McDonald et al., 2005; Perelle et al., 1997). CDT is an actin-ADP-ribosylase, which ribosylation blocks actin polymerisation and promotes actin depolymerisation (Aktories and Wegner, 1992). The ultimate function of CDT appears to be to promote bacterial adherence to colonic membranes, as exposure of colon cells to CDT results in microtubule rearrangement, which drives the formation of long cellular protrusions (Schwan et al., 2009). These protrusions wrap around C. difficile cells, hugging them to the membrane and helping them to adhere to it (Schwan et al., 2009). At high concentrations, CDT exposure leads to cell death (Papatheodorou et al., 2011).

47

The mechanism of CDT is relatively well understood. Like all binary toxins, CDT consists of two protein components, A and B (CDTa and CDTb). CDTb is a membrane- receptor binding component, which binds the membrane protein Lipolysis-Stimulated Lipoprotein Receptor (LSR) (Papatheodorou et al., 2011). Once CDTb is bound to LSR, CDTa binds CDTb and the complex is taken up into the cell by endocytosis. The lowering of the endosome pH drives insertion of CDTb into the endosomal membrane, where it forms a pore through which CDTa passes into the cell, and its ribosylase function is activated (Gerding et al., 2014). However, the importance of CDT to disease is less well understood. There is some evidence that its presence can enhance the virulence of a strain, causing increased disease severity in patients (Goldenberg and French, 2011). Unlike TcdA or TcdB though, which alone are able to confer virulence on C. difficile, CDT cannot, indicating it can only ever play a “supporting role” to the LCCs (Kuehne et al., 2014).

Infectivity Mechanisms C. difficile has long been known to be associated with antibiotic treatment (Bartlett et al., 1978). The exact basis for this association has, however, only recently begun to be unravelled. It is well known that antibiotics affect the constituency of the gut microbiota. In a healthy individual, the gut microbiota is diverse, numbering at least 150 different species (Qin et al., 2010). The bacteria in the gut microbiota are also extremely abundant, comprising up to an estimated 1011 bacterial cells/ml of contents (Walker and Lawley, 2013). In a healthy individual, the species composition of the gut microbiota seems to remain relatively stable over time, though the relative abundance of any particular species may alter periodically due to changes in e.g. host diet or suchlike (Walker and Lawley, 2013). There is no particular species make-up of a healthy gut microbiota, with an estimated 1150 or so species in total being associated with healthy human microbiotas, each individual merely having a selection of those (Qin et al., 2010). However, generally, obligate anaerobes predominate while facultative anaerobes comprise a minority of the microbiota; in particular, Bacteroidetes and species form the majority of the microbiota (Walker and Lawley, 2013). In many disease conditions, the constituency of the gut microbiota is disrupted, resulting in “dysbiosis”. Though various diseases/infections can induce dysbiosis, such as gastrointestinal infections, antibiotic treatment can do the same (Walker and Lawley, 2013). Dysbiosis is generally characterised by a reduction of bacterial diversity in the microbiota, while the relative abundances of the remaining species is often severely altered (Walker and

48

Lawley, 2013). This is of course the result when an individual is treated with antibiotics (particularly broad-spectrum antibiotics), as non-resistant species are eradicated by the drug. The number of species present overall in the microbiota can be severely reduced, leading to increased availability of nutrients and suchlike in the gut, opening niches for colonisation of the gut by species not generally present in a healthy microbiome, e.g. Salmonella or C. difficile, meaning that broad-spectrum antibiotic treatment resulting in dysbiosis can lead to infection by opportunistic pathogens such as these (Pham and Lawley, 2014). It is becoming ever clearer that a healthy microbiota is of vital importance for resistance to gut infections, and indeed more general immune system function (Clarke, 2014; Pham and Lawley, 2014). With specific reference to C. difficile, it has been elucidated that antibiotic-associated dybiosis enables/causes C. difficile infection primarily through the effect on the production of antimicrobial peptides. Antimicrobial peptides, particularly α-defensins, are known to play important roles in defense against C. difficile. Human α-defensins have been found to display potent activity against vegetative C. difficile (Furci et al., 2015), and also to neutralise TcdB (Giesemann et al., 2008). Studies in mice have found that antibiotic-induced dysbiosis reduces α-defensin production by intestinal Paneth cells. The intestinal microflora (particularly Lactobacilli) appear to induce Tol-like receptor signalling pathways, which activate α-defensin production (Menendez et al., 2013). Disruption of the intestinal microflora thus reduces α- defensin production, while restoration of the microflora restores α-defensin production (Menendez et al., 2013). Thus it appears that antibiotic treatment which disrupts the gut microflora (so particularly treatment with broad-spectrum antibiotics) causes disruption to one of the immune system’s key defences against C. difficile (i.e. α-defensins). Presumably, when C. difficile spores germinate in a healthy gut, α-defensins help rapidly destroy the resultant vegetative cells, and also neutralise any TcdB produced prior to eradication of the infection, which is thus asymptomatic. In individuals deficient in α-defensin production (primarily those suffering dysbiosis following antibiotic treatment but also those with more general immune- insufficiencies), when C. difficile spores germinate in the gut, the defences against the resultant vegetative cells are severely compromised, allowing them to colonise the gut and cause disease. This is likely to be an important mechanism by which antibiotic treatment causes susceptibility to C. difficile infection, but other co-effects are also likely important. For instance, antibiotic treatment has been found to increase the concentration of bile acids in the mouse gut, which is likely to drive increase germination of C. difficile spores making colonisation and infection more likely (Theriot et al., 2014).

49

Adhesion It is presumed that, once germinated, C. difficile must adhere to the host intestinal epithelium in order to establish infection. As yet, however, only one protein has been assigned a role as an adhesin: the lipoprotein CD0873 from strain 630 (the main laboratory strain of C. difficile) (Kovacs-Simon et al., 2014). The loss of this protein resulted in a deficiency in bacterial adhesion to the Caco-2 intestinal epithelial cell line. It is presumed that other adhesins are also present in the species, but none have as yet been identified. C. difficile has an S-layer, formed of the protein SlpA, and a unique set of cell wall- binding proteins characterised by the presence of three Cell Wall Binding Repeat 2 (CWB2) domains (Pfam 04122) (Fagan et al., 2011), which, via their CWB2 domains, directly and non- covalently bind the C. difficile cell wall (Willing et al., 2015). Various of these proteins have been hypothesised to be adhesins, but, as yet at least, there is no evidence of this. C. difficile also encodes a functional sortase enzyme, which has been demonstrated to covalently attach substrate proteins to the cell wall (Peltier et al., 2015). However, no evidence was found to suggest the sortase (or its substrates) play a role in adhesion. C. difficile also, as mentioned above, encodes Type IV Pili. Given the above-mentioned widespread function of T4P in colonisation and adhesion, these may play a role in adhesion.

Diagnosis and Treatment Diagnosis of C. difficile infection is challenging. The most commonly used test is a Nucleic Acid Amplification Test (NAAT) wherein the presence of the tcdA and tcdB genes is detected by PCR, though Enzyme Immuno-Assays (EIAs) wherein TcdA and TcdB are directly detected are also used (Gerding et al., 2016). The NAAT is quick and accurate, but can be over-sensitive, detecting C. difficile in patients with low-level colonisation with symptoms having a different cause, resulting in false-positives. The EIA on the other hand is very specific, but is slow and lacks sensitivity (Gerding et al., 2016). Research and development of improved tests is therefore ongoing. Treatment of patients diagnosed as infected with C. difficile is currently by antibiotic. Metronidazole is the first line treatment in milder cases, with vancomycin used in more severe cases (Mullane, 2014). However, these treatments are problematic. Firstly, resistance to metronidazole is increasing, while there is a desire to limit the use of vancomycin to attempt to combat the spread of vancomycin resistant Enterococci. Furthermore, as discussed above, antibiotic treatment is the main cause of C. difficile infection. Treatment of C. difficile with antibiotics, though generally initially successful, thus results in recurrent disease in a

50 significant proportion of patients (Heinlen and Ballard, 2010). There is therefore a need for new treatments. One route of research and development is antibiotics which will target C. difficile more specifically than those previously available. By targeting C. difficile specifically, the intestinal microflora is otherwise left relatively undamaged, thus reducing significantly the chances of C. difficile recurrence. A newly developed example of such an antibiotic is fidaxomicin, which shows significantly better results than vancomycin. However, the cost of fidaxomicin vastly exceeds that of vancomycin, meaning that it has been relatively little utilised (Mullane, 2014). A second route of research is into non-antibiotic treatments aimed at repopulating an infected gut with a healthy microbiome (as described above), to re-activate the immune system to defeat the C. difficile colonisation, and to out-compete C. difficile for its colonisation niche in the gut (Adamu and Lawley, 2013; Lawley et al., 2012). One such avenue if interest is in faecal transplants, wherein the faeces of a healthy volunteer donor is infused into the intestine of a suffering patient. Faecal transplants appear to be effective and to have few side-effects, offering good prospects as a possible mainstream treatment (Gough et al., 2011). Probiotic treatments have also been tested, with promising results (Petrof et al., 2013). Prevention of C. difficile has, however, seen less progress; no vaccine or prophylactic is currently available, with preventative strategies therefore currently focusing on prevention of C. difficile spread by good hygiene and careful use of antibiotics in at-risk populations.

1.8.3 Experimental Tools for C. difficile Research For a long time, research into C. difficile genetics was stymied by a lack of tools. Recent years have, however, seen several advances in the development of tools for use in research into C. difficile. Vectors now exist for both constitutive and regulated gene expression in the species: the C. difficile-derived cwp2 promoter (which natively drives expression of the C. difficile cell wall-associated protein Cwp2) is used for constitutive gene expression (Emerson et al., 2009); a tetracycline-inducible system comprising the divergent tet and tetR promoters, previously optimised for use in Staphylococcus aureus, but adapted for use in C. difficile, is used for regulated gene expression (Fagan and Fairweather, 2011). In this inducible system, the tetracycline analogue anhydrotetracycline is used to induce gene expression rather than tetracycline itself, on account of the toxicity of tetracycline to C. difficile (Fagan and

Fairweather, 2011). Ptet and PtetR have overlapping tet operator sequences, meaning that expression from the promoters is co-regulated. Ptet is used to drive expression of the protein of

51 interest, while PtetR is used to drive expression of TetR, a negative regulator of transcription from Ptet. Thus induction of expression from the promoters activates a negative feedback loop, ensuring tight control of gene expression from the system (Fagan and Fairweather, 2011). In addition to plasmid-based gene expression, mutagenesis systems have been developed for use in C. difficile. The first system developed was the “ClosTron” system (Heap et al., 2010; Heap et al., 2007). This was a system of insertional mutagenesis, whereby a targetable Group II intron ltrB gene from the Gram-positive bacterium Lactococcus lactis is used to interrupt target genes (Heap et al., 2007). The system is functional in all known Clostridial species. Though this mutagenesis system was a great improvement on the previous situation, where there was no way to generate targeted mutants of C. difficile, there were several problems with it. Firstly, the insertion of the intron disrupts transcription through a target gene, meaning that if the said gene is in an operon, polar effects are commonly seen. Secondly, the intron contains an erythromycin resistance marker (coincidentally, erm). This means that ClosTron mutagenesis of strains already resistant to erythromycin and its derivatives is effectively impossible; such strains include 630, a common laboratory strain. Furthermore, while generation of strains containing two or more ClosTron mutations is possible via marker recycling (Heap et al., 2010), the process is nevertheless challenging and labour intensive. Finally, strains carrying the ClosTron erythromycin resistance marker have been found to have problematic changes to antibiotic resistance in in vivo studies (unpublished data, Fairweather group). It was therefore a major breakthrough when two separate methods for the generation of “clean” mutants in C. difficile were developed in recent years. These methods employ allele exchange to cut target genes from the bacterial chromosomes. The first of these methods uses the cytosine deaminase (codA) gene of E. coli as a counter-selection marker (Cartman et al., 2012); the second method employs the pyrE gene from Clostridium sporogenes as a counter- selection marker (Ng et al., 2013). This latter method requires an initial pyrE deletion mutant, generated by allele-coupled exchange, in which second or further mutations of target genes are generated using the heterologous pyrE gene for counter selection. Once the desired second or further mutation is made, the native pyrE gene is repaired, leaving a clean mutant (Ng et al., 2013). Finally, in addition to these targeted mutagenesis systems, mariner-based random transposon mutagenesis systems have also been developed for C. difficile, allowing the generation of huge mutant libraries (Cartman and Minton, 2010; Dembek et al., 2015). Thus

52 most, if not all, techniques necessary for modern genetics-based research are now available for use in C. difficile.

1.8.4 Type IV Pili of C. difficile As described above, genes apparently encoding Type IV Pili were discovered in C. difficile in 2006 (Varga et al., 2006). Like in C. perfringens (described above), two gene clusters were found. However, neither of these appeared to contain all genes necessary for T4P biogenesis. The primary cluster appeared to contain all genes necessary for T4P biogenesis except pilN; the much smaller secondary cluster contains only four T4P genes: a pilin, PilB, PilC and PilM. Schematic diagrams showing the reported constitution of these gene clusters are shown in Figure 1.15. As shown above, neither cluster was reported to contain a pilN gene, posing an interesting question as to how the T4P might be formed. The primary cluster was reported to contain two pre-pilin peptidase genes, posing the question as to whether both are functional, and if so whether they therefore have separate substrates/functions. The secondary cluster on the other hand was reported to contain no pre-pilin peptidase gene, and must presumably therefore share one with from the primary cluster. The secondary cluster was also reported to lack a retraction ATPase, to encode only one pilin and only one accessory membrane protein (pilM). It is therefore possible that other genes encoded in the primary cluster are shared by the secondary cluster. As mentioned above, the production by C. difficile of T4P-like appendages had previously been reported (Borriello et al., 1990), so the demonstration that such genes were present in C. difficile should not have been unexpected. In 2009, it was then reported that C. difficile produce T4P in vivo (Goulding et al., 2009). Immuno-EM demonstrated the presence of pili on cells of strain 630 from a section of gut in an infected hamster. These pili could be stained with antibodies against the pilin protein encoded by gene CD630_3507 (Goulding et al., 2009). This is one of the pilins encoded by the primary T4P gene cluster, apparently demonstrating that the gene cluster does produce T4P and suggesting that the pili produced may play a role in infection. However, following this report, however, nothing further had been published, and nothing more was known, about the T4P of C. difficile at the initiation of this PhD project. This introductory chapter therefore sets out the level of knowledge of these pili at the start of this project. During the course of this project, certain important academic papers concerning the T4P of C. difficile were published, and these are discussed at length in the discussion sections

53 of results Chapters 3, 4 & 5, and in Chapter 7, the final discussion. These papers are very important to our present level of knowledge of the T4P of C. difficile, but are not helpful in setting the background to this project.

A

B

Key

Figure 1.15. The C. difficile T4P Gene Clusters, as Reported in (Varga et al., 2006). A is the primary gene cluster, B the secondary gene cluster. The key shows which genes are indicated by the colours (pilins were not identified as major or minor). 1.9 Clostridium sordellii Clostridium sordellii is a little-known member of the genus Clostridium. C. sordellii is generally a soil-dwelling bacterium, but is known to colonise the gut of a small percentage of humans and animals, and has also been identified in the vaginal microflora of a small number of women (Aldape et al., 2006). The species was first identified in the 1920s (Hall and Scott, 1927), and has now been shown by modern genomics to be one of the closest relations of C. difficile (Elsayed and Zhang, 2004). Many strains of C. sordellii are avirulent, but virulent strains are able to cause a variety of diseases in man and animals.

1.9.1 C. sordellii Disease C. sordellii has in the past been best known as the cause of a veterinary disease. C. sordellii is considered an important pathogen in various livestock, including cattle, horses and sheep (Lewis and Naylor, 1998; Manteca et al., 2001; Unger-Torroledo et al., 2010). In these species, C. sordellii infections are associated with sudden animal death, characterised by a pathology of lesions of haemorrhagic enteritis in the gut (though such pathology is not diagnostic of

54

C. sordellii infection, often alternatively/also being caused by C. perfringens) (Manteca et al., 2001). While primarily associated with veterinary disease for many years, it was also well known that C. sordellii was also able to cause various severe diseases in humans. Such diseases include , endocarditis and septic arthritis, though such cases are rare (Barnes and Leedom, 1987; File et al., 1977; Gredlein et al., 2000). C. sordellii has recently, however, become best known for causing soft-tissue infections leading to myonecrosis and, frequently, , which in turn causes multi-organ failure and death (Fischer et al., 2005). While a significant proportion of such infections occur in intravenous drug users, the majority occur in otherwise healthy individuals following traumatic injury, surgery or (in women) gynaecological events (childbirth, miscarriage, abortion). It is thought that these gynaecological events actually increase an individual’s susceptibility to C. sordellii infection: mouse experiments have shown that generally, uterine defence against C. sordellii is mediated by macrophages, which bind and phagocytose the bacteria using the Class A scavenger receptor ‘Macrophage Receptor with Collagenous Structure’ (MARCO) (Thelen et al., 2010). During pregnancy, including around childbirth, increased levels of the hormone Prostaglandin E2

(PGE2) are produced in the female reproductive tract. PGE2 is a known regulator of the immune system, and has been found to reduce the phagocytosis activity of macrophages, including the phagocytosis of C. sordellii, thus likely rendering women more susceptible to C. sordellii infections during and immediately after pregnancy (Rogers et al., 2014). Such an effect on the immune system has also been seen with the abortion-inducing drug (and PGE2 analogue) misoprostol, which was shown to impair innate immunity to C. sordellii in the reproductive tracts of female mice (Aronoff et al., 2008). An association between misoprostol use and C. sordellii infection has been observed in women (Fischer et al., 2005), indicating that this impairment of the innate immune system is indeed important in allowing the establishment of C. sordellii infection. C. sordellii soft tissue infections are utterly devastating: they are initially hard to diagnose, as early symptoms tend to be broad and non-specific – lethargy, nausea, dizziness. Infections then tend to progress rapidly, so that by the time more specific symptoms occur (such as severe tachycardia, refractory hypotension and a “leukemoid reaction” characterised by significant leukocytosis), it is often too late to save the patient (Aldape et al., 2006). Indeed, C. sordellii infections often progress so rapidly that diagnosis is post-mortem (Stevens et al., 2012).

55

Following a diagnosis of C. sordellii, treatment is immediate surgical removal of necrotic tissue by debridement/amputation (dependent on the location of the infection), followed by intensive antibiotic therapy and general life-support (Aldape et al., 2006). However, although C. sordellii generally appears to be susceptible to a wide range of antibiotics, treatment is often unsuccessful (presumably due to the speed with which the infection takes hold) (Aldape et al., 2006). There is therefore a clear need for improved diagnostics and treatments to be developed.

1.9.2 C. sordellii Virulence Strains of C. sordellii can carry a veritable arsenal of exotoxins encoded on their genomes. The most potent of these toxins are Lethal Toxin (TcsL) and Haemorrhagic Toxin (TcsH) (Aronoff, 2013). TcsL and TcsH are both members of the LCC family, and are closely related to the C. difficile toxins TcdB and TcdA respectively (Just and Gerhard, 2004; Martinez and Wilkins, 1992). The toxins share the same mechanism of action, and levels of sequence identity between the two pairs of toxins are very similar (TcsH/TcdA have ~77 % amino acid identity, TcsL/TcdB ~76 % identity), but while TcsH and TcdA are equally potent (with lethal doses in mice of ~75 ng), TcsL has evolved to become one of the most potent cytotoxins known, with a lethal dose in mice of only ~5 ng (in comparison to 50 ng for TcdB) (Martinez and Wilkins, 1992). In addition to these LCCs, C. sordellii strains can also produce multiple other putative virulence factors, including the neuraminidase NanS, the collagenase ColA, the phospholipase C Csp (standing for Clostridium sordellii phospholipase C) and the cholesterol- dependent cytolysin Sordellilysin (Sdl) (Voth et al., 2006). However, based on the high potency of TcsH and (in particular) TcsL, it is assumed that these LCCs are the key virulence factors, with strains carrying them being described as toxigenic, and those lacking them as non- toxigenic (Aronoff, 2013). This conclusion is supported by the fact that in a mouse model of C. sordellii infection, the TcsL+/TcsH- strain ATCC9714 is highly lethal, while a mutant thereof lacking TcsL is avirulent, indicating that in this strain at least, TcsL is an essential virulence factor (Carter et al., 2011). In terms of human infection, it is believed that TcsL production is required for the development of C. sordellii toxic shock syndrome (Aronoff, 2013). However, there is an imperfect correlation between the presence of toxic shock syndrome in a patient and the presence of the tcsL gene in the infecting strain (Bouvet et al., 2015), leading to speculation that the LCC genes can be lost.

56

However, the importance of the non-LCC toxins to virulence has not been investigated though NanS has been identified as an important stimulant of the characteristic, above- described leukemoid reaction (Aldape et al., 2007). The role of TcsH in infection by TcsH+ strains has not been investigated either. It is therefore important for our understanding of the pathogenesis of C. sordellii infection, particularly the deadly toxic shock syndrome, that it is either confirmed that tcsL is prone to loss from strains, or that other important virulence factors are identified. One major hindrance of research into C. sordellii has been the absence of a high quality genome sequence of the species, rendering the use of modern genetics techniques highly challenging, if not impossible. Such a sequence was therefore required as a matter of urgency to drive forward our knowledge and understanding of the species.

1.10 Aims of the Project The first set of aims were in relation to C. difficile. As described above, C. difficile encodes T4P and apparently expresses them during infection. The first objective therefore was to identify in vitro conditions under which C. difficile expresses its T4P, and possibly to discover how they are regulated. The second objective was to identify the role(s) of C. difficile T4P: do they, for instance mediate biofilm formation? Are they used for twitching motility? The third objective was to gain some level of understanding of the T4P. For instance, which genes are required for T4P biosynthesis? Such investigations have been carried out extensively in Gram-negative species, as described above, but no such investigations had been performed in a Gram-positive organism. It has recently been discussed that a Gram-positive model of T4P would be highly useful in driving forward T4P research, due to their inherently “simpler” nature (i.e. the presence of only one membrane) (Pelicic, 2008). It would therefore be useful to identify the essential genes in C. difficile to help build a model of T4P biogenesis in a Gram-positive organism. Other questions to attempt to answer include the roles of the two separate pre-pilin peptidases, and whether the two separate T4P gene clusters function (largely) separately or if they operate co-operatively. The second set of aims was in relation to C. sordellii. At the initiation of this project my group was in the process of sequencing a number of C. sordellii strains. The first objective was to assemble as far as possible a high-quality genome sequence and analyse it. The second objective was to identify any putative Type IV Pili genes and perform an analysis of them.

57

Given that the genome was to be assembled, any particular points of interest identified along the way were to be investigated.

58

Chapter 2: Materials and Methods

Throughout, all chemicals were obtained from Sigma-Aldrich unless otherwise stated.

2.1 Bacterial Strains and Growth Conditions 2.1.1 Bacterial Strains Used Commercially acquired, chemically competent E. coli NEB5α (New England Biolabs (NEB), USA) were used as cloning hosts during plasmid construction. E. coli CA434 were used as a conjugation donor strain to transfer plasmids into C. difficile and C. sordellii. CA434 cells are derivatives of E. coli strain HB101 carrying the conjugative plasmid R702, which confers resistance to kanamycin. They also encode Dam methylase, which methylates plasmid DNA, protecting it from cleavage by C. difficile restriction enzymes (Purdy et al., 2002). E. coli BL21 (DE3) and Rosetta (DE3) were both obtained from Novagen (USA) and used for protein expression. BL21 (DE3) is a lysogen of bacteriophage λDE3, and therefore carries a chromosomal copy of the T7 RNA polymerase. Rosetta (DE3) is a derivative of BL21 (DE3) which carries the pRARE plasmid, which confers chloramphenicol resistance and encodes tRNAs for six codons rarely found in E. coli. This can enhance expression of foreign genes in the species (Novagen, 2004). E. coli DHM1, a strain lacking endogenous adenylate cyclase (Karimova et al., 2005), was used for bacterial-two-hybrid experiments. C. difficile strain 630 is a ribotype 012, toxinotype tcdA+/tcdB+/cdt– strain kindly provided to the Fairweather group by Dr Peter Mullany of the Eastman Dental Institute, University College London. C. difficile strain R20291 is a ribotype 027, toxinotype tcdA+/tcdB+/cdt+ strain kindly provided to the Fairweather group by Professor Brendan Wren, London School of Hygiene and Tropical Medicine (LSHTM). C. sordellii strain ATCC9714 is a toxinotype tcsL+/tcsH– strain obtained from the American Type Culture Collection (ATCC), USA and C. sordellii W3025 is a toxinotype tcsL–/tcsH– strain kindly provided by Dr Liljana Petrovska, Animal and Plant Health Agency (formerly Veterinary Laboratories Agency), Addlestone, UK. Complete lists of C. difficile and C. sordellii strains used are provided in Table 2.1 and 2.2 respectively. Throughout, ‘Δ’ in a strain name indicates a deletion mutant obtained by homologous recombination, while ‘::erm’ indicates an insertional mutant obtained using the ClosTron system (Heap et al., 2010). E.g. 630ΔpilB1 would indicate a deletion mutant of pilB1 in C. difficile strain 630, while R20291 pilB1::erm would indicate a ClosTron- generated insertional mutant of pilB1 in C. difficile strain R20291.

59

Strain Relevant Characteristics Source/Reference

630 Wild-Type Strain (Mullany et al., 1990)

R20291 Hypervirulent Wild-Type Strain (Stabler et al., 2009)

630ΔpilA1 pilA1 deleted from 630 using codA technique This Work

630ΔpilB1 pilB1 deleted from 630 using codA technique This Work

630ΔpilB2 pilB2 deleted from 630 using codA technique This Work

630ΔpilC1 pilC1 deleted from 630 using codA technique This Work

630ΔpilD1 pilD1 deleted from 630 using codA technique This Work

630ΔpilK pilK deleted from 630 using codA technique This Work

630ΔpilMN pilMN deleted from 630 using codA technique This Work

630ΔpilO pilO deleted from 630 using codA technique This Work

630ΔpilT pilT deleted from 630 using codA technique This Work

630ΔpilU pilU deleted from 630 using codA technique This Work

630ΔpilV pilV deleted from 630 using codA technique This Work

pilA1Terminator deleted from 630 using codA 630ΔpilA1Terminator This Work technique. mfd erm 630Δerm mfd::erm Insertional mutant of in 630Δ made using Dr S. Willing ClosTron technique. R20291 pilB1::erm R20291 with pilB1 insertionally inactivated This Work Table 2.1. Strains of C. difficile Used in This Study. Dr S. Willing is a former member of the Fairweather group. 2.1.2 Culturing Escherichia coli E. coli strains were routinely cultured in Luria Bertani (LB) broth (peptone 10 g/L, yeast extract 5 g/L, NaCl 10 g/L; Merck KGaA, Germany) at 37oC with shaking at 200 rpm (unless otherwise indicated) and on LB agar (as previous with 12 g/L agar; Merck KgaA) (stationary at 37oC, unless otherwise indicated). Where indicated strains were grown in TY broth (tryptose (Beckton Dickinson (BD), USA) 30 g/L, yeast extract (BD) 20 g/L) under conditions indicated. Where appropriate media were supplemented with kanamycin (50 μg/ml), carbenicillin (50 μg/ml) or chloramphenicol (15 μg/ml when culturing E. coli CA434 carrying pMTL960- derived plasmids, 30 μg/ml when culturing E. coli Rosetta (DE3) or E. coli BL21 (DE3) carrying pACYCDuet-1).

60

Strain Source Toxinotype Associated Infection or Pathology R32977 ARU (UK) tcsL–/tcsH– Knee Replacement

R32921 ARU (UK) tcsL–/tcsH– Death During Pregnancy

R32668 ARU (UK) tcsL–/tcsH– Wound Infection

R32462 ARU (UK) tcsL–/tcsH– Endo-Cervical Discharge

R31809 ARU (UK) tcsL–/tcsH– Traumatic Knee Injury

R30684 ARU (UK) tcsL–/tcsH– Calf Abscess

R29426 ARU (UK) tcsL–/tcsH– Sudden Death

R28058 ARU (UK) tcsL–/tcsH– Crushed Hand

R27882 ARU (UK) tcsL–/tcsH– Knee Amputation

R26833 ARU (UK) tcsL–/tcsH– Blood Culture from Diabetic

W2967 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W10 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W2922 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W2945 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W2946 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W2948 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W2975 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W3025 APHA (UK) tcsL–/tcsH– Veterinary Isolate

W3026 APHA (UK) tcsL–/tcsH– Veterinary Isolate

JGS444 ISU (USA) tcsL+/tcsH– Myonecrosis, Bovine

JGS445 ISU (USA) tcsL+/tcsH– Myonecrosis, Bovine

JGS6382 ISU (USA) tcsL+/tcsH+ Myonecrosis, Bovine (Walk et al., 2011) JGS6956 ISU (USA) tcsL–/tcsH– Veterinary Isolate

JGS6961 ISU (USA) tcsL–/tcsH– Veterinary Isolate ATCC (USA) tcsL+/tcsH– Oedema ATCC9714 (Hall and Scott, 1927) UMich (USA) tcsL–/tcsH– Post-Partum Endometritis. DA108 (Hao et al., 2010) SSCC33589 UWA (AUS) tcsL–/tcsH– Blood Isolate

61

SSCC42239 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC26591 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC37615 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC18838 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC18392 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC35109 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC33587 UWA (AUS) tcsL–/tcsH– Blood Isolate

SSCC32135 UWA (AUS) tcsL–/tcsH– Blood Isolate OU (USA) tcsL–/tcsH– Allograft Isolate UMC1 (Voth et al., 2006) OU (USA) tcsL–/tcsH– Allograft Isolate UMC2 (Voth et al., 2006) OU (USA) tcsL–/tcsH– Allograft Isolate UMC164 (Voth et al., 2006) OU (USA) tcsL–/tcsH– Allograft Isolate UMC178 (Voth et al., 2006) OU (USA) tcsL–/tcsH– Allograft Isolate UMC4401 (Voth et al., 2006) OU (USA) tcsL–/tcsH– Allograft Isolate UMC4404 (Voth et al., 2006) E204 MU (AUS) tcsL–/tcsH– Clinical Isolate

R15892 ARU (UK) tcsL–/tcsH– Clinical Isolate

JGS6364 ISU (USA) tcsL+/tcsH– Myonecrosis, Bovine Table 2.2. Strains of C. sordellii Used in This Study. ARU = Anaerobe Reference Unit, Cardiff, UK; APHA = Animal and Plant Health Agency, Addlestone, UK (formerly Veterinary Laboratories Agency (VLA)); ISU = Iowa State University, Ames, IA, USA; ATCC = American Type Culture Collection, Manassas, VA, USA; UMich = University of Michigan, Ann Arbor, MI, USA; UWA = University of Western Australia, Crawley, WA, Australia; OU = University of Oklahoma, Oklahoma City, OK, USA; MU = Monash University, Melbourne, VIC, Australia. The associated pathology or infection is given where known and the infection host was human unless otherwise stated. 2.1.3 Culturing Clostridium difficile C. difficile was routinely cultured in TY broth and on BHIS agar (Brain-Heart Infusion (BHI – Oxoid, UK) 37 g/L, yeast extract 5 g/L, L-Cysteine 1 g/L, agar (BD) 15 g/L) at 37oC without shaking, under anaerobic conditions. These conditions were maintained using an anaerobic cabinet (Don Whitley Scientific Limited, UK) which contained an atmosphere of 80 % N2,

10 % CO2, 10 % H2. Where indicated C. difficile was also grown at room temperature in an anaerobic jar (Thermo Fisher Scientific, USA) containing an anaerobic atmosphere generated

62 using an AnaerogenTM sachet (Thermo Fisher Scientific). Media were supplemented where appropriate with thiamphenicol (15 μg/ml), cycloserine (250 μg/ml), cefoxitin (81μg/ml) and lincomycin (20 μg/ml). Where indicated C. difficile was also cultured in BHIS broth (BHI 37 g/L, yeast extract 5 g/L, L-Cysteine 1 g/L), SMC broth (Bacto Peptone (BD) 90 g/L, Proteose Peptone 5 g/L, Tris-(hydroxymethyl)aminomethane (henceforth “Tris”, Thermo Fisher

Scientific) 1.5 g/L, (NH4)2SO4 1 g/L), BHI broth (BHI 37 g/L), Fastidious Anaerobe Broth (FAB – LabM Limited, UK; 29.7 g/L) and TGY broth (Tryptic Soy Broth (BD) 30 g/L, yeast extract 10 g/L, L-Cysteine 1 g/L, Glucose 2 % w/v). Also where indicated C. difficile was cultured on BHI agar (BHI broth with 15 g/L agar), SMC agar (SMC broth with 15 g/L agar), TY agar (TY broth with 15 g/L agar), Blood agar (Blood Agar Base (Oxoid) 40 g/L, defibrinated horse blood (Oxoid) 70 ml/L), Columbia agar (Columbia Blood Agar Base (Oxoid) 39 g/L, agar 5 g/L, defibrinated horse blood 50 ml/L), Fastidious Anaerobe Agar (FAA – BioConnections, UK, 45.6 g/L, Agar 3 g/L, defibrinated horse blood 70 ml/L), Brazier’s agar (C.C.E.Y. Agar (BioConnections) 48 g/L, agar 3 g/L, egg yolk emulsion (Southern Group Laboratory, UK) 40 ml/L, defibrinated horse blood 10 ml/L).

2.1.4 Culturing Clostridium sordellii C. sordellii was routinely cultured in BHIS broth, at 37oC without shaking under anaerobic conditions maintained using an anaerobic cabinet, and on BHIS agar containing an additional 5 g/L agar (total of 20 g/L agar), under identical conditions.

2.1.5 Storage of Strains A culture of the desired strain was grown overnight in 5 ml media (LB for E. coli, BHIS for C. difficile or C. sordellii). 1 ml of overnight culture was mixed with 400 μl 0.2 µm filter- sterilised 70 % glycerol (pre-reduced if necessary), vortexed and stored at -80oC.

2.1.6 Production of Chemically Competent E. coli Except for NEB5α cells, which were all commercially acquired, chemically competent cells of all other strains were produced from stocks. An overnight culture of the strain in 5 ml LB broth was sub-cultured into 25 ml fresh LB broth in a 250 ml baffled flask to a resultant OD600 of o 0.1. The culture was incubated at 37 C with shaking until an OD600 of ~0.6 was reached, whereupon cells were harvested by centrifugation at 4000 g for 10 mins at 4oC. Pellets were resuspended in 12.5 ml ice-cold 50 mM CaCl2, 15 % (v/v) glycerol and incubated on ice for 20 mins. Cells were harvested again by centrifugation at 4000 g for 10 mins at 4oC and the pellet

63 resuspended in 1.25 ml ice-cold 50 mM CaCl2, 15 % (v/v) glycerol. The resuspension was then aliquoted and frozen at -80oC.

2.1.7 Transformation of E. coli Chemically competent E. coli cells were transformed by heat-shock. 10-25 μl of competent cells were thawed on ice. 0.5 μl (approx. 40-200 ng) intact plasmid DNA or 2 μl ligation/Gibson assembly reaction mixture was added to the cells which were then incubated on ice for 30 mins. Cells were then heat shocked for 30 sec at 42oC then returned to ice for 5 mins. 150 μl SOC Outgrowth Medium (NEB) was then added to the cells which were then incubated at 37oC for 1 hr with shaking at 200 rpm, following which 100 μl of the transformation mixture was spread onto LB-agar plates supplemented with appropriate antibiotic.

2.1.8 Conjugative Transfer of Plasmid DNA into Clostridia 1 ml was taken from a 5 ml overnight LB culture of E. coli CA434 containing the desired vector and harvested by centrifugation for 2 min at 1700 g. The supernatant was discarded and the pellet transferred to the anaerobic cabinet. The pellet was then gently resuspended in 200 μl of a Clostridial culture grown overnight in BHIS broth. 20 μl aliquots were then spotted onto non- selective BHIS agar plates. Plates were incubated at 37oC for 8 hrs (C. difficile 630 and derivatives and C. sordellii strains) or 24 hrs (C. difficile R20291 and derivatives) before cells were scraped off the plate in 500 μl pre-reduced phosphate-buffered saline (PBS; 137 mM NaCl, 11.9 mM phosphates, 2.7 mM KCl; Thermo Fisher Scientific) and spread onto selective BHIS plates containing thiamphenicol for plasmid selection and cycloserine to inhibit E. coli growth. Transconjugants were re-streaked twice onto selective BHIS to ensure purity.

2.2 Bioinformatics 2.2.1 DNA Sequence Visualisation C. difficile genome sequences were obtained from the Wellcome Trust Sanger Institute website (www.sanger.ac.uk) or directly from the Wellcome Trust Sanger Institute, while C. sordellii sequences were obtained personally from Mr Hilary Browne, Wellcome Trust Sanger Institute, UK, as described in the text. Genome sequences were visualised using Artemis Genome Browser (available for free for the Wellcome Trust Sanger Institute website), while plasmid and DNA fragment maps were constructed and visualised using ‘Geneious’ (Biomatters Ltd., New Zealand).

64

2.2.2 DNA Sequencing All DNA sequencing (except genome sequencing) was performed by GATC Biotech (Germany). Sequences were aligned using Geneious; sequencing chromatograms were visualised using ‘Chromas Lite’ (Technelysium, Australia).

2.2.3 DNA/RNA, Gene and Protein Analysis Alignment searches for nucleotide and protein sequences were performed using the Basic Local Alignment Search Tool (BLAST, http://blast.ncbi.nlm.nih.gov/Blast.cgi), while alignments for nucleotide and protein sequences were performed using Geneious or Clustal Omega (Sievers et al., 2011). Where alignments were performed using Clustal Omega, the alignments were viewed using MView (Brown et al., 1998). Basic properties of proteins were calculated using ProtParam (http://web.expasy.org/protparam/) (Gasteiger et al., 2003); protein signal sequences and signal sequence cleavage sites were predicted using SignalP (http://www.cbs.dtu.dk/services/SignalP/) and Phobius (http://phobius.sbc.su.se/) (Käll et al., 2007; Petersen et al., 2011); protein trans-membrane helix predictions were performed using TMHMM (Krogh et al., 2001) (http://www.cbs.dtu.dk/services/TMHMM/); protein structures were predicted using Phyre2 (http://www.sbg.bio.ic.ac.uk/phyre2) (Kelley and Sternberg, 2009); DNA/RNA secondary structure and properties were predicted using Integrated DNA Technologies’ (USA) OligoAnalyzer (http://eu.idtdna.com/calc/analyzer).

2.3 DNA Manipulation 2.3.1 Genomic DNA Extraction Genomic DNA (gDNA) was extracted from C. difficile and C. sordellii by phenol/chloroform extraction. Strains were grown overnight in 25 ml BHIS broth. Cells were harvested by centrifugation at 4000 g for 10 mins then washed with 1 ml TE buffer (10 mM Tris-HCl, 1 mM

Na2EDTA, pH 8). C. difficile cells were then lysed by resuspension in 500 μl lysis buffer (200 mM NaCl, 50 mM EDTA, 20 mM Tris-HCl, pH 8) followed by the addition of 5 μl of 2 mg/ml CD27L (final concentration 20 μg/ml), a phage-derived endolysin with N- acetylmuramoyl-L-amidase activity which is able to degrade the C. difficile cell wall (Mayer et al., 2008). CD27L was purified and provided by Drs H.A. Shaw, A. Dale and J. Peltier (all former members of the Fairweather Group). The resuspension was then incubated for 1 hr at 37oC, following which 20 μl 20 mg/ml Pronase (Roche, Switzerland) was added (final concentration 0.8 mg/ml) and the resuspension incubated for 1 hr at 55oC. 200 μl 10 % N-lauroylsarcosine solution was then

65 added (final concentration 3.8 % (v/v)) and the suspension incubated for 1 hr at 37oC, after which 500 μl nuclease free water (Life Technologies, USA) was added followed by RNase A to a final concentration of 0.2 mg/ml and the suspension incubated for 1 further hr at 37oC. C. sordellii cells were lysed by initially freezing for 1 hr at -80oC, followed by immediate resuspension in 400 μl lysis buffer. Lysozyme was then added to a resultant concentration of 2 mg/ml and RNase A to 0.2 mg/ml and the resuspension incubated for 2 hrs at 37oC. Proteinase K was then added to a resultant concentration of 0.5 mg/ml and SDS to 1 % (w/v) and the suspension incubated at 50oC for 1 hr. Cell lysates were then mixed with an equal volume of chloroform:isoamyl alcohol (24:1 mixture) and mixed by inversion, then transferred to a phase-lock gel (PLG) tube (VWR, USA). The mixture was then centrifuged at 4 000 g for 10 mins, and the upper phase then transferred to a fresh PLG tube. This process was repeated twice, and the upper phase resulting from the final centrifugation transferred to a fresh tube (non-PLG) and DNA precipitated from it by the addition of 2.5 volumes ice-cold 100 % ethanol (VWR) followed by incubation over-night at -20oC. DNA was then harvested by centrifugation at 17 900 g for 5 mins at 4oC, washed with 500 μl 70 % ethanol then centrifuged again at 17 900 g for 5 mins at 4oC. The supernatant was removed and the pellet left to dry at room temperature for approx. 10 mins, then resuspended in 50 μl nuclease free water. DNA concentrations were measured using a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific).

2.3.2 Purification of Plasmid DNA E. coli cultures were grown overnight in LB broth. Cells were harvested by centrifugation at 5000 g for 12 mins. Plasmid DNA was extracted and purified using a QIAprep Spin Miniprep Kit (Qiagen, Netherlands) according to manufacturer’s instructions, using a microcentrifuge and eluting plasmid DNA in nuclease free water. DNA concentrations were measured using a NanoDrop 1000 Spectrophotometer. All plasmids used in this study are listed in Table A1, in Appendix 1 at the end of this thesis.

2.3.3 Polymerase Chain Reaction (PCR) Primers All primers used in this study were synthesised by Sigma-Aldrich, and are listed in Table A2, in Appendix 1 at the end of this thesis. All primers were used at a resultant concentration of 300 nM, except when used in quantitative reverse-transcriptase PCR (qPCR) wherein primers were used at a resultant concentration of 500 nM. Annealing temperatures used for all PCR reactions (except qPCR) were approximately 5oC lower than the lowest primer melting

66 temperature (Tm) in the reaction, as calculated by the Eurofins Genomics ‘Oligo Property Scan’ accessible through www.mwg-biotech.com.

2.3.4 High Fidelity PCR High fidelity PCRs were performed using KOD Hot Start DNA polymerase (Merck KGaA). Reactions were performed according to manufacturer’s instructions in a total volume of 50 μl.

Reaction mixtures comprised: 5 μl 10x enzyme buffer, 200 μM dNTPs, 1.5 mM MgSO4, 2.5 μl DMSO, 300 nM primers, 200 ng gDNA template or 5 ng plasmid DNA template, 1 μl KOD Hot Start DNA polymerase (2 μl polymerase was used when amplifying targets of over 3 kb) and the remainder nuclease-free water. Thermo-cycler conditions were: 94oC for 2 mins (initial o o denaturation), then 30 cycles comprising: 94 C for 15 sec, primer Tm minus 5 C for 30 sec, 68oC for 1 min/kb predicted product size (when amplifying from gDNA) or 72oC for 30 sec/kb predicted product size (when amplifying from plasmid DNA) ending with a hold at 4oC.

2.3.5 Colony/Low Fidelity PCR Colony PCRs were performed of E. coli NEB5α colonies to screen plasmids and of C. difficile colonies to screen for mutants. Template DNA was prepared from E. coli by resuspending a colony in 50 μl nuclease-free water and heating at 100oC for 10 mins. Template DNA was prepared from C. difficile by resuspending a colony in 50 μl 5 % Chelex 100 (Bio-Rad, USA), heating at 100oC for 10 mins and centrifuging at 17 000 g for 5 mins. The resultant supernatant was used as template. Low fidelity PCR was performed to screen purified gDNA. All colony and low fidelity PCR reactions were performed in 20 μl volume. Colony PCRs were performed using either REDTaq DNA Polymerase (Sigma-Aldrich) (for predicted product sizes of < 1 kb), Taq DNA Polymerase from Thermus aquaticus (Sigma-Aldrich) (for predicted product sizes of 1 kb ≤ x < 3 kb) or KOD Hot Start DNA Polymerase (for predicted product sizes ≥ 3 kb). All screening PCRs were performed using Taq DNA Polymerase from Thermus aquaticus. All PCR reactions were performed according to polymerase manufacturers’ instructions. REDTaq DNA Polymerase reaction mixtures comprised 10 μl REDTaq ReadyMix, 300 nM primers, 2 μl template DNA and the remainder nuclease-free water. Taq

DNA Polymerase reaction mixtures comprised 2 μl 10x enzyme buffer, 2.5 mM MgCl2, 200 μM dNTPs, 300 nM primers, 2 μl template DNA, 0.2 μl Taq DNA polymerase and the remainder nuclease-free water. Thermo-cycler conditions for both REDTaq and Taq polymerase were: 94oC for 2 mins (initial denaturation), then 30 cycles comprising: 94oC for o o 30 sec, primer Tm minus 5 C for 30 sec, 72 C for 1 min/kb predicted product size. KOD Hot

67

Start DNA Polymerase reaction mixtures comprised 2 μl 10x enzyme buffer, 200 μM dNTPs,

1.5 mM MgSO4, 1 μl DMSO, 300 nM primers, 1.5 μl template DNA, 0.8 μl KOD Hot Start DNA Polymerase and the remainder nuclease-free water. Thermo-cycler conditions were: 94oC o for 2 mins (initial denaturation), then 30 cycles comprising: 94 C for 15 sec, primer Tm minus 5oC for 30 sec, 68oC for 1 min/kb predicted product size.

2.3.6 Agarose Gel Electrophoresis All PCR products were analysed by electrophoresis through agarose gels. Gels were made of between 1 % and 4 % ultra-pure agarose (Life Technologies), dependent on product size, dissolved in TAE buffer (40 mM Tris-Acetate, 2 mM Na2EDTA, pH 8.3; National Diagnostics, USA). SYBR Safe (Life Technologies) was added to gels in order to visualise DNA. Samples were mixed with 6x Gel Loading Dye (either purple or blue, NEB) prior to loading onto gels. 10 μl PCR product, mixed with 2 μl Gel Loading Dye, was loaded. Electrophoresis was performed in TAE buffer at 100 V for between 20 and 30 mins, dependent on the agarose percentage of the gel. To estimate product size either 1 kb DNA Ladder (NEB) or 100 bp DNA Ladder (NEB) (dependent on expected product size and percentage of agarose in gel) was loaded onto each gel alongside samples. Following electrophoresis gels were visualised using a Safe Imager Blue Light Transilluminator (Life Technologies) and photographed using an InGenius system (Syngene, UK).

2.3.7 Purification of PCR Products Where necessary, PCR products were purified using a QIAquick PCR Purification Kit (Qiagen) according to the manufacturer’s instructions, using a microcentrifuge and eluting DNA in nuclease-free water. If it was necessary to degrade template DNA following PCR amplification from a plasmid (i.e. during mutagenesis by inverse PCR) the purified products were mixed with DpnI (NEB) and appropriate buffer in a total volume of 40 μl, then incubated for 2 hr at 37oC. Products were then purified again from the reaction mix using a QIAquick PCR Purification Kit. DNA concentrations were measured using a NanoDrop 1000 Spectrophotometer.

2.3.8 Restriction Digests All digests were performed with restriction enzymes obtained from NEB. Where available, high-fidelity (HF) versions were used. Digests were carried out according to the manufacturer’s instructions, where necessary the NEB Double Digest Finder (https://www.neb.com/tools-and- resources/interactive-tools/double-digest-finder) was used to identify optimal conditions.

68

Digests were performed in 20 μl volume, with 500 ng substrate DNA made up to 20 μl with nuclease-free water. Digests were incubated at 37oC for 1 hr then analysed by agarose gel electrophoresis.

2.3.9 Extraction of DNA from Agarose Gels Where necessary, desired DNA-containing sections were extracted from agarose gels using a scalpel. DNA was then extracted from the agarose gel sections using a QIAquick Gel Extraction Kit (Qiagen) according to the manufacturer’s instructions, using a microcentrifuge and eluting DNA in nuclease-free water. Isopropanol was obtained from VWR. DNA concentrations were measured using a NanoDrop 1000 Spectrophotometer.

2.3.10 DNA Ligation Quick-Stick Ligase (Bioline, UK) was used for DNA ligations, according to manufacturer’s instructions. For ligations using sticky ends (during cloning), linearised vector DNA was mixed with purified insert with a molar ratio of 1:3, with a maximum of 100 ng total DNA. Quick- Stick Ligase and buffer were added and the reaction mix made up to 20 μl using nuclease-free water. The reaction was incubated at room temperature for 5 mins, and 2 μl of mixture then used to transform NEB5α competent cells. For ligations using blunt ends (during plasmid mutagenesis by inverse PCR) 50 ng DNA was mixed with ligase buffer (initially without ligase) and 1 μl T4 Polynucleotide Kinase (PNK) (NEB) in a total volume of 19 μl, made up to 19 μl with nuclease-free water, and incubated at 37oC for 30 mins. The mixture was then cooled to room temperature and 1 μl Quick-Stick Ligase added, and the mixture incubated at room temperature for 15 mins. 2 μl mixture was then used to transform NEB5α competent cells.

2.3.11 Gibson Assembly Assembly of plasmids from DNA fragments was performed using Gibson Assembly Master Mix (NEB). Primers used to produce said DNA fragments by PCR were designed using NEBuilder (http://nebuilder.neb.com/). Gibson Assembly Master Mix was used according to manufacturer’s instructions. Linearised vector and insert fragments were mixed in a 1:3 molar ratio, using 10 ng vector and amounts of insert fragments as calculated. Combined DNA was then added to 5 μl Gibson Assembly Master Mix, and nuclease-free water added to make a total volume of 10 μl. This mixture was incubated at 50oC for 30 mins. 2 μl mixture was then used to transform NEB5α.

69

2.3.12 Sequencing of Plasmids To confirm correct construction, all plasmid constructs were sequenced by GATC using appropriate sequencing primers unless the construct was made by direct sub-cloning of a region of DNA from one established plasmid into another. In this case sequencing was not performed.

2.3.13 ClosTron Mutagenesis Insertional mutants were generated in C. difficile R20291 using the ClosTron system (Heap et al., 2010; Heap et al., 2007). The Group II Intron used for ClosTron mutagenesis was re- targeted to genes of interest using clostron.com to identify possible insertion sites and the required intron sequence for each target. The intron was thus re-targeted and cloned into pMTL007C-E5 by DNA 2.0 (USA). Plasmids were transformed into E. coli CA434 (Section 2.1.6) then conjugated into C. difficile (Section 2.1.7). Transconjugants were streaked onto Brazier’s agar containing lincomycin. Plasmid loss from lincomycin-resistant colonies was confirmed by testing thiamphenicol sensitivity. Thiamphenicol-sensitive colonies were screened by colony PCR (Section 2.3.5) to confirm intron insertion into target gene. Putative ClosTron mutants of pilB1 were screened with primers NF2096/2097.

2.3.14 Southern Blotting Southern blotting was used to confirm the presence of only one Group II Intron in the genomes of mutants generated using the Clostron System (Section 2.3.13). gDNA was extracted from strains-of-interest (Section 2.3.1). 5 μg gDNA was digested for 6 hrs using, separately, NdeI and XmnI restriction enzymes each in a total volume of 30 μl. Digests then underwent electrophoresis on a 1 % agarose gel overnight (approx. 15 hrs) at 20 V. The gel was then imaged next to a ruler for sizing of bands, then incubated in Denaturing Buffer (1.5 M NaCl, 0.5 M NaOH) for 45 mins at room temperature with rocking. The gel was then rinsed in distilled water, incubated in Neutralising Buffer (1.5 M NaCl, 1 M Tris, pH 7.4) for 45 mins at room temperature with rocking, then rinsed again in distilled water. The gel was then placed in a capillary blotting stack to transfer DNA from the gel onto Biodyne B nylon membrane (Pall, USA). The base of the stack contained approx. 500 ml 20x SSC solution (3 M NaCl, 0.3 M Trisodium Citrate Dihydrate). Four pieces of Whatman Paper (GE Life Sciences, UK) wet in 20x SSC were placed on the base of the stack, beneath the gel. The bottom two pieces were partially submerged in 20x SSC. The nylon membrane was placed on top of the gel, and two pieces of Whatman paper wet in 10x SSC on top of the membrane. Tissue paper was placed

70 upon the Whatman, and a weight of approx. 500 g on top of that, and the stack left to blot overnight. Following blotting DNA was cross-linked to the membrane using a UV Stratalinker 1800 (Stratagene, USA) then pre-hybridised by incubating at 55oC with 10 ml pre-heated hybridisation solution (Amersham AlkPhos Direct Hybridisation Buffer (GE Healthcare, UK) containing 0.5 M NaCl and 4 % (w/v) AlkPhos Direct Blocking Reagent (GE Healthcare)). The DNA probe was then labelled using the AlkPhos Direct Labelling and Detection System (GE Healthcare) (the probe was made by PCR amplification from the relevant ClosTron plasmid using primers NF1597/1598). 10 μl of 10 ng/μl probe was denatured by heating for 5 mins at 99oC then cooled on ice for 5 mins. 10 μl AlkPhos Direct reaction buffer, 2 μl AlkPhos Direct labelling reagent and 10 μl 1x crosslinker solution were mixed with the probe, and the mixture incubated at 37oC for 30 mins then added to the pre-hybridised membrane. Membrane and probe were incubated overnight at 55oC with rotation. The buffer was removed from the membrane and the membrane washed twice for 10 mins at 55oC with rotation in 50 ml pre-warmed wash buffer 1 (2 M urea, 0.1 % (w/v) SDS,

150 mM NaCl, 0.2 % (w/v) AlkPhos Direct Blocking Reagent, 1 mM MgCl2, 50 mM

Na2HPO4), then twice for 10 mins at room temperature with wash buffer 2 (50 mM Tris,

100 mM NaCl, 2 mM MgCl2). 5 ml 1x CDP-Star detection reagent (Roche), diluted in wash buffer 2, was applied to membrane and incubated at room temperature for 5 mins. The blot was then visualised using a LAS-3000 Imager (Fujifilm, Japan).

2.3.15 Allele Exchange Mutagenesis Deletion mutants were generated in C. difficile 630 using the codA-linked method of allele exchange (Cartman et al., 2012). Plasmids were designed containing adjacent ‘homologous arms’ of approx. 1 kb and constructed using Gibson Assembly (Section 2.3.11). These plasmids were based on the codA allele exchange plasmid pMTL-SC7315 (Cartman et al., 2012). Constructed plasmids were transformed into E. coli CA434 then conjugated into C. difficile 630. Single cross-overs were selected based on their increased speed of growth (single- crossover colonies are noticeably larger than those of non-integrants) and re-streaked until they were pure (based on a uniform colony size – PCR screening was not performed at this point). This normally took 2-3 restreaks. Pure single cross-overs were then incubated for 96 hrs on non-selective BHIS agar, then resuspended in PBS and serially diluted. Neat resuspension and

71

10-1 and 10-2 dilutions were then plated onto adapted C. difficile minimal medium (CDMM, see Table 2.3) (Cartman and Minton, 2010) supplemented with 50 μg/ml 5-fluorocytosine (FC). FC-resistant colonies were re-streaked onto BHIS agar, both non-selective and thiamphenicol-supplemented. Thiamphenicol-sensitive colonies, which had lost the pMTL- SC7315-based allele exchange plasmid, were screened by colony PCR (Section 2.3.5) using primers which bound the chromosome either side of the region to be deleted. Mutants were identified as having smaller resultant PCR products than the wild-type. Mutants were confirmed by Sanger sequencing of said PCR products. Putative pilA1 mutants were screened with primers NF3091/3092; putative pilA1Terminator mutants were screened with primers NF3128/3129; putative pilB1 mutants were screened with primers NF3130/3131; putative pilC1 mutants were screened with primers NF3193/3194; putative pilMN mutants were screened with primers NF3195/3196; putative pilO mutants were screened with primers NF3197/3198; putative pilV mutants were screened with primers NF3124/3125; putative pilU mutants were screened with primers NF3093/3094; putative pilK mutants were screened with primers NF3191/3192; putative pilT mutants were screened with primers NF2793/2794; putative pilD1 mutants were screened with primers NF3201/3202; putative pilD2 mutants were screened with primers NF3199/3200; and putative pilB2 mutants were screened with primers NF3132/3133.

In all instances except for the pilA1Terminator mutant, the PCR product used for Sanger sequencing was produced using the same primers as were used for double cross-over screening.

In the case of the pilA1Terminator mutant, the product used for Sanger sequencing was obtained by amplification using primers NF3126/3127.

2.3.16 Vector Construction Construction of Vectors for PilA2 Purification Throughout this section, C. difficile genes/DNA fragments were amplified from C. difficile strain 630 gDNA. Plasmid pRPF230 was constructed by Dr R. Fagan. The pilA2 gene from strain 630 was amplified using primers NF1990/NF1991 and the resultant product cloned into pET28a using NcoI and XhoI, yielding pRPF230. (Throughout this thesis, where pET28a-derived constructs were synthesised, desired constructs were identified by colony PCR screening of multiple clones using primers NF34/35 prior to Sanger sequencing). Plasmid pRPF230 comprised the pilA2 gene encoded with a C-terminal His-tag.

72

Component Conc. in Stock Soln. (mg/ml) Final Conc. in CDMM (mg/ml) Amino Acids (5x) Casamino Acids 50 10 L-Tryptophan 2.5 0.5 L-Cysteine 2.5 0.5

Salts (10x)

Na2HPO4 50 5

NaHCO3 50 5

KH2PO4 9 0.9 NaCl 9 0.9

Glucose (20x) D-Glucose 200 10

Trace Salts (50x)

(NH4)2SO4 2 0.04

CaCl2. 2H2O 1.3 0.026

MgCl2. 6H2O 1 0.02

MnCl2. 4H2O 0.5 0.01

CoCl2. 6H2O 0.05 0.001

Iron (1000x)

FeSO4. 7H2O 4 0.004

Vitamins (1000x) D-Biotin 1 0.001 Calcium-D-Pantothenate 1 0.001 Pyridoxine 0.1 0.0001 Table 2.3. Recipe for CDMM.

73

The region of pRPF230 encoding the C-terminal His-tag was removed by inverse PCR using primers NF2583/NF2584 and the resultant linear backbone re-ligated, yielding plasmid pECC19. During this inverse PCR reaction a stop codon was inserted at the end of the pilA2 gene and immediately downstream of the stop codon an XhoI restriction site was inserted. DNA sequence attaching an N-terminal His-tag onto pilA2 in pECC19 was then inserted by a second round of inverse PCR using primers NF2585/NF2586, yielding pECC23. The N-terminal His-tag coding-sequence was inserted immediately after the initiating ATG start codon, and also resulted in the shifting of the NcoI restriction site from the site of the initiating ATG start codon to the end of the N-terminal His-tag coding-sequence. The overall effect of these two rounds of inverse PCR was to have shifted the His-tag from the C-terminus to the N-terminus of the encoded protein, and to allow the cloning of any gene into the new vector to encode an N-terminally His-tagged protein. Plasmid pECC38 was produced by amplification of the 5’ truncated pilA2 gene (see Section 3.2.3) using primers NF2742/2743 and cloning of the resultant PCR product into pECC23 using restriction enzymes NcoI/XhoI.

Construction of Vectors for C. difficile Pilin Expression in E. coli For full-length pilin expression, primers NF1996/1997 were used to amplify pilA1, NF1986/1987 to amplify pilU, NF1994/1995 to amplify pilV and NF1998/1999 to amplify pilW. Each of these genes were amplified without their 5’ termini, such that they encoded pilins lacking their N-terminal signal peptides. The resultant PCR products were then cloned into pET28a using NcoI/XhoI, yielding pRPF227, pRPF232, pRPF228 and pRPF226 respectively. These plasmids were synthesised by Dr. R. Fagan. Primers NF2992/2993 were used to amplify pilK, NF2994/2995 to amplify pilJ, NF3110/2997 to amplify pilX and NF3111/2999 to amplify pilA3. Similarly, each of these genes were amplified without their 5’ termini so that they encoded pilins lacking N-terminal signal sequences. The resultant PCR products were then cloned into pECC23 using NcoI/XhoI, yielding pECC56, pECC55, pECC53 and pECC54 respectively. For truncated pilin expression, the soluble domain-encoding region of pilK was amplified using primers NF3245/3246 and cloned into pECC38 using NcoI/XhoI, yielding pECC86, which encodes PilK with an N-terminal His-tag. The soluble domain-encoding regions of pilU and pilV were cloned into the two multiple cloning sites (MCSs) of pACYCDuet-1. The soluble domain-encoding region of pilU was amplified using primers NF3279/3281, and that of pilV using primers NF3282/3283. The pilV gene fragment was

74 cloned into the second MCS of pACYCDuet-1 using NdeI/XhoI, yielding plasmid pECC94. (Here and throughout clones into the second MCS of pACYCDuet-1 were screened by colony PCR using primers NF1121/1122.) Plasmid pECC94 was found to contain an insertion between the MCS2 ribosome binding site and the gene start site. This was removed by inverse PCR using primers NF3292/3293, yielding plasmid pECC103. The pilU gene fragment was then cloned into the first MCS of pECC103 using NcoI/SalI, yielding plasmid pECC106. (Here and throughout clones into the first MCS of pACYCDuet-1 were screened by colony PCR using primers NF944/945.) Neither of the resultant PilU and PilV fragments encoded by pECC106 were tagged.

Construction of Vectors for C. sordellii Pilin Expression in E. coli Primers NF3247/NF3248 and NF3249/NF3250 were used to amplify pilA1A and pilA1B respectively, which were then cloned into pET28a using NcoI and XhoI, yielding pECC80 and pECC81 respectively. It was not possible to clone pilU into pET28a in the same way as pilA1A and pilA1B, as the pilU gene contained an NcoI restriction site. Therefore, after amplification using primers NF3251/3252, pilU was cloned into pET28a using BamHI and XhoI, yielding plasmid pECC87. The BamHI site in pET28a is over 100 bp downstream of the ribosome binding site (RBS) (conversely the translation start site within the NcoI restriction site is only 7 bp downstream of the RBS). Excess DNA between the RBS and pilU translation start site was therefore removed by inverse PCR using primers NF3253/3254. This procedure removed 98 bp from between the RBS and the translation start site, leaving only 11, and yielding plasmid pECC93. All three resultant pilin encoding plasmids (pECC80, pECC81 and pECC93) encoded pilins with C-terminal His-tags. All three pilin genes (pilA1A, pilA1B and pilU) were amplified from C. sordellii W3025 gDNA.

Construction of Vectors for Diguanylate Cyclase Expression The diguanylate cyclase (DGC) gene dccA was cloned into pRPF144. The dccA gene was amplified using primers NF2126/2199, and the product cloned into pRPF144 using SacI/BamHI, yielding pECC12, which encodes dccA under the control of the constitutive cwp2 promoter. Primer NF2199 encodes a C-terminal His-tag, and contains an XhoI restriction site between the end of dccA and the His-tag, meaning that pECC12 can easily be used as a vector backbone for the construction of vectors encoding His-tagged proteins. The dccA gene was then sub-cloned from pECC12 into pRPF185 using SacI/BamHI, yielding pECC17. Similarly to pECC12, pECC17 contains an XhoI restriction site between the end of dccA and its

75

C-terminal His-tag, meaning that pECC17 can easily be used as a vector backbone for the construction of vectors encoding His-tagged proteins.

Construction of Vectors for Pilin Expression in C. difficile For inducible pilin expression, pilA1, pilV, pilU, pilW and pilA2 were cloned into pRPF185. Plasmid pRPF185 is a C. difficile expression vector from which genes may be expressed from the inducible tet promoter (whence gene expression is induced using the non-antibiotic tetracycline analogue anhydrotetracycline (Atc) (Fagan and Fairweather, 2011). Primers NF1525/1519 were used to amplify pilA1, NF1526/1521 to amplify pilV, NF1708/1709 to amplify pilU, NF3189/3190 to amplify pilW and NF2654/2655 to amplify pilA2. The resultant PCR products were then cloned into plasmid pRPF185 using SacI/BamHI, yielding plasmids pECC34, pECC15, pECC33, pECC28 and pECC24 respectively. (Throughout this thesis, where pRPF185-derived constructs were synthesised, desired constructs were identified by colony PCR screening of multiple clones using primers NF1323/794 prior to Sanger sequencing). For constitutive pilin expression, pilU and pilA2 were cloned into plasmid pRPF144. Plasmid pRPF144 is a C. difficile expression vector from which genes may be expressed from the constitutive cwp2 promoter (Fagan and Fairweather, 2011). The pilU and pilA2 genes were sub-cloned from pECC33 and pECC24 respectively into the pRPF144 backbone using SacI/BamHI, yielding pECC29 and pECC31 respectively. (Throughout this thesis where pRPF144-derived constructs were synthesised, desired constructs were identified by colony PCR screening of multiple clones using primers NF793/794 prior to Sanger sequencing). To express pilA2 with a His-tag, Primers NF2654/2656 were used to amplify pilA2, which was then cloned with SacI/XhoI into plasmids pECC12 and pECC17, yielding pECC32

(Pcwp2) and pECC25 (Ptet) respectively, both of which encode pilA2 with a C-terminal His-tag.

Construction of a Ptet Control Vector

The Ptet control vector pASF85 was constructed by Dr A. Fivian-Hughes. This was achieved by restriction digest of pRPF185 using SacI/BamHI to remove the gusA gene, followed by Klenow treatment to create blunt-ended fragments which were ligated together, yielding the empty vector pASF85.

Construction of Plasmids for Use in Allele Exchange Mutagenesis of C. difficile 630 All plasmids for allele-exchange mutagenesis were constructed (i) by amplification of approximately 1 kb regions up- and downstream of the deletion target; (ii) linearization of the

76 pseudo-suicide vector pMTL-SC7215, by inverse PCR using primers NF2214/2215; and (iii) assembly of the amplified regions into the linearised vector by Gibson assembly. A pilA1 deletion plasmid was constructed using primers NF3000/3001 to amplify the downstream region and NF3002/3003 to amplify the upstream region. These regions were assembled into the linearised vector, yielding plasmid pECC58. (Here and throughout, pMTL- SC7215 assembly clones were screened by colony PCR using primers NF2169/2170.) A pilA1Terminator deletion plasmid was constructed using primers NF3078/3079 to amplify the upstream region and primers NF3080/3081 to amplify the downstream region; assembly of these regions into the linearised vector yielded plasmid pECC61. A Cdi2_4_ deletion plasmid was constructed using primers NF3176/3177 to amplify the upstream region and primers NF3178/3179 to amplify the downstream region; assembly of these regions into the linearised vector yielded plasmid pECC71. A pilV deletion plasmid was constructed using primers NF3004/3005 and NF3006/3007 to amplify the down- and up-stream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC59. A pilU deletion plasmid was constructed using primers NF3008/3009 and NF3010/3011 to amplify the down- and up-stream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC60. A pilK deletion plasmid was constructed using primers NF3012/3013 and NF3014/3015 to amplify the down- and up-stream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC65. A pilD1 deletion plasmid was constructed using primers NF3156/3157 and NF3158/3159 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC70. A pilD2 deletion plasmid was constructed using primers NF3152/3153 and NF3154/3155 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC69. A pilB1 deletion plasmid was constructed using primers NF3086/3087 and NF3088/3089 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC62. A pilB2 deletion plasmid was constructed using primers NF3082/3083 and NF3084/3085 to amplify the up- and downstream regions respectively; assembly of these regions into the linearised vector yielded plasmid pECC63. A pilT deletion plasmid was constructed using primers NF3106/3107 and NF3108/3109 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC64.

77

A pilC1 deletion plasmid was constructed using primers NF3112/3113 and NF3114/3115 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC66. A pilMN deletion plasmid was constructed using primers NF3116/3117 and NF3118/3119 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC67. A pilO deletion plasmid was constructed using primers NF3120/3121 and NF3122/3123 to amplify the up- and downstream regions, respectively; assembly of these regions into the linearised vector yielded plasmid pECC68.

Construction of Dual Expression Vectors for C. difficile To produce a dual expression vector from which dccA and a second gene could be inducibly expressed in C. difficile, a second cloning site was inserted into plasmid pECC17, between the end of dccA and the slpA transcriptional terminator. The site was inserted using two rounds of inverse PCR: the first round used primers NF3180/3181 to amplify pECC17, inserting half a Strep-tag and an SphI restriction site, yielding pECC72; the second round used primers NF3223/3224 to amplify pECC72, inserting the second half of the Strep-tag and a SalI site, yielding pECC76. Plasmid pECC76 contains dccA with a 5’ SacI site and 3’ BamHI site, a SalI site downstream of the BamHI site, followed by a Strep-tag sequence, then an SphI site and finally the slpA transcriptional terminator. To construct a dccA/pilA1 dual expression vector, pilA1 was amplified with primers NF3298/3299 and cloned into the second cloning site of pECC76 using BamHI/SphI, yielding pECC109. (Here and throughout, constructs made by cloning into the second cloning site of pECC76 were screened using primers NF3286/3287.) To construct dccA/pilV, dccA/pilU and dccA/pilK dual expression vectors, pilV, pilU and pilK were amplified with primers NF3263/3264, NF3265/3266 and NF3351/3352, respectively, then cloned into the second cloning site of pECC76 using BamHI/SphI. Cloning of pilV into the second cloning site of pECC76 yielded plasmid pECC104; cloning of pilU into the second cloning site of pECC76 yielded plasmid pECC99; and cloning of pilK into the second cloning site of pECC76 yielded plasmid pECC132. To construct a dccA/pilD1 dual expression vector, pilD1 was amplified with primers NF3276/3277 and cloned into the second insertion site of pECC76 using BamHI/SphI, yielding plasmid pECC96. To construct dccA/pilB1 and dccA/pilB2 dual expression vectors, pilB1 and pilB2 were amplified using primers NF3339/3340 and NF3342/3343, respectively, and cloned into the

78 second cloning site of pECC76 using BamHI/SphI. Cloning of pilB1 into the second cloning site of pECC76 yielded plasmid pECC128; cloning of pilB2 into the second cloning site of pECC76 yielded plasmid pECC129. To construct dccA/pilC1, dccA/pilMN and dccA/pilO dual expression vectors, pilC1, pilMN and pilO were amplified with primers NF3273/3274, NF3270/3271 and NF3267/3268, respectively, and cloned into the second cloning site of pECC76 using BamHI/SphI. Cloning of pilC1 into the second cloning site of pECC76 yielded plasmid pECC95; cloning of pilMN into the second cloning site of pECC76 yielded plasmid pECC97; and cloning of pilO into the second cloning site of pECC76 yielded plasmid pECC98. To construct dccA/pilC2 and dccA/pilM dual expression vectors, pilC2 and pilM were amplified with primers NF3345/3346 and NF3348/3349, respectively, and cloned into the second cloning site of pECC76 using BamHI/SphI. Cloning of pilC2 into the second cloning site of pECC76 yielded plasmid pECC130; cloning of pilM into the second cloning site of pECC76 yielded plasmid pECC131. To produce a dual expression vector from which dccA and a second gene would be constitutively expressed in C. difficile, the cwp2 promoter was sub-cloned from plasmid pECC12 into pECC109 using KpnI/SacI, yielding plasmid pECC127. Plasmid pECC127 constitutively co-expresses dccA and pilA1.

Construction of Vectors for Bacterial-2-Hybrid Assays The soluble domain-encoding regions of pilV, pilU, pilK, pilA1, pilJ and pilW were amplified using primers NF3235/3236, NF3237/3238, NF3239/3240, NF3241/3242, NF3259/3260 and NF3261/3262, respectively. Each of these gene fragments was then cloned into both pKT25 and pUT18C (Karimova et al., 2001) using BamHI/KpnI, yielding plasmids encoding fusion proteins with N-terminal adenylate cyclase subunits and C-terminal pilin subunits. Cloning of pilA1 into pKT25 and pUT18C yielded plasmids pECC77 and pECC83 respectively; cloning of pilU into pKT25 and pUT18C yielded plasmids pECC78 and pECC84 respectively; cloning of pilV into pKT25 and pUT18C yielded plasmids pECC79 and pECC85 respectively; cloning of pilK into pKT25 and pUT18C yielded plasmids pECC82 and pECC88 respectively; cloning of pilJ into pKT25 and pUT18C yielded plasmids pECC89 and pECC90 respectively; and cloning of pilW into pKT25 and pUT18C yielded plasmids pECC92 and pECC91 respectively.

Construction of Vectors for Pre-Pilin Peptidase/Truncated Pilin Co-Expression in E. coli A C-terminal strep-tag encoding sequence was inserted into the second MCS of pECC103, by inverse PCR with primers NF3296/3297, yielding plasmid pECC108. Highly truncated pilA1,

79 pilV, pilU and pilK genes were amplified with primers NF3331/3332, NF3333/3334, NF3335/3336 and NF3337/3338, respectively, and cloned into the second MCS of pECC108 using NdeI/XhoI. The cloning of truncated pilV yielded plasmid pECC115, of pilU yielded pECC116, of pilK pECC117 and of pilA1 pECC118. The pilD1 and pilD2 genes were then amplified using primers NF3326/3327 and NF3328/3330, respectively. The pilD1 gene was cloned into the first cloning site of pECC115, pECC116, pECC117 and pECC118 with NcoI/SacI, yielding pECC119, pECC120, pECC121 and pECC122 respectively; the pilD2 gene was similarly cloned, yielding plasmids pECC123, pECC124, pECC125 and pECC126 respectively.

2.4 RNA Manipulation 2.4.1 RNA Extraction RNA was extracted from cultures of C. difficile and C. sordellii using the FastRNA Pro Blue kit (MP Biomedicals, USA). Throughout, surfaces were treated with RNase Zap (Life Technologies) to prevent RNase contamination. Cultures ready for harvest were first incubated for 5 mins with 2 culture volumes of RNA Protect Bacteria Reagent (Qiagen), then harvested by centrifugation at 4000 g for 10 mins at 4oC. Supernatants were discarded and pellets resuspended in 1 ml FastRNA Pro Solution, and the resuspensions transferred to FastRNA Lysing Matrix B tubes. These were processed in a FastPrep-24 machine (MP Biomedicals) for 40 secs at 6 m/s, then cooled on ice. Once cooled, samples were centrifuged at 15 700 g for 10 mins at 4oC. Supernatants were transferred into RNase-free microcentrifuge tubes (Life Technologies) then incubated at room temperature for 5 mins. 300 μl chloroform was then added to each supernatant, and the mixtures vortexed for 10 sec then centrifuged at 15 700 g for 15 mins at 4oC. The upper phase was then transferred to a fresh RNase-free microcentrifuge tube containing 500 μl absolute ethanol, mixed then incubated overnight at -20oC to precipitate the nucleic acids. After approx. 18 hrs samples were centrifuged at 15 700 g for 15 mins at 4oC. The supernatants were discarded and pellets washed in 70 % ethanol (made with nuclease-free water), then centrifuged again at 15 700 g for 5 mins at 4oC. Supernatants were removed and pellets air-dried at room temperature for 10 mins, then resuspended in 45 μl nuclease-free water. Samples were then DNase treated using the Turbo DNase kit (life Technologies). To each sample 5 μl 10x reaction buffer was added, then 1 μl Turbo DNase and the sample incubated at 37oC for 30 mins. Another μl Turbo DNase was then added to each sample, and a further 30 min incubation at 37oC performed. 5 μl inactivation reagent was then added to each

80 sample, mixed and incubated at room temperature for 5 mins. Samples were then centrifuged at 10 000 g for 2 mins at 4oC and the supernatant transferred to a fresh RNase-free tube. RNA samples were tested for contaminating DNA by high fidelity PCR (Section 2.3.4 – KOD DNA polymerase was used due to its superior amplification ability over Taq DNA polymerase) using 1 μl RNA sample in a reaction with either primers NF408 and NF409 (for C. difficile RNA), which amplify C. difficile 16S rRNA genes, or primers NF3294 and NF3295 (for C. sordellii RNA), which amplify C. sordellii 16S rRNA genes. The reaction was performed in a total volume of 20 μl (volumes of all constituent parts were reduced proportionately from those used for a 50 μl reaction). If a PCR product was seen, indicating contamination, treatment with Turbo DNase was performed again. If no PCR product was seen, indicating an absence of contaminating DNA, the RNA concentration was measured using a NanoDrop 1000 Spectrophotometer.

2.4.2 cDNA Synthesis RNA was reverse transcribed using the RETROscript Reverse Transcription kit (Life Technologies) according to the manufacturer’s instructions. Briefly, 2.5 μg RNA was mixed with 5 μl RETROscript random decamers, and the mixture made up to a total volume of 30 μl with nuclease free water. The mixture was heated at 85oC for 3 mins then cooled on ice. 5 μl 10x RETROscript buffer, 10 μl 2.5 mM RETROscript dNTPs, 2.5 μl 10 U/μl RETROscript RNase inhibitor and 2.5 μl 100 U/μl RETROscript Moloney Murine Leukaemia Virus Reverse Transcriptase (MMLV-RT) were added and the mixture then incubated at 44oC for 1 hr, then heated at 92oC for 10 mins to inactivate the MMLV-RT. Successful cDNA synthesis was confirmed by high-fidelity PCR (Section 2.3.4) using 1 μl cDNA as template in a reaction using either primers NF408 and NF409 or NF3494 and NF3495 to amplify 16S rRNA, in a total reaction volume of 20 μl. The presence of a PCR product indicated that cDNA synthesis had been successful.

2.4.3 Reverse Transcriptase PCR (RT-PCR) Semi-quantitative RT-PCR was performed using the high-fidelity PCR protocol (Section 2.3.4), using 1 μl cDNA per PCR reaction, in a total volume of 20 μl. 30 reaction cycles were used as by this point one could reliable see product after agarose gel electrophoresis (Section 2.3.6), but the reaction was not saturated, meaning it was possible to qualitatively identify differences in amount of PCR product.

81

RT-PCR was used to amplify intergenic regions (Section 4.2.4). Primers NF2618/2619 amplify a region upstream of the riboswitch Cdi2_4, and the 5’ end of the riboswitch. Primers NF2503/2504 amplify the 3’ end of the riboswitch and the 5’ end of pilA1. All the following primer pairs were designed such that the 5’ primer would bind the 3’ end of the upstream gene, and the 3’ primer the 5’ end of the downstream gene, so that a product would be amplified comprising the ends of both genes and the intergenic region between them. Primers NF3095/2506 amplify from pilA1 to pilB1, NF2507/3096 amplify from pilB1 to pilC1, NF2509/2510 amplify from pilC1 to pilMN, NF2511/3097 amplify from pilMN to pilO, NF2513/2514 amplify from pilO to pilV, NF3098/2516 amplify from pilV to pilU, NF3099/2518 amplify from pilU to pilK, NF2519/2520 amplify from pilK to pilT, NF2521/2522 amplify from pilT to pilD1, NF2523/2524 amplify from pilD1 to pilD2, NF2525/2526 amplify from pilD2 to pth, NF2527/2528 amplify from pth to mfd, NF2529/2530 amplify from mfd to prsA, NF2531/2532 amplify from prsA to spoVT.

2.4.4 qRT-PCR Primers for qRT-PCR (qPCR) were designed using Primer Express software (Life Technologies). Reactions were performed in MicroAmp Fast Optical 96-Well Reaction Plates (Life Technologies) covered in MicroAmp Optical Adhesive Film (Life Technologies). Reactions were performed using SYBR Green (Life Technologies) as a fluorescent reporter, on a 7500 Fast Real-Time PCR System (Life Technologies) using 7500 Software v2.3 (Life Technologies). Reactions were performed in a total volume of 15 μl, comprising 7.5 μl 2x SYBR Green, 5 μl sample, primers to a final concentration of 500 nM each, made up to 15 μl with nuclease-free water. After loading the MicroAmp plate it was centrifuged for 1 min at 300 g to force all reaction mixtures to the base of their wells. Reactions were then performed using the default 7500 software reaction settings (i.e. initial holding stage: 50oC for 2 mins; second holding stage: 95oC for 10 mins; then 40 2-step cycles comprising: 95oC for 15 sec, then 60oC for 1 min. Following this a melt curve stage was performed, wherein the samples were heated to 95oC for 15 secs, cooled and held at 60oC for 1 min, then heated slowly to 95oC, held there for 30 secs then cooled finally back to 60oC for 15 secs). Standard curves were produced for all primer sets, using appropriate gDNA. Curves comprised 7 gDNA amounts, the highest being 50 ng and the others formed by 4-fold serial dilutions (i.e. 12.5 ng, 3.125 ng, 781 pg, 195 pg, 49 pg and 12 pg). cDNA was diluted prior to addition to reaction mixtures. Appropriate cDNA dilutions were identified empirically for each

82 primer set. No template controls were performed for each reaction, wherein nuclease-free water was added instead of DNA. All reactions were performed in triplicate, while for experimental reactions 3 technical replicates were performed of 3 biological replicates to produce each dataset. Significance of results was determined using Student’s t-tests and, where indicated, one-way Analysis of Variance testing (one-way ANOVA). For qPCR analysis of C. difficile gene expression, primers NF1696/1697 were used for 16S rRNA analysis; primers NF1912/1913 for pilA1; primers NF3031/3033 for pilB1; primers NF3308/3309 for pilC1; primers NF3310/3311 for pilMN; primers NF3312/3313 for pilO; primers NF1910/1911 for pilV; primers NF1908/1909 for pilU; primers NF3314/3315 for pilK; primers NF1906/1907 for pilT; primers NF3316/3317 for pilD1; primers NF3318/3319 for pilD2; primers NF3320/3321 for pth; primers NF3322/3323 for mfd; primers NF3324/3325 for prsA; and NF1914/1915 for pilA2. For 16S rRNA expression analysis, cDNA was diluted 1:20 000; for pilA1 and pilB1 expression analysis, cDNA was diluted as indicated in the text; for pilA2 expression analysis, cDNA was diluted 1:2; for analysis of the expression of all other genes, cDNA was diluted 1:20. For qPCR analysis of C. sordellii gene expression, primers NF1696/1697 were used for 16S rRNA analysis; primers NF3288/3289 for pilA1A; primers NF3290/3291 for pilA1B; and primers NF3365/3366 for pilB1. For16S rRNA expression analysis, cDNA was diluted 1:20 000; for pilA1A expression analysis, cDNA was diluted 1:2; for pilA1B expression analysis, cDNA was diluted 1:80; and for pilB1 expression analysis, cDNA was diluted 1:50.

2.5 Protein Manipulation 2.5.1 Protein Expression in E. coli For simple protein expression in E. coli (i.e. when purification was not intended), E. coli Rosetta (DE3) or BL21 (DE3), carrying a derivative of plasmid pET28a or PACYCDuet-1, was grown in overnight in LB broth. Following overnight growth this was sub-cultured into

25 ml fresh medium in a 250 ml flask to a resultant OD600 of 0.1. This culture was grown at o 37 C with shaking at 200 rpm until reaching logarithmic growth phase (log-phase – OD600 of approx. 0.6 for E. coli) whereupon expression of target genes was induced by the addition of 0.5 mM IPTG. Cultures were then grown for 4 further hours at 37oC with shaking at 200 rpm, then harvested by centrifugation for 12 mins at 5000 g. Pellets were washed with 1 ml PBS and transferred to micro-centrifuge tubes, then centrifuged again, for 2 mins at 17 000 g, and the supernatant removed. Pellets were then processed and analysed by SDS-PAGE followed by Coomassie staining/Western blotting, as described later in this chapter.

83

2.5.2 Protein Solubility Tests from E. coli E. coli Rosetta (DE3) or BL21 (DE3) carrying derivatives of plasmids pET28a and/or pACYCDuet-1 encoding recombinant C. difficile proteins were grown overnight in 5 ml either LB and/or TY broth at 37oC, with shaking at 200 rpm. Following overnight growth, these were sub-cultured into 25 ml fresh medium in a 250 ml flask to a resultant OD600 of 0.1, and grown at 37oC with shaking at 200 rpm until reaching log-phase, whereupon expression of target genes was induced with addition of 0.5 mM IPTG. Cultures were then grown either for 4 further hrs at 37oC with shaking at 200 rpm or overnight at 18oC or 20oC with shaking at 140 rpm. Cultures were then harvested by centrifugation for 12 mins at 5000 g. Pellets were washed in 10 ml PBS and the supernatant removed. Pellets were then lysed using BugBuster (Merck KGaA) according to the manufacturer’s instructions. 10x BugBuster Master Mix was diluted in PBS. DNaseI (12 μg/ml) and lysozyme (0.5 mg/ml) were also added to the mixture. Pellets were weighed and resuspended in the resultant 1x BugBuster mix. 5 ml 1x BugBuster mix was used per gram of pellet. The resuspension was incubated for 30 mins at room temperature with rotation, then centrifuged for 20 mins at 16 000 g at 4oC to separate the soluble and insoluble fractions of the cell lysate. The supernatant (soluble fraction) was removed from the pellet (insoluble fraction), and the pellet then resuspended in 1x Bugbuster mix, using the same volume as initially used for the lysis. Supernatant and pellet were then analysed by SDS-PAGE followed by Coomassie staining/Western blot, as described later in the chapter.

2.5.3 Protein Purification from E. coli E. coli Rosetta (DE3) or BL21 (DE3) carrying derivatives of plasmid pET28a, encoding a hexahistidine-tagged (His-tagged) recombinant C. difficile protein, were grown overnight in 5 ml either LB or TY broth at 37oC, with shaking at 200 rpm. Following overnight growth, these were sub-cultured into 250 ml LB or TY broth (as appropriate) in a 2.5 L flask to a o resultant OD600 of 0.1, and grown at 37 C with shaking at 200 rpm until reaching log-phase, whereupon expression of target genes was induced with addition of 0.5 mM IPTG. Cultures were further grown under conditions known to produce soluble target protein (Section 2.5.2; conditions identified for individual proteins are described in later chapters). Following culture growth, cells were harvested by centrifugation for 15 mins at 5000 g. Pellets were washed in 20 ml PBS and the supernatant removed.

84

Pellets were lysed using BugBuster (Merck KGaA) according to the manufacturer’s instructions. 10x BugBuster Master Mix was diluted in His-tag wash buffer (50 mM Tris-HCl, 300 mM NaCl, 20 mM imidazole, pH 7.4). DNaseI (12 μg/ml) and lysozyme (0.5 mg/ml) were added to the mix and the pellet lysed, and soluble and insoluble fractions separated, as above. The soluble fraction was filtered through a 0.45 μm filter and, affinity chromatography performed to purify the His-tagged protein. Affinity chromatography was performed using an ÄktaPrime Plus (GE Healthcare) with a 1 ml HisTrap FF Ni-sepharose column (GE Healthcare). The column was prepared by washing with 10 volumes of His-tag wash buffer. The soluble fraction of the lysate was then applied to the column, which was then washed with 20 further volumes of His-tag wash buffer. Bound protein was then eluted by gradually increasing the imidazole concentration to 300 mM over the course of 20 further column volumes of buffer. Desired fractions (those containing protein) were identified by their absorbance of light with 280 nm wavelength, and analysed by SDS-PAGE followed by Coomassie staining and/or immunoblotting (see below). Imidazole was then removed from purified protein by desalting. This was performed using an ÄktaPurifier UPC 10 (GE Healthcare) via Unicorn 5.20 software (GE Healthcare) with a HiPrep 26/10 Desalting Column (GE Healthcare). Over 3 column volumes protein was transferred into HEPES-Buffered Saline (HBS) buffer (150 mM NaCl, 10 mM HEPES, pH7.5). Again, desired fractions were identified and analysed by SDS-PAGE followed by Coomassie staining and/or immunoblotting. Fractions containing purified target protein were pooled, then concentrated by centrifugation at 4000 g in an Amicon centrifugal filter with an appropriate molecular weight cut-off.

2.5.4 Protein Concentration Quantitation Protein concentrations were quantified by bicinchoninic acid (BCA) assay (Life Technologies). Bovine Serum Albumin (BSA) was resuspended in HBS to a concentration of 1 mg/ml, and a series of 2-fold serial dilutions in HBS produced to create a standard curve. Protein sample was similarly serially diluted. BCA solutions were used according to manufacturer’s instructions. Reagent B was diluted 1:50 in Reagent A, and 180 μl of the resultant BCA mixture added to the wells of a 96-well plate. 20 μl of protein sample was added to the BCA mixture and mixed. All samples were analysed in triplicate. The plate was then incubated for 30 mins at 37oC and the absorbances of 595 nm light then read using a Benchmark Microplate Reader (Bio-Rad) using HBS as a blank. Protein concentrations were then calculated using the standard curve.

85

2.5.5 Protein Expression in C. difficile Genes of interest were cloned into pMTL960 based vectors, and expressed under the control of either the constitutive cwp2 promoter (Emerson et al., 2009) or the inducible tet promoter (Fagan and Fairweather, 2011). When expressing genes in C. difficile on solid medium, strains expressing them under the control of the cwp2 promoter were streaked out onto agar, incubated for 24 hrs at 37oC then suspended in 1 ml PBS using an L-shaped spreader. Suspended cells were then centrifuged for 4 mins at 17 900 g, and the supernatant discarded. Pellets were then resuspended in 1 ml PBS in order to measure the OD600 of the harvested cells. This suspension was then centrifuged once more for 4 mins at 17 900 g, and the supernatant discarded. Strains expressing genes under the control of the tet promoter were treated identically, save that the agar whereupon they were streaked out was supplemented with anhydrotetracycline (Atc) at concentrations as defined in the text. When expressing genes in C. difficile in liquid medium, strains expressing them under the control of the cwp2 promoter were inoculated in TY broth and grown overnight. The following morning they were sub-cultured into fresh TY broth to a resultant OD600 of 0.05, incubated for 6 hrs then harvested by centrifugation at 17 900 g for 4 mins. Pellets were washed in 1 ml PBS then harvested again by centrifugation at 17 900 g. Strains expressing genes under the control of the tet promoter were inoculated into TY broth and grown overnight, sub-cultured into fresh TY broth to a resultant OD600 of 0.05 and incubated until the fresh cultures had reached log phase (an OD600 of between 0.3 and 0.4). At this point Atc was added at various concentrations (as defined in the text) and the culture incubated for 3 further hrs, whereupon the culture was harvested as described above.

2.5.6 Protein Expression in C. sordellii Protein expression in C. sordellii was performed from the tet promoter, as described in the text. Protein expression in C. sordellii was only attempted during growth in liquid medium, and C. sordellii liquid cultures were harvested identically to C. difficile liquid cultures (as described in Section 2.5.5).

86

2.5.7 Tri-Chloroacetic Acid Precipitation of Protein Proteins were isolated from culture supernatants by precipitation using tri-chloroacetic acid (TCA). C. difficile culture supernatants (obtained during harvesting of C. difficile cultures, Section 2.5.5) were mixed 9:1 with 100 % TCA, giving a final TCA concentration of 10 %, vortexed briefly to mix, then incubated on ice for 30 mins. These mixtures were then centrifuged at 25 000 g for 10 mins at 4oC, supernatants discarded and the pellets resuspended in 1 ml ice-cold 90 % acetone. These suspensions were vortexed for 15 mins at room temperature. These two steps (centrifugation followed by vortexing) were repeated, and the suspensions then centrifuged one final time (again at 25 000 g for 10 mins at 4oC). The supernatants were discarded and the pellets left to dry. Pellets were then resuspended in PBS to an effective OD600 of 20.

2.6 SDS-PAGE and Immunoblotting 2.6.1 Cell Lysate Preparation The procedure used for lysing E. coli cells when needing to separate soluble and insoluble fractions of lysate, or to purify protein therefrom, was described in Sections 2.5.2 and 2.5.3. When lysing E. coli cells simply to analyse their lysate by Coomassie stain or Western blot BugBuster was not used; rather, pellets were resuspended in 1x Laemmli buffer to an effective

OD600 of 10 and boiled for 10 mins. Laemmli buffer was made as a 2x stock (150 mM Tris- HCl, pH 6.8, 30 % (v/v) glycerol, 1.5 % (w/v) sodium dodecyl sulphate (SDS), 15 % (v/v) β- mercaptoethanol, 2 µg/ml bromophenol blue), and diluted with water to make 1x buffer. C. difficile cells were lysed by enhanced freeze-thaw. Cells were frozen overnight o at -20 C then resuspended in PBS to an effective OD600 of 20. DNase I (12 µg/ml) and CD27L (20 µg/ml) were added to the suspension which was then incubated at 37oC for 1 hr. Both His- and Strep-tagged versions of CD27L had been purified, so if wishing to analyse expression of a tagged protein by the cells it was necessary to ensure differently tagged CD27L was used for lysis. C. sordellii cells were lysed as described in the text.

2.6.2 SDS-PAGE Protein sample analysis was performed using SDS-Polyacrylamide Gel Electrophoresis (SDS- PAGE). SDS-PAGE gels were made comprising a resolving and a stacking gel. Resolving gels contained: 375 mM Tris-HCl, pH 8.8, 15 % (w/v) acrylamide, 0.1 % (w/v) SDS, 0.1 % (v/v) Tetramethylethylenediamine (TEMED), 0.05 % (w/v) Ammonium Persulphate (APS).

87

Stacking gels contained: 125 mM Tris-Hcl, pH 6.8, 5 % (w/v) acrylamide, 0.1 % (w/v) SDS, 0.1 % (v/v) TEMED, 0.05 % (w/v) APS. Occasionally gels with resolving gels containing different acrylamide percentages were used. These were identical to those containing 15 % acrylamide save for the alteration in acrylamide content, and where used these gels are identified in the text with their acrylamide content defined. Prior to loading onto polyacrylamide gels, samples were mixed 1:1 with 2x Laemmli buffer. 10 µl sample was added to each well. In the cases that the sample was a whole cell lysate/supernatant this corresponded to 0.1 OD600 units. Electrophoresis was performed at 200 V in Tris-Glycine-SDS (TGS) running buffer (25 mM Tris, 192 mM glycine, 0.1 % SDS. National Diagnostics). Loaded onto every gel alongside samples was 10 µl Unstained Protein Marker, Broad Range (2-212 kDa) (NEB) for band sizing.

2.6.3 Coomassie Staining Following SDS-PAGE gels were incubated at room temperature for approx. 1 hr in Coomassie stain (45 % (v/v) ethanol, 10 % (v/v) acetic acid, 0.25 % (w/v) Coomassie Brilliant Blue R- 250). Coomassie stain was then poured away and the gel washed with water. The gel was then incubated at room temperature in Coomassie destain (45 % (v/v) ethanol, 10 % (v/v) acetic acid) until the gel was fully destained. The Coomassie destain was regularly changed to increase the speed of the destaining process. Once destained, all Coomassie destain was poured away, the gel washed again in water and then rehydrated by brief incubation in 10 % acetic acid. Once the gel had regained its original size the dilute acetic acid was poured away and the gel washed again in water and imaged by scanning on an Epson Perfection V700 Photo scanner (Epson, Japan) using Silverfast 8 software (LaserSoft Imaging AG, Germany).

2.6.4 Western Blotting Western blotting was performed using Immobilon-P hydrophobic Polyvinylidene Fluoride (PVDF) membrane (Merck KGaA) following the manufacturer’s three buffer transfer protocol. All steps were performed at room temperature and all incubation steps on a bench-top rocker. Following SDS-PAGE gels were incubated for 15 mins in Cathode Buffer (25 mM Tris-HCl, pH 9.4, 40 mM glycine, 10 % (v/v) ethanol). PVDF membrane was initially activated by briefly wetting in absolute ethanol, then washed by incubation for 2 mins in water. The PVDF membrane was then transferred to Anode II buffer (25 mM Tris-HCl, pH 10.4, 10 % (v/v) ethanol) and incubated therein for 5 mins.

88

Following the incubations the protein transfer was performed using a Trans Blot SD Semi-Dry Transfer Cell (Bio-Rad). The transfer stack was formed from the anode up. At the base, 2 pieces of Whatman 3MM Chr Chromatography paper (Thermo Fisher Scientific) soaked in Anode I buffer (0.3 M Tris-HCl, pH 10.4, 10 % (v/v) ethanol) were placed upon the anode. A third piece of Whatman 3MM paper, soaked in Anode II buffer, was placed upon the first two, followed by the PVDF membrane, then the gel and finally three more pieces of Whatman 3MM paper, these soaked in Cathode buffer. Air bubbles were removed from the stack by rolling a stripette over the stack, applying firm pressure all the while. The transfer was then performed by electroblotting at 15 V for 15 mins. Proteins were visualised on the membrane to confirm successful transfer by incubating the membrane in Ponceau stain (1 % (v/v) acetic acid, 0.5 % (w/v) Ponceau S) for 10 mins followed by washing with water to remove background staining. The membrane was again briefly wetted in absolute ethanol, then left to dry for approx. 30 mins. Primary antibody was then applied to the membrane in a solution of 3 % milk (Merck KGaA) and 0.01 % Tween-20 in PBS. Dilution factors of antibodies used are provided in Table 2.4. The membrane was incubated in primary antibody for 45 mins, then washed 3 times in PBS and incubated in secondary antibody (also in 3 % milk and 0.01 % Tween-20 in PBS) for 30 mins. The membrane was then washed 3 more times in PBS. The blot was then developed using SuperSignal West Pico Chemiluminescent Substrate (Life Technologies) and visualised using a LAS-3000 Imager.

2.7 Microscopy 2.7.1 Phase Contrast Microscopy General samples were prepared according to one of two protocols: one which included sample fixation in formaldehyde, and one which did not. Samples from liquid cultures were harvested by centrifugation at 17 900 g for 4 mins. Samples from plates were resuspended in 1 ml PBS using an L-shaped spreader, then harvested by centrifugation at 17 900 g. All pellets were then washed in PBS and re-harvested by centrifugation. In the case that samples were to be examined without fixation, pellets were resuspended in 1 ml PBS, and dilutions of the resuspension made (1/3 and 1/9 dilutions). 10 μl of each dilution was spotted onto microscope slides (Super Premium Microscope Slides, VWR) and left to dry. In the case that samples were fixed prior to examination, following washing in PBS pellets were resuspended in 4 % formaldehyde in PBS, and incubated with rotation for 15 mins at room temperature. The pellet was then harvested again by centrifugation, then resuspended

89 in 20 mM NH4Cl and incubated with rotation for a further 15 mins at room temperature, to quench residual formaldehyde. The cells were then harvested again by centrifugation, washed in 1 ml water, re-harvested by centrifugation and the pellet finally resuspended in 1 ml water. Again, this resuspension was diluted 1/3 and 1/9, 10 μl of each dilution spotted onto microscope slides and left to dry.

Antibody Raised In Working Dilution Source Primary Antibodies α-PilA1 (CD 3513; Rabbit 1:2000/1:50 000 Eurogentec (Belgium) strain 630) α-PilU (CD3507; Rabbit 1:2000/1:50 000 Eurogentec strain 630) α-PilV (CD3508; Rabbit 1:2000/1:50 000 Eurogentec strain 630) α-PilW (CD2305; Rabbit 1:1000/1:20 000 Eurogentec strain 630) Royal Holloway, α-PilA2 (CD3155; Mouse 1:50 000 strain R20291) University of London (UK) α-Strep-tag II Mouse 1:2000 VWR α-His-tag-HRP Mouse 1:2000/1:10 000 Sigma

Secondary Antibodies α-Rabbit-HRP Goat 1:2000 Dako (Denmark) α-Mouse-HRP Rabbit 1:2000 Dako Table 2.4. Antibodies Used in Western Blotting. Where two working dilutions are given, the first is for use with C. difficile-derived samples, the second with E. coli-derived samples. Commercially acquired primary antibodies are monoclonal. All others (primary and secondary) are polyclonal. HRP indicates antibody is conjugated to horseradish peroxidase.

In the specific case of checking sporulation during a sporulation assay (see below), a plastic 1 μl loop-full of cells were picked off a plate and resuspended in 100 μl water. This was diluted 1/5 and 10 μl each dilution were spotted onto microscope slides and left to dry. In all cases, 10 μl water was spotted onto each sample spot and a cover slip placed on the slide, and the slide then examined using an Eclipse E600 microscope (Nikon, Japan) with a Plan Fluor 100x lens (Nikon). Images were taken using a Retiga 2000R camera (QImaging, Canada) via QCapture Pro software (QImaging), which was also used for the addition of scale bars to images.

90

2.7.2 Fluorescence Microscopy Strains were grown overnight in BHIS medium. Between 0.2 – 1 ml culture was then harvested by centrifugation (17 900 g for 4 mins). The pellet was washed in 1 ml PBS then re-harvested by centrifugation. The pellet was then resuspended in 1 ml PBS containing 1 μg/ml MitoTracker Green FM (MTG, Life Technologies) to stain cell membranes. 15 μl resuspension was spotted onto ‘Polysine’ polylysine slides (Thermo Fisher Scientific) and immediately visualised using an Eclipse E600 microscope with a Plan Fluor 100x lens. Images were taken using a Retiga 2000R camera via QCapture Pro software, which was also used for the addition of scale bars to images.

2.7.3 Transmission Electron Microscopy (TEM) All TEM was kindly performed by Dr Maria McCrossan, Royal Free Hospital. 1 ml samples from liquid cultures were harvested by centrifugation at 500 g for 2 mins (the minimum level of centrifugation found to pellet a 1 ml sample). These were then resuspended in 4 % formaldehyde, 1.5 % glutaraldehyde in PBS to fix. 10 μl fixed sample was spotted onto Pioloform coated 400 mesh copper grids then negatively stained with 0.3 % phosphotungstic acid and examined under a Jeol 1200EX Transmission Electron Microscope.

2.8 Phenotypic Analyses 2.8.1 Growth Curves Overnight cultures of C. difficile or C. sordellii in either BHIS or TY broth were sub-cultured into fresh medium to a resultant OD600 of 0.05. These cultures were incubated for several hours, and the culture OD600 measured either hourly (for C. difficile) or half-hourly (C. sordellii).

2.8.2 Cell Morphology Comparisons Strains were grown as described in the text, either in liquid or on solid medium, harvested and examined by phase contrast microscopy (Section 2.7.1) and fluorescence microscopy (Section 2.7.2) to investigate their cell morphology.

2.8.3 Colony Morphology Comparisons Strains were grown in TY broth overnight, then sub-cultured into fresh medium and incubated for 6 hrs. Serial 10-fold dilutions in PBS were then made of the cultures, and 100 μl of the 10-5 – 10-7 dilutions spread onto TY agar. These were incubated for 4 days then imaged by scanning on an Epson Perfection V700 Photo scanner using Silverfast 8 software, or photographed.

91

2.8.4 Sporulation Assay Strains were first re-streaked onto BHIS agar supplemented with 0.1 % sodium taurocholate, to ensure any spores previously present in the strain stock germinated into vegetative cells. 24 hrs later strains were inoculated into TGY medium and grown overnight. The following morning these were sub-cultured into SMC medium to a resultant OD600 of 0.05. This was grown to an OD600 of approx. 0.5 then sub-cultured again into fresh SMC medium to a resultant

OD600 of 0.05. This was again grown to an OD600 of approx. 0.5, then 100 μl of culture spread onto SMC plates. These were incubated for 7 days. On the 3rd, 5th and 7th days sporulation was checked by phase contrast microscopy (Section 2.7.1). On the 7th day cells were scraped off plates into 1 ml PBS. Serial 10-fold dilutions of the resuspension were made, down to a dilution of 10-7. The serial dilutions were then spotted onto BHIS agar supplemented with 0.1 % sodium taurocholate, 25 μl of each dilution onto each plate. The initial, undiluted, resuspension was then mixed 1:1 with 100 % ethanol for 1 hr, to kill vegetative cells. Serial 10-fold dilutions were again made (taking account of the fact that the initial resuspension was now diluted 1/2) and these spotted onto BHIS agar supplemented with 0.1 % sodium taurocholate identically to the initial resuspension. Plates were incubated for 24 hrs to allow colony growth, and colonies then counted. Colony counts were multiplied to obtain total c.f.u./ml (colony-forming units/ml) and ethanol-resistant c.f.u./ml of each resuspension. Sporulation efficiency was calculated as: 100 % x (ethanol-resistant c.f.u./ml)/(total c.f.u./ml). For each strain analysed, three technical replicates of four biological replicates were performed. Significance of results was analysed using a Student’s t-test.

2.8.5 Aggregation Assays Strains were inoculated into 10 ml BHIS medium and grown overnight (approx. 15 hrs). The apparent OD600 of each culture was then measured. The culture was then vortexed vigorously to disrupt any aggregation at the base of the culture tube, and the OD600 of the culture measured again. Percentage aggregation was calculated as 100 % x (initial OD600)/(post-vortex OD600).

2.8.6 Biofilm Assays The protocol from Ðapa et al, 2013 (Ðapa et al., 2013) was used to investigate biofilm formation. 24 well, tissue culture-treated plates were reduced in the anaerobic cabinet for 48 hrs. 5 ml cultures of C. difficile in BHIS were grown overnight. 1 ml BHIS supplemented with 0.1 M glucose was added to each well of the 24 well plates, and the overnight cultures were then sub-cultured into the 24 well plates to an OD600 of 0.05. Lids were secured to the plates

92 using parafilm to prevent evaporation of the liquid medium. Plates were then left either 1 day, 3 days or 5 days. Following this incubation period, BHIS was removed from each well. The wells were then gently washed with 1 ml sterile PBS then incubated in the anaerobic chamber for 30 mins in 1 ml sterile 0.2 % crystal violet to stain any biofilm. The crystal violet was then removed and the biofilm washed once more in 1 ml sterile PBS. Plates were then photographed, and 1 ml absolute ethanol added to each well to extract the dye from the biofilms. Plates were incubated with ethanol for 30 mins at room temperature under aerobic conditions. The absorbance of the ethanol at 570 nm was then measured to quantitatively analyse the amount of dye extracted from the biofilm, and therefore the relative biofilm mass.

2.8.7 Twitching Motility Assays

C. difficile strains were grown overnight, diluted in PBS to a resultant OD600 of 0.3, then spotted onto plates 5 µl at a time. These plates were then incubated for 4 days then imaged, either by scanning with an Epson Perfection V700 Photo scanner using Silverfast 8 software or photographing. Alternatively, strains were stab-inoculated using a cocktail stick into 1.5 % BHIS agar plates. Plates were incubated for 4 days, then, to the agar removed and the plates imaged by scanning on an Epson Perfection V700 Photo scanner using Silverfast 8 software. To confirm the extent of the spread of cells across the petri dish, they were stained with 1 % crystal violet for 30 mins. Crystal violet was then removed and the plates washed three times with PBS, then imaged again by scanning.

2.9 Protein Interaction Studies

2.9.1 Bacterial-Two-Hybrid Bacterial-two-hybrid (B2H) assays were performed broadly as described by Karimova et al in 1998 (Karimova et al., 1998). Genes of interest were cloned into the B2H vectors pKT25 and pUT18C, which allow expression of fusion proteins with genes of interest at the C-terminus. Plasmid constructs were transformed into E. coli DHM1 cells. Transformants were grown overnight in LB medium. 2 μl overnight culture was then spotted onto LB-agar plates supplemented with 0.5 mM IPTG and 40 μg/ml X-Gal. These plates were then incubated at 30oC for 24 hr in the dark, then imaged by scanning on an Epson Perfection V700 Photo scanner using Silverfast 8 software. DHM1 cells transformed with the plasmids pKT25-Zip and pUT18C-Zip were used as a positive control. All experiments were carried out in triplicate (3 biological replicates).

93

2.9.2 Co-Purification Protein purification was performed identically to that described in Section 2.5.3, save that multiple proteins were co-expressed in E. coli BL21 (DE3) from derivatives of both plasmids pET28a and pACYCDuet-1, encoding recombinant C. difficile proteins.

94

Chapter 3: Expression of Type IV Pili in C. difficile

3.1 Introduction Type IV pili (T4P) have been objects of interest in Gram-negative bacterial species since the 1970s, when they were first discovered in P. aeruginosa (Bradley, 1972). In the following decades T4P were identified in multiple other Gram-negative species, including many human pathogens (Mattick, 2002). T4P have the appearance on the surface of a cell of a very long, very thin, hair-like structure, which often form into bundles. Uniquely among pilus types, T4P are able to retract.

3.1.1 Medical Relevance of Type IV Pili T4P generally are an important area of interest in microbiological research, due to their importance in the pathogenesis of several human pathogens. T4P are able to promote virulence in species by enabling host colonisation, both by directly binding to host tissue and by mediating biofilm formation (Pelicic, 2008). For instance, T4P are essential virulence factors in both P. aeruginosa and N. meningitidis; P. aeruginosa T4P bind directly to human pneumocytes, and their loss results in a 90 % loss of bacterial adherence to the cells (Hahn, 1997), while in N. meningitidis T4P mediate adhesion to host tissue both by direct binding of the bacteria to human cells, via the minor pilins PilC1 and PilC2 (Morand et al., 2009), and by bacterial aggregation via the minor pilin PilX (Hélaine et al., 2005). T4bP, such as the bundle-forming pili of Enteropathogenic E. coli (EPEC) and the toxin-co-regulated pilus of V. cholerae, also frequently act as virulence factors (Bieber et al., 1998; Taylor et al., 1987). A greater understanding of T4P therefore improves our understanding of the virulence mechanisms of various important pathogens, and may eventually, it is hoped, enable the development of new vaccines or medicines targeting these structures.

3.1.2 C. difficile Type IV Pili The C. difficile T4P genes were identified as being localised in two clusters/operons, shown schematically in Figure 1.15. The primary operon appeared to have every gene required for assembly of functional T4P, except for pilN. The secondary operon, on the other hand, is much smaller, containing only genes encoding a pilin and pilB, pilC and pilM (Varga et al., 2006). Pili apparently produced from the primary operon have been observed by EM on the surface of cells present in the gut of an infected hamster (these pili were immunolabelled with antibodies against the pilin of the primary cluster encoded by gene 3507 in strain 630)

95

(Goulding et al., 2009). This shows that T4P are produced by C. difficile during infection, suggesting a possible role in disease. However, beyond what is described here, nothing more was known of C. difficile T4P at the initiation of this PhD project.

3.1.3 Aims This chapter details initial work done to lay the grounds for this project. The C. difficile T4P genes were analysed in silico, and experiments were performed to identify conditions under which C. difficile would express its T4P in vitro. It was initially hoped that T4P expression from both the primary and secondary clusters could be identified, but this was eventually limited to expression from the primary operon. The major constituent of C. difficile T4P expressed from the primary T4P gene cluster was also identified.

3.2 Results 3.2.1 Analysis of the C. difficile 630 Type IV Pili Gene Loci The structures of the C. difficile primary and secondary T4P gene loci as first described in 2006 ((Varga et al., 2006)) are shown in Figure 1.15. However, initial analysis of the gene clusters in strain 630 showed that the primary T4P locus had been described incorrectly. The initial description of the primary operon shows that it starts with a pilin gene (CD630_3513), followed by assembly ATPase and platform protein genes. This is correct. The CD630_3513 pilin gene was predicted by others to constitute the major pilin, based on its position at the start of the gene cluster, and so named pilA1 (Maldarelli et al., 2014). The assembly ATPase and platform protein genes were named pilB1 and pilC1 respectively, the “1” in each of their names indicating their locations within the primary T4P gene cluster (Maldarelli et al., 2014). The fourth gene in the cluster was first described as a pilM gene. However, this gene is much larger than pilM genes found in either the Gram-positive species C. perfringens or the T4aP of the Gram-negative species P. aeruginosa or N. meningitidis (1713 bp, encoding a 510 amino acid protein, compared to 1155 bp/384 aa in C. perfringens Strain 13, 1065 bp/354 aa in P. aeruginosa Strain PA01 or 1116 bp/371 aa in N. meningitidis Strain ATCC 13091). Analysis of the sequence of the protein product of this supposed pilM gene using TMHMM suggested that it contains a transmembrane helix (Figure 3.1A). PilM proteins in T4aP systems are known to be cytoplasmically located (Martin et al., 1995), where they form part of the alignment complex with PilNOP via the binding of PilM to the cytoplasmic N- terminal domain of the monotopic membrane protein PilN (Ayers et al., 2009; Karuppiah and Derrick, 2011). No independent pilN gene could be found within the primary T4P operon in

96

C. difficile, nor anywhere else in its genome. In the related T2SS one large, monotopic membrane protein (GspL or equivalent) plays the role of PilM and PilN (Korotkov et al., 2012), while the same is found in T4bP – the monotopic membrane protein BfpC from the EPEC bundle-forming pilus is homologous to GspL family proteins (Yamagata et al., 2012). It therefore appears that the C. difficile gene initially annotated as pilM is in fact a GspL-like pilM-pilN fusion. Indeed, BLAST analysis of the gene product identifies both PilM and PilN family domains within the protein (PilM at the N-terminus and PilN at the C-terminus (Figure 3.1B)). A

B

Figure 3.1. In Silico Analysis of the Product of the C. difficile 630 Primary T4P Cluster “pilM” Gene. A. TMHMM analysis, predicting the presence of a transmembrane helix mid-way through the protein, with the N-terminus of the product cytoplasmic and the C-terminus extracellular. B. BLAST analysis, showing the N-terminus is predicted to contain a PilM motif and the C-terminus a PilN motif. The PRK08581 domain is a predicted amidase domain.

Others have also identified the fact that this gene is a pilM-pilN fusion (Melville and Craig, 2013), and when nomenclature for the C. difficile T4P genes was recently formalised the gene was named pilMN, reflecting this nature (Maldarelli et al., 2014). BLAST analysis also suggested that the C-terminus might contain an amidase domain (see Figure 3.1). This is interesting, because to extend away from a bacterium, the T4P have to pass through the peptidoglycan cell wall. How this is enabled is unknown, but the presence of an amidase

97 domain within the extracellular region of PilMN suggests it may be key in re-ordering the cell wall for passage of T4P. Following pilMN in the primary T4P gene cluster is a pilO gene, correctly annotated in the initial description of the cluster. Following pilO in the initial description is an unidentified gene followed by an individual pilin gene. This is incorrect; following pilO are two easily identifiable pilin genes, likely to encode minor pilins based on their location in the middle of the cluster. These have since been named pilV and pilU (Maldarelli et al., 2014). Proceeding pilU is a large gene, which was the gene initially annotated as unknown, though it has since been identified to be a GspK-family minor pilin gene (Melville and Craig, 2013), and named pilK (Maldarelli et al., 2014). Following pilK is a pilT gene followed by two pilD genes, named pilD1 and pilD2. Immediately downstream of pilD2 are three genes which appear to bear no relation to T4P: pth, mfd, and prsA. Encoded by pth is the peptidyl-tRNA hydrolase protein, which hydrolyses the amide bond between peptide and tRNA within peptidyl-tRNA molecules released from stalled ribosomes (Das and Varshney, 2006). pth has been found to be essential for survival of E. coli, probably because it is needed to prevent the build-up of peptidyl-tRNA in the cytosol as this would have the eventual effect of sequestering tRNA in a useless format, leading to the depletion of free tRNA in the cell and thus the inhibition of protein synthesis (Das and Varshney, 2006). The following gene, mfd, encodes the Mutation Frequency Decline protein. Mfd is a transcription-repair coupling factor, which releases RNA polymerase complexes which have stalled during transcription at locations of DNA damage, and then recruits the DNA repair factor UvrA (Hanawalt and Spivak, 2008). Unlike pth, mfd is not essential for bacterial survival and its loss results in only a mildly increased rate of mutation (Hanawalt and Spivak, 2008). The final gene associated with the cluster is prsA, which encodes the lipoprotein PrsA. PrsA acts as a chaperone, and catalyses the folding of exported proteins (Hyyryläinen et al., 2010). None of these three final genes associated with the primary T4P gene cluster are thought to play any role in T4P formation. Figure 3.2 shows the conclusive structure of the primary T4P gene cluster in C. difficile 630, factoring in my analysis of genes and also including the other advances in gene annotation and nomenclature described in the text. Included in the cluster are all genes which look likely to be transcriptionally linked. The original description of the secondary T4P gene cluster in C. difficile ((Varga et al., 2006)) appears to be more accurate than that of the primary cluster. As described, the cluster contains a pilB gene followed by a pilC gene (since named pilB2 and pilC2, denoting

98 their location in the secondary gene cluster (Maldarelli et al., 2014)). Next comes a pilin gene, hypothesised by others to encode a second major pilin, due to its presence as the only pilin gene within a separate gene cluster, and named PilA2 (Maldarelli et al., 2014). The initial description then ends with a pilM gene, now known simply as pilM (Maldarelli et al., 2014).

Figure 3.2. The C. difficile 630 Primary T4P Gene Cluster. Genes are colour-coded as described in the legend. Gene names are as formalised in (Maldarelli et al., 2014). Underhanging genes indicate overlap of open reading frames (ORFs) on the chromosome. The numbers at the beginning and end of the cluster indicate the gene numbers in the C. difficile 630 genome.

Though these are the only definite T4P-related genes in the secondary cluster, there do appear to be other transcriptionally-linked genes: upstream of pilB2, initiating the gene cluster, is a gene of unknown function (CD630_3297), and immediately downstream of pilM are two more genes of unknown function. All three of these unknown genes are predicted by TMHMM to encode monotopic membrane proteins. No homologues are identified by BLAST, though others have speculated that the final two genes in the cluster (CD630_3290 and CD630_3291) in fact encode accessory membrane proteins analogous to PilN and PilO. This is based purely on the fact that they are predicted to encode monotopic membrane proteins with topologies identical to PilN and PilO (Melville and Craig, 2013). This is an interesting possibility as it would mean that the secondary T4P gene cluster would encode a full complement of accessory membrane proteins. However, currently that suggestion is purely speculative, so in this work these genes remain annotated as genes of unknown function. The structure of the secondary T4P gene cluster is shown in Figure 3.3.

99

Figure 3.3. The Secondary T4P Gene Cluster from C. difficile 630. The genes are colour-coded using the same scheme as in Figure 3.2. Gene names are as formalised in (Maldarelli et al., 2014). The 3297 and pilB2 ORFs overlap and the arrowed numbers indicate the gene numbers in the C. difficile 630 genome.

These analyses show that the primary T4P gene cluster contains all the genes expected to be necessary for T4P biogenesis. The secondary T4P gene cluster indisputably does not, as it lacks at least a pre-pilin peptidase (pilD) gene. In all species tested, PilD is essential for processing of pre-pilins into mature pilins able to be incorporated into a growing pilus (Melville and Craig, 2013). Thus, if the secondary T4P gene cluster does indeed produce pili (which is not certain), processing of PilA2 would presumably be performed by an externally encoded PilD protein. The only pilD genes yet identified in the C. difficile genome are pilD1 and pilD2, encoded in the primary T4P gene cluster, suggesting that if the secondary T4P gene cluster does produce T4P, there is likely to be a level of crosstalk between the primary and secondary gene clusters. The primary gene cluster also encodes a putative PilT retraction ATPase. The secondary T4P gene cluster does not contain a pilT gene, and though not all T4P do retract, if the secondary T4P gene cluster does produce T4P, it is possible that the PilT protein could be used to drive retraction of pili produced from both gene clusters. Early in this study, four other putative type IV pilin-like genes within the C. difficile genome were identified by others (Melville and Craig, 2013). The first, CD630_2305 (now known as pilW (Maldarelli et al., 2014)) our group had already identified; the others (CD630_0755 (pilJ), CD630_1242 (pilX) and CD630_1245 (pilA3)) were not previously known to us. The pilW and pilJ genes are isolated on the chromosome, while pilX and pilA3 are located in an apparent four gene operon, separated by two genes of unknown function.

3.2.2 Conservation of Type IV Pili Genes Within C. difficile Although the genomes of dozens of strains of C. difficile have been sequenced, the vast majority remain as unannotated and unassembled sequences. However, I was able to identify/access 11 other genome sequences (in addition to strain 630) which had been annotated (listed in Table 3.1), enabling relatively easy identification of individual genes

100

(www.sanger.ac.uk/people/directory/lawley-trevor). The genomes were analysed to investigate whether the primary and secondary T4P gene clusters (and the 4 other putative pilin genes identified) are conserved across the species, between strains. These strains are spread across 10 different ribotypes, and are broadly spread across the species. Complete and intact primary T4P gene clusters were identified in every genome analysed, indicating it is likely to play an important role in the species. Complete and intact secondary T4P gene clusters were found in all strains apart from CF5, wherein the final gene in the cluster (one of unknown function but speculated to be a pilO analogue) is truncated at the 3’ end, losing approximately the final third of the gene. If this gene is an essential component of the secondary T4P gene cluster, then strain CF5 is unlikely to produce a functional product from the cluster. CF5 is a ribotype 017 strain; the other 017 strain from the group (M68) contains an intact secondary T4P gene cluster, showing that the mutation is not found across ribotype 017 and is likely to be a strain-specific mutation. The high level of conservation of the secondary cluster across divergent strains of C. difficile demonstrate that it likely plays an important role in the species, but is clearly not essential. Strain Ribotype All 12 genomes were found to contain the Liv24 001 putative pilin genes pilJ, pilX and pilA3. Interestingly BI9 001 though, only 8 of them contain pilW, which is absent TL178 002 from strains CF5, M68, 305 and M120. Pilin proteins 630 012 tend to be the most variable of the T4P components: their TL176 014 hydrophobic N-termini, which interact to form the central CF5 017 pilus filament, tend to be quite well conserved between M68 017 species, but their globular C-terminal domains are much 305 023 more variable (Craig et al., 2004). Partly this is due to the 196 027 R20291 027 differing functions of pili between species, and natural M120 078 genetic drift. However, immunological pressure may also Liv22 106 play a part in driving pilin variation, even within species Table 3.1. C. difficile Genomes – because vast numbers of pilins can be exposed to host Analysed for T4P Gene Content. cells during infection, they often act as antigens, meaning pilin mutants can undergo positive selection for mutations which reduce their antigenicity (Blank et al., 2000). The pilin proteins from each strain were therefore aligned to analyse the variation of each protein between strains (Table 3.2). The alignments themselves are shown in Appendix 2.

101

All 9 pilin proteins are well conserved between the 12 strains analysed (or 8 strains in the case of PilW), but PilA1 and PilW are clearly the least well conserved of the 9 proteins, being the only 2 with total amino acid identities of less than 80 % and also having the lowest mean pairwise identities of the 9. This supports the supposition based on its position at the start of the primary T4P gene cluster that pilA1 encodes the major pilin in C. difficile T4P, because as the major pilin and therefore main constituent of the pili, hosts would have the highest level of exposure to PilA1 compared to the other pilins, likely generating the heaviest immune response and greatest positive selection pressure on antigenic epitopes. It is interesting that PilW is so poorly conserved across strains. This could indicate that it is highly antigenic during infection and is therefore also under positive selection, but probably more likely, given that it is found in only some pathogenic strains of C. difficile, it does not play a tremendously important role within the function of C. difficile T4P and is therefore not under such strong purifying selection as the other pilins, allowing it to diversify more. The particularly strong conservation of PilA2 is also notable: as a putative major pilin of the secondary cluster one might also expect it to display greater levels of variation. The fact that it does not may indicate either that it is not very immunogenic, that it is not produced during infection of a host, or possibly that it is not in fact a major pilin at all.

Pilin Protein Total Amino Acid Identity (%) Average Pairwise Identity (%) PilA1 78.6 91.4 PilV 94.1 98.6 PilU 93.1 98.4 PilK 92.2 98 PilA2 96.6 99 PilA3 87.5 96.9 PilX 88.3 97.4 PilJ 92.5 97.9 PilW 75.2 85.9 Table 3.2. Conservation of all 9 (Putative) Pilin Proteins from C. difficile 630. The level of conservation of the 9 pilin proteins from C. difficile 630 between the 12 strains of C. difficile for which annotated genomes were available to analyse. ‘Total Amino Acid Identity (%)’ indicates the percentage of amino acids within each protein sequence shared by all strains. ‘Average Pairwise Identity (%)’ indicates the average percentage of amino acids from the putative pilin sequence shared between any two strains.

Analysis of the conservation of all putative pilin genes identified by ourselves and others in C. difficile strain 630 has been described here. Work by others in strain R20291 has identified

102 these 9 genes as encoding putative pilin proteins and found no others (Maldarelli et al., 2014), though of course other strains may encode other, as-yet unknown pilins.

3.2.3 Anti-Pilin Antibodies A previous student in the Fairweather group had already raised antibodies against four pilins from C. difficile 630: PilA1, PilV, PilU and PilW. The globular C-terminal domains of these proteins were expressed as fusion proteins with maltose-binding protein (MBP). The fusion proteins were purified using an amylose column, cleaved using Factor Xa, the MBP and pilin fragments separated by size exclusion chromatography and the pilin fragments used as antigens to raise α-pilin antibodies in rabbits. However, an antibody against PilA2 was also desired, and it was decided to purify PilA2 and raise antibodies against it. To purify PilA2, the 3’ fragment of pilA2 which encodes the C-terminal soluble domain was identified. TMHMM was used to predict the location of the TM helix, which was identified to be between residues 11-33 inclusive, of the complete, pre-pilin sequence. Therefore the pilA2 gene excluding the first 99 bp (corresponding to the first 33 amino acids of PilA2) was cloned into pECC23, an intermediate vector (see Chapter 2) yielding plasmid pECC38, which encodes an N-terminally truncated pilA2 gene (pilA2Δ1-33) with an N-terminal His-tag. Plasmid pECC38 was transformed into E. coli Rosetta, and its solubility tested when expressed in LB broth at either 37oC (for 4 hrs) or 18oC (overnight). Following lysis, soluble and insoluble lysate fractions were analysed by SDS-PAGE followed by Western blotting using α-His-tag antibody. A significantly greater proportion of total PilA2Δ1-33 was soluble when expressed overnight at 18oC than when expressed for 4 hrs at 37oC (not shown).

Purification of PilA2 was then performed. PilA2Δ1-33 was expressed from E. coli o Rosetta (pECC38) overnight at 18 C. PilA2Δ1-33 was purified from the soluble fraction of the 2+ lysate using a Ni column (Figure 3.4A). Fractions containing PilA2Δ1-33 were then pooled and desalted, a process which also has the effect of further purifying the protein in a mild form of size exclusion chromatography (Figure 3.4B). Fractions containing PilA2Δ1-33 were again pooled and this time the PilA2Δ1-33 within them concentrated by centrifugation using a centrifugal filter with a 3 kDa molecular weight cut-off. Following concentration, the protein was found to be completely pure (Figure 3.4C).

A total yield of 1.37 mg PilA2Δ1-33 protein was obtained, sufficient for use in raising antibodies. The protein solution was sent to the Cutting Group at Royal Holloway, University of London, who used it to raise α-PilA2 antibodies in mice.

103

The five α-pilin antibodies (α-PilA1, α-PilA2, α-PilV, α-PilU and α-PilW) were next tested against pilin proteins expressed in E. coli, both to test their specificity between the 9 C. difficile 630 pilin-like proteins, and also to identify an appropriate working dilution factor for their use against E. coli lysates. The five target pilins were expressed in E. coli Rosetta from pET28a-derived plasmids: PilA2 was expressed from pECC23, while PilA1, PilU, PilV and PilW were expressed from pRPF227, pRPF232, pRPF228 and pRPF226 respectively (plasmids constructed by Robert Fagan). 1:20 000 was identified as an appropriate working dilution for the α-PilW antibody and 1:50 000 as an appropriate working dilution for the other four (for use against E. coli lysates).

Figure 3.4. Purification of His-Tagged PilA2Δ1-33 from E. coli Rosetta (pECC38). A. Affinity 2+ chromatography of PilA2Δ1-33 using a Ni column. B. Desalting of PilA2Δ1-33. C. Concentrated PilA2Δ1-33 after use of a centrifugal filter. Analysed using 18 % acrylamide SDS-PAGE.

To test the specificity of the five α-pilin antibodies, the other four C. difficile 630 pilin-like genes (pilX, pilA3, pilJ and pilK) were cloned into pET28a (yielding plasmids

104 pECC53-56 respectively) and expressed in E. coli. Whole cell lysates of all nine E. coli Rosetta strains expressing C. difficile pilin-like proteins were analysed by SDS-PAGE followed by Western blotting with the five α-pilin antibodies (Figure 3.5).

Figure 3.5. Western Blots Showing the Specificity of the Five α-C. difficile Pilin Antibodies. The antibodies were tested against all 9 pilin-like proteins encoded by C. difficile 630 expressed in E. coli Rosetta. A. Blot with α-PilA1 antibody. B. Blot with α-PilV antibody. C. Blot with α-PilU antibody. D. Blot with α-PilA2 antibody. E. Blot with α-PilW antibody. The predicted molecular mass here of PilA1 is 18.4 kDa; of PilV is 21.8 kDA; of PilU is 20.1 kDa; of PilA2 is 12.7 kDa; of PilW is 18.5 kDa.

It was apparent that all five α-pilin antibodies were highly specific to the pilin they were raised against, with no cross-reactivity between any antibody and a non-specific pilin protein. PilV and PilU both appear to have degraded slightly as degradation products can be seen on the blots below the full-length products (indeed the majority of PilU appears to have degraded slightly). A small amount of laddering of PilA2 is seen, while a great deal of background is seen in the α-PilW blot. This is visible in every lane indicating it is cross- reactivity of the α-PilW antibody with several E. coli proteins. It is a possibility that when PilW was initially purified in order to raise antibody the purification was not entirely complete and that some E. coli proteins were not separated from PilW prior to its injection into rabbits. Having demonstrated the specificity of the α-pilin antibodies and identified working dilution factors for their use against E. coli lysates, it was next necessary to identify working dilution factors for their use against C. difficile lysates. (Previous experience from the Fairweather group has found that proteins are generally expressed at much lower levels in 105

C. difficile than when exogenously expressed in E. coli, meaning lower dilution factors are generally required of antibodies for use in Western blots against samples from C. difficile). The five pilin genes, PilA1, PilA2, PilU, PilV and PilW, were cloned into an inducible C. difficile expression vector (pRPF185, (Fagan and Fairweather, 2011)), placing them under the control of the tet promoter, whereby protein expression is induced using the non-antibiotic tetracycline analogue anhydrotetracycline (Atc). This yielded plasmids pECC34, pECC24, pECC15, pECC33 and pECC28, respectively. The plasmids were transformed into E. coli CA434, then conjugated into C. difficile 630. Protein expression was performed in liquid medium using 250 ng/ml Atc. Expression and working dilutions for antibodies were tested by Western blotting against the appropriate pilin proteins from both the cell lysates and supernatants of the cultures. Supernatants were analysed because T4P are extracellular, so if these pilins were being assembled into pili one would expect to see pilins in the culture supernatants. Proteins from the culture supernatant were extracted and concentrated by TCA precipitation. Expression of PilA1, PilV and PilW was immediately apparent and a dilution of 1:2000 was identified as optimal for use of α-PilA1 and α-PilV antibodies against C. difficile lysates, and 1:1000 for the α-PilW antibody (Figure 3.6A-C). The α-PilA1, α-PilV and α-PilW antibodies showed good specificity against their target pilins with minimal background cross-reactivity. In the α-PilA1 blot, a small PilA1 band is seen above the primary large one, which may represent pre-pilin, with the large band being mature PilA1. Expression of PilU and PilA2 was not immediately successful. The next step was therefore to attempt to express them constitutively using the cwp2 promoter, which is thought to drive a higher level of gene expression than the maximal possible from the tet promoter (Fagan and Fairweather, 2011). Plasmids encoding pilU and pilA2 genes under the control of the cwp2 promoter (pECC29 and pECC31, which have the pRPF144 backbone (Fagan and Fairweather, 2011)) were transformed into E. coli CA434, then conjugated into C. difficile 630. Expression of PilU from the cwp2 promoter was successful (Figure 3.6D), though expression was very low. Significant background cross-reactivity was seen, but 1:2000 seemed to be an appropriate working dilution for the α-PilU antibody against C. difficile lysates. Expression of PilA2 from the cwp2 promoter appeared unsuccessful, as it was not possible to detect it using α-PilA2 antibodies. Indeed, multiple dilution factors of the α-PilA2 antibody were used trying to detect PilA2 expression from both the tet and cwp2 promoters, but even at high antibody concentrations PilA2 was not detected. This prompted a hypothesis

106 that PilA2 might for some reason adopt a different structure, or undergo post-translational modifications, in C. difficile compared to in E. coli, causing it not to be recognised by the α- PilA2 antibody. Therefore it was decided to express PilA2 with a C-terminal His-tag as an alternative epitope for detection. Plasmids encoding His-tagged PilA2 from both the cwp2 and tet promoters (pECC32 and pECC25, respectively) were also conjugated into C. difficile 630 and protein expression followed by Western blotting was attempted, but this time using both α-PilA2 and α-His-tag antibodies against the cell lysates and culture supernatants. Again, neither antibody was able to detect PilA2. Why this might be is unclear, but presumably PilA2 is not being properly expressed from the plasmids as the α-PilA2 antibody is easily able to detect PilA2 produced in E. coli, while the α-His-tag antibody is generally reliable at detecting His-tagged proteins produced in any species, including C. difficile.

3.2.4 Identifying Conditions for Type IV Pili Production Having shown that the antibodies against pilins encoded within the primary T4P cluster (α-PilA1, α-PilV, α-PilU) work well, the next stage was to try and use them to identify conditions under which the primary T4P cluster is expressed and proteins are produced. Two approaches were taken to this end: the first was to try growing C. difficile in/on several different types of medium to identify which conditions (if any) prompted the production of T4P; the second was based on a prior observation that under conditions where levels of the second messenger cyclic-di-GMP (c-di-GMP) are raised long unidentified strands were present on the surface of C. difficile 630, which could be T4P (Purcell et al., 2012). Therefore, the second approach to try was to artificially drive up levels of c-di-GMP in C. difficile cells and test whether this prompted T4P expression. For the first approach C. difficile 630 was grown in liquid and on solid media. Five different liquid media were tested: BHI, BHIS, TY, SMC and FAB (their exact compositions are defined in Section 2.1.3, but they range from relatively nutrient-poor medium to relatively nutrient-rich medium containing a variety of nutrients and sugars). C. difficile 630 was grown in these five media, and harvested at 4 different time points (5 hrs, 24 hrs, 72 hrs and 120 hrs, to investigate temporal effects on T4P expression). Cells were lysed and their lysates and TCA-treated supernatants analysed by Western blot to look for PilA1 expression. PilA1 was chosen because being predicted to be the major pilin, one would expect the levels of PilA1 to be much higher than those of PilU or PilV. However, no PilA1 expression was apparent (not shown).

107

Figure 3.6. Immuno-Detection of Pilins Expressed Exogenously in C. difficile. Western blots demonstrating the function of α-pilin antibodies against target pilins expressed in C. difficile (except for the α-PilA2 antibody). WCL = Whole Cell Lysate; S/N = Supernatant. A. Blot with α-PilA1 antibody against lysates of C. difficile 630 expressing PilA1 from tet promoter. Pre-I = pre-induction with Atc; Post-I = post-induction with Atc. Predicted molecular mass of PilA1 = 18.2 kDa as pre- pilin, 17.2 kDa as mature pilin. B. Blot with α-PilV antibody against lysates of C. difficile 630 expressing PilV from tet promoter. Predicted molecular mass of PilV = 21.5 kDa as pre-pilin, 20.6 kDa as mature pilin. C. Blot with α-PilW antibody against lysates of C. difficile expressing PilW from tet promoter. Predicted molecular mass of PilW = 17.9 kDa as pre-pilin, 17.2 kDa as mature pilin. D. Blot with α-PilU antibody against lysates of C. difficile 630 (pMTL960) (the vector control – VC) and C. difficile 630 expressing PilU from the cwp2 promoter. Predicted molecular mass of PilU = 19.8 kDa as pre-pilin, 18.8 kDa as mature pilin.

To test PilA1 production on solid medium C. difficile 630 was streaked out onto 8 different agars: BHI, BHIS, TY, SMC, Blood Agar (BA), Fastidious Anaerobe Agar (FAA), Columbia Blood Agar (CBA) and Braziers (Brz). (Their exact compositions are described in Section 2.1.3, but again they represent a wide variety of media, though their agar content was uniform). Following streaking out, cultures were incubated for between 1 and 4 days under

108 standard conditions before being harvested from the plate and lysed. Lysates were analysed by α-PilA1 Western blot, but again no PilA1 expression was apparent (not shown). In December 2013 (just over 1 yr after the commencement of this study) Piepenbrink et al reported that C. difficile strain R20291 produced Type IV Pili containing PilA1 when grown on CBA for four days, and that T4P production was enhanced when after 1 day’s growth under normal conditions (at 37oC) the solid cultures were moved to room temperature (RT) (Piepenbrink et al., 2014). I therefore attempted to replicate this with strain 630. For completeness, and in case optimal T4P production conditions for 630 were different to those for R20291, the strain was streaked out onto all solid media tested previously. These were incubated for 24 hrs at 37oC then for 1-3 days at room temperature (Figure 3.7). After 24 hrs at 37oC followed by 24-48 hrs incubation at RT no PilA1 production was seen from any samples (Figure 3.7A-B). However, after 24 hrs at 37oC followed by 72 hrs at RT small amounts of PilA1 are detected in the lysates of cultures grown on BHI, FAA and CBA agar (Figure 3.7C). C. difficile R20291 was similarly grown on all 8 solid media for four days with the shift to RT after 24 hrs, and those lysates identified as containing PilA1 (those from cultures grown on FAA and CBA, not shown) analysed by α-PilA1 Western blot alongside those lysates of strain 630 found to contain PilA1, in order to analyse relative amounts of PilA1 production in strains 630 and R20291 (Figure 3.7D). This showed that the greatest amount of T4P production was found in strain R20291 when grown on FAA. Having completed the investigation into conditions under which C. difficile produces PilA1 using the first approach outlined above, the second approach was also attempted, in which the effect of the second messenger cyclic-di-GMP would be tested. For this purpose the diguanylate cyclase (DGC) gene dccA (Diguanylate Cyclase from Clostridium A) was cloned into pRPF144, yielding pECC12 from which DccA is expressed constitutively with a C-terminal His-tag. DccA has previously been shown to be an active DGC (Bordeleau et al., 2011), and has been identified as useful for experimental use as it contains no ‘I’ sites which allow post-translational regulation of the protein, meaning it is constitutively active once synthesised (Purcell et al., 2012). Hence it was chosen for this investigation. Plasmid pECC12 was conjugated into C. difficile 630. This strain was cultured in liquid TY medium then harvested, the cell pellet lysed and culture supernatant treated with TCA to isolate its protein content. Lysate and supernatant samples were then analysed by α-PilA1 Western blot, using samples of 630 (pMTL960) as negative control (Figure 3.8). This clearly shows that C. difficile 630 produces PilA1 in response to elevated levels of cyclic-di-GMP, and also that a large amount is exported into the supernatant, indicating that

109

T4P are being formed. The band seen in the lysate is of the size predicted for PilA1, and an identically sized band is also seen in the culture supernatant, but larger bands are also seen which are likely to reflect PilA1 multimers which have not been properly dissociated by SDS. For instance, the dominant band in the supernatant fraction, with an apparent molecular weight of approximately 50 kDa, is of an appropriate size for a PilA1 trimer (predicted molecular weight 51.5 kDa). To investigate whether production of the minor pilins PilU, PilV and PilW could also be detected when c-di-GMP levels are elevated in C. difficile 630, the samples analysed in Figure 3.8 were also probed using α-PilU, α-PilV and α-PilW antibodies. However, none of these were detected (not shown), suggesting that they are likely expressed at too low a level to be detected by Western blot. The fact that PilA1 is therefore apparently produced at vastly greater levels than these other three pilins suggests that it is indeed the major pilin comprising these T4P, and that PilU, PilV and PilW are likely only to be minor pilins.

Figure 3.7. PilA1 Production by C. difficile when Grown on Solid Media. Investigation by means of α-PilA1 Western blots of PilA1 production by C. difficile when grown on a variety of media for 2-4 days, being moved from incubation at 37oC to RT after 24 hrs. A-C are C. difficile strain 630. A. 24 hrs at 37oC, 24 hrs at RT. B. 24 hrs at 37oC, 48 hrs at RT. C. 24 hrs at 37oC, 72 hrs at RT. D. Comparison of amounts of PilA1 produced by strains 630 and R20291 after 24 hrs at 37oC, 72 hrs at RT on media which promote PilA1 production.

110

Overall, these investigations had shown that PilA1 production by C. difficile can be induced by growth under particular conditions or by elevating the levels of c-di-GMP by exogenous expression of a DGC. It was initially considered preferable to investigate these T4P by growing C. difficile under relatively normal lab conditions rather than by expressing a DGC in the bacteria, to maintain physiological conditions at all times and prevent excessive levels of c-di-GMP causing other, unknown effects in the bacteria. However, it was clear that in strain 630 under the conditions we tested PilA1 production was at best very low and took 4-5 days to develop. In strain R20291, though the timescales were the same it did seem that noticeably higher levels of T4P production were achieved when the strain was grown on FAA.

Figure 3.8. C-di-GMP-Induced PilA1 Expression by C. difficile. α-PilA1 Western blot to show the expression of PilA1 by C. difficile 630 carrying plasmid pECC12, in comparison to the vector control (VC). PilA1 is found in the culture supernatant (S/N) as well as the cell lysate, indicating it is exported from the cell in the form of a Type IV Pilus.

Consideration was given to using strain R20291 as the primary research vehicle for this project, but this proved impossible: for unknown reasons conjugation frequencies into strain R20291 are much lower than into strain 630. Ordinarily this is not a great problem – a small number of transconjugants from an individual conjugation is no worse than a large number, given only one is needed, while conjugation into R20291 of plasmids which transfer from E. coli into C. difficile less easily (such as those used in the biogenesis of ClosTron mutants) can be facilitated by lengthening the period of time allowed for conjugation from 7-8 hrs to 24 hrs. However, the pseudo-suicide vectors designed for use in allele exchange mutagenesis of C. difficile are inherently very unstable and conjugate with extremely low

111 frequency even into C. difficile 630 (though the frequency was not calculated an estimate would be less than 1 % of the frequency of conjugation of regular, pMTL960-derived plasmids, with only 2-3 transconjugants obtained on average per conjugation compared to several hundred). Though it is apparently possible to conjugate these pseudo-suicide vectors into R20291 (Cartman et al., 2012; Ng et al., 2013), I was not successful in my attempts to create clean deletion mutants in R20291, and so it was decided to work on strain 630 during this project, to enable the creation of mutants by allele-exchange. Since PilA1 expression is very low in all conditions tested for 630 growth, it was decided to study T4P by inducing their expression by exogenous expression of a DGC. This was also preferable as it is a much quicker process for T4P expression than growth on agar, taking only 24 hrs rather than 4 days.

3.2.5 The Effects of Cyclic-di-GMP on Growth of C. difficile It was immediately apparent that the conjugation of pECC12 into C. difficile 630 resulted in considerably fewer transconjugants than when similar pRPF144-derived plasmids were conjugated into the same strain. This suggested that the constitutive expression of a DGC in strain 630 could be having a disruptive effect on growth (which would not be surprising given the disruption doubtless being caused to several intra-cellular signalling pathways by the continual elevation of c-di-GMP levels). The first analysis performed to investigate the effect of constitutive expression of DccA on the growth of strain 630 was a growth curve, in which the growth of 630 carrying pECC12 was compared to the vector control 630 (pMTL960) (Figure 3.9). Clearly, constitutive expression of DccA has no effect on the growth of C. difficile 630 cultures, so next investigated was the effect of DccA expression on the morphology of individual cells. Cells from cultures of 630 (pECC12) and the 630 (pMTL960) were analysed by phase-contrast microscopy, which showed an unexpected lengthening of the 630 (pECC12) cells (Figure 3.10A-B). While the control cells were, on average, between 5-10 μm long (Figure 3.10A), those expressing DccA were up to and over 100 μm long (Figure 3.10B; measuring such long cells with precision was challenging but many of them were clearly 20-30 times the length of the control cells). To investigate whether these long cells were filaments (extended single cells which have not undergone division) or chains (lengths of cells joined together due to a failure to complete division) they were stained with a membrane dye to show up any internal septa. Staining with three dyes was attempted: Nile Red (Sigma), FM464X (Life Technologies) and MitoTracker Green (MTG, Life

112

Technologies). MTG produced the best results, and after various concentrations were tested a concentration of 1 μg/ml was identified as optimal. Both the vector control cells, and those expressing DccA were stained with MTG (Figure 3.10C-D). Cells expressing DccA (Figure 3.10D) clearly contain multiple septa, showing that they are chains and suggesting that elevated levels of c-di-GMP somehow inhibit the separation of daughter cells during cell division, possible by disrupting the expression of peptidoglycan hydrolases.

Effect of DccA on Growth of 630 1.8

1.6

1.4

1.2

1 600

OD 0.8 630 (pMTL960) 630 (pECC12) 0.6

0.4

0.2

0 0 1 2 3 4 5 6 7 8 9 Growth Time (Hrs)

Figure 3.9. Effect of C-di-GMP on C. difficile Growth. Growth curve showing the effect on growth of constitutive DccA expression in C. difficile strain 630 (630 (pECC12)) in comparison to the vector control (630 (pMTL960)). Strains grown in triplicate in TY broth.

The final piece of investigation into the effect of DccA expression on the growth of C. difficile 630 was to test the effect of DccA expression on the sporulation of C. difficile 630. Sporulation assays were performed, with the vector control giving a sporulation efficiency of 61.7 ± 14.6 % and 630 (pECC12) giving a sporulation efficiency of 66.2 ± 15.6 %. These results are not significantly different, and also do not significantly differ from the previously reported average sporulation efficiency for strain 630 of 59.4 % (Dr M. Dembek, former Fairweather group member, thesis), suggesting that c-di-GMP levels do not affect sporulation/germination efficiency.

113

Figure 3.10. Effect of C-di-GMP on C. difficile Cell Morphology. Changes in cell morphology induced by constitutive expression of the DGC DccA in C. difficile 630. A&C. Vector control, C. difficile 630 (pMTL960); A = phase contrast image, C = fluorescence image with MTG staining of cell membranes. B&D. 630 (pECC12); B = phase contrast image, D = MTG fluorescence image. Red scale bars indicate 10 μm; white arrows indicate septa. Vector control cells all contain either no or 1 septum, depending on their stage of cell division. The 630 (pECC12) cell contains multiple septa, indicating a deficiency in cell division. The identity of the bright blobs within the 630 (pECC12) cell is unknown. Some may be unusually bright septa; others are not. Their identity is unknown, but they have been seen before in C. difficile cells (Pereira et al., 2013), though not explained.

3.2.6 Optimisation of Conditions for T4P Production Given that constitutive expression of DccA leads to defects in cell division it was felt undesirable to use this technique for studying T4P, if it could be avoided. To try and express DccA at a lower level which would be less disruptive to the general cell biology of C. difficile, dccA was placed in a plasmid under the control of the inducible tet promoter (pECC17). Plasmid pECC17 was conjugated into C. difficile 630, as was a vector control. The vector control was pASF85, which was constructed by Dr A. Fivian-Hughes. 630 (pECC17) and 630 (pASF85) were grown in TY broth and protein expression induced with various concentrations of Atc: 5, 20, 50, 200 & 500 ng/ml. One culture was also left uninduced. These cultures were harvested, resultant pellets lysed and their supernatants TCA-treated, then analysed by α-PilA1 Western blot (Figure 3.11). A minimal level of PilA1 production

114 was seen in all samples (including vector control samples, Figure 3.11A) except the uninduced ones, suggesting that Atc itself might induce a minimal level of PilA1 production. At an Atc concentration of 20 ng/ml or above vastly more PilA1 was seen in the 630 (pECC17) samples than the 630 (pASF85) ones, showing that expression of DccA from the inducible tet promoter also drives PilA1 expression. Though this Western blot is not quantitative, it appeared that a maximal level of PilA1 production was reached with induction using 50 ng/ml Atc (Figure 3.11B).

Figure 3.11. PilA1 Production Following Atc-Induced Expression of DccA. α-PilA1 Western blots to show the expression of PilA1 by C. difficile 630 carrying plasmid pECC17 (B) in comparison to the vector control pASF85 (A). The amount of Atc added to each culture to induce protein expression is shown above the blots.

Cultures of 630 (pECC17) and 630 (pASF85) were grown again identically and the morphology of cells within the cultures examined, to see whether induced expression of DccA (probably a lower level overall and on a shorter timescale to constitutive expression) also caused defects in cell division and the associated lengthened cells (Figure 3.12). 630 (pASF85) were grown with maximal 500 ng/ml Atc, and were of a normal shape, showing Atc alone had no effect on cell morphology (Figure 3.12A). 630 (pECC17) induced with up to 50 ng/ml Atc for 3 hrs as normal were also of a normal shape (Figure 3.12B-D),

115 but in those cultures induced with 200 or 500 ng/ml Atc cells with the lengthened morphology were visible (Figure 3.12E-F). This showed that to maintain healthy cell division while investigating these T4P expression of DccA should be driven from the inducible tet promoter, rather than the constitutive cwp2 promoter, and that induction should be with no more than 50 ng/ml Atc.

Figure 3.12. Effect of Increasing Expression of DccA on Cell Morphologies of C. difficile 630. A. Vector control (630 (pASF85)) with 500 ng/ml Atc. B-F. 630 (pECC17) with DccA production induced with 5, 20, 50, 200 and 500 ng/ml Atc respectively. Red scale bars denote 10 μm; yellow arrows indicate lengthened cells.

116

3.2.7 Production of Type IV Pili by C. difficile 630 Having optimised expression levels of DccA to identify an expression level sufficient to stimulate apparently high levels of PilA1 production in C. difficile 630 (and therefore presumably T4P production) without disrupting cell division, it was attempted to visualise the T4P by electron microscopy. To protect the T4P from shearing during the harvesting of cells after growth in liquid medium, they were harvested at the lowest speed and for the shortest time possible. This was identified as 500 g for 2 mins. In (Purcell et al., 2012), cells were harvested for 5 mins at 700 g and appendages hypothesised to be T4P were seen on the surface of the cells. The centrifugation conditions I identified were obviously gentler than these so it seemed reasonable to expect T4P to remain largely intact throughout them. C. difficile 630 (pASF85) and 630 (pECC17) were grown in TY broth and protein expression induced with 50 ng/ml Atc, as described above. After 3 hrs samples were harvested and fixed, then examined by TEM. This was kindly performed by Dr M. McCrossan of the University College London Electron Microscopy Unit, Royal Free Hospital. Flagella were clearly visible on cells carrying both plasmids (Figure 3.13). Expression of DccA from plasmid pECC17 should inhibit flagella synthesis, but after only 3 hrs DccA expression insufficient time has passed for the flagella previously being synthesised to be lost, or largely diluted out of the population by cell division. However, there did appear to be more flagella present on the 630 (pASF85) cells than the 630 (pECC17) cells. Visible on many of the 630 (pECC17) cells (though not all), are much thinner appendages, protruding from the cell surface (Figure 3.13B). The flagella appear to range quite significantly in width, from 7-8 nm wide up to almost 30 nm wide. It is hard to believe such variation in pilus width genuinely occurs in C. difficile, so it is most likely down to the difficulty in accurately measuring them in micrographs. However, the thinner appendages seen only on the 630 (pECC17) cells are noticeably narrower than flagella, and are measured as being only 5 nm wide or less. These are believed to be T4P, supporting the assumption that the production of PilA1, and its detection in culture supernatants, is indicative of T4P production. To confirm that these appendages do indeed consist of PilA1, pilA1 was deleted from the C. difficile 630 chromosome, using the codA-linked method of allele exchange, yielding C. difficile 630ΔpilA1. Screening confirming the deletion of pilA1 from the 630 chromosome is shown in Figure 3.14.

117

Figure 3.13. TEM Visualisation of C. difficile Type IV Pili. Transmission electron micrographs of 630 (pASF85) (A) and 630 (pECC17) (B). Flagella are seen on cells from both cultures, though more seem to be seen on cells of 630 (pASF85), but on 630 (pECC17) cells much thinner appendages are seen, which are presumed to be T4P. Red lines indicate 500 nm.

A plasmid was also designed to complement the pilA1 mutation. The currently available plasmids for protein expression in C. difficile contain only one cloning site, but in order to complement the mutant a dual expression vector was required – expression of pilA1 was required to complement the pilA1 mutation, and expression of dccA was also required in order to activate T4P formation (as seen in Figure 3.6A, expression of pilA1 alone is insufficient to initiate T4P formation. When pilA1 is expressed alone, PilA1 is almost entirely located within the cell, while when dccA is expressed PilA1 is produced and exported out of the cell into the culture supernatant, as seen in Figure 3.8). Therefore, a second cloning site was inserted into plasmid pECC17 between the end of dccA and the slpA transcriptional terminator, so that a gene inserted into the second cloning site would be co-transcribed with dccA. The site was inserted using two rounds of inverse PCR, yielding pECC76.

Figure 3.14. Identification of 630ΔpilA1 Mutant. PCR screening of thiamphenicol- sensitive double-crossovers after attempted deletion of pilA1 from C. difficile 630. A WT control shows the native size of the amplified region. The band in the right-hand lane is clearly smaller than the WT control, indicating successful deletion of the gene from the chromosome. The middle lane contains two bands of both the WT and mutant sizes, indicating the colony is a mix of strains. 118

To generate a plasmid capable of complementing the 630ΔpilA1 mutant, pilA1 was cloned into the second cloning site of pECC76, yielding pECC109. Plasmids pECC17 and pECC109 were conjugated into 630ΔpilA1 and gene expression induced as previously. Cultures were harvested and cell lysates and culture supernatants analysed by α-PilA1 Western blot (Figure 3.15). This showed that when dccA was expressed alone in 630ΔpilA1 (from pECC17) no PilA1 was produced, but when dccA and pilA1 were co-expressed in 630ΔpilA1 (from pECC109) PilA1 was not only produced, but also exported into the supernatant, indicating the restoration of T4P production.

Figure 3.15. Complementation of 630ΔpilA1 Mutant. α-PilA1 Western blot showing production and export of PilA1 in C. difficile 630ΔpilA1 in comparison to wild-type C. difficile 630. When dccA is expressed from pECC17 in the WT, PilA1 is produced and is seen in the whole cell lysate (WCL), and is also exported into the culture supernatant (S/N). When dccA is expressed from pECC17 in 630ΔpilA1, no PilA1 is produced at all, but when dccA and pilA1 are co-expressed in the 630ΔpilA1 from the complementing plasmid pECC109, PilA1 is both produced and exported, demonstrating restoration of T4P production.

To confirm that the appendages seen in Figure 3.13B, which are produced by C. difficile in response to dccA expression, are indeed T4P composed of PilA1, TEM was performed as described earlier to examine samples of both 630ΔpilA1 (pECC17) and the complemented strain 630ΔpilA1 (pECC109) (Figure 3.16). Though flagella were seen on 630ΔpilA1 (pECC17) cells (Figure 3.16A), no narrower appendages were seen, which demonstrates that they are not produced when pilA1 is absent, proving that they are Type IV Pili. When the pilA1 mutation is complemented from plasmid pECC109, both flagella and

119 pili are seen (in this case a bundle of them, Figure 3.16B), demonstrating the restoration of piliation upon the restoration of an intact pilA1 gene. Immuno-labelling and fluorescence identification of PilA1 was also attempted to confirm T4P production and make-up, but unfortunately this did not prove successful due to various difficulties, in particular large amounts of unexplained background fluorescence causing difficulty in identifying PilA1-associated fluorescence, and cells which had been expressing DccA becoming extremely ‘sticky’ and hard to handle after formaldehyde fixation.

Figure 3.16. EM Images Showing Complementation of 630ΔpilA1 Mutant. TEM images of 630ΔpilA1 (pECC17) (A), in which the DGC DccA is expressed. The loss of pilA1 has resulted in the absence of the narrow appendages seen in Figure 3.13B, demonstrating that pilA1 is necessary for their production and proving they are pili. Images of 630ΔpilA1 (pECC109) (B), show that restoration of an intact pilA1 gene complements the mutation, restoring pilus formation. Red scale bars indicate 500 nm.

3.3 Discussion Type IV pili genes are found in every strain of C. difficile, indicating that they play an important role in some aspect(s) of the species’ lifecycle. As mentioned above, these T4P had previously been shown to be expressed during infection of a hamster (Goulding et al., 2009), indicating they may function as virulence factors. In order to start trying to understand the functions of these T4P it was necessary to start by identifying conditions under which they are produced in vitro. The only hint at the initiation of this study as to what conditions might actually drive T4P production was the observation by others that under conditions of elevated levels of the second messenger cyclic-

120 di-GMP, long appendages were visible on the surface of some C. difficile cells (Purcell et al., 2012). I confirmed that what was seen in these images almost certainly were T4P, as I demonstrated first by Western blot that the expression of the DGC DccA in C. difficile induced production of the presumed major pilin PilA1, and then demonstrated by EM that it also induced the formation of very long and thin appendages on the surface of C. difficile, presumed to be T4P. The identity of these appendages as T4P was confirmed by mutational studies which demonstrates that their formation was pilA1-dependent. T4P have previously been found generally to be between 5 and 8 nm wide (Pelicic, 2008), so if the measurements of the C. difficile T4P are correct they are narrower than the majority of T4P, coming in at 3-5 nm wide. It had been hoped that by raising an α-PilA2 antibody it might also be possible to identify conditions under which the secondary T4P gene cluster was expressed, but since it proved impossible to detect PilA2 production in C. difficile from a plasmid, detection of native PilA2 production was not attempted, and this aspect of the project was abandoned. As stated above, at the initiation of this project, only hints existed as to what might drive production of T4P by C. difficile. However, over the course of this project other groups have made significant and interesting progress in relation to this topic. Firstly, as mentioned in Section 3.2.4, Piepenbrink et al (2014) demonstrated that C. difficile R20291 produces T4P when grown on Colombia Blood agar, and that this production is enhanced when cultures are incubated for 3 days at room temperature after an initial 24 hr incubation at 37oC. This enabled them to clearly demonstrate by Immunogold labelling that both PilA1 and PilJ are constituted into the pili. A later publication by the same research group demonstrated conclusively that as hypothesised, PilA1 is not only essential for production of these T4P, it is also the major pilin found within them. Quantitative immunoblotting showed that approximately 2000x more PilA1 is produced than PilJ, which is only a minor pilin (Piepenbrink et al., 2015). My results had also indicated that PilA1 was almost certainly the major pilin: though no α-PilJ antibody was available, α-PilV, α-PilU and α-PilW antibodies were available, in addition to α-PilA1. Under conditions of elevated cyclic-di-GMP only PilA1 was detectable (at a high level at that) – PilV, PilU and PilW were all undetected in conditions under which T4P were found to be produced, showing that PilA1 was produced in much larger amounts that any of the other pilins I was able to blot against. Work by other groups also identified the link between c-di-GMP levels and T4P production. It was first demonstrated that transcription of pilA1 is regulated by a riboswitch which activates pilA1 transcription upon binding of c-di-GMP (Soutourina et al., 2013), (for

121 further analysis see Chapter 4), while a second group confirmed that elevated levels of c-di- GMP induce production of T4P (Bordeleau et al., 2015). While I show by TEM pilA1- dependent production of T4P in response to DGC expression in C. difficile, identification of T4P is confused by the concurrent existence of flagella on the cells. Bordeleau et al (2015) avoided this problem in two ways: firstly, the promoter they used to express dccA is an inducible promoter whereby it is possible to activate gene expression using the polycyclic peptide nisin, rather than the tet promoter which I used. This group found themselves able to express dccA for 12 hrs without any reported untoward effects on cell division. Though images are not shown, I found that even with as little as 20 ng/ml Atc added to induce dccA expression from the tet promoter, when left for a similar period significant cell-chain formation occurred. This could indicate that gene expression from the nisin-inducible cpr promoter is simply less efficient than from the tet promoter, resulting in lower levels of dccA expression and therefore of intracellular c-di-GMP. The overall effect was that after this 12 hr period of dccA induction with nisin, no flagella were visible on the surface of those cells expressing dccA, making the visualisation of T4P much easier. They also visualised T4P production in a sigD mutant. SigD is required for flagella production, so in its absence no flagella are produced under any conditions, which further improved their ability to visualise T4P over mine (Bordeleau et al., 2015). In retrospect, using a non-flagellated mutant to visualise T4P would have been a wise move, in order to better distinguish between flagella and T4P. Regrettably, this was not done, though I was still able to demonstrate the production of T4P under conditions of elevated c-di-GMP. I also showed here that elevated levels of c-di-GMP are required for T4P biogenesis, (not just PilA1 production) as when PilA1 was expressed exogenously from a plasmid, T4P were not produced, as shown by the build-up of PilA1 within the cells rather than its export into culture supernatant, which was seen when PilA1 was produced endogenously in response to elevated levels of c-di-GMP. This is investigated further in Chapter 4. Though several studies published before and during this project have investigated the varying effects of c-di-GMP in C. difficile (e.g. (Bordeleau et al., 2015; McKee et al., 2013; Purcell et al., 2012; Soutourina et al., 2013)), no-one previously had identified the effect of elevated c-di-GMP levels on cell morphology (i.e. the cells form long chains). Reasons for this are suggested above, but these effects demonstrate that c-di-GMP is likely to play a role in regulation of expression of (a) gene(s) involved in cell division. How this might occur is currently unknown, but Soutourina et al showed that c-di-GMP controls the expression of various transcription factors and regulatory proteins, which could in turn affect expression of

122 genes involved in cell division. In particular, since septa form between cells which then fail to separate, it is likely that these high levels of c-di-GMP are somehow inhibiting peptidoglycan hydrolase activity. Peptidoglycan hydrolases have been shown to be important in the completion of cell division after septum-formation in Bacillus subtilis, with long chains of cells forming in their absence, as is seen here (Smith et al., 2000). Interestingly, in the tooth decay-associated species Streptococcus mutans, deletion of a DGC and associated reduction in c-di-GMP levels resulted in the formation of long chains of cells (Yan et al., 2010). This emphasises the hugely varied and sometimes opposing roles c-di-GMP is able to play in bacteria, apparently inhibiting proper cell division in C. difficile but being necessary for it in S. mutans. It is surprising that in both C. difficile and S. mutans, despite the observed aberrant cell division resulting from abnormal intracellular concentrations of c-di-GMP, the rate of growth of both species was unaffected. No further research into how c-di-GMP has this aberrant effect on C. difficile cell division was performed, being outside the central interests of the project, but it is hoped that publication of these data in (Peltier et al., 2015), will drive others in the field to further investigate the roles played by c-di-GMP in the fundamental biology of C. difficile. C. difficile is not the first species identified in which T4P production is regulated by c-di-GMP. In P. aeruginosa for example, high levels of intracellular c-di-GMP drive T4P biosynthesis (Jain et al., 2012), circumventing the need for the phosphodiesterase FimX, which is otherwise essential for T4P formation (Huang et al., 2003; Kazmierczak et al., 2006). In M. xanthus on the other hand, increased levels of c-di-GMP inhibit T4P formation (Skotnicka et al., 2015), again showing the wide variety and opposing roles c-di-GMP is able to play in bacteria. Exactly how c-di-GMP regulates expression of C. difficile T4P is further explored in Chapter 4.

123

Chapter 4: Dissecting the Primary Type IV Pilus Gene Cluster

4.1 Introduction As discussed in the previous chapter, pili are produced from the primary C. difficile T4P gene cluster in response to increased levels of c-di-GMP. Mutagenesis studies in N. meningitidis and EPEC in particular have identified the genes required for T4P biogenesis in Gram-negative species: a major pilin, a pre-pilin peptidase, an assembly ATPase, a platform protein, the proteins which form the alignment complex, a secretin, a secretin pilotin and a number of minor pilins (Anantha et al., 2000; Blank and Donnenberg, 2001; Carbonnelle et al., 2005; Ramer et al., 2002). However, as previously discussed, the situation is not as simple as it may appear, as it seems that certain genes are truly essential for T4P biogenesis, while others appear to be required only to counteract the force of T4P retraction. Furthermore, the genes truly essential for T4P biogenesis appear to differ between species, complicating the search for an overall model of T4P biogenesis in Gram-negative species (Carbonnelle et al., 2006; Takhar et al., 2013). In light of this, it has been suggested that a Gram-positive model of T4P would be useful, as such a model would presumably be simpler (on account of Gram-positive species having only one membrane) (Pelicic, 2008). The C. difficile primary cluster appears to contain all the genes required for T4P biogenesis, containing all the genes known to be essential for T4P synthesis in Gram-negatives except for the secretin (PilQ), the secretin pilotin PilF (PilW in N. meningitidis) and the lipoprotein PilP, each of which play roles in Gram-negative bacteria which are unnecessary in Gram-positives, and each of which are absent from all Gram-positives known to encode T4P (Melville and Craig, 2013). Indeed, production of T4P from the primary C. difficile cluster has previously been identified by Immunogold EM (Goulding et al., 2009). A schematic diagram of the C. difficile primary T4P gene cluster is shown in Figure 4.1. Several other Gram-positive species are known to encode T4P, including all sequenced Clostridial species (Varga et al., 2006). However, at the initiation of this project, the only Gram-positive species demonstrated to produce T4P aside from C. difficile were Clostridium perfringens (Varga et al., 2006), the ruminant bacterium Ruminococcus albus (Rakotoarivonina et al., 2002) and the gut bacterium Bifidobacterium breve, which has been specifically reported to produce T4bP (O'Connell Motherway et al., 2011), while

124

Streptococcus sanguinis had also been reported to produce T4P-like pili (Henriksen and Henrichsen, 1975), though their identity had not been confirmed. Out of all the above, at the initiation of this project, molecular characterisation had only been performed on the T4P of C. perfringens. Schematic diagrams showing the T4P gene loci of C. perfringens are shown in Figure 1.12. As described in Chapter 1, pilC1 from the secondary gene cluster had been surprisingly shown to be essential for biogenesis of T4P from pilins of the primary gene cluster (Varga et al., 2006). PilT had also been shown to be essential for T4P biogenesis (Varga et al., 2006), which was unexpected. C. perfringens was the first species to be discovered in which PilT was essential for T4P biogenesis, rather than just T4P retraction, and remains one of only two species known in which this is the case (the other being the Gram-negative pathogen Francisella tularensis (Chakraborty et al., 2008)). In light of the above facts, investigation of the biogenesis of the T4P encoded by the primary gene cluster of C. difficile seemed an interesting project. There were various questions to answer: For instance, what is the interplay between the primary and secondary T4P gene clusters? It would be of interest to establish the degree of interlinking between their products/functions. Is PilT required for T4P biogenesis or not? Given that pilT is found to be essential in C. perfringens, the only Gram-positive species whose T4P had been characterised prior to this project, it was of interest to see if this feature was common to Gram-positive (or at least Clostridial) T4P. Finally, which genes are required for T4P biogenesis in C. difficile, and could we develop a model for T4P biogenesis in C. difficile and other Gram-positive species? The first three of these questions are answered in this chapter, while the possibility of using C. difficile as a model is discussed later. In addition to these investigations, the regulation of the expression of the production of T4P from the primary cluster by c-di-GMP is also investigated. As mentioned in the previous chapter, in the course of this project it was shown by others that pilA1 expression is regulated by a riboswitch (Soutourina et al., 2013). Riboswitches are non-coding regions of RNA which bind small molecules (generally the RNA is mRNA, though riboswitches have also been identified in other RNA types, including anti-sense RNA (Mellin and Cossart, 2015)). The small molecule ligand binds at a site known as an aptamer, and, in the case of mRNAs, its binding regulates expression of the gene encoded by the mRNA molecule (Breaker, 2008). Multiple riboswitch ligands have been identified, including thiamine pyrophosphate (TPP), glucosamine-6-phosphate, lysine, glycine and c-di-GMP (Barrick and Breaker, 2007; Sudarsan et al., 2008). TPP-binding riboswitches have been found in archaea, fungi and plants, but all other riboswitch types appear to be exclusive to eubacteria, where

125 they are found in both Gram-negative and Gram-positive species (Barrick and Breaker, 2007). Riboswitches in mRNA generally comprise two sections, known as the sensor domain and the expression platform. The sensor domain is the region which contains the aptamer, i.e. it is the region to which the ligand binds. The expression platform is the region of the riboswitch which regulates expression of the gene(s) downstream of the riboswitch. Ligand binding to the sensor domain induces a conformational change in the expression platform, modulating expression of the downstream coding regions (Serganov and Nudler, 2013). Various mechanisms exist by which riboswitches can function, but riboswitches can be broadly categorised into two classes: Type I and Type II riboswitches. Type I riboswitches form multi-helical junctions, while Type II riboswitches form pseudoknots (Serganov and Nudler, 2013). Riboswitches can regulate gene expression in a variety of ways: for instance, ligand binding can modulate the formation of rho-independent terminator and anti-terminator hairpins in the expression platform, determining whether the target gene is transcribed or whether transcription is prematurely terminated; ligand binding can alter the structure of a riboswitch to either sequester or expose the ribosome binding site; and ligand-binding can induce or inhibit rho-dependent transcriptional termination (Serganov and Nudler, 2013). There does not appear to be any correlation between riboswitch ligand, riboswitch type and riboswitch mechanism of action (Serganov and Nudler, 2013), though it seems that in Gram- positive bacterial species, riboswitches generally function by modulating the formation of rho-independent terminator and anti-terminator hairpins in the expression platform, while in Gram-negative species riboswitches generally function by modulating access to ribosome binding sites (Breaker, 2008), though there are exceptions. C. difficile is known to encode a large number of c-di-GMP turnover enzymes (strain 630, for example, encodes 15 active diguanylate cyclases (which synthesise c-di-GMP), 18 active phosphodiesterases (which break down c-di-GMP) and one protein which may have both functions, giving a total of 34 active c-di-GMP turnover enzymes (Bordeleau et al., 2011)). This is unusual for a Gram-positive species: while it is common for Gram-negative species to encode dozens of c-di-GMP turnover proteins, most Gram-positive species encode, at most, a handful, suggesting c-di-GMP plays a role in signalling in C. difficile which is unusually prominent for a Gram-positive organism. This supposition is supported by the identification of 16 c-di-GMP-binding riboswitches in the C. difficile 630 genome: 12 Type I riboswitches and 4 Type II riboswitches (Soutourina et al., 2013).

126

One of the Type I c-di-GMP riboswitches identified in C. difficile 630 is the CD1 riboswitch, which is located immediately upstream of the flagellar operon. This was the first C. difficile riboswitch to be identified and characterised, and was shown to have an ‘off’ function (i.e. c-di-GMP binding to the riboswitch turns off or downregulates gene expression) (Sudarsan et al., 2008), and flagella expression in C. difficile has indeed been demonstrated to be downregulated in response to elevated levels of c-di-GMP (Purcell et al., 2012). The exact mechanism by which the CD1 riboswitch functions remains uncertain, but c-di-GMP binding to CD1 has been found to terminate its transcription in an in vitro assay (Sudarsan et al., 2008).

Figure 4.1. The C. difficile Primary T4P Gene Cluster. A further annotated version of the gene cluster presented in Figure 3.2. The cluster of the structure is the same as shown previously, but the locations of the above-described riboswitch, and a transcriptional terminator (see Section 4.2.3), are presented, to aid comprehension of the chapter. The riboswitch identified immediately upstream of pilA1 is a Type II riboswitch known as Cdi2_4. Northern blotting suggested that c-di-GMP binding to Cdi2_4 up-regulated expression of pilA1, and a 10-fold increase in transcription was estimated (Soutourina et al., 2013). However, this did not tell the whole story as only one level of DGC expression was demonstrated, so how much c-di-GMP was needed to activate the Cdi2_4 riboswitch was not calculated. Interestingly, the pilA1 transcript appeared to be only 800 bp long, which suggested pilA1 was transcribed alone (Soutourina et al., 2013), rather than as part of an operon. I therefore intended to investigate how much c-di-GMP was required to activate the c-di-GMP riboswitch. I also intended to investigate whether the primary T4P gene cluster was co-transcribed, and whether the whole cluster was regulated by c-di-GMP or just pilA1. I

127 also wished to identify whether the secondary T4P gene cluster was also regulated by c-di-GMP. These experiments are detailed in this chapter.

4.2 Results 4.2.1 Transcription of pilB1, as well as pilA1, is Up-Regulated in Response to Elevated Levels of Cyclic-di-GMP As demonstrated in the previous chapter, increasing the intra-cellular concentration of the second messenger c-di-GMP drives T4P formation in C. difficile 630. Concurrently, (Soutourina et al., 2013) showed that transcription of pilA1 was governed by a Type-II c-di- GMP-binding riboswitch. However, no investigation of the effect of c-di-GMP on transcription of other genes in the primary T4P cluster was performed. Therefore, qPCR was performed to compare the effect of c-di-GMP on transcription of pilA1 and pilB1, which immediately follows pilA1 in the primary T4P cluster, to determine if the reported increase in transcription of pilA1 in response to c-di-GMP was limited to pilA1, or if transcription is also upregulated downstream of pilA1. RNA was extracted from C. difficile 630 (pASF85) and 630 (pECC17), after dccA expression had been induced by the addition of 50 ng/ml Atc for 3 hrs, as standard. Purity of the RNA was confirmed by PCR amplification of 16S rRNA from each sample, as described in Section 2.4.1 (Figure 4.2A). The absence of contaminating DNA from the sample was demonstrated by the absence of any PCR product. cDNA was then produced by reverse transcription of the RNA, successful synthesis being confirmed by repeating the PCR reaction to amplify 16S rRNA, on this occasion successful cDNA synthesis being indicated by the presence of an appropriate PCR product (Figure 4.2B). The cDNA produced was then used for qPCR analysis of the levels of pilA1 and pilB1 transcription in the samples, using the level of 16S rRNA transcription to normalise the levels of pilA1 and pilB1 transcription. The qPCR analysis showed that elevated levels of c-di-GMP significantly up-regulated transcription of both pilA1 and pilB1 (Figure 4.3). Transcription of pilA1 was particularly up-regulated, increasing approx. 40-fold, with pilB1 transcription up- regulated only 3.6-fold. Furthermore, under control conditions, the level of pilA1 transcription was not significantly different to the level of pilB1 transcription, but under conditions of elevated levels of c-di-GMP pilA1 transcription was significantly (19-fold) higher than pilB1 transcription.

128

Figure 4.2. RNA Extraction and cDNA Synthesis from C. difficile 630 (pASF85) and 630 (pECC17). A. Confirmation of purity of RNA extracted, by PCR amplification of 16S rRNA. The absence of a PCR product indicates an absence of contaminating DNA and therefore purity of the RNA. B. Confirmation of successful cDNA synthesis from the RNA samples in A. This confirmation was obtained by performing the same PCR reaction, with the presence of a PCR product indicating successful cDNA synthesis.

3.5 * 3

2.5

2

pASF85 1.5 pECC17 1

0.5 * 0

Gene Expression as a Percentage of 16S rRNA pilA1 pilB1 Gene of Interest

Figure 4.3. C-di-GMP Regulation of Transcription of pilA1 and pilB1. Transcription of both pilA1 and pilB1 was up-regulated upon elevation of c-di-GMP induced by expression of DccA from pECC17, in comparison to a control strain carrying pASF85. Levels of pilA1 and pilB1 transcription are normalised to the level of 16S rRNA transcription. Error bars indicate one standard deviation either side of the mean. * indicates a significant difference relative to the (pASF85) control of P>0.95.

Melt curves of all qPCR reactions showed the primers used produce only one product, as required (Figure 4.4). Standard curves showed that the level of cDNA dilution for analysis of pilB1 transcription was good, though (at 1/20) slightly lower than ideal for analysis of pilA1 transcription (Figure 4.4). Therefore, cDNA was diluted 1/80 for every future qPCR analysis of pilA1 transcription.

129

Figure 4.4. Standard Curves (Top) and Melt Curves (Bottom) of the qPCR Reactions Analysed in Figure 4.3 (L-R: 16S rRNA, pilA1, pilB1). Standard curves plot sample Ct value (y-axis) against DNA quantity (x-axis); red squares indicate standards, blue squares experimental samples. Melt curves plot derivative reporter values (y-axis) against temperature (x-axis); red/orange (16S rRNA & pilA1) and blue (pilB1) curves represent standards, green (16S rRNA & pilA1) and purple/pink (pilB1) curves represent experimental samples. 4.2.2 Induction of dccA Expression with a Gradient of Increasing Atc Concentrations does not Result in a Gradient of pilA1 Transcription In Section 3.2.6 it was shown that PilA1 production becomes identifiable by Western blot after induction of dccA expression from pECC17 with at least 20 ng/ml Atc, and that this level of PilA1 production appeared to be close to the maximal level of PilA1 production seen. To investigate whether this was reflected in the transcription of pilA1, and the effect on transcription of pilA1 of gradually increasing Atc concentration, RNA was harvested from cultures of C. difficile 630 (pECC17) induced with varying concentrations of Atc: one set of cultures were not induced (i.e. 0 ng/ml Atc), three other sets of cultures were induced with 5, 10, and 20 ng/ml Atc, and with one final set of cultures induction with 50 ng/ml Atc was repeated. Purity of the RNA extracted was confirmed as above, by PCR to amplify 16S rRNA (Figure 4.5).

130

Figure 4.5. Confirmation of Purity of RNA Extracted from C. difficile 630 (pECC17) Cultures. Cultures were either uninduced (No Atc) or induced with 5, 10, 20 or 50 ng/ml Atc, by PCR amplification of 16S rRNA. The absence of a PCR product indicates an absence of contaminating DNA and therefore purity of the RNA. The RNA was then reverse transcribed to produce cDNA, successful synthesis being confirmed by repeating the PCR reaction to amplify 16S rRNA (Figure 4.6).

Figure 4.6. Confirmation of Successful cDNA Synthesis from the RNA Samples in Figure 4.4. This confirmation was obtained by performing the same PCR reaction, with the presence of a PCR product indicating successful cDNA synthesis. In addition to the cDNA obtained from the above samples, in which dccA expression from plasmid pECC17 was induced with a variety of concentrations of Atc, RNA was also extracted from cultures of C. difficile 630 (pMTL960) and 630 (pECC12). 630 (pMTL960) was chosen for use as a further negative control, lacking any plasmid-encoded copy of dccA, while 630 (pECC12) was chosen in order to compare the level of transcription of pilA1

131 induced by constitutive expression of dccA from the cwp2 promoter with that obtained from the tet promoter. Purity of the RNA extracted from C. difficile 630 (pMTL960) and 630 (pECC12) was confirmed as above, by PCR to amplify 16S rRNA (Figure 4.7A). Pure RNA was then reverse transcribed to yield cDNA, with the success of the reverse transcription again confirmed by PCR to amplify 16S rRNA (Figure 4.7B).

Figure 4.7. RNA Extraction and cDNA Synthesis from C. difficile 630 (pMTL960) and 630 (pECC12). A. Confirmation of purity of RNA extracted by PCR amplification of 16S rRNA. B. Confirmation of successful cDNA synthesis from the RNA samples in A. This confirmation was obtained by performing the same PCR reaction.

The cDNA samples from Figure 4.6 and Figure 4.7B were then used in qPCR analysis of the expression levels of pilA1 and pilB1 (Figure 4.8). Again, levels of pilA1 expression were normalised to the level of 16S rRNA expression. Gene expression levels found in this qPCR analysis are shown in Figure 4.8. One-way ANOVA showed significant variance in the levels of pilA1 expression in the samples analysed. Individual samples were then analysed by T-test to identify differences. A minimal level of pilA1 expression was seen in C. difficile 630 (pMTL960). The levels of pilA1 transcription in samples of 630 (pECC17) wherein dccA expression is either un-induced or induced with only 5 ng/ml are not significantly different to each other, however, the level of pilA1 transcription in those samples induced with 5 ng/ml Atc is significantly higher than the level in samples of 630 (pMTL960) (the difference between pilA1 transcription levels in samples of 630 (pMTL960) and un-induced 630 (pECC17) is insignificant due to the high variance between un-induced 630 (pECC17) samples). This suggests a slight level of ‘leakiness’ in the tet promoter, However, the level of ‘leakiness’ is insufficient to drive production of PilA1 protein (and therefore T4P), in cultures in which expression of dccA from pECC17 was not induced was induced with only 5 ng/ml Atc, as seen in Section 3.2.6.

132

2.5

‡ 2 **

1.5 ‡ * ‡ ** 1 ‡ Expression Level **

0.5 * Gene Expression Level as a Percentage of 16S rRNA 0

Sample Figure 4.8. Effect of Diguanylate Cyclase Expression on Expression of pilA1.Graph showing the effect on pilA1 transcription of inducing expression of the diguanylate cyclase dccA with a gradient of increasing concentrations of Atc (shown in x-axis labels), and also comparing the effect on pilA1 transcription of constitutively expressing dccA from the cwp2 promoter with plasmid pECC12 rather than inducing its transcription with Atc from the tet promoter with plasmid pECC17. Levels of pilA1 transcription are normalised to the level of 16S rRNA. Error bars indicate one standard deviation either side of the mean. Assuming a basal level of pilA1 transcription is seen in samples from 630 (pMTL960), * indicates a level of pilA1 transcription significantly above this basal level, P>0.95, ** P>0.99. The level of pilA1 transcription in samples of 630 (pECC17) wherein expression is induced with 5 ng/ml Atc is known to be insufficient to drive production of PilA1 protein (Section 3.2.6); ‡ indicates a level of pilA1 transcription significantly above this level, P>0.95.

Thus, an increase in pilA1 transcription above the basal level seen in 630 (pMTL960) is not necessarily sufficient to indicate pilA1 expression. Having initially compared the level of pilA1 transcription in samples with the 630 (pMTL960) negative control (see Figure 4.8), levels of pilA1 transcription in more highly induced samples were further compared with the level of the 630 (pECC17) sample where dccA expression was induced with 5 ng/ml Atc. All three samples in which dccA expression was induced with a higher concentration of Atc (10, 20 and 50 ng/ml) contained significantly higher levels of pilA1 transcripts than the sample induced with 5 ng/ml Atc, as did the sample from 630 (pECC12), in which dccA is constitutively expressed. This suggests that somewhere between 5 and 10 ng/ml Atc is a

133 concentration which drives a level of dccA expression which produces a critical intra-cellular concentration of c-di-GMP, sufficient to activate pilA1 transcription to a level sufficient to result in PilA1 production. For reasons described in the Discussion (Section 4.3) no further work was done to identify the c-di-GMP concentration required to activate PilA1 production. Interestingly, there is no significant difference between the level of pilA1 transcription in any of these four samples (630 (pECC17) with dccA expression induced with 10, 20 and 50 ng/ml Atc and 630 (pECC12)). In particular, the mean levels of pilA1 transcription in the three 630 (pECC17) samples are extremely similar. This could indicate that they demonstrate a maximal level of pilA1 transcription obtainable when dccA is expressed from the tet promoter. However, this seems unlikely. Increasing the concentration of Atc used to induce dccA transcription beyond 50 ng/ml significantly affects cell morphology, as shown in Section 3.2.6. This indicates that increasing the concentration of Atc used to induce dccA transcription beyond 50 ng/ml does increase the level of DccA production, and in turn c-di-GMP production. The level of pilA1 transcription obtained using 50 ng/ml Atc to induce dccA expression seems not to be the maximum possible, as a higher level appears to be obtained when dccA is expressed from the constitutive cwp2 promoter. It is therefore unclear why the level of pilA1 transcription is not significantly higher when dccA expression is induced with 50 ng/ml Atc compared to when it is induced with 10 ng/ml Atc, though it could be that differences which do exist are masked by quite high standard deviations. Standard curves and melt curves for these qPCR reactions showed that the reactions had worked well and had only 1 product, and that the cDNA dilutions used were appropriate (Figure 4.9).

4.2.3 A Transcriptional Terminator Exists Between pilA1 and pilB1 It is shown in Figure 4.3 that though expression of a DGC (and the resultant increase in intracellular c-di-GMP concentration) increases the level of pilB1 transcription, it is transcribed at a much lower level than pilA1, indicating that transcription was being terminated between the two genes.

134

Figure 4.9. Standard Curves (top) and Melt Curves (bottom) of the qPCR Reactions Analysed in Figure 4.8 (L: 16S rRNA; R: pilA1). Standard curves plot sample Ct value (y-axis) against DNA quantity (x-axis); red squares indicate standards, blue squares experimental samples. Melt curves plot derivative reporter values (y- axis) against temperature (x-axis). All melt curves are good, showing only one product, and both standard curves are good, showing appropriate levels of cDNA dilution for the reactions.

Indeed, visual analysis of the intergenic region between pilA1 and pilB1 revealed an apparent rho-independent terminator (Figure 4.10A). Secondary structure prediction confirmed the DNA sequence identified was likely to form a stem-loop structure (Figure 4.10B), and that its formation was highly energetically favourable, with a ΔG of formation of -29.71 kJ/mol. A

Figure 4.10. Identification of a Transcription B Terminator Between pilA1 and pilB1. A. The intergenic region between pilA1 and pilB1. The genes are on the antisense strand and therefore go from right to left, and are the labelled green arrows. The putative terminator is in the intergenic region. It is light blue and labelled. Taken from ‘Geneious’. B. Secondary

structure prediction of the pilA1Terminator, as it would form within mRNA. This shows it is predicted to form a stem-loop structure with a poly-U tail (such as a classic rho-independent transcriptional terminator). The predicted structure is taken from OligoAnalyzer 3.1. Red circles indicate G-C base pairings; blue circles indicate A-U base pairings.

135

To confirm that this putative terminator (named the pilA1Terminator) was terminating the pilA1 transcript, it was decided to delete the pilA1Terminator in the chromosome of strain 630 and investigate the effect of its deletion on the level of pilB1 transcription. This was performed by codA-linked allele exchange, as was used for the deletion of pilA1 in Section 3.2.7. Plasmid pECC61 was designed to delete the terminator, and was conjugated into C. difficile 630; single cross-over integrants were identified, and from them thiamphenicol- sensitive double cross-over strains generated and identified. These strains were screened by amplification of a 209 bp fragment, including the 36 base-pair terminator deleted. PCR products were analysed by agarose gel electrophoresis, allowing the identification of apparent pilA1Terminator mutants (Figure 4.11). Apparent mutants were confirmed by Sanger sequencing, and the mutant generated known as

630ΔpilA1Terminator.

Figure 4.11. Colony PCR Screening Showing Successful Generation of 630ΔpilA1Terminator Mutant. PCR products were analysed by agarose gel electrophoresis on a 4 % agarose gel. The WT PCR product is 209 bp; the mutant PCR product is 173 bp.

Plasmids pASF85 and pECC17 were conjugated into 630ΔpilA1Terminator. RNA was extracted from samples of 630ΔpilA1Terminator (pASF85) and (pECC17). In both (pASF85) and (pECC17) cultures, 50 ng/ml Atc was added as inducer, and the purity of the extracted RNA confirmed (Figure 4.12A). The RNA was then reverse transcribed to produce cDNA, whose synthesis was confirmed (Figure 4.12B). This cDNA was then used for qPCR analysis of the expression of pilA1 and pilB1 (in this instance the cDNA was diluted 1/80 for analysis of pilB1 transcription as well as pilA1).

136

Figure 4.12. RNA Extraction and cDNA Synthesis from C. difficile 630ΔpilA1Terminator (pASF85) and (pECC17) Cultures. A. Confirmation by PCR amplification of 16S rRNA of purity of RNA extracted. B. Confirmation of successful cDNA synthesis from the RNA samples in A. This confirmation was obtained by performing the same PCR reaction.

This qPCR analysis showed that in the absence of the transcriptional terminator, the level of pilB1 transcription is at least equal to the level of pilA1 transcription (Figure 4.13), and indeed, in the (pASF85) control cultures, in which the intracellular concentration of c-di-GMP is low, the level of pilB1 transcription is significantly higher than that of pilA1 transcription. This strongly suggests that pilB1 transcription is driven by its own independent promoter, as if pilB1 was only co-transcribed with pilA1, it would be impossible for its level of transcription to be higher than that of pilA1.

0.5 0.45 *** 0.4 0.35 *** 0.3 0.25 pilA1 0.2 pilB1 0.15 0.1 ‡ 16S rRNA Expression Level Expression rRNA 16S 0.05 0

Gene Expression Level as a Percentage of of a Percentage as Level Expression Gene pASF85 pECC17 Sample

Figure 4.13. Effect of DccA Expression on Expression of pilA1 and pilB1 in 630ΔpilA1Terminator. Graph showing the levels of pilA1 and pilB1 transcription in cultures of 630ΔpilA1Terminator carrying plasmids pASF85 and pECC17 following induction of expression with 50 ng/ml Atc. All levels of transcription normalised to level of 16S rRNA transcription. Error bars represent one standard deviation either side of the mean. *** indicates increased level of gene transcription in comparison to the same gene in the other sample, P>0.999; ‡ indicates higher level of gene transcription in comparison to the other gene in the same sample, P>0.95.

137

Analysis of the region upstream of pilB1 in strain 630, between the pilA1Terminstor and the start of the pilB1 gene, using the internet-based programme Neural Network Promoter Prediction (http://www.fruitfly.org/seq_tools/promoter.html, (Reese, 2001)), identified two potential pilB1 promoters, either or both of which may be driving pilB1 expression. The first of these promoters has a predicted transcription start site 25 bp upstream of pilB1, and the second a predicted transcription start site only 13 bp upstream of pilB1. However, no further investigation of these putative promoters was performed. In wild-type 630 transcription of both pilA1 and pilB1 is up-regulated by c-di-GMP, but pilA1 is much more highly up-regulated and the resultant level of pilA1 transcription is much higher than that of pilB1. In the 630ΔpilA1Terminator mutant, both pilA1 and pilB1 are again highly up-regulated by c-di-GMP. The level of pilA1 up-regulation is still considerably higher than pilB1 up-regulation (pilA1 is up-regulated by a factor of 19.2, pilB1 by a factor of 8.2; why the level of pilA1 up-regulation is so much lower in this mutant compared to the wild-type is unknown), but the resultant levels of pilA1 and pilB1 transcription are similar.

This shows that in the wild-type strain, the pilA1Terminator is functional, terminating at least the majority of pilA1 transcripts and preventing the co-transcription of pilB1. This leaves the question as to whether the c-di-GMP dependent increase in pilB1 transcription in wild-type 630 is a result of the increase in pilA1 transcription causing an increase in the number of pilA1 transcripts proceeding past the transcriptional terminator, or whether the pilB1 promoter(s) are also regulated by c-di-GMP. As previously discussed, a c- di-GMP-binding riboswitch (Cdi2_4) has been identified upstream of pilA1 (Soutourina et al., 2013). To investigate whether the c-di-GMP dependent up-regulation of pilB1 is independent of this riboswitch, a plasmid was constructed with which to delete the riboswitch (pECC71). pECC71 was conjugated into strain 630 to attempt to delete the Cdi2_4 riboswitch. Unfortunately, this was unsuccessful. For reasons both unknown and inexplicable it did not prove possible to obtain single cross-over integrants of pECC71, despite multiple attempts, so this experiment into the control of expression of pilB1 was not completed.

4.2.4 An Operon Runs from pilB1 to prsA To investigate whether the genes of the primary T4P cluster are transcriptionally linked, primer pairs were designed to amplify the intergenic regions between all the genes in the cluster. The first pair amplify a region upstream of the riboswitch Cdi2_4, and the 5’ end of the riboswitch. The second pair amplify the 3’ end of the riboswitch and the 5’ end of pilA1. All other primer pairs were designed such that the 5’ primer would bind the 3’ end of the

138 upstream gene, and the 3’ primer the 5’ end of the downstream gene, so that a product would be amplified comprising the ends of both genes and the intergenic region between them. Primer pairs were designed to amplify all intergenic regions in the putative operon, ending with a pair which amplify the region between prsA and spoVT. (The operon was predicted to terminate at prsA.) These primers were all tested with 630 gDNA to confirm their function (Figure 4.14). Having confirmed that the primers work, cDNA extracted from 630 (pASF85) and 630 (pECC17) cultures induced with 50 ng/ml Atc was used as template to investigate co- transcription of the primary T4P gene cluster by RT-PCR, and the effect of expression of DccA (Figure 4.14). At no point is the region upstream of the Cdi2_4 riboswitch transcribed, but it is apparent that transcription of Cdi2_4 running into pilA1 is hugely up-regulated when DccA is present. However, in neither conditions of low nor high c-di-GMP concentrations were pilA1 and pilB1 co-transcribed. Under both conditions though, it appears that an operon runs from pilB1 to prsA, with all genes co-transcribed. The operon clearly ends at prsA, with no co-transcription between it and spoVT. If there is indeed no co-transcription of pilA1 and pilB1 (as it appears) then this might suggest that the up-regulation of pilB1 transcription in response to increased c-di-GMP concentrations is independent of the Cdi2_4 riboswitch and is rather due to increased transcription from the pilB1 promoter(s). However, as described above, this was regrettably not possible to test. It appears, albeit qualitatively, that the entire operon is more highly transcribed when DccA is present (i.e. under conditions of high c-di-GMP concentration) than when it is not. This would be expected: given that transcription of the first gene in the operon (pilB1) is up- regulated by c-di-GMP, one would expect transcription of the entire operon to be similarly up-regulated. To test this quantitatively, qPCR was performed on all genes of the operon. These qPCR experiments were performed on the cDNA extracted from 630 (pMTL960) and 630 (pECC12), as these strains seem to transcribe pilA1 most lowly and highly respectively. It was therefore assumed that they might also transcribe the pilB1 – prsA operon most highly.

139

Figure 4.14. RT-PCR to Investigate Co-Transcription of Genes in the Primary T4P Gene Cluster. The top row shows testing by regular PCR of the primer pairs on gDNA to confirm that they all work. The second two rows show RT-PCR using cDNA from 630 (pASF85) and 630 (pECC17) as indicated, demonstrating that an operon runs from pilB1 to prsA, but that pilA1 does not appear to be co-transcribed with the operon. Having shown that pilB1 expression is up-regulated by increased c-di-GMP concentrations, further qPCR was performed on the operon from pilC1 onwards. Analysis of the qPCR reactions (Figure 4.15) showed that the mean level of transcription of all genes in the primary T4P gene operon was higher in cultures of 630 (pECC12) than cultures of 630 (pMTL960) (i.e. when intracellular c-di-GMP concentrations were higher). However, the level of variance of gene transcription (in cultures of 630 (pECC12) particularly) was very high, meaning that only transcription of pilC1 and pilMN was statistically significantly higher in cultures of 630 (pECC12) than 630 (pMTL960). However, it is likely that were more cDNA samples analysed (three were used in each of these analyses), then more genes would be statistically significantly up-regulated in 630 (pECC12) compared to 630 (pMTL960), as

140 of course this would mean a higher value for degrees of freedom in the t-test analysis and that therefore a lower t-value would be statistically significant. Melt curves and standard curves from all reactions showed that the primer pairs were good (producing only one product) and that the cDNA dilutions were also appropriate for all reactions (Figure 4.16).

0.3

* 0.25

0.2

0.15 * pMTL960 pECC12 0.1 Expression

0.05

0 Gene Expression as Percentage 16Sof rRNA as Percentage Gene Expression pilC1 pilMN pilO pilV pilU pilK pilT pilD1 pilD2 pth mfd prsA Gene

Figure 4.15. Effect of DccA on Primary T4P Operon Expression Excluding pilA1. qPCR analysis of transcription of genes of the primary T4P operon from pilC1 to prsA, in cultures of C. difficile 630 with high intracellular c-di-GMP (pECC12), and control cultures with low intracellular c-di-GMP (pMTL960). Gene transcription levels normalised to level of 16S rRNA transcription. Error bars indicate one standard deviation either side of the mean. * indicates statistically significant increased level of gene transcription in comparison to that in the control sample, P>0.95.

141

Figure 4.16. Standard Curves (Top) and Melt Curves (Bottom) of qPCR Reactions on Genes in the Primary T4P Operon. Top rows (L-R): pilC1, pilMN, pilO, pilV, pilU, pilK; bottom rows (L-R): pilT, pilD1, pilD2, pth, mfd, prsA.

142

4.2.5 Transcription of the Secondary Type IV Pilus Gene Cluster Appears to be Down-Regulated by c-di-GMP Having shown that transcription of the primary T4P operon is controlled by c-di-GMP, with increased c-di-GMP concentrations up-regulating transcription of pilA1 and pilB1, I was interested to see whether c-di-GMP concentration also affected the level of transcription of the secondary T4P gene cluster (a schematic diagram showing the secondary T4P gene cluster is shown in Figure 3.3). As a representative gene from the cluster pilA2 was chosen for investigation of its transcription. The level of pilA2 transcription was compared between that in cultures of 630 (pECC17), in which dccA transcription was induced with 50 ng/ml Atc, and cultures of 630 (pECC17) in which dccA transcription was not induced. Synthesis of these cDNA samples is shown in Figures 4.5 & 4.6. The qPCR analysis appeared to show that pilA2 is significantly down-regulated when dccA is expressed in cultures of 630 (pECC17) wherein gene expression is induced with 50 ng/ml Atc, compared to cultures wherein dccA expression is not induced (Figure 4.17). The increased levels of c-di-GMP resulted in pilA2 transcription being down-regulated by an order of magnitude, with a level of statistical significance of more than 0.995. However, this result should be taken with a hint of caution. Though the qPCR reactions analysed here produced good standard curves (Figure 4.18), the melt curves (Figure 4.18) show that while in the sample standard reactions only one PCR product is produced, in several of the experimental samples two products are produced. It is possible that this is due to the extremely high concentration of cDNA used in the reactions (due to the very low levels of pilA2 transcription in the samples, cDNA was diluted only ½ for qPCR analysis). The cDNA samples still contain the reagents used in the reverse transcription of the RNA (though the transcriptase itself has been heat inactivated), and it may well be that their high concentration disrupts the PCR reaction, resulting in the formation of the second product. This product may be a primer dimer. Either way, it is impossible to know how much this second product is contributing to the apparent level of transcription seen. Furthermore, the results of the qPCR reactions from the un-induced cultures are based on only two biological replicates (the third reaction set failed), while qPCR reactions attempted with several other samples (in particular those obtained from cultures of 630 (pASF85), 630 (pMTL960) and 630 (pECC12)) also failed completely, with no products detected. This is presumably due to the extremely low level of pilA2 transcription in all samples investigated. This suggests firstly that neither of the conditions attempted here

143 correspond to those when the secondary T4P gene cluster is expressed, and secondly that the qPCR results for pilA2 transcription obtained here possibly ought to be taken with a pinch of salt.

0.009

0.008

0.007

0.006

0.005

0.004

0.003

Transcription as ofa % the Level of 0.002 16S 16S TranscriptionrRNA

pilA2 0.001 ***

0 Level Level of Un-Induced 50 ng/ml Atc Concentration of Atc Used to Induce dccA Expression from Plasmid pECC17.

Figure 4.17. Effect of C-di-GMP Concentration on pilA2 Transcription. Graph showing results of qPCR reactions investigating the levels of pilA2 expression under conditions of high c-di-GMP concentration (samples wherein dccA expression is induced with 50 ng/ml Atc) in comparison to those under conditions of low c-di-GMP concentration (un-induced samples). Levels of pilA2 transcription are normalised to levels of 16S rRNA transcription. Error bars indicate one standard deviation either side of the mean. *** indicate a statistically significant difference in transcription relative to the un-induced control, P>0.995. 4.2.6 The Minor Pilins PilV, PilU and PilK are Essential for Type IV Pilus Biogenesis The next step in investigating the production of T4P from the primary T4P gene cluster was to investigate which of the genes within it were essential for T4P formation. The first genes investigated were the minor pilins pilV, pilU and pilK. Plasmids were designed to delete them all using codA-mediated allele exchange. Plasmids pECC59, pECC60 and pECC65 were designed for allele exchange of pilU, pilV and pilK respectively. Plasmids pECC59, pECC60 and pECC65 were individually conjugated into C. difficile 630, and as previously described single and then double cross-overs were isolated and then screened to identify mutants (Figures 4.19 – 4.21).

144

Figure 4.18. Standard Curve (left) and Melt Curves (right) for the pilA2 qPCR Reactions Analysed in Figure 4.17. The standard curve is good (samples coloured green indicate flags, herein due to the presence of two peaks in the melt curves). The red and yellow melt curves, obtained from the standard samples, are good, containing only one peak; the blue/green curves are obtained from the experimental samples – most of them contain to some degree a second peak, indicating the formation of a second PCR product meaning that the results obtained may not be entirely reliable.

The new mutant strains (630ΔpilU, 630ΔpilV and 630ΔpilK) were next tested for T4P biogenesis. Plasmid pECC17 was conjugated into each of them, cultures grown, dccA expression induced with 50 ng/ml Atc and cultures then harvested and protein extracted from culture supernatants by TCA precipitation. Cell lysates and supernatant preparations were then analysed by α-PilA1 Western blot (Figure 4.22). These showed that while PilA1 was produced by all three mutant strains (as the protein was present in the cell lysates) it was not exported (none being seen in the cell supernatants), suggesting that all three of these minor pilins were essential for T4P biogenesis. In comparison to the WT control there appeared to be significantly more PilA1 in the cell lysates in the three mutant strains, which could be due to build-up of PilA1 in the cells due to the cells’ inability to export it in the form of T4P.

Figure 4.19. Screening of Putative pilU Mutants, Derived from pECC59 Single Cross-Over Integrants. The Wild-Type (WT) control shows the size of the native PCR product. A smaller band (in terms of kb) is seen in the pilU mutant (630ΔpilU). The difference in size between the WT and mutant bands is particularly apparent in the mixed colony in the middle.

145

Figure 4.20 (R). Screening of Putative pilV Mutants, Derived from pECC60

Single Cross-Over Integrants. The Wild-

Type (WT) control shows the size of the

native PCR product. A smaller band (in terms of kb) is seen in the pilV mutant (630ΔpilV).

Figure 4.21 (L). Screening of Putative pilK Mutants, Derived from pECC65 Single Cross- Over Integrants. The WT control PCR failed here. However, the reaction worked for both the pilK mutant (630ΔpilK) and a WT-revertant, demonstrating the difference in band size (in terms of kb) between the WT and pilK mutant strains.

Figure 4.22 (L). Testing of Minor Pilin Mutants for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production of PilA1 by all three mutants. B. TCA preparations of culture supernatants, showing that unlike in the WT control, PilA1 is not exported from the cell by 630ΔpilU, 630ΔpilV or 630ΔpilK, indicating that these strains are unable to produce T4P and that PilU, PilV and PilK are therefore essential for T4P biogenesis.

To further confirm the specific essentiality of PilU, PilV and PilK for T4P biogenesis in C. difficile, the pilU, pilV and pilK genes were re-introduced on plasmids into their respective mutants to complement their deletions. The three genes were cloned into the second cloning site of pECC76, yielding plasmids pECC99, pECC104 and pECC132 respectively.

146

Plasmids pECC99, pECC104 and pECC132 were conjugated into 630ΔpilU, 630ΔpilV and 630ΔpilK respectively. Again, strains were cultured, dccA expression induced with 50 ng/ml Atc and cultures then harvested and protein extracted from culture supernatants by TCA precipitation. Cell lysates and supernatant preparations were again analysed by α-PilA1 Western blot (Figure 4.23). This showed that when the 630ΔpilU, ΔpilV and ΔpilK strains were complemented from these plasmids, PilA1 export was restored, confirming that all three minor pilins are individually essential to PilA1 export.

Figure 4.23. Complementation of Minor Pilin Mutants for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production of PilA1 by all three complemented strains. B. TCA preparations of culture supernatants, showing successful complementation of 630ΔpilU, 630ΔpilV and 630ΔpilK, as when the deleted genes are restored concomitant with dccA expression, PilA1 export is also restored.

4.2.7 Investigating the Function of PilU, PilV and PilK In the closely related Type II Secretion System (T2SS), the minor pseudo-pilins GspI, GspJ and GspK are known to interact to trimerise into a structure which is believed to form the tip of the pseudopilus (Korotkov and Hol, 2008), and it has been suggested that the minor pilins of the T4P of Gram-negative bacteria (in particular P. aeruginosa) could play the same role, forming the tips of the pili (though possibly being incorporated into the main bodies of the pili as well) (Giltner et al., 2010). In C. difficile, there is evidence from immunogold labelling experiments that PilU is incorporated into T4P (Goulding et al., 2009). In C. difficile the pilVUK genes are located next to each other in the primary T4P operon, suggesting that the functions of their encoded proteins PilVUK might be linked. I wondered whether they might similarly trimerise to form the Type IV pilus tip. To investigate whether this might be the case, and/or whether PilVUK are distributed throughout the pili, I decided to try to identify which C. difficile pilins interact with each other.

147

The first experiments undertaken were Bacterial-2-Hybrid (B2H) experiments. GspJIK from the ETEC T2SS were found to trimerise through their soluble domains alone, without need of their hydrophobic helices (Korotkov and Hol, 2008). For simplicity and accuracy, pilin-interaction experiments were performed exclusively with the soluble domains of the pilins. Inclusion of the hydrophobic helices in the proteins used in these experiments would have made the proteins much harder to work with, as they would have had the effect of making the proteins membrane proteins. It is also probable that if pilin proteins had been expressed alone, without the T4P assembly machinery to ensure accuracy of pilus assembly, the pilins’ hydrophobic helices would have aggregated due to hydrophobic interactions, in such a way that was not representative of the interactions seen in properly synthesised pili (in that pilins which would not interact when assembled into pili might interact anyway if exogenously expressed in E. coli). In the B2H process, the adenylate cyclase deficient E. coli DHM1 strain is transformed with two plasmids encoding fusion proteins, each of which contains one domain of the B. pertussis adenylate cyclase protein. If the other parts of the fusion proteins (the proteins of interest) are able to interact with each other, this will bring together the two domains of the adenylate cyclase protein, resulting in the formation of an active adenylate cyclase. This synthesises cyclic AMP (cAMP), which induces expression of β-galactosidase. In the method used herein, the DHM1 strain is grown on LB agar supplemented with the chromogenic substrate X-Gal, which, when cleaved by β-galactosidase, produces a blue product. Therefore, strains of DHM1 expressing fusion proteins wherein the proteins of interest interact, produce β-galactosidase which cleaves the X-Gal and turns the colonies blue (Karimova et al., 1998). To perform the B2H experiments all pilin genes not located in other distinct clusters (i.e. all except pilA2, pilA3 and pilX) were cloned into the B2H vectors pKT25 and pUT18C. N-terminal TM helices in the pilin proteins were identified using TMHMM, allowing easy identification of the C-terminal soluble domains, and only these soluble domains were cloned. PCR products were cloned into pKT25 and pUT18C, yielding plasmids encoding fusion proteins with N-terminal adenylate cyclase subunits and C-terminal pilin subunits. Cloning of pilA1 into pKT25 and pUT18C yielded plasmids pECC77 and pECC83 respectively, cloning of pilU plasmids pECC78 and pECC84 respectively, cloning of pilV plasmids pECC79 and pECC85 respectively, cloning of pilK plasmids pECC82 and pECC88 respectively, cloning of pilJ plasmids pECC89 and pECC90 respectively and cloning of pilW plasmids pECC92 and pECC91 respectively.

148

These plasmids were all transformed into E. coli DHM1. They were co-transformed in every combination, i.e. every pKT25-derived plasmid co-transformed with every pUT18C- derived plasmid, and vice versa. In addition, as negative controls every pKT-25-derived plasmid was co-transformed with empty pUT18C, and every pUT18C-derived plasmid co- transformed with empty pKT25. As a positive control pKT25-Zip and pUT18C-Zip were co- transformed. These plasmids contain the two adenylate cyclase fragments fused to the two alpha-helix monomers of a leucine zipper protein (the yeast GCN4 activator), and are established positive controls for the bacterial-2-hybrid system (Karimova et al., 2001). B2H experiments were performed with all plasmid pairs as described above. Though the positive controls worked very well, demonstrating that the method itself was working, not one plasmid pair produced fusion proteins which interacted. Not one single colony (aside from the positive controls) turned blue, with all the others remaining colourless (Figure 4.24). This indicated that in the context of the C. difficile Type IV pilins, the soluble domains are most likely insufficient for interactions to occur between pilins, and that the insoluble α- helices are probably essential for these interactions to occur (unlike the situation in the T2SS pseudopilins discussed above).

Figure 4.24. Examples of B2H Plates from B2H Analysis of Interactions Between the C. difficile 630 Pilin Proteins. Two paradigm examples of the B2H plates are shown rather than all several dozen. Both plates shown show testing of PilK-PilV interactions; on the left PilV is expressed from pUT18C and PilK from pKT25 (pECC85 and pECC82 respectively), while the reverse is the case on the right (i.e. plasmids pECC79 and pECC88). Zip/zip indicates the positive control leucine zipper proteins; those pairs including pKT25 or pUT18C are negative controls including an empty vector; V/K and K/V are the experimental pairs containing both PilV- and PilK-fusion proteins. Clearly only the positive controls are blue, indicating only those fusion proteins have interacted and that the soluble domains of PilV and PilK do not.

149

To confirm that the soluble domains of PilU, PilV and PilK do not interact (PilU, PilV and PilK are the three minor pilins hypothesised to form a trimeric pilus tip) co-purification of the soluble domains of these minor pilins was attempted. To this end the soluble domain- encoding region of pilK was cloned into pET28a (pECC38), yielding pECC86, which encodes PilK with an N-terminal His-tag. The soluble domain-encoding regions of pilU and pilV were cloned into the two multiple cloning sites (MCSs) of pACYCDuet-1, yielding plasmid pECC106, which encodes pilU in its first MCS and pilV in its second. Neither the PilU nor PilV fragments were tagged. Plasmids pECC86 and pECC106 were transformed into the protein expression strain E. coli BL21, and the solubility of the N-terminally truncated PilV, PilU and PilK proteins tested following expression under various conditions. Expression was tested in both LB and TY broth, with strains grown in both broths also grown after induction of protein expression with IPTG for both 4 hrs at 30oC and overnight at 20oC. The solubility profiles of truncated PilU and PilV were investigated by α-PilU and α-PilV Western blot respectively, and that of truncated PilK by α-His Western blot. All three proteins were largely soluble under all four expression conditions tested (data not shown). However, PilK was seen to degrade somewhat under all conditions, but this was seen least when it was expressed in TY broth overnight at 20oC. Therefore, this condition was chosen for protein expression for co-purification experiments. Protein expression of truncated PilK, PilU and PilV from pECC86 and pECC106 was scaled up and co-purification attempted via the N-terminal His-tag of PilK. Analysis of the soluble fraction of cell lysate, purification flow through and purification fractions by SDS- PAGE followed by Coomassie staining and α-His Western blot showed successful purification of truncated PilK (and degradation products thereof) (Figure 4.25). However, the Coomassie stain did not suggest successful co-purification of either truncated PilU or PilV. Analysis of the same samples by α-PilV or α-PilU Western blot confirmed that following PilK purification, both PilV and PilU were found exclusively within the flow through (Figure 4.25). This gave further evidence that the soluble domains alone of the minor pilins PilV, PilU and PilK do not trimerise, in contrast to the minor ETEC T2SS pseudopilins GspI, GspJ and GspK. However, this does not mean that PilK, PilU and PilV do not form the tip of the C. difficile Type IV Pilus – they could well still do, but require their N-terminal α-helices in order to interact.

150

4.2.8 The Pre-Pilin Peptidase PilD1 Processes PilA1 To further investigate which primary T4P operon genes are essential for T4P formation I created plasmids to delete the pre-pilin peptidases pilD1 and pilD2. Plasmids pECC70 and pECC69 were generated for targeted deletion of pilD1 and pilD2 respectively. These plasmids were independently conjugated into C. difficile 630. Single and double cross-overs derived from pECC70 transconjugants were identified and screened as previously described, resulting in the identification of a pilD1 mutant (630ΔpilD1, Figure 4.26).

Figure 4.25. Attempted Co-Purification of the Soluble Domains of C. difficile 630 PilK, PilV and PilU. SF = Soluble Fraction of cell lysate; FT = purification Flow Through. Top left – Coomassie stained SDS-PAGE gel analysing fractions from the purification. Along with the α-His blot (top right), this shows successful purification of truncated PilK (predicted molecular weight: 55.7 kDa). Clearly, from the α-His Western blot, there is a level of PilK degradation. However, the vast majority of the purified protein appears to be full-length (aside from the designed N-terminal truncation). The purification itself has clearly been successful in terms of PilK purification, though several non- specific proteins appear to have been co-purified, as shown by the presence of bands in the Coomassie stain which are not visible in the α-His blot. Bottom left and right are α-PilU and α-PilV Western blot analyses of the same samples in the top images. Truncated PilU has a predicted molecular weight of 16.2 kDa, truncated PilV of 17.9 kDa. Both are expressed, being visible in the soluble fraction of the lysate, but clearly neither has co-purified with PilK, both being found exclusively in the purification flow through. Single and double cross-overs were also identified from pECC69 transconjugants. However, frustratingly, despite repeating the mutation protocol multiple times, and screening over 50 double cross-overs, no pilD2 mutants were identified (an example screen is shown in Figure 4.27). Given the statistical improbability of this occurrence (one would expect approximately 50 % WT-revertants and 50 % mutants, and ratios broadly along these lines were seen for all other attempts at mutant biogenesis), this would normally suggest that pilD2

151 was an essential gene. However, it seemed highly improbable that a pre-pilin peptidase could be an essential gene in C. difficile.

Figure 4.26. Screening of Putative pilD1 Mutants, Derived from pECC70 Single Cross- Over Integrants. The WT control shows the size of the native PCR product. A smaller band (in terms of kb) is seen in the pilD1 mutant (630ΔpilD1).

Noticeably though, immediately downstream of pilD2 was the pth gene. In E. coli, pth is an essential gene (Das and Varshney, 2006). A recent transposon mutagenesis study has identified what is expected to be the vast majority (if not all) of the essential genes in C. difficile strain R20291 (Dembek et al., 2015). This showed that pilD2 was not an essential gene in C. difficile R20291, with several pilD2 transposon mutants obtained in the study; however, pth was identified as an essential gene, with no pth transposon mutants obtained. This suggests that my attempts to delete pilD2 might have failed due to the inclusion in the attempted deleted region of a pth promoter. Using plasmid pECC69 only 40 bp directly upstream of pth was intended to be left intact. Bioinformatics analysis of the region 100 bp directly upstream of the pth gene suggests the presence of three strong transcriptional promoters in that region, all of which would have been disrupted in any pilD2 mutant generated using pECC69. It should therefore be attempted to delete pilD2 in such a way that considerably more of the region directly upstream of pth was left intact so as to avoid disrupting any promoter present there.

Figure 4.27. Example Screen of pECC69-Derived Double Cross-Overs. These were generated in an attempt to generate a pilD2 mutant. Clearly, all 8 double cross-overs obtained in this procedure were WT-revertants, with no pilD2 mutants seen.

152

Plasmid pECC17 was conjugated into 630ΔpilD1. A culture was grown, dccA expression induced with 50 ng/ml Atc and the culture then harvested and protein extracted from the culture supernatant by TCA precipitation. The cell lysate and supernatant preparation were then analysed by α-PilA1 Western blot (Figure 4.28), which showed that (i) pilD1 was essential for PilA1 export and therefore T4P biogenesis, and (ii) that pilD1 was necessary for PilA1 maturation, with the PilA1 protein present in the 630ΔpilD1 lysate noticeably larger than that present in WT 630.

Figure 4.28. Testing of 630ΔpilD1 for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing PilA1 is produced by 630ΔpilD1, but is larger than in the WT lysate, indicating PilD1 is required for maturation of pre-PilA1. B. TCA preparations of culture supernatants, showing that unlike in the WT control, PilA1 is not exported from the cell by 630ΔpilD1, indicating that it is unable to produce T4P and that PilD1, is therefore essential for T4P biogenesis.

To complement the pilD1 mutant pilD1 was cloned into the second insertion site of pECC76, yielding plasmid pECC96, which co-expresses dccA and pilD1. Plasmid pECC96 was conjugated into 630ΔpilD1, a culture grown, dccA and pilD1 expression induced with 50 ng/ml Atc and the culture harvested and the culture supernatant TCA-treated to precipitate the protein therein. The cell lysate and supernatant preparation were then analysed by α- PilA1 Western blot (Figure 4.29), which showed successful complementation of the pilD1 deletion, with PilA1 cleavage and export restored. This confirmed that PilD1 cleaves PilA1. To investigate whether PilD1 also cleaved and matured PilU and PilV, plasmids pECC99 and pECC104, which co-express dccA with pilU and pilV respectively, were both conjugated independently into WT 630 and 630ΔpilD1. Cell lysates and supernatant preparations were prepared as above and analysed by α-PilU and α-PilV Western blots as appropriate. However, neither PilU nor PilV were detected by α-PilU or α-PilV antibodies in Western blots, when expressed from pECC99 and pECC104 respectively in either strain (not shown). This is surprising given that both proteins are detectable when expressed alone in WT 630 (particularly PilV, see Figure 3.7), and both

153 proteins are produced from their respective plasmids, as they are able to complement chromosomal mutations of the genes (Figure 4.23). This suggests that somehow expression of the primary T4P operon, as induced by DccA expression, regulates the amounts of PilU and PilV produced/present in the cell and prevents excessive amounts of either protein building up.

Figure 4.29. Complementation of the pilD1 Mutant for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production and cleavage of PilA1 by the complemented strain, as the PilA1 is now the same size as in the WT. B. TCA preparations of culture supernatants, showing export of PilA1 by complemented 630ΔpilD1. These results demonstrate successful complementation of the pilD1 mutant.

A series of experiments aimed at identifying any other pilins processed by PilD1 or PiD2 were carried out, but failed to yield conclusive results. Tagged, truncated versions of each of PilA1, PilV, PilU and PilK were co-expressed in E. coli with each of PilD1 and PilD2, in an attempt to identify by which pre-pilin peptidase each was processed. However, it proved impossible to detect these truncated pilins by Western blotting, and so these experiments were inconclusive.

4.2.9 PilB1 is Essential for Type IV Pilin Biosynthesis from PilA1 To investigate which of the ATPases located in the C. difficile 630 pilin gene clusters were required for assembly of T4P from PilA1, plasmids were synthesised for use in the deletion of pilB1 and pilT from the primary T4P gene cluster and of pilB2 from the secondary gene cluster. These were pECC62, pECC64 and pECC63 respectively. pECC62, pECC63 and pECC64 were individually conjugated into 630, single and double cross-overs from each set of transconjugants identified and the double cross-overs screened to identify mutants, resulting in the identification of the 630ΔpilB1, 630ΔpilB2 and 630ΔpilT mutants respectively (Figure 4.30-32).

154

Figure 4.30 (L). Screening of Putative pilB1 Mutants, Derived from pECC62 Single Cross-Over Integrants. The Wild- Type (WT) control shows the size of the native PCR product. A smaller band (in terms of kb) is seen in the pilB1 mutant (630ΔpilB1).

Figure 4.31 (R). Screening of Putative pilB2 Mutants, Derived from pECC63 Single Cross-Over Integrants. The Wild- Type (WT) control failed here, but a WT revertant shows the size of the native PCR product. A smaller band (in terms of kb) is seen in the pilB2 mutant (630ΔpilB2).

Figure 4.32 (L). Screening of Putative pilT Mutants, Derived from pECC64 Single Cross-Over Integrants. The Wild- Type (WT) control shows the size of the native PCR product. A smaller band (in terms of kb) is seen in the pilT mutant (630ΔpilT).

Plasmid pECC17 was conjugated into each of 630ΔpilB1, 630ΔpilB2 and 630ΔpilT. Cultures of the transconjugants were grown, dccA expression induced with 50 ng/ml Atc and the cultures then harvested and protein extracted from the culture supernatants by TCA precipitation. The cell lysates and supernatant preparations were then analysed by α-PilA1 Western blot (Figure 4.33).

155

Figure 4.33. Testing of T4P ATPase Mutants for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production of PilA1 by all three mutants. B. TCA preparations of culture supernatants, showing that PilA1 is not exported from the cell by 630ΔpilB1, indicating that this strain is unable to produce T4P and that PilB1 is therefore essential for T4P biogenesis. PilA1 is exported by 630ΔpilB2 and 630ΔpilT, indicating that PilB2 and PilT are not essential for biogenesis of T4P from PilA1.

As shown in Figure 4.33, PilB1 is essential for export of PilA1 and therefore synthesis of T4P from PilA1. However, neither PilB2 nor PilT were essential for PilA1 export. In the case of PilB2, this suggests that the secondary T4P cluster functions discretely to the primary operon and that PilB2 therefore does not play a role in T4P biosynthesis from PilA1 (though it may well play such a role in T4P biosynthesis from e.g. PilA2). In the case of PilT, this suggests that PilT is only required for T4P retraction (not synthesis). In 630ΔpilT, there appears to be an increased amount of PilA1 in the culture supernatant and reduced amount in the cell lysate in comparison to the wild-type strain, which may indicate that the pilT mutant is hyper-piliated, such as is seen in Neisseria and P. aeruginosa pilT mutants (Carbonnelle et al., 2005; Comolli et al., 1999) (unlike in the C. perfringens pilT mutant which is unpiliated (Varga et al., 2006)). To confirm the essential role of pilB1 in T4P synthesis the pilB1 mutant was complemented. The pilB1 gene was cloned into the second cloning site of pECC76, yielding plasmid pECC128. It was also of interest to attempt to complement the pilB1 mutation with pilB2, to further attempt to clarify whether there is redundancy of function between the genes of the two clusters or whether they serve different functions. Therefore, pilB2 was also cloned into the second cloning site of pECC76, yielding plasmid pECC129. Plasmids pECC128 and pECC129 were both, independently conjugated into 630ΔpilB1. Cultures were grown, gene expression from the plasmids induced with 50 ng/ml Atc, cultures harvested and protein extracted from the culture supernatant by TCA precipitation. The cell lysates and supernatant preparation were then analysed by α-PilA1

156

Western blot (Figure 4.34). This showed conclusively that while pilB1 was able to complement the pilB1 mutation, pilB2 was not, and that PilB2, despite being presumed to be a T4P assembly ATPase, is clearly not able to assemble PilA1 into T4P, at least not in combination with the primary T4P apparatus. This is presumably due to PilB2 being unable to interact with the other components of the primary T4P apparatus, on account of the significant differences in its sequence compared to PilB1.

Figure 4.34. Complementation of the pilB1 mutant with both pilB1 and pilB2. This α-PilA1 Western blot clarifies the essentiality of pilB1 for biosynthesis of PilA1 into T4P (indicated by the export of PilA1). A. Cell lysates, showing the production and of PilA1 by the complemented strains. B. TCA preparations of culture supernatants, showing export of PilA1 by 630ΔpilB1 when complemented with pilB1, but not when complemented with pilB2. This suggests a clear delineation of functions of genes within the primary and secondary T4P gene clusters.

4.2.10 The Inner Membrane Proteins are Essential for Type IV Pilus Biosynthesis The only remaining predicted T4P-related genes from the primary T4P operon not investigated regarding their necessity for PilA1 export were those which encode the inner membrane proteins, which comprise the inner membrane core protein PilC1 and the inner membrane accessory proteins PilMN and PilO. Plasmids pECC66, pECC67 and pECC68 were constructed for use in the deletion of pilC1, pilMN and pilO respectively. Plasmids pECC66, pECC67 and pECC68 were individually conjugated into C. difficile 630, single and double cross-overs isolated as described previously and double crossovers then screened to identify mutants. This resulted in the identification of 630ΔpilC1, 630ΔpilMN and 630ΔpilO respectively (Figure 4.35).

157

A B C

Figure 4.35. Colony PCR Screening Showing Deletion of pilC1, pilMN and pilO from C. difficile 630. A. Generation of 630ΔpilC1 by deletion of pilC1 with pECC66. B. Generation of 630ΔpilMN by deletion of pilMN with pECC67. C. Generation of 630ΔpilO by deletion of pilO with pECC68. In each image the WT control shows the size of the WT band produced by each PCR reaction, while the mutant bands are of smaller size than the WT ones. Plasmid pECC17 was conjugated into each of 630ΔpilC1, 630ΔpilMN and 630ΔpilO. Cultures of each strain were grown, dccA expression from pECC17 induced with 50 ng/ml Atc, cultures harvested and protein extracted from the culture supernatant by TCA precipitation. The cell lysates and supernatant preparation were then analysed by α-PilA1 Western blot (Figure 4.36). This showed that all of these inner membranes are essential for T4P synthesis in C. difficile 630, since no export of PilA1 is seen in any of them.

Figure 4.36. Testing of Inner Membrane Protein Mutants for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production of PilA1 by all three mutants. B. TCA preparations of culture supernatants, showing that PilA1 is not exported by any of the mutants, indicating that all of pilC1, pilMN and pilO are essential for biogenesis of T4P from PilA1.

To confirm the essential role of each of these genes in T4P biogenesis, each mutant was complemented. The pilO gene, the pilMN gene and the pilC1 gene were each cloned into the second cloning site of pECC76, yielding plasmids pECC98, pECC97 and pECC95

158 respectively. Additionally, it was decided to attempt again to try to complement mutants with homologous genes from the secondary T4P cluster. Therefore pilC2 and pilM were cloned into the second cloning site of pECC76, yielding plasmids pECC130 and pECC131 respectively. Complementation with pilM was attempted despite its homology to pilMN being questionable, with PilM being predicted to be a cytoplasmic protein and PilMN to be a monotopic membrane protein. Plasmids pECC95 and pECC130 (encoding dccA with pilC1 and pilC2 respectively) were conjugated into 630ΔpilC1, plasmids pECC97 and pECC131 (encoding dccA with pilMN and pilM respectively) were conjugated into 630ΔpilMN and plasmid pECC98 (encoding dccA with pilO) was conjugated into 630ΔpilO. Cultures were grown of each strain, gene expression from plasmids induced with 50 ng/ml Atc, cultures harvested and protein extracted from the culture supernatant by TCA precipitation. The cell lysates and supernatant preparation were then analysed by α-PilA1 Western blot (Figure 4.37). This showed that while pilC1, pilMN and pilO were each able to complement their respective mutants, pilC2 and pilM from the secondary T4P cluster were not able to complement the deletions of pilC1 and pilMN from the primary cluster. This provides yet further evidence that the two gene clusters have different functions and that there is no overlap between them.

4.2.11 The mfd Gene is not Necessary for Type IV Pilus Biogenesis Mutants of all genes in the primary T4P gene cluster annotated as T4P-related were either made or attempted, as detailed above. However, as shown in Figure 4.14, the apparently unrelated genes pth, mfd and prsA are co-transcribed with the T4P genes. I was interested to see whether, despite their apparent unrelatedness, these genes played a role in T4P biogenesis. As previously described, the pth gene is an essential gene, so it was not possible to investigate its function; however, an mfd mutant was readily available in the lab, since a ClosTron mutant of mfd had recently been generated (albeit unintentionally) in 630Δerm by Dr Stephanie Willing, a former member of the Fairweather lab (Willing et al., 2015). Plasmid pECC17 was conjugated into this 630Δerm mfd::erm mutant. The strain was cultured, dccA expression from pECC17 induced with 50 ng/ml Atc, the culture then harvested and protein precipitated from the culture supernatant with TCA. The cell lysate and supernatant preparation were then analysed by α-PilA1 Western blot (Figure 4.38). This showed that mfd was not required for T4P biogenesis, and is likely therefore not to be involved in the biosynthesis or function of the T4P, as would have been predicted. The

159 involvement of prsA was not investigated, though it seems very unlikely that it would play a role in T4P biosynthesis or function.

Figure 4.37. Complementation of the Inner Membrane Protein Mutants with Genes from Both the Primary and Secondary T4P Gene Clusters. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates showing production of PilA1 by all strains. B. Cell supernatants showing successful complementation of the pilC1, pilMN and pilO mutants when complemented with the deleted gene, but that complementation of pilC1 and pilMN mutants with the homologue of the deleted gene from the secondary T4P gene cluster fails, indicating the non- redundant nature of these genes.

Figure 4.38. Testing of mfd Mutant for the Ability to Synthesise T4P. T4P synthesis is indicated by the export of PilA1, as detected by α-PilA1 Western blots. A. Cell lysates, showing the production of PilA1 by the mutant. B. TCA preparations of culture supernatants, showing that PilA1 is exported by 630Δerm mfd::erm, and that mfd is therefore not essential for biogenesis of T4P from PilA1.

160

4.3 Discussion 4.3.1 The Cdi2_4 Riboswitch As mentioned in Chapter 3, over the course of this PhD other groups have contributed significant advances to our understanding of C. difficile T4P. Of particular relevance to this chapter, (Bordeleau et al., 2015) elucidated the structure and mechanism of the Cdi2_4 riboswitch by which pilA1 transcription is regulated by c-di-GMP. The predicted structure of the riboswitch is shown in Figure 4.39. According to these predictions, in the absence of c-di-GMP binding, a rho- independent terminator forms in the pilA1 transcript approximately 120 bases upstream of pilA1. On c-di-GMP binding to the riboswitch aptamer, the P1 stem (see Figure 4.39) forms, sequestering the 5’ sequence of the terminator stem. The P1 stem is therefore designated an anti-terminator, inhibiting transcription termination and enabling transcriptional read-through into pilA1, thus activating its expression (Bordeleau et al., 2015).

The T50 for the Cdi2_4 riboswitch (i.e. the concentration of c-di-GMP which drove half of the maximum level of transcriptional read-through of the terminator region) was calculated as 70 ± 12 nM (Bordeleau et al., 2015), a concentration previously observed in C. difficile 630 without exogenous DGC expression, indicating it is physiologically relevant

(Purcell et al., 2012). Given that the T50 for the Cdi2_4 riboswitch were calculated and published, the same experiments and calculations were not performed in this work. Several other C. difficile genes have been shown to be controlled by c-di-GMP- controlled riboswitches (Peltier et al., 2015; Purcell et al., 2012; Soutourina et al., 2013), but only one of these other riboswitches has been investigated at a molecular level: Cdi2_1, a Type II riboswitch which is located immediately upstream of the C. difficile 630 gene CD3246 (Soutourina et al., 2013). CD3246 is a gene of unknown function, but is annotated as a putative surface protein. C-di-GMP has been shown to bind to Cdi2_1 with a Kd of 200 pM (Lee et al., 2010), while the Type I riboswitch Vc2 from Vibrio cholerae has been shown to bind c-di-GMP with a Kd of 1 nM (Sudarsan et al., 2008). It is unknown, however, how the

T50 of Cdi2_4 relates to its Kd, and therefore how strongly Cdi2_4 binds c-di-GMP compared to other similar riboswitches. Similarly, T50 values were not calculated for these other riboswitches, so these cannot be compared either.

161

A B

Figure 4.39. Two Representations of the Predicted Structure of the Cdi2_4 Riboswitch. In A, on the left, c-di-GMP is bound to the aptamer yielding an anti-terminator stem (P1). In the box on the right, the structure of a rho-independent terminator formed in the absence of c-di-GMP binding is shown. This comprises two helices, parts of which as shown on the left are sequestered in the anti- terminator structure when c-di-GMP is bound. In B an alternative representation of the same structure is shown, showing the secondary and tertiary mRNA structures of the riboswitch in a c-di-GMP- bound state. The P1 helix is shown in blue, the P2 in grey, P3 in purple and P4 in green. The bound c-di-GMP molecule is shown in red. Both images are taken from (Bordeleau et al., 2015). Used with permission.

4.3.2 The pilA1Terminator

As demonstrated in this chapter, it is primarily pilA1 which is regulated by c-di-GMP and Cdi2_4, as a rho-independent transcriptional terminator between pilA1 and pilB1 prevents at least the vast majority of transcriptional read-through from pilA1. Though pilB1 expression is up-regulated by c-di-GMP, it is up-regulated approximately 10-fold less than pilA1, and co-transcription of pilA1 and pilB1 is not visible by RT-PCR. (Interestingly, while c-di-GMP binding to Cdi2_4 had previously been shown to up-regulate pilA1 10-fold (Soutourina et al., 2013), both (Bordeleau et al., 2015) and I found pilA1 to be up-regulated approximately 40-fold.) It is thus clear that the transcriptional terminator terminates at least the majority of pilA1 mRNA transcripts. However, though (Bordeleau et al., 2015) did identify the transcriptional terminator between pilA1 and pilB1, they found pilA1 and pilB1 to be co-transcribed. It is not clear why they and I obtained different results in this particular experiment, though my hypothesis that the two genes are not co-transcribed is supported by (Soutourina et al., 2013), who showed that a pilA1 mRNA transcript is approximately 800 bp, which is approximately the total length predicted for a transcript initiated at the beginning of the riboswitch and terminated at the pilA1Terminator. It is also clear that pilB1 has a promoter independent of the riboswitch, as when the pilA1 terminator is deleted, allowing full read-through to pilB1, expression of pilB1 was

162 significantly higher than expression of pilA1; in a true operon, in which transcription is exclusively initiated upstream of the first gene, it would of course be impossible for transcription of the second gene therein to be higher than that of the first. This was confirmed by (Bordeleau et al., 2015), who identified a promoter immediately upstream of pilB1. While it is possible that up-regulation of pilB1 expression is caused by transcriptional read-through past the pilA1 terminator, this is not supported by my results (though may be by (Bordeleau et al., 2015)). The only other way pilB1 expression could be up-regulated by c-di-GMP is by direct regulation of the pilB1 promoter by c-di-GMP. This is perfectly possible, and in an attempt to conclusively show which of these possibilities leads to pilB1 up-regulation I attempted to delete the Cdi2_4 riboswitch by codA-linked allele-exchange mutagenesis. Had this been successful it would have been straightforward to identify (broadly) which of these two mechanisms led to pilB1 up-regulation. However, disappointingly and rather bizarrely, the attempted mutagenesis failed, preventing my performing of this experiment. This is firstly bizarre as transcription from the riboswitch is not essential for C. difficile survival: under most laboratory conditions, including those under which mutagenesis was performed, the riboswitch is “switched off”. Furthermore, complete disruption of transcription from the riboswitch has been shown to be possible and without obvious negative consequences (by ClosTron mutagenesis of pilA1, which has been performed both by the Fairweather group (unpublished) and (Bordeleau et al., 2015)). Thus, it certainly seems that the riboswitch should not be essential for C. difficile survival. Even if it were essential, my results are not consistent with this anyway. Generally, attempted deletion of an essential gene/sequence fails at the double cross-over stage: single cross-overs can still be obtained, as the essential gene/sequence remains in the chromosome at this stage. Such a scenario was seen in my attempted deletion of pilB2, above. However, my deletion attempts failed at the single cross-over stage. Despite multiple conjugations of pECC71 (the plasmid designed to delete the riboswitch) into strain 630, and the resultant obtention of large numbers of trans-conjugants, uniquely both in all my mutagenesis work and all that of other members of the Fairweather group, no single cross-overs were obtained. The homologous sequences present in pECC71 were certainly appropriate for Cdi2_4 deletion, so it remains utterly unclear as to why this attempted mutagenesis failed in this manner.

163

4.3.3 The Primary T4P Operon The operon which appears to start at pilB1 was shown to run past pilD2 (the final obviously T4P-related gene in the operon), to include pth, mfd and prsA as well. ((Bordeleau et al., 2015) showed the operon to run to pilD2 but did not investigate whether it extended beyond that.) The functions of these three genes are described in Section 3.2.1. None of these three proteins have any obvious link to T4P (and indeed I demonstrated that mfd is not required for T4P biogenesis). Normally, genes which are co-transcribed in bacteria have linked functions (e.g. the T4P biogenesis genes which comprise the majority of this operon), so it is surprising that these apparently unrelated genes are co-transcribed with the T4P genes of the primary T4P operon (though as discussed above, it appears that pth at least also has its own promoter(s) located upstream of the gene, within pilD2). Interestingly, it is not only C. difficile in which these genes are linked to the T4P genes: as described in Chapter 6, the same is seen in the closely related species C. sordellii. Though it would be tempting to speculate that there is therefore a link between these three genes and the T4P, in the absence of any evidence linking their functions it seems more likely that their proximity is merely a result of chance.

4.3.4 Systematic Mutagenesis of the Primary Type IV Pilus Operon

The results of my mutagenesis experiments were interesting, but generally unsurprising. As expected, with the exception of pilT, all genes in the primary T4P operon which were successfully deleted proved essential for T4P biogenesis. PilT was not essential for T4P biogenesis, and may result in a hyper-piliated phenotype. This result corresponds with what has been seen for the vast majority of Gram-negative T4P, showing that essentiality of PilT is not characteristic of Gram-positive species, or even Clostridia, although it appears to be the case in C. perfringens (Varga et al., 2006). Unfortunately, there proved to be insufficient time in this project to produce double mutants in which pilT was deleted in addition to each of the other T4P biogenesis genes, so it was not possible to investigate whether T4P biogenesis genes in a Gram-positive organism could be split into those essential for T4P synthesis and those required to resist PilT-mediated retraction, and if so what the split would be. Such a study would be an obvious next stage in the investigation of these T4P. PilD1 was shown to cleave PilA1, but disappointingly, for reasons speculated above, it proved impossible to delete pilD2. Any further attempts to delete pilD2 would thus require an alternative strategy, either deleting just the 5’ region of the gene so as to leave any pth promoters located in the 3’ region of the gene intact, or simply introducing a premature stop

164 codon into the gene so that only highly truncated PilD2 is produced. Interestingly, it was previously predicted that PilD1 would cleave PilA1, while PilD2 (which has an unusual N-terminal truncation and lower sequence identity with other Clostridial PilD proteins than PilD1) was predicted to cleave PilU, PilV and PilK, all of which unusually possess a negatively charged glutamate (rather than a positively charged residue) at the -3 position from the cleavage site (Melville and Craig, 2013). My work has confirmed the first of these two predictions, while the second remains unconfirmed. The best way to finally confirm which minor pre-pilins are cleaved by which pre-pilin peptidase would likely be the deletion of both PilD1 and PilD2 to yield a strain lacking either of the two, in which each minor pilin could be individually co-expressed with each PilD to identify which peptidase cleaves which pre-pilin. Even this approach might not be straightforward though, due firstly to the difficulty in obtaining a pilD2 mutant, but also to the difficulties in exogenously expressing certain of the minor pilins in C. difficile (Section 3.2.3). In addition to pilD2, it would also be of interest to delete the minor pilins pilJ and pilW. The pilJ gene has been identified in every strain of C. difficile for which annotated sequences were available (Section 3.2.2). Its transcription is up-regulated by c-di-GMP (though only very slightly) (Bordeleau et al., 2015) and the PilJ protein has been shown to be incorporated into C. difficile T4P (Piepenbrink et al., 2014). It would therefore be of interest to delete this gene, to see whether pilJ is essential for biogenesis of the primary T4P, despite not being encoded in the main operon. It would also be of interest to delete pilW, though it seems considerably more unlikely that this gene is essential for T4P biogenesis given that it is absent from certain strains which retain the primary T4P operon (Section 3.2.2). One of the intentions of this study was to be the first to perform a systematic mutagenesis study of a Gram-positive T4P system. Disappointingly, this did not turn out to be the case, as an extremely interesting study on the T4P of Streptococcus sanguinis was published in early 2016 (Gurung et al., 2016). These pili have been given the Neisseria T4P nomenclature. The T4P genes were all found in a 22 kb cluster (which may be an operon, though this was not investigated), and the species was found to require the expected genes for T4P biogenesis (i.e. pilT was not required; interestingly, nor were a number of genes of unknown function which are located within the pil operon) (Gurung et al., 2016). The S. sanguinis T4P were found to be unusual as the gene cluster contains two major pilin genes (pilE1 and pilE2). The two pilins were found to be approximately equally abundant in the T4P, though it was unclear whether separate pili consisting of either PilE1 or PilE2 were made, or whether the two pilins are assembled into pili consisting of a mixture of the two.

165

Either way, S. sanguinis was able to make functional pili from either one of the two, if the other is knocked out (Gurung et al., 2016). However, this study also failed to investigate whether different genes are essential for T4P biogenesis in a ΔpilT background to a WT background, so this remains to be done in a Gram-positive species.

4.3.5 Functional Separation of the Two C. difficile Type IV Pilus Gene Clusters The final point of interest from this chapter is that it appears that the primary and secondary C. difficile T4P loci are functionally separate (though as discussed earlier in the thesis the secondary cluster presumably shares one or both of the PilD proteins encoded by the primary T4P operon, given that no pilD is present in the secondary T4P cluster, and may also share PilT). It has previously been suggested that PilB2, for instance, might display some functional redundancy with PilB1, meaning that neither protein would be entirely essential for any particular role (Melville and Craig, 2013; Purcell et al., 2015). This study demonstrates that this is not the case. Deletion of pilB1 completely inhibits PilA1 export from C. difficile, while deletion of pilB2 has no effect. Furthermore, PilA1 export could be restored in 630ΔpilB1 by complementation with pilB1, but not with pilB2. Equivalent results were obtained with the pilC genes, demonstrating that these genes do not play redundant roles, but rather play separate roles. This conclusion is further supported by the demonstration of the fact that transcription of pilA2 is significantly reduced by an increased intracellular concentration of c-di-GMP ((Bordeleau et al., 2015) saw a similar reduction in pilA2 transcription in response to increased c-di-GMP, but in this instance the reduction was not statistically significant, while the reduction seen in this study is significant). This strongly suggests that expression of the secondary T4P gene cluster is down-regulated by c-di-GMP (it seems unlikely that the other genes in this T4P cluster are regulated differently to pilA2), which suggests that the primary and secondary T4P gene clusters are expressed under different conditions and that their roles are not therefore redundant. What role the primary T4P gene cluster might play in C. difficile is investigated in Chapter 5.

166

5. Functions of the Primary Type IV Pilus Cluster 5.1 Introduction

One of the roles for which T4P are best known is twitching motility. Twitching motility is a non-flagella-mediated form of bacterial motility, wherein T4P function somewhat like grappling hooks, extending from a bacterial cell, adhering to the surface whereupon the bacterium sits and then retracting, dragging the bacterium to the position at which the pilus is attached to the surface (Burrows, 2012). Not all T4P are able to mediate twitching motility, but such motility has been observed in a wide range of bacterial species, including proteobacteria (Burrows, 2012), cyanobacteria (Khayatan et al., 2015) and Gram-positive bacteria (Varga et al., 2006). Twitching motility occurs on semi-solid and solid surfaces; when considering agar plates, generally at least 1 to 1.5 % agar is required (below this level flagellated species are able to move by flagella-mediated swimming or swarming motility) (Burrows, 2012). However, bacteria are able to move via twitching motility on significantly more solid surfaces (e.g. Clostridium beijerinckii has been shown to move via T4P-mediated twitching (or gliding) motility on 4 % agar (Varga et al., 2006)). Twitching motility is so called because bacteria displaying the phenomenon appear to twitch, making short, sharp jerks as they travel (Burrows, 2012). Individual rod-shaped bacteria are able to move in this way in two ways: they can either move horizontally (such that their longer edge is in contact with the surface) or vertically (such that their shorter edge, i.e. a pole, is in contact with the surface). Vertical twitching motility is also known as walking motility (Conrad et al., 2011). Individual bacteria can switch between vertical and horizontal orientations. Vertically orientated bacteria have been found to change direction with much higher frequency than horizontally orientated ones, suggesting that walking motility may play an important role in the exploration of surfaces, which is believed to be the primary function of twitching motility (Conrad et al., 2011). In addition to surface exploration, twitching motility is also known to play an important role in biofilm formation (in P. aeruginosa at least). Bacteria can use twitching motility to move within a biofilm to produce the correct architecture, and strains of P. aeruginosa which lack T4P form biofilms with irregular structures (Klausen et al., 2003). In addition to the mediation of twitching motility for correct architecture formation, T4P play other important roles in biofilm formation. In this regard, T4P are able to mediate inter-bacterial interactions, enabling cells to stick together. This can be at the stage of initial

167 microcolony formation, or more mature biofilm formation. For instance, in P. aeruginosa T4P play important roles in both these events (Burrows, 2012), and the T4b BFP of EPEC and TCP of V. cholerae are each essential for microcolony formation in their respective species (Roux et al., 2012). In each of these species microcolony/biofilm formation is important in virulence, meaning these T4P are important virulence factors. In Neisseria meningitidis, T4P do not mediate true biofilm formation, but do mediate the formation of bacterial aggregates via the minor pilin PilX, which is essential for host cell adhesion and therefore virulence in the species (Hélaine et al., 2005). The third common function of T4P is in direct bacterial adhesion to host cells. For instance, in N. meningitidis the pilus tip adhesins PilC1 and PilC2 mediate direct attachment of T4P to human epithelial and endothelial cells (Morand et al., 2009). Similarly, P. aeruginosa T4P are able to attach directly to N-glycans found on the surface of human respiratory tract epithelial cells, thus mediating bacterial binding to the apical surface of polarised epithelial cells, where these N-glycans are localised (Bucior et al., 2012). Furthermore, P. aeruginosa requires injection of toxins into host cells via its T3SS to initiate invasive disease (Okuda et al., 2010), and it has been shown that retraction of T4P is required to bring bound P. aeruginosa cells into close enough contact with host cells for the bacteria to be able to reach the target host cells with their T3SS (Hayashi et al., 2015). Thus there are three roles commonly played by T4P, while other more unusual roles are detailed in Section 1.6.4. C. perfringens and C. beijerinckii both display T4P-mediated twitching motility (Varga et al., 2006). C. difficile have generally been observed by the Fairweather group and others to display a somewhat irregular colony morphology (unpublished), leading us to speculate that C. difficile T4P might mediate twitching motility as well. We also speculated that C. difficile T4P might mediate biofilm formation, as C. perfringens T4P have been shown to do so (Varga et al., 2008), and C. difficile has been shown to produce biofilms (Dawson et al., 2012; Ðapa et al., 2013). Strain 630 takes about 3 days to reach a maximal level of biofilm formation, a process which is promoted by glucose and inhibited by NaCl (Ðapa et al., 2013). The contribution of T4P to both these processes in C. difficile was therefore investigated. In this regard, over-expression of dccA in C. difficile, which leads to increased intracellular levels of c-di-GMP, has been shown to be associated with aggregation of C. difficile (Purcell et al., 2012). However, at the time of initiation of this project it was not

168 known what causes this aggregation. It is not immediately clear whether “aggregation” and “biofilm” is the same thing in terms of C. difficile, but these results suggest that c-di-GMP might promote biofilm formation. As detailed above, there are thus several possible lines of enquiry into the roles of the primary T4P of C. difficile.

5.2 Results 5.2.1 Effect of Type IV Pilus Expression on the Colony Morphology of Strain 630 The first putative function of the T4P encoded by the primary T4P cluster of C. difficile to be investigated was twitching motility. The first approach used to investigate twitching motility in strain 630 was to compare the morphologies of colonies of 630 (pECC12) (in which the diguanylate cyclase dccA, and therefore the primary T4P operon, is constitutively expressed) with that of colonies of the vector control (630 (pMTL960)). For this purpose, cultures of these C. difficile strains were grown, serially 10-fold diluted then plated out onto BHIS agar plates (1.5 % agar content). 100 µl each dilution was plated per plate and plates incubated for four days. Comparison of these colonies showed that the expression of T4P had no effect on the morphology of the colonies (Figure 5.1A&B). Furthermore, colonies of 630ΔpilA1 (pECC12), which cannot produce T4P, also have an apparently identical morphology to those of WT 630 (Figure 5.1C), providing further evidence that expression of T4P does not alter the colony morphology of C. difficile 630.

Figure 5.1. Effect of dccA Expression on the Colony Morphology of C. difficile 630. A. 630 (pMTL960), the vector control; B. 630 (pECC12), wherein T4P are constitutively produced; C. 630ΔpilA1 (pECC12), which is incapable of synthesis of T4P due to the deletion of the major pilin gene. There is no definitive difference between the shapes of these colonies. Scale bar shows 1 cm.

169

5.2.2 Twitching Motility in Strain 630 Twitching motility was then investigated using the same technique as described for C. perfringens (Varga et al., 2006). Cultures were grown, then normalised to a pre- determined optical density (OD600 of 0.3) and 5 µl spotted onto BHIS agar (1.5 % agar content). Plates were then incubated for 4 days, to allow growth and movement of the spotted bacteria. The same three strains were used for these investigations as for the colony morphology experiment above. Following incubation, it did appear that those strains containing plasmid pECC12 had a slightly different appearance to the strain carrying the vector control (pMTL960) (Figure 5.2). However, whereas one might expect that the induction of T4P expression via pECC12 would result in twitching motility activity causing spots of pECC12-carrying strains to be larger, they do in fact appear to be slightly smaller and less spread out, and there is no apparent difference in size between 630 (pECC12) and 630ΔpilA1 (pECC12). This indicates that the difference in size between the vector control strain and those carrying plasmid pECC12 is not due to induction of T4P expression in response to increased levels of c-di-GMP. Rather, the fact that the culture spots appear to spread out less when levels of c-di-GMP increase suggests that this might be due to suppression of flagella expression.

Figure 5.2. “Twitching Motility” in C. difficile Strain 630. A. Vector control (630 (pMTL960)). B. 630 (pECC12). C. 630ΔpilA1 (pECC12). Culture spots of the vector control appear to be larger than those of strains carrying plasmid pECC12. Culture spots of 630ΔpilA1 (pECC12) appear to be slightly more smoothly rounded than those of 630 (pECC12). Scale bar shows 1 cm.

Though there is no apparent difference in size between the culture spots of 630ΔpilA1 (pECC12) and 630 (pECC12), there does appear to be a difference in morphology between the two strains (albeit only slight), in that the spots of 630ΔpilA1 (pECC12) appear to be much more smoothly rounded than those of 630 (pECC12). This could indicate that the

170

T4P mediate a minimal level of twitching motility which disrupts the edge of the culture spots but no more.

5.2.3 Colony Morphology and Twitching Motility in Strain R20291 Plasmid pECC12 was also conjugated into the hypervirulent, ribotype 027 C. difficile strain R20291, whereupon it was immediately noticeable that the expression of dccA had a significant impact upon colony morphology (Figure 5.3). Clearly, expression of dccA from pECC12 results in a great deal of motility and the formation of fern-like fronds from the colonies.

Figure 5.3. Effect of dccA Expression on the Colony Morphology of C. difficile R20291. Left – vector control (R20291 (pMTL960)), 10-7 dilution; right – R20291 (pECC12), 10-8 dilution. Clearly, a remarkable spreading effect is seen in R20291 when dccA is expressed and c-di-GMP concentration is increased. Scale bar shows 1 cm. As for strain 630 in Section 5.2.2, twitching motility in R20291 was then investigated. This is shown in Figure 5.4, and again the difference in morphology is clear to see.

5.2.4 Role of Type IV Pili in R20291 Motility Given the effect of dccA expression (and therefore c-di-GMP concentrations) on motility and colony morphology in R20291, it seemed possible that R20291 employed T4P-mediated twitching motility in a manner not seen in strain 630. To investigate this, a T4P mutant was generated in R20291. As codA-mediated allele exchange mutagenesis did not prove possible in strain R20291, the older ClosTron method of insertional mutagenesis was employed (Heap et al., 2010; Heap et al., 2007). A plasmid (pECC01) was designed that would insert an intron into pilB1 in the sense orientation – i.e. in the same orientation as the gene.

171

Figure 5.4. “Twitching Motility” in C. difficile Strain R20291. Left – vector control (R20291 (pMTL960)); right – R20291 (pECC12). Clearly when dccA is expressed from plasmid pECC12 in R20291 it results in the spreading of the strain across the plate in an unusual and beautiful pattern. pECC01 was conjugated into R20291, and possible mutant clones were screened by colony PCR and pilB1 mutants were identified (Figure 5.5).

Figure 5.5. Screening of Putative R20291 pilB1 ClosTron Mutants. Amplification from the WT chromosome is predicted to produce a band of approx. 200 bp; the ClosTron intron is approx. 2 kb in length, so amplification of this region of DNA wherein the ClosTron intron has inserted produces a band of approx. 2.2 kb. Clearly all clones screened above contain the insertional mutation.

Six clones were screened by Southern blot using restriction enzymes NdeI/XmnI and a probe which recognises the 3’ end of the ClosTron intron (Figure 5.6). This probe was produced by PCR from the pECC01 plasmid. The Southern blot is shown in Figure 5.6A. It was predicted that, from the intron inserted into pilB1, following digestion with XmnI, the intron-specific probe would anneal to an 800 bp DNA fragment, and following digestion with NdeI would anneal to a 2.6 kb fragment. This is explained in Figure 5.6B. In Figure 5.6A only one band is seen in the DNA digested with XmnI, which is a fragment approx. 1 kb in size. This is assumed to be the predicted 800 bp fragment (it is hard to distinguish between a small difference in size of 200 bp). In the DNA digested with NdeI a very strong band

172 representing a fragment of approx. 2.5 kb is seen (which is assumed to be the predicted 2.6 kb fragment). Much weaker bands of approx. 4 and 6 kb are also seen in all but one of the clones. These could be bands derived from incomplete gDNA digestions, or indeed background – the picture is complicated by the presence of a faint band in the WT negative control. Either way, however, the vastly greater strength of the bands seen of the predicted size indicates that all clones contain only one intron insertion. Nonetheless, the first clone, which contains only one band following both NdeI and XmnI digestion, was chosen for use in further experimentation. This mutant is R20291 pilB1::erm.

Figure 5.6. Southern Blots of R20291 ClosTron Mutants. A. Southern blots of mutant gDNA using a probe specific to the ClosTron intron. Left – genomes digested with NdeI; right – genomes digested with XmnI. WT R20291 genomes were digested and blotted as a control. B. Schematic explaining Southern blot band sizes. Blue arrow represents pilB1 gene, the red hashed box the inserted ClosTron intron, the green line the probe. NdeI and XmnI restriction sites are marked and the distances between them indicated. Plasmid pECC12 was conjugated into R20291 pilB1::erm and PilA1 export/T4P synthesis was investigated, showing that this strain was deficient in PilA1 export and therefore was non-piliated (Figure 5.7).

173

Figure 5.7. R20291 pilB1 Mutant is Unable to Produce T4P. PilA1 is synthesised and exported by the WT strain, being present both in the whole cell lysate (WCL) and culture supernatant (S/N), indicating the production of T4P; in the pilB1::erm mutant on the other hand, PilA1 is synthesised, and is present in a large amount in the WCL, but is not seen in the S/N, indicating it is not exported and meaning that this mutant is unable to synthesise T4P.

Having shown that R20291 pilB1::erm is unable to produce T4P, its colony morphology and ‘twitching motility’ phenotypes were investigated (Figure 5.8). These showed that there was no noticeable difference between the colony morphology and ‘motility’ phenotypes induced by c-di-GMP between the R20291 WT strain and the pilB1 mutant, with the fern-like fronds seen in the WT strains also present in the pilB1 mutant.

Figure 5.8. Colony Morphology and “Twitching Motility” of R20291 pilB1 Mutant. A. R20291

strain colony morphologies: left – WT R20291; right – R20291 pilB1 erm :: . Both strains were diluted 10-6. B. R20291 strains

“twitching motility”: Left – WT

R20291; right – R20291

pilB1::erm. In both experiments

there is no noticeable difference

between the morphologies of the

WT strains and the mutants,

indicating that the formation of the

frond-like structures, which is

induced in R20291 by c-di-GMP, is

not T4P-mediated. Scale bar shows

1 cm.

The results of the above experiments in both 630 and R20291 suggest that the T4P encoded by the primary T4P cluster of C. difficile do not mediate motility, and that the fronds

174 produced by R20291 may not be indicative of twitching motility. Twitching motility investigations were also attempted using toothpick innoculation to the base of agar plates, to investigate whether the ultra-smooth plastic petri dish bases might be more amenable to T4P- mediated surface motilty than an agar surface. However, this did not yield interpretable results, and did not in any way suggest twitching motility by the strains.

5.2.5. Type IV Pilus-Mediated Aggregation It has previously been shown that increased concentrations of c-di-GMP, as a result of expression of dccA, cause aggregation of C. difficile 630 (Purcell et al., 2012). As this cell aggregation could be mediated by T4P, aggregation was tested in strains 630 and R20291 carrying plasmid pECC12 and thus constitutively expressing dccA. Strains were inoculated into liquid medium and grown overnight, and the following morning the OD600 of the medium at the top of the culture measured. The cultures were then vortexed to disrupt aggregates and equally disperse the bacteria throughout the medium, and the OD600 of the culture measured a second time. The second reading was taken to be the “true” OD600, and baed on the difference between the first and second OD600 readings the percentage of cells aggregated at the base of the cultures was calculated Strain R20291 proved an unsuitable vessel for these experiments, but with strain 630 it proved possible to replicate the previously published results (Figure 5.9). In strain 630 (pMTL960), under conditions of low c-di-GMP concentration 8.3 % aggregation was seen, while under conditions of high c-di-GMP concentrations (630 (pECC12)), 97.7 % aggregation was seen. To test whether T4P mediate this aggregation, plasmid pECC12 was conjugated into 630ΔpilA1, 630ΔpilB1 and 630ΔpilB2 and aggregation assays performed (Figure 5.9). Aggregation of 630ΔpilA1 (pECC12) and 630ΔpilB1 (pECC12) were both significantly reduced compared to the wild type (84.5 % aggregation of 630ΔpilA1 (pECC12) was seen, and 85.9 % of 630ΔpilB1 (pECC12)). This demonstrates that the T4P of the primary cluster of C. difficile play a role in c-di-GMP-induced aggregation of C. difficile. Aggregation of 630ΔpilB2 (pECC12), on the other hand, was not reduced in comparison to the wild type (being 99.9 %), indicating that the secondary T4P cluster does not play a role in this aggregation. To confirm the role of the T4P encoded by the primary gene cluster in c-di-GMP- induced aggregation, the 630ΔpilA1 mutant was complemented (using plasmid pECC127, a dual expression vector from which both dccA and pilA1 are constitutively expressed from the

175 cwp2 promoter) (Figure 5.9). The aggregation of this complemented mutant was found to be significantly higher than that of the uncomplemented mutant, though not as high as the wild type, demonstrating a partial complementation of the mutation. It is probable that complementation was only partial because the level of pilA1 expression from pECC127 in the complemented strain will differ from the level expressed in the WT strain. Effect of pil Gene Mutations on C-Di-GMP- Induced Cell Aggregation

120 *** *** * *** ‡‡ 100

80

60

% Aggregation 40 Percentage Aggregation 20

0 630 630 630ΔpilA1 630ΔpilA1 630ΔpilB1 630ΔpilB2 (pMTL960) (pECC12) (pECC12) (pECC127) (pECC12) (pECC12) Strain

Figure 5.9. Cyclic-di-GMP-Mediated Aggregation in C. difficile. Error bars indicate one standard deviation either side of the mean. * indicates significant difference between aggregation level of any strain with that of 630 (pECC12): * indicates P>0.95; *** indicates P>0.999. ‡ indicates significant difference between strain 630ΔpilA1 and its complement: ‡‡ indicates P>0.99.

5.2.6 Type IV Pilus-Mediated Biofilm Formation Techniques for the investigation of biofilm formation in C. difficile have previously been described (Dawson et al., 2012; Ðapa et al., 2013), and indeed, biofilm formation by C. difficile has recently been shown to be stimulated by c-di-GMP (Soutourina et al., 2013). However, despite multiple attempts with both strains 630 and R20291, it did not prove possible, following the referenced protocols, to obtain consistent or useful data in this regard. As described in Section 2.8.6, C. difficile strains were grown in 24-well plates and it aggregates of cells did appear to form, but these never adhered strongly to the base of the well. It proved impossible to remove the culture medium, let alone wash the “biofilm”, without disrupting the cell aggregates. It therefore remains unclear whether C. difficile T4P mediate “biofilm” formation, or simply cell “aggreggation”.

176

5.3 Discussion

My work here has demonstrated that the T4P encoded by the primary gene cluster of C. difficile mediate cell aggregation, suggesting that they mediate interbacterial interactions. As described above, in other bacterial pathogens, bacterial aggregation is important in virulence and so at first glance suggests that these C. difficile T4P may function as virulence factors. Similar results regarding T4P-mediated aggregation were observed previously (Bordeleau et al., 2015). Interestingly, in that study observed levels of aggregation were considerably lower than seen here. The WT strain in which dccA was expressed showed approx. 90 % aggregation, while both pilA1 and pilB1 ClosTron mutants showed only between 40 and 45 % aggregation (in this study the WT strain and ΔpilA1 and ΔpilB1 mutants showed 98 %, 85 % and 86 % aggregation respectively). The differences in aggregation seen are probably due to expression of dccA being driven by a nicin-inducible promoter by (Bordeleau et al., 2015), compared to constitutive expression from the cwp2 promoter here, resulting in a higher level of dccA expression in this study. Noticeably, even in the T4P-deficient strains, c-di-GMP drove significant bacterial aggregation in both studies. There are various possible reasons for this, including other adhesins mediating inter-bacterial adhesion; certainly this was the presumption of the Bordeleau study. However, I would suggest a rather more prosaic explanation: the results are not shown here, but in strain R20291 the only variant to measurably aggregate à la strain 630 was a fliC ClosTron mutant. This strain aggregated significantly even in the absence of dccA expression, suggesting that when C. difficile cannot swim in liquid medium its cells will sediment and form an aggregate at the base of the liquid. I therefore suggest that, at least the majority of aggregation seen at high c-di-GMP levels in T4P-deficient strains, is simply due to the effect of gravity on a strain which cannot oppose it by swimming due to the inhibition of expression of their flagella (Purcell et al., 2012). Notably, deletion of pilB2 did not reduce aggregation of C. difficile 630 in any way, showing that the T4P encoded by the secondary T4P gene cluster do not play a role in the observed aggregation. (This is not surprising, given that it appears that the secondary cluster will not even be expressed under these conditions (Section 4.2.5).) Unfortunately, it did not prove possible to successfully investigate the effect of c-di-GMP on biofilm formation, and whether T4P mediate this in C. difficile. Performing biofilm experiments in the species appears very challenging. I am aware of numerous other

177 individuals both from within the Fairweather Group and other C. difficile research groups who have failed to obtain results from it, and indeed (Bordeleau et al., 2015), who demonstrated that T4P play a role in mediation of bacterial aggregation, stated that biofilm formation was “weak and variable”. It appeared to me and others in the Fairweather Group that the C. difficile aggregates present in wells during biofilm assays did not adhere to the base of the wells, and this also appeared to be the case in (Dawson et al., 2012), who reported that biofilms formed by C. difficile in tissue culture flasks were easily detached from the base of the flasks. This resulted in huge difficulties for me in obtaining accurate results, as the amount of biofilm measured was more determined by the amount it proved possible to leave in a well during washing steps and suchlike, rather than the amount of biofilm actually initially present. Interestingly, the same group who published (Bordeleau et al., 2015) have since published data showing that c-di-GMP drives biofilm formation in both C. difficile strain 630 and R20291, and that the T4P encoded by the primary T4P gene cluster significantly contribute to biofilm formation (Purcell et al., 2015). The results are very plausible, as it is not surprising that c-di-GMP drives biofilm formation given that it drives bacterial aggregation, and given that T4P contribute to aggregation it is not surprising that they also contribute to biofilm formation. However, the method used for biofilm analysis in (Purcell et al., 2015) was apparently identical to the methods used successfully in (Ðapa et al., 2013) but unsuccessfully by the same individuals in (Bordeleau et al., 2015). No explanation was given as to how the methodology was changed to obtain a definite phenotype and improve the experimental technique. In light of the extremely variable success rates of biofilm assays performed according to the published methodologies, it seems to me that the C. difficile research community should perhaps treat the results obtained from such experiments with a degree of scepticism, until an improved and more reliable methodology is published. It also appears clear from my results that the T4P from the primary T4P cluster do not mediate twitching motility in either strain 630 or R20291. It is unclear what is responsible for driving formation of the fern-like fronds observed in R20291 colonies expressing dccA, but clearly it is not T4P encoded by the primary T4P operon. It seems unlikely that flagella- mediated motility is responsible for the formation of the structures, given that c-di-GMP is known to down-regulate flagella expression (Purcell et al., 2012). Under the conditions in which dccA is expressed the flagella should not be expressed. Indeed, it was confirmed by non-quantitative RT-PCR that fliC was expressed in the R20291 (pMTL960) vector control, but not in R20291 (pECC12) in which dccA is constitutively expressed, when the two strains

178 are grown in liquid medium; data is not shown. Unfortunately, though an R20291 fliC ClosTron mutant was available, for an unknown reason this strain has a significant growth defect, meaning that its colony morphology and “twitching motility” phenotypes could not be meaningfully compared with those of the WT strain. It is possible that T4P encoded by the secondary T4P gene cluster are responsible for the observed motility phenotypes. When grown in liquid medium, expression of the secondary T4P gene cluster appears to be down-regulated by c-di-GMP, but the mechanism whereby this occurs is completely unknown, and under all circumstances tested, expression of pilA2 is very low (Section 4.2.5). What induces pilA2 expression remains unknown, but it is possible that, on a solid surface on which twitching motility is possible, expression of the secondary T4P gene cluster is up-regulated, overcoming whatever inhibition is placed on their expression by c-di-GMP in liquid medium. Alternatively, another mechanism entirely may be responsible for the observed phenotype. In (Purcell et al., 2015), in the same study which demonstrated the role of T4P from the primary gene cluster in C. difficile biofilm formation, the R20291 colony phenotype resulting from increased c-di-GMP levels was also identified. They also generated a pilB1 ClosTron mutant in the strain, as I did. However, they found that this mutant did not display the motility phenotype displayed by the WT in response to c-di-GMP. (Purcell et al., 2015) attributed this to loss of pilB1 and the associated T4P. However, my results categorically demonstrate that pilB1, and therefore the T4P encoded by the primary T4P operon, do not mediate motility. Both (Purcell et al., 2015) and I generated a ClosTron mutant of pilB1 in C. difficile strain R20291, but obtained different results when examining them in motility assays. I do not know why this is, but presumably the mutants were generated slightly differently, for instance in terms of where the intron was inserted into the gene and the orientation in which it was inserted, resulting in different polar effects. As mentioned, the pilB1 mutant used by (Purcell et al., 2015) is a ClosTron mutant, meaning that it was generated by insertion of a large intron into the target gene (pilB1) (Heap et al., 2010; Heap et al., 2007). Insertional mutagenesis technology, such as ClosTron, is known to disrupt transcription, and is not ideal to use on genes in operons, as polar effects often result due to reduction or inhibition of transcription of downstream genes. The pilB1 gene forms the first gene in an operon (Section 4.2.4) meaning that insertional mutagenesis of the gene is likely to have polar effects on downstream genes. It is extremely important that such mutants are complemented to demonstrate that resultant phenotypes are correctly attributed to either the disrupted gene or to polar effects. However, though this is explicitly

179 acknowledged by (Purcell et al., 2015), they did not (apparently) attempt to complement the pilB1 ClosTron mutant. This is extremely problematic. At least 3 genes downstream of pilB1 may be shared between the primary T4P gene cluster and the secondary T4P cluster and/or other putative T4P genes located elsewhere in the genome. Indeed, at least one of these genes seemingly must be shared, as no pre-pilin peptidase gene is present on the C. difficile chromosome outside of the primary T4P gene cluster, meaning it seems inevitable that at least one of the pre-pilin peptidases must also process pilins encoded externally to the primary cluster. It is also possible that pilT might be shared with, for instance, the secondary T4P cluster. The hypothesis that the c-di-GMP-induced surface motility seen by (Purcell et al., 2015) in R20291 is mediated by T4P from the primary cluster is therefore not supported by the data, and is shown to be wrong here. However, it certainly appears that a gene or genes present in the cluster are involved in this observed motility. Based on this I would hypothesise that the surface motility is mediated by T4P encoded by the secondary T4P gene cluster, enabled by pilD and/or pilT genes from the primary cluster. As described previously, this is more consistent with the results seen in C. perfringens (Varga et al., 2006). It would be interesting to see this hypothesis tested by mutagenesis of the second cluster in R20291. The C. difficile T4P encoded by the primary cluster are not particularly unusual in being unable to mediate motility: several types of T4P are unable to do so (Burrows, 2012). It is notable that T4bP are much less commonly associated with motility than T4aP (Burrows, 2012), given that C. difficile T4P encoded by the primary T4P operon share certain characteristics with T4bP. In particular, PilA1 has been found to have a structure which appears to be much more closely related to Gram-negative T4bP than T4aP (Piepenbrink et al., 2015), suggesting that these pili may be more T4bP-like than T4aP-like. That said, it is possibly unwise to attempt to put these Gram-positive T4P too much in a box with either class of Gram-negative T4P. Though the above functional assays are all interesting, the most important question with regard to these pili is whether they play an important role in disease. I did not directly address this question, due to various factors including time and finances. However, an important contributing factor was a personal communication from the groups behind (Piepenbrink et al., 2014) and (Piepenbrink et al., 2015) that despite attempting multiple assays in both mice and hamsters with various T4P mutants, they were unable to detect any phenotype. This knowledge seemed to render much less interesting, and also less ethical, any further in vivo study using T4P mutants. It is notable that the only other Clostridial species

180 whose T4P have been investigated at molecular level (C. perfringens) was also found to remain fully virulent following the loss of its T4P (Varga et al., 2006). As discussed in previous chapters, it is debatable whether the toxin-mediated model of C. perfringens disease used to investigate the role of its T4P in virulence is appropriate for this purpose, as it involved injecting vegetative, toxigenic bacteria into mice and following the disease course. However, it is challenging to make the same argument for the murine model of C. difficile disease, in which spores are orally ingested by mice, and which ought to offer a more realistic model of clinical disease. It may thus be the case that T4P (or at least those encoded by the primary T4P gene cluster) are genuinely not important virulence factors in C. difficile. It may in fact be the case that Clostridia generally encode T4P for a non-virulence associated purpose, a possibility supported by the fact that non-pathogenic Clostridia also encode complete sets of T4P, e.g. C. beijerinckii and C. acetobutylicum (Varga et al., 2006). If this is the case, though the T4P of C. difficile remain interesting from a purely biological point of view, they may be of little medical relevance.

181

Chapter 6: Investigations of Clostridium sordellii Type IV Pili and its Genome 6.1 Introduction C. sordellii is one of the closest relatives of C. difficile (Elsayed and Zhang, 2004), the species of interest in the previous chapters of this thesis. First isolated almost 90 years ago (Hall and Scott, 1927), C. sordellii is commonly found in the soil. However, it has also been found to colonise the gut of a small percentage of humans (and animals), and has been identified in the vaginal microflora of a small percentage of women (Aldape et al., 2006). Many strains of C. sordellii are avirulent (i.e. non-pathogenic) (Aldape et al., 2006), but as described previously, virulent strains of the bacterium are well-known as highly lethal pathogens of humans and animals, though such infections are thankfully rare (Lewis and Naylor, 1998; Unger-Torroledo et al., 2010) (Fischer et al., 2005). The pathology of C. sordellii infections is quite poorly understood. C. sordellii is able to produce several putative exo-toxins, the most potent of which are the Lethal Toxin (TcsL) and the Haemorrhagic Toxin (TcsH) (Aronoff, 2013), both of which are members of the Large Clostridial Cytotoxin (LCC) family (Aronoff, 2013; Just and Gerhard, 2004), and which are closely related to the Clostridium difficile toxins TcdB and TcdA respectively (Martinez and Wilkins, 1992). As previously described, it is believed that these two toxins are the key virulence factors in C. sordellii infections. Other putative virulence factors encoded by C. sordellii include Neuraminidase (nanS), Collagenase (colA), Phospholipase C (csp – C. sordellii Phospholipase C) and the cholesterol-dependent cytolysin Sordellilysin (sdl). Csp is closely related to the Clostridium bifermentans phospholipase C Cbp (77.4 % amino acid identity), and is also related to the C. perfringens phospholipase C α-toxin (53.4 % identity) (Karasawa et al., 2003), while Sdl is related to the C. perfringens toxin Perfringolysin O (Voth et al., 2006). However, few (if any) strains produce all the above-listed toxins (Voth et al., 2006), and the importance/contribution of most of these toxins to infection is generally unclear. Virulence studies in the tcsL+/tcsH- strain ATCC9714 support the essentiality of TcsL in virulence: ATCC9714 is rapidly lethal in a mouse model, while mutation of the tcsL gene results in a complete loss of virulence (Carter et al., 2011). However, no studies have been performed to examine the role of TcsH in pathogenesis in those strains which encode it, and few investigations of the other exotoxins have been performed. Furthermore, those studies which have been undertaken on toxins other than TcsL have not truly demonstrated their roles

182

(or lack thereof) in disease. Recombinant Csp from strain ATCC9714 has been shown to be active and haemolytic, though much less so than C. perfringens α-toxin (its enzymatic and haemolytic activities are much more similar to the more closely related Cbp phospholipase), and, alone, to be non-toxic in mice (Karasawa et al., 2003). However, the contribution of Csp to disease caused by a virulent strain of C. sordellii has not been investigated. The neuraminidase NanS has been found to cause or contribute to the leukemoid reaction generally seen in C. sordellii infections (Aldape et al., 2007). The scale of leukemoid reaction seen in C. sordellii infection has been found to be the only prognostic indicator, with more severe leukemoid reactions associated with worse outcomes (Aldape et al., 2006). However, whether more severe leukemoid reactions directly cause more severe disease (or vice versa) is unknown, as the direct contribution of NanS to disease has not been investigated. The cholesterol- dependent cytolysin Sdl has been shown to be active and cytotoxic (Voth et al., 2006), but again, its role in disease has not been investigated. It has been hypothesised that non-toxigenic strains can remain virulent, but much less so than toxigenic strains. Approximately two-thirds of reported cases of C. sordellii infections are associated with toxic-shock syndrome; the remainder are associated with less severe disease (Walk et al., 2011). The true proportion of C. sordellii infections causing toxic shock syndrome is likely to be lower still, as such cases are more spectacular and therefore interesting to clinicians, meaning they are more likely to be fully investigated and reported in the literature, producing a statistical bias. Several cases of C. sordellii infection resulting in less severe disease (not leading to toxic shock syndrome) have been found to be associated with non- toxigenic strains of C. sordellii (Abdulla and Yee, 2000; Hao et al., 2010; Valour et al., 2010; Walk et al., 2011), supporting the above hypothesis. Very little has been published on the fundamental biology of the species either, or on its interactions with a human (or animal) host (both in terms of host colonisation and its interactions with the immune system), meaning we have poor understanding of almost every aspect of this species, which is often described as an emerging pathogen (Thelen et al., 2010; Voth et al., 2006). A greater understanding of this disease is therefore important, both for the protection of human health and also in agriculture. Though research into C. sordellii has increased in recent years, progress has been hindered by the absence of a high quality reference genome, which is essential for most modern, genetics-based research. By chance, at the commencement of this study, my research group was involved in a project to sequence the genomes of multiple strains of C. sordellii. 44 strains had been collected from across the globe, with the intention of producing one high

183 quality reference genome and also providing a wider bank of genomic information which would hopefully be representative of the species as a whole, together aiding and enabling further research into this species. As described earlier in this thesis, all Clostridia for which genome sequences are available have been found to encode Type IV pili (Varga et al., 2006). There was therefore a high level of expectation that the C. sordellii genome would reveal another set of Clostridial Type IV pili genes. As C. sordellii is one of the closest relatives of C. difficile, it seemed possible that its Type IV pili would be similar to those of C. difficile. This chapter describes my investigations into the C. sordellii Type IV pili, comparing them with those of C. difficile. It also deals with a broad genomic analysis of the species, undertaken to derive as much knowledge as possible from its genome sequence and to provide a strong base for further investigations of the species.

6.2 Results A collection of 44 strains of C. sordellii was obtained, containing strains sourced from the UK, USA and Australia. By sourcing strains from across the globe it was hoped to derive a wider variety of samples which would provide a more complete picture of the global diversity of C. sordellii. The majority of strains were obtained from either clinical or veterinary cases of disease; a small number were isolated during the screening of potential allograft tissue from deceased individuals. These individuals had not suffered from C. sordellii infection, but had clearly been colonised by the species. Strains used are listed in Chapter 2 (Materials and Methods), Table 2.2. Where known, the infection or pathology associated with the strain is given. I extend my thanks to Professor J. Glenn Songer (University of Arizona, Tucson, USA), Dr Val Hall (Anaerobe Reference Unit, Cardiff, UK), Dr Liljana Petrovska (Animal and Plant Health Agency, Weybridge, UK), Professor Thomas Riley (University of Western Australia, Perth, Australia) and Dr Milena Awad (Monash University, Melbourne, Australia) for their provision of strains to the study. Genomic DNA was extracted from all 44 strains (performed by me, Mr Mark Hession (a former member of the Fairweather research group) and Dr Milena Awad). Genome sequencing was then performed by Mr Hilary Browne (Wellcome Trust Sanger Institute, Hinxton, UK) by Illumina Hi-Seq. Sequences were assembled by Mr Hilary Browne using VelvetOptimiser (Victorian Bioinformatics Consortium), yielding genomic sequences in multiple contigs. The genome of the commonly used reference strain ATCC9714 was then improved, using optical mapping (Schwartz et al., 1993) to establish the order and relative

184 orientation of contigs >50 kb. This was performed by Mr Matthew Dunn (Wellcome Trust Sanger Institute). Coding DNA sequences were identified in the genomes of all strains by Mr Hilary Browne using Prodigal (Hyatt et al., 2010), and the ATCC9714 genome then annotated (also by Mr Hilary Browne) using as references C. difficile strain 630, Clostridium botulinum Hall A (Proteolytic), C. botulinum E3 Strain Alaska E43 (non-proteolytic) and NT (in that order). This produced genome sequences which could be productively analysed using a simple genome browser, such as Artemis.

6.2.1 The Strain Collection Divides into Four Clades The core gene sequences of the 44 strains from the collection were analysed and compared to form a phylogenetic tree (Figure 6.1). Using CD-HIT (Fu et al., 2012), 2712 genes were identified as common to all strains. Sequences were concatenated and a nucleotide alignment created using Muscle (Edgar, 2004) and a maximum likelihood phylogeny produced using FastTree version 2.1.3 (Price et al., 2010). BLASTN analysis showed C. difficile to be the closest relative of C. sordellii for which a genome sequence was available (specifically strain R20291), so was used as an out-group to root the tree. The core genome sequence of C. difficile R20291 proved to be too diverse from the C. sordellii genomes to root the tree based on core gene sequences, so the root of the tree was established using the C. difficile multi-locus sequence typing (MLST) genes (Griffiths et al., 2010). During tree assembly 1000 bootstrap replicates were performed, giving bootstrap support values of at least 89 % for all nodes. All phylogenetic tree creation was performed by Mr Hilary Browne with input from Professor Nick Thomson (Wellcome Trust Sanger Institute).

185

Figure 6.1. Phylogenetic Tree Showing Maximum Likelihood Phylogeny of the 44 Strains of C. sordellii in our Collection. Green circles and squares denote UK strains of clinical and veterinary origin, respectively; red circles and squares denote US strains of clinical and veterinary origin, respectively; Blue circles denote Australian strains of clinical origin. The root of the phylogenetic tree is C. difficile strain R20291. 1000 bootstrap replicates were run resulting in bootstrap support values of greater than 89 % for all nodes. Scale bar denotes number of nucleotide changes per position. * indicates outlying strains not falling into any of the four clades.

Analysis of the phylogenetic tree showed that the 44 strains in our collection can be divided into four distinct clades, as indicated in Figure 6.1. Five strains from the collection appear to be outliers, not falling into any of the four clades. These may or may not be genuine outliers from the species, as it is of course possible that were more strains added to the tree the current outliers could fall into new, distinct clades. Interestingly, no clade originates from any specific continent, or is associated with strains isolated from any specific host; rather each clade contains a mixture of strains from multiple continents and comprising both clinical and veterinary isolates.

6.2.2 The Type IV Pili Genes of Clostridium sordellii ATCC9714 Following genome annotation the ATCC9714 genome was searched for T4P genes. Two putative T4P gene clusters were identified (Figure 6.2, A&C) which closely resembled the primary and secondary clusters of T4P genes found in C. difficile 630 (shown again in Figure

186

6.2, B&D for comparison). All genes within both clusters were, as far as possible, named after their C. difficile equivalents.

The C. sordellii ATCC9714 Primary Cluster From first glance it is clear that the C. sordellii ATCC9714 and C. difficile 630 primary T4P clusters are similar. The first point to note is that the C. sordellii primary T4P cluster, like that of C. difficile, contains all genes presumed to be necessary for the biosynthesis of functional T4P. This is not the case in all Clostridial species. For example, in C. perfringens, C. tetani and C. botulinum (amongst others) the pilT gene is located apart on the chromosome from the pilin genes (Varga et al., 2006). The second thing to note is the extremely similar composition of both clusters, and the order of the genes within them. This includes highly distinctive features such as the pilM-pilN gene fusion and the two neighbouring pilD genes. Indeed, as shown in Figure 6.2 there is generally a high level of sequence identity between the gene products of the cluster of each species. Furthermore, the non-T4P-related genes pth, mfd and prsA are found at the 3’ end of the C. sordellii primary T4P gene cluster, just as in the C. difficile primary T4P operon. Indeed, the surrounding genes on either side of the cluster appear to be identical in C. sordellii and C. difficile, with the immediately upstream genes being prs and glmU and the immediately downstream genes being spoVT and spoVB in both species (Figure 6.3). This shows that the gene cluster is located at the same genomic locus in both species, indicating they were likely present in a common ancestor and have since diverged. The major differences between the primary clusters in C. sordellii and C. difficile are within the pilin genes. The C. difficile primary cluster starts with one pilin gene (the major pilin pilA1 (Piepenbrink et al., 2015)), while that of C. sordellii begins with two adjacent pilin genes, which I have named pilA1A and pilA1B. BLAST analysis reveals they are most closely related to PilA1 from C. difficile. They have 35 % sequence identity with each other, which is lower than either display towards 630 PilA1. It is probable that this additional pilin gene in C. sordellii is due to a gene duplication event, and that the two genes have since diverged.

187

Figure 6.2. The T4P Gene Clusters of C. sordellii ATCC9714. The C. sordellii gene clusters are shown in A and C, while the equivalent C. difficile 630 gene clusters are shown again in B and D for ease of comparison. Genes are colour-coded according to function, as described in the key. Arrow lengths are indicative of relative gene lengths, but not directly proportional. Where a gene underhangs its neighbour(s), this indicates gene overlap in the genome. Numbers above the starting and ending genes in each cluster show the gene numbers of the indicated genes. Percentages given underneath C. sordellii ATCC9714 genes indicate the pairwise identity of their products with their C. difficile 630 homologue.

Differences also exist within the minor pilin genes. C. difficile encodes 3 minor pilins within its primary T4P gene cluster: pilU, pilV and pilK. As described earlier, PilK is much larger than an average pilin and contains a non-canonical signal peptide. C. sordellii also encodes three putative minor pilins within its primary T4P gene cluster. The predicted products of these three genes were analysed by BLAST: the first displayed 31 % identity to PilU from C. difficile so was named pilU. The predicted product of the second gene (ATCC9714_01161) is extremely large; at 700 amino acids in length it is much larger than a normal type IV pilin, and when analysed by BLAST displays no homology to any known protein. When analysed using PilFind (a web-based programme designed to identify Type IV pilins, (Imam et al., 2011)) it is not identified as a pilin. However, it is predicted by TMHMM to contain an N-

188 terminal trans-membrane helix (characteristic of type IV pilins). Based on this fact, and its location within the primary type IV pilus gene cluster, I believe it to be a minor pilin and have annotated it as such in Figure 6.2. However, I have not named it. The product of the final of the three genes (ATCC9714_01171) is also larger than a normal type IV pilin (464 amino acids). Similarly, it is not identified as a type IV pilin by PilFind but is predicted by TMHMM to contain an N-terminal trans-membrane helix, and when analysed by BLAST its closest relation is PilV from C. difficile. These facts, together with its location in the C. sordellii primary T4P locus, give me confidence in annotating it as a minor pilin. Because of the significant difference in size between ATCC9714_01171 and PilV I did not name the gene pilV, as I do not consider them homologous. Therefore ATCC9714_01161 and ATCC9714_01171 remain unnamed.

Figure 6.3. Comparison of the Genomic Loci of the Primary T4P Gene Clusters in C. sordellii ATCC9714 and C. difficile 630. Gene sizes are not to scale. Arrows indicate relative gene direction. Colours indicate homologous genes. The genes hbs from C. sordellii and hupA from C. difficile are homologous, and despite the difference in their annotation are both predicted to encode HU-family DNA-binding proteins.

The C. sordellii ATCC9714 Secondary Cluster Again, the similarity between the C. sordellii ATCC9714 and C. difficile 630 secondary T4P clusters is apparent from first glance. The composition and order of genes within the cluster is similar, as indicated by the high levels of sequence identity displayed by the products of the gene cluster (see Figure 6.2). The genomic locus of the cluster appears to be similar in the two species, with the gene immediately upstream being a highly conserved ATPase (Figure 6.4), though the presence of a Major Facilitator Superfamily transporter gene in the upstream region in 630 (CD630_3299), which is absent from C. sordellii ATCC9714, demonstrates diversity in the locus. The central differences between the genes within the clusters are as follows: firstly, in C. difficile the cluster begins with a gene of unknown function (CD630_3297). This gene is not

189 present in C. sordellii. Secondly, immediately downstream of pilM in C. sordellii ATCC9714 is a gene annotated as pilN. This gene lacks a homologue in C. difficile. As one would expect for a pilN gene, this is predicted by TMHMM to encode a monotopic transmembrane protein. Interestingly though, it is predicted to sit in the membrane with its small N-terminus extracellular, and its longer C-terminus in the cytoplasm. In Gram-negative species, pilN has the opposite topology, with its N-terminus located in the cytoplasm (Georgiadou et al., 2012). Additionally, Gram-negative PilN proteins contain a characteristic INLLP motif at their N- termini, through which they interact with PilM proteins (Tammam et al., 2013). This motif (and its reverse) is absent from the predicted C. sordellii PilN, suggesting that if this truly does encode a PilN protein, it likely functions significantly differently to those found in Gram- negative species. Finally, the two genes of unknown function immediately downstream of pilN in C. sordellii ATCC9714 are both absent from C. difficile 630, while the two genes of unknown function immediately downstream of pilM in 630 are both lacking from C. sordellii ATCC9714.

Figure 6.4. Comparison of the Genomic Loci of the Secondary T4P Gene Clusters in C. sordellii ATCC9714 and C. difficile 630. Gene sizes are not to scale. Arrows indicate relative gene direction. Colours indicate homologous genes (N.B. colours on this figure do not indicate homology with colours on Figure 6.3). Genes in dotted arrows are absent from the other species.

Other Type IV Pilin Genes Encoded by C. sordellii ATCC9714 C. difficile 630 encodes four other type IV pilin genes within its genome – pilW, pilJ, pilX and pilA3. The pilW and pilJ genes are isolated in the C. difficile genome, while pilX and pilA3 are located together in a possible third T4P locus, with two other genes (CD630_1243 and CD630_1244) both of unknown function sited between them, forming a 4 gene cluster. PilJ is known to function as a minor pilin inserted into T4P formed primarily of PilA1, and has a unique structure containing a dual pilin fold (Piepenbrink et al., 2014). Nothing is known of PilW, PilX or PilA3.

190

Homologues of all four genes were hunted for in the C. sordellii ATCC9714 genome. A possible homologue of C. difficile 630 pilW was identified (33.9 % amino acid sequence identity). I have therefore named this gene pilW. No homologue was found of pilJ. Homologues were found of both pilA3 and pilX (35.6 % and 27.5 % identity respectively), and these genes have been so named in the ATCC9714 genome. These genes were found in a similar looking cluster to pilX and pilA3 in 630, with the two genes between them (ATCC9714_26271 and ATCC9714_26261) bearing resemblance to the equivalently located genes CD630_1243 and CD630_1244 (47.1 % and 29.6 % identity respectively). This suggests that this four gene cluster is somewhat conserved between the two species. However, their locations within the genomes is extremely different. In C. difficile the four gene cluster looks to form a self- contained operon, with its neighbouring genes relatively distant from it on the chromosome. In C. sordellii ATCC9714 however, the pilA3/pilX cluster is located within what appears to be a larger operon, encoding the Shikimate pathway. To identify any other type IV pilin genes, which lack a C. difficile homologue and were therefore not identified by BLAST searches, the amino acid sequences of all coding DNA sequences (CDSs) identified in the ATCC9714 genome were extracted and analysed using PilFind. No putative T4P genes were identified other than those already described.

6.2.3 Conservation of Type IV Pili Genes Across C. sordellii To analyse the level of conservation of T4P genes identified in strain ATCC9714 across the species, the genomes of the other 43 strains sequenced alongside ATCC9714 were searched for the presence of the same genes. Both the primary and secondary gene clusters were found to be largely conserved across all strains, as is the pilX/pilA3 gene cluster and pilW. The products of all 8 putative pilin genes from the 44 strains sequenced were aligned and showed a high level of sequence conservation across the strain collection (Table 6.1; the alignments themselves are not presented due to their size). Though highly conserved, with a copy of each gene identified in every strain, closer examination suggests some of these copies may actually be non-functional. Strain R32668 contains an N-terminal truncation of PilA1A, likely rendering it non-functional as a pilin, while in strain JGS6956 the homologue of the putative oversized pilin encoded by ATCC9714_01171 is severely truncated at its C-terminus due to a premature stop codon mid-way through the gene. The isolated pilW gene is truncated in two strains: in R32668 a premature stop codon results in the encoding of a severely C-terminally truncated product, which is likely non- functional; in UMC1 a mutation in the translation start codon may render the gene untranslated,

191 or if not nevertheless non-functional. The pilA2, pilX and pilA3 genes are full-length in every strain.

Putative Protein Total Amino Acid Identity (%) Average Pairwise Identity (%) PilA1A 92.7 98.9 PilA1B 96.9 99.4 PilU 91.1 98.4 ATCC9714_01161 82.9 97 ATCC9714_01171 77.8 95.6 PilA2 90.2 98.9 PilW 85.9 98.2 PilA3 87.9 98.5 PilX 86.6 97.8 Table 6.1. Conservation of the Putative Type IV Pilin Genes from C. sordellii ATCC9714. Table show conservation of the genes across all 44 C. sordellii strains sequenced. ‘Total Amino Acid Identity (%)’ indicates the percentage of amino acids within each protein sequence shared by all strains. ‘Average Pairwise Identity (%)’ indicates the average percentage of amino acids from the putative pilin sequence shared between any two strains. 6.2.4 Choosing a C. sordellii Strain for Type IV Pili Investigations It was decided to proceed with basic investigations into the primary T4P cluster of C. sordellii. Given the high level of conservation of this cluster between strains it seemed likely that conclusions drawn from any strain with a fully intact cluster would be representative of the species as a whole. Previously, strain ATCC9714 had been used for most research into C. sordellii. However, for safety reasons it was decided to use a non-toxigenic strain for this work. Therefore, all strains in our possession were PCR screened for the presence of the tcsL and tcsH genes (this is described in detail later in the chapter), resulting in the identification of several non-toxigenic strains. It was known from previous work that it was possible to conjugate plasmids into strain ATCC9714 from E. coli CA434, using the same protocol as used for conjugations into C. difficile (Carter et al., 2011). Four non-toxigenic strains were chosen to test the ease of conjugating plasmids into them. The four chosen were W2922, W2946, W3025 and R26833, as they provided a relatively diverse selection of strains (W2922 is from clade 2, W2946 is from clade 3, R26833 is from clade 4 and W3025 is an outlier, not located within a currently annotated clade). It was then attempted to conjugate the plasmid pASF85 into each strain. pASF85 was used because it is an empty vector which contains the tet promoter, known to be functional in C. sordellii (Sirigi Reddy et al., 2013). Of the four strains tested, conjugation only

192 proved successful into strain W3025, from which a high number of transconjugants were obtained. Strain W3025 was therefore chosen for further investigations into the primary C. sordellii T4P cluster.

6.2.5 Cross-Reactivity of W3025 Pilin Proteins with α-C. difficile Pilin Antibodies It would have proven useful to our investigations if the antibodies raised against C. difficile pilins cross-reacted with the pilin proteins produced by C. sordellii. Pilin genes from the primary T4P cluster with C. difficile homologues (i.e. pilA1A, pilA1B and pilU) were cloned into pET28a for expression in E. coli, yielding plasmids. pECC80, pECC81 and pECC93, respectively. The pilins are encoded with C-terminal His-tags in these plasmids. Plasmids pECC80 (PilA1A), pECC81 (PilA1B) and pECC93 (PilU) were individually transformed into E. coli Rosetta, proteins expressed from them and harvested culture pellets lysed. Cell lysates were analysed by SDS-PAGE followed by Western blotting. Western blots were performed using α-His antibody (to confirm protein expression (Figure 6.5A)) and α-pilin antibodies (Figure 6.5B-D). Though C. difficile PilA1 is the closest homologue of C. sordellii PilA1A and PilA1B, and C. difficile PilU is the closest homologue of C. sordellii PilU, for completeness and in case of unexpected cross-reactivity all three of C. sordellii PilA1A, PilA1B and PilU were tested for cross-reactivity with the α-PilA1, α-PilV and α-PilU antibodies raised against C. difficile 630 proteins. As shown in Figure 6.5, all three proteins were expressed, as confirmed by the α-His Western blot. PilU is expressed at a lower level than the other two pilins. This could be due to too large a gap being left between the RBS and translation start site in plasmid pECC93 (there is an 11 bp gap between the two in pECC93, and only a 7 bp gap in pECC80 and pECC81, because it proved necessary to use different restriction enzymes to clone pilU into pET28a than to clone pilA1A and pilA1B, due to the presence of an NcoI restriction site in pilU). A smaller gap of 6 or 7 bp may well result in much higher levels of PilU production than currently seen. The pilins do not run at their predicted molecular weights – PilA1A has a predicted molecular weight of 22.6 kDa, PilA1B of 18.8 kDa and PilU of 20.4 kDa, but PilA1B and PilU both appear from Figure 6.5A to be significantly larger than these predicted weights. The reason for this is unknown. However, regardless of the low level of PilU expression from plasmid pECC93, it is clear from these images that there is no cross-reactivity of the antibodies raised against the C. difficile pilins PilA1, PilU and PilV with the C. sordellii pilins PilA1A, PilA1B and PilU,

193 despite the antibodies being used at 25 times their normal concentration (1:2000 instead of 1:50 000). There are some background bands seen in Figure 6.5B and 6.5D (blots using α-PilA1 and α-PilU antibodies respectively), but it is apparent that these are background as when compared to the α-His blot in 6.5A, the bands seen are the wrong size to be the C. sordellii Type IV pilins; furthermore, the bands are of the same molecular weight in each lane, demonstrating they are E. coli proteins from the cell lysates, as PilA1A, PilA1B and PilU are all different sizes.

Figure 6.5. Western Blots for C. sordellii Pilins Using Antibodies Raised Against C. difficile Pilins. PilA1A, PilA1B and PilU were expressed in E. coli Rosetta from pECC80, pECC81 and pECC93 respectively with C-terminal His-tags. An α-His blot was performed confirming the expression of each of the pilins (A). Western blots were also performed with α-PilA1, α-PilV and α-PilU (C. difficile) antibodies (B, C and D respectively). These showed that the antibodies raised against C. difficile pilins do not cross-react with C. sordellii pilins. Though bands are seen in blots B and D, these are background where the antibodies have cross-reacted with unknown E. coli proteins from the cell lysates.

194

6.2.6 Does a Riboswitch also Regulate Expression of the C. sordellii Primary Type IV Pilus Cluster? As discussed in previous chapters, transcription of the primary T4P cluster in C. difficile is regulated by the concentration of cyclic-di-GMP, via a novel mechanism using the Type II riboswitch Cdi2_4 (Bordeleau et al., 2015). Due to the similarity between the C. difficile and C. sordellii primary T4P clusters we wondered whether the C. sordellii cluster might also be regulated by a cyclic-di-GMP-binding riboswitch. We started by investigating the regions of DNA upstream of pilA1A and pilA1B to see whether the riboswitch sequence is conserved between C. sordellii and C. difficile. The riboswitch in C. difficile 630 is predicted to be formed from a region of mRNA between 250 and 130 bp upstream of the pilA1 translation start site, and several regions predicted to play important roles in its structural integrity have been identified (Bordeleau et al., 2015), as shown in Figure 4.39B. To search for a homologous riboswitch in C. sordellii the 300 bp upstream of the C. difficile 630 pilA1 translation start site were aligned with the 300 bp upstream of the pilA1A and pilA1B translation start sites. The alignments were performed with the upstream sequences from C. sordellii W3025 (as chosen earlier), though comparison with the reference strain ATCC9714 shows these regions to be well conserved between C. sordellii strains. Immediately noticeable is that the gap between prs and pilA1 in C. difficile 630 is approx. 1100 bp, while in C. sordellii W3025 the gap between pilA1A and prs is only 105 bp. This itself suggests that it is unlikely that there is a riboswitch between the two genes controlling pilA1A expression, simply due to a lack of space (in C. difficile 630 the riboswitch itself comprises over 120 bp (Bordeleau et al., 2015)). The alignment of the region upstream of the pilA1A gene from C. sordellii W3025 with the region upstream of pilA1 from C. difficile 630 supports this conclusion, with little homology seen between the two regions (Figure 6.6). Interestingly, the gap between pilA1A and pilA1B in C. sordellii W3025 is rather larger than that between pilA1A and prs, at 198 bp. Importantly, this gap is long enough to contain a riboswitch homologous to that found upstream of pilA1 in C. difficile 630. Therefore the region upstream of pilA1B in C. sordellii W3025 was also aligned with the region upstream of pilA1 in C. difficile 630 (Figure 6.7). This showed that the region upstream of pilA1 in C. difficile 630 has rather higher homology to the region between pilA1A and pilA1B in C. sordellii W3025 than it does the region between pilA1A and prs, and importantly, not only is the level of homology higher but some regions of DNA predicted to play important structural roles within

195 the riboswitch appear to be well conserved, suggesting this region might contain a riboswitch controlling expression of pilA1B. Finally, the region between pilA1B and pilB1 from C. sordellii W3025 was aligned with the region between pilA1 and pilB1 from C. difficile 630. In C. difficile 630 this region contains a rho-independent transcriptional terminator, and the alignment shows a rho-independent terminator appears to exist in this region in C. sordellii W3025 as well (Figure 6.8A&B). A schematic diagram showing the location of the riboswitch and the rho-independent terminator are shown in Figure 6.9.

6.2.7 Optimisation of Protein Expression in C. sordellii W3025 To test whether c-di-GMP regulates expression of the primary T4P cluster in C. sordellii it would be necessary to express a DGC in strain W3025 in order to drive up levels of intra- cellular c-di-GMP. DccA, the same DGC used for the purpose in C. difficile in earlier chapters was chosen. However, we had never previously attempted to express a gene in C. sordellii, so the process required optimisation. As previously mentioned, others have previously demonstrated gene expression from the tet promoter in C. sordellii ATCC9714 (Sirigi Reddy et al., 2013), using the Ptet expression system originally developed for use in C. difficile (Fagan and Fairweather, 2011). Herein the non-antibiotic tetracycline analogue anhydrotetracycline (Atc) is used to induce gene expression. For optimal levels of gene expression Atc should be added during the logarithmic phase of culture growth, so a growth curve was produced to examine the rate of growth of C. sordellii W3025 in BHIS broth (Figure 6.10). For this experiment W3025 (pASF85) was used, rather than the wild-type strain, to take into account the effect of being grown with thiamphenicol selection.

196

Figure 6.6. Alignment of the 300 bp Regions Upstream of pilA1 from C. difficile 630 and pilA1A from C. sordellii W3025. Segments of DNA which form parts of the anti-terminator shown in Figure 4.39B are annotated and colour coded as such. Regions predicted to form a transcriptional terminator in the absence of bound cyclic-di-GMP (Bordeleau et al., 2015) are annotated in black. The 3’ end of the prs gene is also annotated. The alignment shows relatively low overall sequence homology and a lack of segments homologous to those which form the riboswitch in C. difficile 630.

197

Figure 6.7. Alignment of the 300 bp Regions Upstream of pilA1 from C. difficile 630 and pilA1B from C. sordellii W3025. DNA segments are annotated and colour-coded as in Figure 6.6. The 3’ end of the pilA1A gene is also annotated. The alignment shows a closer homology between these 2 regions than those in Figure 6.6, and the conservation of several of the segments which form the riboswitch.

198

6.8A

6.8B Figure 6.8. A Rho-Independent Terminator Between C. sordellii W3025 pilA1B and pilB1. A – Alignment of the intergenic regions between C. difficile 630 pilA1 and pilB1 and C. sordellii W3025 pilA1B and pilB1. A rho-independent transcriptional terminator is known to exist here in C. difficile 630 (see Chapter 4) and, though the sequence is not well conserved, a putative rho-independent terminator also appears to exist in this region in C. sordellii W3025, as annotated. B – Predicted structure of the rho-independent C. sordellii pilA1B terminator. The shown predicted structure was obtained using OligoAnalyzer 3.1 (http://eu.idtdna.com/calc/analyzer). This hairpin structure is predicted to have a free energy of formation (ΔG) of -49.96 kJ/mol.

199

Figure 6.9. Schematic Diagram Showing the C. sordellii Type IV Pilus Primary Gene Cluster. The genes are coloured as in Figure 6.2, with the triangle and stem-loop representing the riboswitch and rho-independent transcription terminator, as in Figure 4.1. The genes are not numbered in this figure, as the W3025 genome has not been assembled.

The growth curve shows that an OD600 value of about 0.5 corresponds to mid- logarithmic growth phase, and that Atc should therefore be added to cultures to induce protein expression when their OD600 value is approximately 0.5 (reached after about 1.5 hrs). It was also necessary to identify a suitable concentration of Atc to use to induce protein expression in C. sordellii W3025. In C. difficile concentrations of Atc as low as 20 ng/ml have been found to induce detectable levels of protein expression, with 500 ng/ml the highest concentration used (Fagan and Fairweather, 2011). In the study published during this work wherein gene expression from the tet promoter was demonstrated in C. sordellii Atc was used at a concentration of 50 ng/ml (Sirigi Reddy et al., 2013). Though Atc is not an antibiotic, it has been found to be mildly toxic to C. difficile, with significant differences between strains in the toxic effects seen (Fairweather lab, unpublished data). For instance, strain R20291 has been found to be particularly sensitive to Atc. We therefore wished to investigate the effect on the growth of cultures of C. sordellii W3025 (pASF85) when varying concentrations of Atc were added. Atc concentrations of 50, 100, 150, 200 and 250 ng/ml were investigated. Interestingly, we found that the addition of Atc to W3025 (pASF85) cultures immediately upon sub-culturing inhibited any growth at all, even at the lowest Atc concentration used. Fortunately this was not the case when Atc was added to cultures during their logarithmic phase of growth, and growth curves were produced showing the effect on culture growth of each Atc concentration tested (Figure 6.11). As a control another growth curve for C. sordellii W3025 (pASF85) was also produced, on this occasion with the addition of an equal volume of 70 % ethanol when other cultures were induced with Atc (Atc solution is made up in 70 % ethanol).

200

W3025 (pASF85) Growth Curve in BHIS Medium 10

1 600 OD

0.1

0.01 0 1 2 3 4 5 6 7 Time (Hr)

Figure 6.10. C. sordellii W3025 Growth Curve in BHIS Medium. Readings were taken every half- hour. The OD600 readings are plotted using a logarithmic scale. Error bars indicate 1 standard deviation either side of the mean value.

Atc clearly has an inhibitory effect on the growth of C. sordellii W3025, as all cultures to which Atc was added grew more slowly than the culture to which no Atc was added and also reached a lower peak of growth. Additionally, the rate and peak of growth were slower and lower respectively for every increase in Atc concentration added to a culture. This was with the exception of the comparison between the cultures supplemented with 50 and 100 ng/ml Atc, which peaked at a broadly similar level (though the culture supplemented with 100 ng /ml Atc reached that peak slightly later in its growth). These 2 cultures grew at very similar rates with the difference between the OD600 values for the two culture sets statistically insignificant for the majority of the growth curve. This suggests that optimally a concentration of Atc no higher than 100 ng/ml should be used to induce protein expression in C. sordellii W3025, as above this level more significant growth defects are seen in cultures. The growth curves in

Figure 6.11 also show that these cultures reach a maximum OD600 after about 4.5-5 hrs of total growth (i.e. about 3-3.5 hrs after addition of Atc). Cultures expressing proteins should therefore be harvested about 3 hrs after addition of Atc to attempt to catch them at their peak level of growth, before cells start to die, which appears to happen almost immediately after peak growth

201 is reached, as shown by the reductions in culture OD600 values after they reach their peak (Figure 6.11).

W3025 (pASF85) Growth Curves With Varying Concentrations of Atc 4.5

4

3.5

3 0 ng/ml Atc 2.5 50 ng/ml Atc 600

OD 2 100 ng/ml Atc 150 ng/ml Atc 1.5 200 ng/ml Atc 1 250 ng/ml Atc 0.5

0 0 1 2 3 4 5 6 7 Time (Hrs)

Figure 6.11. Growth Curves of C. sordellii W3025 with Varying Concentrations of Atc. Error bars denote 1 standard deviation either side of the mean. Atc was added to each culture after about 1.5 hrs, when an OD600 of about 0.5 was reached, as indicated by the arrow.

To test whether 100 ng/ml was a suitable concentration of Atc to induce DccA expression from plasmid pECC17, expression from pECC17 would be tested with various concentrations of Atc, and production of the His-tagged DccA probed by α-His Western blot of the cell lysates. First, a method for the lysis of C. sordellii cells was needed. A culture of

C. sordellii W3025 was grown in BHIS medium to an OD600 of ~2. 1 ml aliquots of culture were taken and harvested by centrifugation. Four different methods were then used to attempt to lyse the cells. One was subjected to freeze-thaw (frozen at -80oC for 30 mins then warmed at 37oC for 30 mins, both then repeated), then resuspended in PBS; the other three were immediately resuspended in PBS, one then incubated at 37oC for 2 hrs with 0.5 mg/ml lysozyme, another incubated at 37oC for 2 hrs with 20 μg/ml CD27L, and the last simply incubated for 2 hrs at 37oC in PBS, as a negative control. To confirm lysis, soluble and insoluble fractions of lysate were then separated by centrifugation (25 000 g for 10 mins) and the soluble fractions analysed by SDS-PAGE followed by Coomassie staining (Figure 6.12). This showed

202 that C. sordellii lysed easily, with significant lysis seen even in the PBS only ‘negative control’. The best and most effective lysis though seemed to be obtained with the use of lysozyme, which was chosen as optimal lysis conditions. The next step therefore was to conjugate pECC17 into C. sordellii W3025, which was then cultured in BHIS broth. After 1.5 hrs growth production of DccA from pECC17 was induced with either 50, 100, 150, 200 or 250 ng/ml Atc, and an uninduced control was also grown. Cultures were harvested three hrs after induction, lysed with lysozyme and lysates then analysed by SDS-PAGE (using a 12 % acrylamide gel) followed by α-His-tag Western blot (Figure 6.13). This showed that induction of DccA expression was good with all concentrations of Atc attempted. It was decided to press ahead Figure 6.12. Coomassie Stain Showing Cell Lysates of C. sordellii W3025. The with using 100 ng/ml Atc to induce DccA method of lysis is shown above each lane. expression. The first investigations were again into the effect of DccA expression on growth of C. sordellii cultures. A growth curve was performed comparing the growth of C. sordellii W3025 (pASF85) and W3025 (pECC17), both with addition of 100 ng/ml Atc (Figure 6.14). This showed that expression of DccA significantly slowed growth of W3025. This is not seen in C. difficile suggesting that c-di-GMP regulates growth of C. sordellii very differently to that of C. difficile. Given the effect of c-di-GMP production on growth of C. sordellii we wondered whether it might affect cell morphology of the species in the same way it does of C. difficile (i.e. causing significant elongation of cells). Cultures of C. sordellii W3025 were grown in BHIS and induced with 200 ng/ml Atc for 3 hrs before harvesting and examination by phase contrast microscopy (these same conditions lead to the extreme lengthening of a proportion of cells of C. difficile 630 – Section 3.2.6). No lengthening of cells was seen – indeed, they were indistinguishable from the vector control cells (Figure 6.15), suggesting intra-cellular levels of c-di-GMP have no great effect on C. sordellii cell morphology.

203

Figure 6.13. DccA Expression in C. sordellii W3025 (pECC17) Induced with Atc. α-His-tag Western blot, demonstrating DccA expression when induced with varying concentrations of Atc. The control is an uninduced sample.

Effect of DccA Production on growth of C. sordellii W3025 4

3.5

3

2.5

600 2

OD W3025 (pASF85) 1.5 W3025 (pECC17) 1

0.5

0 0 1 2 3 4 5 6 7 Time (Hrs)

Figure 6.14. Growth Curve Showing the Effect of DccA Expression on Growth of C. sordellii W3025. Protein expression was induced at the point indicated by the arrow. Error bars indicate 1 standard deviation either side of the mean.

204

It is possible that the cells were simply not subjected to DccA expression for long enough for the phenotype to develop, or that the level of DccA expression was insufficient to affect the cell morphology on this timescale; however, these precise conditions were sufficient to observe such changes in C. difficile. More extreme conditions were considered, but not attempted: the negative effect on growth of C. sordellii of both DccA and high concentrations of Atc dissuaded me from using a higher concentration of Atc to induce DccA expression, while expression of DccA for much longer than 3 hrs was not considered practical due to the relatively rapid cell lysis of the vector control cells once their peak growth was reached (as visible in Figure 6.14). Despite these considerations, I believe this result to be an accurate reflection on the (non) effect of c-di-GMP on C. sordellii cell morphology.

Figure 6.15. Cell Morphologies of C. sordellii W3025 (pASF85) and (pECC17). A – the W3025 (pASF85) vector control; B – W3025 (pECC17) expressing the DGC DccA. Both strains were analysed after 3 hrs growth with induction of expression from the tet promoter with 200 ng/ml Atc. No difference was seen between the two cultures. Scale bar indicates 10 µm.

6.2.8 Effect of Cyclic-di-GMP on Transcription of the Primary Type IV Pilus Locus To investigate whether c-di-GMP upregulates transcription of the genes in the primary T4P gene cluster, qPCR was used. Cultures of C. sordellii W3025 (pASF85) and W3025 (pECC17) were grown as above, and protein expression induced for 3 hrs with 100 ng/ml Atc. After 3 hrs the cultures were harvested and RNA extracted and purified. Purity of the RNA samples was confirmed by PCR (see Section 2.4.1); primers which amplify 16S rRNA were used for screening: if a PCR product was formed this indicated contamination of the sample with DNA, while the absence of a PCR product indicated purity of the RNA and an absence of contaminating DNA. RNA samples obtained from these C. sordellii cultures were shown to be pure (Figure 6.16A), and then used as templates for cDNA synthesis. The success of cDNA synthesis was confirmed identically to the testing of RNA purity (PCR to amplify 16S rRNA), except here the presence of a PCR product would demonstrate the success of cDNA synthesis

205 while its absence would indicate failure. Success of cDNA synthesis from the C. sordellii W3025 RNA samples was confirmed (Figure 6.15B).

Figure 6.16. 16S rRNA PCRs Demonstrating RNA Isolation and cDNA Production. A shows RNA extracted from C. sordellii W3025 cultures – the lack of a product indicates RNA purity from DNA. B shows cDNA synthesised from the RNA samples in A – the presence of products indicates successful cDNA synthesis from each sample. PCRs were performed to attempt to amplify from 16S rRNA genes; gDNA controls indicate the expected size of the product.

The cDNA was used for qPCR analysis of the expression of genes pilA1A, pilA1B and pilB1. The level of 16S rRNA expression was calculated for use as a baseline value. Primers NF1696/1697 were used for amplification; optimisation indicated that 1/20 000 was a good dilution factor for C. sordellii cDNA for measurement of 16S rRNA levels. Meanwhile, pilA1A amplification was performed with primers NF3288/3289 using cDNA diluted 1/2, pilA1B amplification with primers NF3290/3291 using cDNA diluted 1/80 and pilB1 amplification with primers NF3365/3366 using cDNA diluted 1/50. Standard curves were produced with defined amounts of gDNA (Section 2.4.4), which were used to calculate the effective amount of cDNA produced per gene, which was then converted to a proportion of the quantity of 16S rRNA cDNA to allow comparison of gene expression levels between cultures. Melt curves were also produced, which demonstrate the number of products formed in the reaction. Melt curves and standard curves are shown in Figure 6.17. The standard curves are all good, showing perfect correlations between the amount of gDNA per sample and its Ct value (the threshold cycle value, i.e. the cycle number when the amount of double stranded DNA present in the sample reaches the fluorescence detection threshold). The melting curves for the 16S rRNA, pilA1B and pilB1 qPCR reactions are also good, though the pilA1A melting curve is not

206 perfect. The gDNA standard curve reactions all produced good melting curves, but the melting curves for the experimental samples are not identical – the main peaks are both higher and located at a slightly higher temperature than the gDNA samples, and a second peak is also seen, possible a primer dimer. The reason for this is uncertain, but it may well be due to the fact that the pilA1A gene was expressed at a very low level under the conditions tested here meaning the cDNA was used at a very high concentration. The cDNA still contains all the buffers etc. from the reverse transcription reaction, meaning they will be present in the qPCR mix at unusually high levels which might slightly interfere with the SYBR Green master mix and qPCR reaction. Despite the issue with the melting curve it was felt that the pilA1A qPCR data was still good enough for analysis.

Figure 6.17. Standard Curves (top) and Melting Curves (bottom) for qPCR Reactions Shown in Figure 6.18. L-R: 16S rRNA, pilA1A, pilA1B, pilB1. Standard curves plot sample Ct value (y-axis) against DNA quantity (x-axis). Red squares indicate standards, blue experimental samples and green flagged experimental samples. Here, certain pilA1A samples are flagged due to the presence of a second peak in their melt curves. Melt curves plot derivative reporter values (y-axis) against temperature (x- axis). Red/orange (16S rRNA & pilA1A) and blue (pilA1B & pilB1) curves represent standards, green (16S rRNA & pilA1A) and purple/pink (pilA1B & pilB1) curves represent experimental samples.

The qPCR data was analysed to show the difference in levels of transcription of pilA1A, pilA1B and pilB1 when DccA was expressed compared to the vector control (Figure 6.18). It was apparent that, rather unexpectedly, the effect of DccA expression here in C. sordellii was the exact opposite to its effect in C. difficile, in that it down-regulates expression of all three genes. Of the three genes, the highest level of expression is of pilA1B. In the control strain it is expressed approximately 9x more highly than pilB1 and 76x more highly than pilA1A.

207

Following expression of the DGC DccA, pilA1B is down-regulated by a factor of approximately 3, as is pilB1; pilA1A is down-regulated by a factor of approximately 6.

A Effect of Cyclic-di-GMP on Expression of pilA1A, pilA1B and pilB1 Genes 1 0.9 0.8 0.7 0.6 0.5 pASF85 0.4 pECC17

Expression 0.3 0.2 0.1 0 pilA1A pilA1B pilB1 Gene Expression Level as rRNAas % 16Sof Level Gene Expression Gene of Interest

0.014B 0.012

0.01

0.008 pASF85 0.006 pECC17

0.004

0.002

0 pilA1A

Figure 6.18. The Effect of DGC Expression on Expression of Genes from the Primary T4P Cluster of C. sordellii W3025. Expression levels when the DGC DccA is expressed from plasmid pECC17 are compared with expression levels from a strain carrying the vector control pASF85. Graph A shows expression levels of pilA1A, pilA1B and pilB1. Graph B is a blow-up of the indicated area from graph A to allow proper visualisation of pilA1A expression, which is vastly lower than that of pilA1B or pilB1. Error bars indicate 1 standard deviation either side of the mean.

208

Re-analysis of the conserved riboswitch between pilA1A and pilA1B shows that the sequence responsible for c-di-GMP binding and pseudoknot formation in C. difficile 630 is largely well conserved in C. sordellii W3025, with the exception of the end of the pseudoknot sequence section P1. However, the sequence which in C. difficile 630 forms a Rho-independent transcriptional terminator in the absence of c-di-GMP binding to the riboswitch is lost entirely from C. sordellii, meaning c-di-GMP binding is not necessary for enabling of transcription of pilA1B. A clear RBS is also obviously visible immediately upstream of pilA1B, suggesting that binding of c-di-GMP to the riboswitch is not necessary for gene translation. It is therefore logical, based on the sequence of the riboswitch and the region between the riboswitch and pilA1B, that c-di-GMP binding to this riboswitch should down-regulate gene expression, though this is still rather surprising given that such a similar gene cluster is up-regulated by c- di-GMP binding in C. difficile. How this riboswitch works is not immediately clear and is not addressed in this work.

6.2.9 A Minority of Strains from the Collection Encode Cytotoxins As stated in Section 7.2.3, all C. sordellii strains were screened for presence of the LCC genes tcsL and tcsH. The majority of strains had been isolated from either clinical or veterinary cases of disease, and since tcsL in particular had been postulated to be the major virulence factor of the species (Aronoff, 2013) it was expected that the majority of strains would encode at least one of the LCCs. Initial screening was simply of the genome sequences. Surprisingly, this suggested that only 5 out of the 44 strains encode tcsL (ATCC9714, JGS444, JGS445, JGS6364 and JGS6382), and that of these strains only one (JGS6382) also encodes tcsH. The four tcsL+/tcsH- strains all apparently contain two identical 5’ fragments of tcsH. To confirm this result gDNA from all strains was screened for the presence of tcsL, full-length tcsH and tcsH fragments. Screening for tcsL was performed with primers NF2362/2363 (Figure 6.19), screening for full-length tcsH with primers NF2351/2352 (Figure 6.20) and screening for tcsH fragments with primers NF2351/2353 (Figure 6.21). The results of these screenings confirmed the results obtained by computational genome screening – i.e. that only 5 of the strains encode tcsL and only one of those encodes tcsH. No strain encodes tcsH (whether full-length or fragment) without tcsL). Only 25 of the 44 strains were accessible to me here in the UK, and so were screened personally by me as shown in Figures 6.19-21. The remaining 19 strains were located with collaborators at Monash University, Melbourne, Australia, where they were identically screened by Dr Milena Awad. Her results are not shown but also confirmed results obtained from genome screenings.

209

Figure 6.19. PCR Screen of C. sordellii Strains for tcsL Gene. The presence of a PCR product indicates the strain is carrying tcsL; the absence of a PCR product indicates the strain lacks tcsL.

Figure 6.20. PCR screen of C. sordellii Strains for Full-Length tcsH Gene. The presence of a PCR product indicates the strain carries a full-length tcsH gene; the absence of a product shows it does not.

The four tcsL+/tcsH- strains are closely related, all being found in Clade 1 of the phylogenetic tree (Figure 6.1). Also in Clade 1 are strains UMC164 and R15892, which are clearly closely related to the four tcsL+/tcsH- strains (particularly JGS444, JGS445 and

210

ATCC9714) but lack tcsL. It is particularly intriguing that UMC164 lacks tcsL, as a previous study in 2006 found that it did contain tcsL (though lacked tcsH) (Voth et al., 2006). This suggests that since this initial study UMC164 has lost tcsL.

Figure 6.21. PCR Screen of C. sordellii Strains for Fragments of the tcsH Gene. The presence of a PCR product indicates that the strain contains fragments of the tcsH gene; the absence of a product shows that the strain either lacks the tcsH gene entirely, or contains the full-length gene.

6.2.10 The LCC Genes are Localised Within a Pathogenicity Locus The related C. difficile toxin genes tcdA and tcdB are located within a pathogenicity locus (PaLoc) (Braun et al., 1996). This initiates with the alternative Sigma Factor tcdR (which is required for transcription of tcdA and tcdB (Mani and Dupuy, 2001)), followed by toxin B (tcdB), the holin-like protein tcdE (required for secretion of TcdA and TcdB from C. difficile cells (Govind and Dupuy, 2012)), toxin A (tcdA) and ending with the putative anti-Sigma Factor tcdC (whose function is uncertain (Cartman et al., 2012)). The first four genes (tcdR – tcdA) are all located on the same DNA strand, while tcdC is located on the opposite strand (Figure 6.22A). Analysis of the C. sordellii toxin genes showed that in every strain wherein they are found the LCC genes are located within a similar (though not identical) PaLoc. The C. sordellii PaLoc initiates with tcsL, after which comes a homologue of tcdE (which we have named tcsE), then tcsH (whether the full-length version of the gene or fragments thereof) and finally a tcdR homologue (which we have named tcsR). No tcdC homologue is present, and unlike in C. difficile, where tcdA and tcdB are encoded on the same DNA strand, in C. sordellii tcsH and tcsR are located on the opposite DNA strand to tcsL and tcsE (“facing them”) and so are transcribed in the opposite direction (Figure 6.22B).

211

Just upstream of tcsL are found the plasmid partitioning genes parA and parB. ParA and ParB are responsible for ensuring that during cell division, after DNA replication, plasmids (particularly low copy-number plasmids) are partitioned into both daughter cells, ensuring their maintenance within the genome of the bacterial strain/species (Bignell and Thomas, 2001). Though parA and parB can also be used for chromosome partitioning during cell division, these genes were only found in the five C. sordellii strains which encode tcsL, offering evidence that tcsL might be located on a plasmid.

Figure 6.22. Comparison of the PaLoc from C. difficile 630 and C. sordellii ATCC9714. The C. difficile 630 PaLoc is presented in A, and the C. sordellii ATCC9714 PaLoc is presented in B. Clearly, though the gene order is similar, differences exist: the alternative Sigma Factor (tcdR/tcsR) is found at opposite ends of the locus, tcdC is absent from C. sordellii and tcsH and tcsR are encoded in the opposite direction in C. sordellii (relative to tcsL) compared to in C. difficile. In strain ATCC9714 only two 5’ fragments of tcsH are found, but in a tcsH+ strain, such as JGS6382, the full-length version is found in the same location and orientation as the gene fragments in ATCC9714.

6.2.11 The LCC Genes are Located on a Plasmid Following initial optical mapping of the ATCC9714 chromosome, it became apparent that considerably more DNA remained un-mapped than gaps remained on the chromosome. This suggested that some of the un-mapped DNA was present as extra-chromosomal elements. Interestingly, the contig which contained the tcsL gene was one of those which remained un- mapped. The fact that strain UMC164 had apparently lost the tcsL gene suggested that it might be located on a mobile element (such as a plasmid). It was therefore decided to try to manually improve the assembly of the ATCC9714 genome, with a particular focus on the contig containing the tcsL gene, in order either to locate it on the chromosome or assemble its associated extra-chromosomal element. Assembly was performed in three ways: firstly, manual inspection of sequence data resulted in the identification of overlapping ends of contigs, allowing their merging; secondly, computerised contig assembly had resulted in slightly different assemblies between strains. Therefore comparison between the ATCC9714 genome assembly and the assemblies of other tcsL+ strains allowed certain non-overlapping ATCC9714 contigs to be identified as neighbouring; thirdly, the C. difficile and C. sordellii genomes are very similar, and many genes or gene clusters are located in identical sequences in the two species, meaning certain

212 contigs could be identified as neighbouring in the C. sordellii ATCC9714 genome based on the contiguous nature of homologous sequences in the completely assembled C. difficile 630 genome. All identified contig joins were confirmed by PCR amplification of the contig junctions followed by Sanger sequencing of the resultant product. PCRs carried out in the process of these initial improvements were performed with primers NF2409-2420, 2425, 2426, 2467, 2468, 2533, 2534 and 2695-2698. This resulted in an overall reduction in the number of contigs from 81 to 59 and the closing up of several gaps in the map of the chromosome. The tcsL gene was now left located on a 103.8 kb section of un-mapped sequence, which proved impossible to optically map onto the chromosome. PCR using primers NF2634/2635 confirmed the circular nature of this element, identifying it as a plasmid which we named pCS1-1. In addition to the PaLoc and plasmid partitioning genes, also found on pCS1-1 were other genes likely to be required for plasmid replication (a putative topoisomerase, a recA recombinase, a resolvase and a helix-destabilising single-stranded DNA- binding protein) and two genes which may enable conjugative transfer of the plasmid (a Type IV Secretion System DNA conjugation protein and a Type IV Secretion System coupling DNA-binding domain protein). Also of potential interest encoded on pCS1-1 is a putative secreted collagen-binding protein and a second copy of sortase (one is encoded on the chromosome) which may attach plasmid-encoded proteins to the cell wall. An equivalent plasmid, termed pCS1-3, was found in JGS6382, the only tcsH+ strain in the collection. pCS1-3 was assembled by PCR using primers NF2413 and NF2699-2703. It was found to be a similar size (106 kb) and to contain a similar number of ORFs (94 in comparison to 90 on pCS1-1), but actually has multiple differences compared to pCS1-1: in addition to the presence of full-length tcsH in the PaLoc, 29 ORFs present on pCS1-1 are absent from pCS1-3 (including two universal stress proteins and all genes encoding anaerobic sulphite reductase subunits), while 32 ORFs are present on pCS1-3 which are absent from pCS1-1 (including an additional copy of resolvase and three transposase genes). Visual examination of the genomes of the tcsL+ strains JGS444 and JGS445 suggested that they both carry highly similar plasmids to pCS1-1, though these plasmids were not assembled by PCR. Visual examination of the genome of the fourth tcsL+/tcsH- strain, JGS6364, showed the presence of a plasmid similar to pCS1-1, but pulsed-field gel electrophoresis (carried out by collaborators in Australia) demonstrated its plasmid was significantly larger than pCS1-1. This prompted us to assemble the plasmid by PCR, performed with primers NF2409-2418, 2634, 2635, 3067, 3068 and 3073. The plasmid was named pCS1-2, which, at 117.3 kb, is 13.5 kb larger than pCS1-1. The arrangement of genes within and around the PaLoc in pCS1-2 is identical to that

213 in pCS1-1, but several major differences between the two plasmids exist elsewhere. Several genes present on pCS1-1 are absent from pCS1-2, but most interestingly a 21 kb insertion is found in pCS1-2, which encodes 23 genes, all novel compared to pCS1-1. These include several putative lantibiotic resistance genes and a putative lantibiotic-binding component of an ABC transporter. It would be of great general interest if strain JGS6364 were found to produce a lantibiotic using these genes, since as far as we are aware no member of the Clostridia has ever previously been found to produce a lantibiotic. Interestingly, two other, tcsL- strains from our collection (W10 and UMC2) were also found to carry pCS1-1-like plasmids. The 100 kb plasmid from UMC2 was assembled by PCR using primers NF2634 and NF3057-3061, and named pCS1-4. pCS1-4 is highly similar to pCS1-1, but the entire PaLoc is lost and replaced with a series of genes most of which are unknown, but includes a transposase. In both pCS1-3 and pCS1-4 a second transposase gene is found just upstream of a conserved putative MazE-family antitoxin gene (Figure 6.23).

Figure 6.23. Comparison of the PaLoc and Surrounding Region from the Four pCS1-Type Plasmids Identified in C. sordellii Strains in our Collection. ORFs are coloured according to function, as described in the key. Though pCS1-1 and pCS1-2 contain several differences, the region surrounding the PaLoc is identical between them. pCS1-3 contains a full-length tcsH gene and a nearby transposase gene, which may help explain why most of tcsH is lost from pCS1-1 and pCS1-2. pCS1-4 contains a second transposase gene, and the PaLoc is replaced with it and 13 genes of unknown function. Presumably this transposase gene has played a role in creating this difference. Published in (Couchman et al., 2015), reused with permission.

During manual assembly of the ATCC9714 genome a second circular entity was identified, this was assembled by PCR using primers NF2632/2633 and designated pCS2. pCS2

214 is considerably smaller than pCS1-type plasmids (being only 37.1 kb) and contains 52 ORFs. The element is clearly phage-derived (20 of its 52 ORFs are annotated as phage genes), though it also contains two genes potentially involved in plasmid replication (a copy of parA and putative replication initiator protein), meaning it is unclear whether pCS2 now represents a plasmid or a phage. It is found in all Clade 1 strains except for the most outlying strain, JGS6364.

6.2.12 Analysis of the ATCC9714 Genome The above-detailed refinement of the ATCC9714 genome produced a high quality reference genome for the species: the only gaps in the chromosome assembly are now where highly repetitive DNA sequence has prevented effective Sanger sequencing. This is generally due to the presence of rRNA genes, though a couple of highly repetitive ORFs have also resulted in gaps in the map. 3282 ORFs are currently annotated on the chromosome, together with 90 on pCS1 and 52 on pCS2, giving a total of 3424 in the genome. This is broadly similar to the numbers found in other Clostridial species – considerably more than C. perfringens or C. tetani (strains 13 and E88 contain 2660 and 2580 respectively) but fewer than C. difficile 630, which contains 3680. ATCC9714 contains the fewest ORFs of any strain in our collection, with the most belonging to UMC2, which has 3957 (the average is 3459). However, these figures are almost certainly overestimates, as the other genomes have undergone little or no improvement to reduce contig number, which would almost certainly reduce the number of ORFs identified. Further analysis would also doubtless indicate the presence of novel plasmids in several strains. The ATCC9714 genome was analysed to identify secreted proteins and putative virulence factors. 168 putative secreted proteins were identified (six of which were encoded by pCS1-1, one by pCS2 and the others on the chromosome). Of these 168, 66 are predicted to be lipoproteins (including one of those encoded on pCS1-1 and the one on pCS2). 9 predicted cell wall proteins were identified, all of which contain 3 adjacent S-Layer Homology (SLH) domains. SLH domains can non-covalently attach proteins to the cell wall via pyruvylated peptidoglycan (Mesnage et al., 2000). This cell wall pyruvylation is carried out by CsaB (Mesnage et al., 2000), which is found encoded on the chromosome. This shows that C. sordellii cell wall proteins attach to the cell walls by a different method to those of the highly related species C. difficile, wherein cell wall proteins attach to the cell wall via Cell Wall- Binding Repeat 2 (CWB2) domains (Willing et al., 2015). Four chromosomally-encoded probable exotoxins were identified: the collagenase colA, the neuraminidase nanS, a phospholipase C (csp) and the cholesterol-dependent cytolysin

215 sordellilysin (sdl). All four genes are well conserved across our collection of strains. However, four closely related strains from clade 4 (R29426, SSCC35109, R26833 and R31809) possess point mutations in codon 50 of csp, resulting in the formation of premature stop codons, JGS6382 contains a 5’ truncation of the csp gene resulting in the loss of its signal peptide and JGS6956 also contains a premature stop codon which will result in C-terminal truncation of Csp. All these 6 strains are likely to produce non-functional Csp, while 2 strains from clade 2 (SSCC37615 and SSCC18838) contain a mutation in nanS which may render their NanS proteins non-functional, by causing an Asp=>Asn mutation. The affected Asp residue is located within a structurally important Asp-box, mutation of which has been shown in the homologous C. perfringens NanH to cause incorrect folding of the protein (Chien et al., 1996). Therefore, though all 4 putative exotoxins are encoded by all 44 strains of C. sordellii in our collection, not all toxins will be produced by all strains (and indeed, some strains (such as ATCC9714) seem not to produce Sdl, despite apparently encoding a functional copy (Voth et al., 2006)). Two putative virulence factors with functions related to immune evasion were also identified. One was aureolysin, a secreted metalloprotease first identified in Staphylococcus aureus, which specifically cleaves Complement protein C3, thereby preventing activation of the Complement system during infection (Laarman et al., 2011), and also cleaves the anti- microbial peptide LL-37, further protecting bacteria from immune killing (Sieprawska-Lupa et al., 2004). The other was a cell wall-binding protein which contains a Mac-1 domain, a domain originally identified in the Streptococcus pyogenes protein IdeS which specifically cleaves human IgG (von Pawel-Rammingen et al., 2002). If both these genes produce proteins with the predicted activities C. sordellii would be able, at least to some degree, to evade both the innate and adaptive immune systems. Putative adhesion factors were also identified in the genome. A secreted collagen binding protein with at least 7 CnaB repeats was found, as was another large gene containing two Discoidin domains; in eukaryotes Discoidin Domain Receptors (DDRs, which contain Discoidin domains) bind collagen (Fu et al., 2013). As described above in detail, two T4P gene clusters were identified, as was a complete flagella apparatus. A distinguishing feature of C. sordellii is its production of urease. Most (though not all) strains of C. sordellii produce urease (Nakamura et al., 1976; Tataki and Huet, 1953), which is a nickel-containing metalloenzyme which hydrolyses urea into ammonia and carbonic acid. It is of microbiological interest as in several bacterial and fungal pathogens (including Helicobacter pylori and Klebsiella spp.) urease acts as a virulence factor (Rutherford, 2014). (That said, there is no evidence that urease functions as a virulence factor in the context of

216

C. sordellii infection). We found complete urease operons in all strain in our collection except for one (SSCC26591). The complete operon comprises 8 genes (ureABCIEFGD), and is located immediately upstream of nanS on the chromosome. We are unaware of any other Clostridial species which carries urease genes on its chromosome, though certain strains of C. perfringens carry urease genes on a plasmid (Dupuy et al., 1997). The latter parts of these results have since been published in (Couchman et al., 2015).

6.3 Discussion 6.3.1 Importance of the C. sordellii Genomic Analysis This analysis of the C. sordellii genome reveals important and interesting facts regarding the pathogenesis of C. sordellii disease and the phylogeny of the species. As shown in Figure 6.1, the species (based on current knowledge) appears to divide into four phylogenetic clades. None of these clades is specifically associated with clinical or veterinary cases of disease (or indeed the absence of disease), suggesting that no group of strains is adapted to any specific host. No clade is associated with strains derived from any particular geographical location either – all clades contain strains isolated in at least 2 continents. Given that the UK, USA and Australia are all geographically isolated from each other, one might have expected the strains from each country to have diverged into distinct clades. The fact that they have not, and are in fact phylogenetically mixed, suggests recent transfer of strains between countries. This is likely to have been enabled by human activity, such as travel and transport of livestock. Only a small minority of C. sordellii strains in this collection were found to contain the LCC gene tcsL (5 out of 44), and only one of them to contain tcsH. A similar but smaller study from 2006 had similar findings, with only one strain out of 14 found to contain tcsL (and none tcsH) (Voth et al., 2006). This would suggest that tcsL+ strains are rare within the C. sordellii population. However, these results may be misleading. The only strain from the 2006 study found to contain tcsL was UMC164, which was contributed to our collection for sequencing. Whole genome sequencing showed that the tcsL gene was now absent from strain UMC164, a fact confirmed by PCR screening. These facts show that UMC164 had lost the tcsL gene during rounds of sub-culturing. There is evidence that this is a common event in C. sordellii strains; during the course of this work Bouvet et al published a study in which they reported incidents of C. sordellii disease wherein the responsible strain, upon laboratory isolation and screening, was found to lack the tcsL gene, yet the presence of the TcsL protein itself was demonstrated in tissue samples (Bouvet et al., 2015). I have shown here that tcsL (and tcsH on occasion) are encoded on a plasmid, known as a pCS1-type plasmid. These results together suggest that

217 culturing of C. sordellii under laboratory conditions can frequently result in the loss of this plasmid, meaning that tcsL and tcsH may in fact be produced by a far larger proportion of naturally occurring strains than predicted by screening studies. It is hoped that this study will enable improved diagnostics of C. sordellii infection. The availability of DNA sequences for the toxin genes (and other plasmid-associated genes) may allow improved PCR screening for them, but also the knowledge that the toxin genes are on easily lost mobile elements will indicate to physicians/clinical microbiologists that such screening is highly fallible. It is also hoped that this study will enable and drive further research into C. sordellii. The absence of a high quality genome sequence was inevitably a hindrance to research into the species, precluding genetics studies and suchlike. This study also highlights the continued importance of proper genome assembly in biological research. In this day and age, where automated sequencing capabilities far exceed the capacity for proper manual assembly, annotation and analysis of genomes, there is a tendency for genome sequences to be simply deposited into databases with only automated annotation being carried out. In the course of this work two C. sordellii genome sequences were published (Sirigi Reddy et al., 2013), one of a tcsL+/tcsH- strain, the other of a tcsL+/tcsH+ strain. However, no attempt was made to assemble the genomes beyond the automatically assembled contigs, meaning the presence of plasmids was not identified in them. Furthermore, multiple homopolymer errors were seen in these genomes, resulting in frame shift mutations and limiting the usefulness of these genome sequences. One good example is in the primary T4P gene cluster, which is completely unrecognisable due to the prevalence of frame-shifts, presumably due to sequencing errors. This emphasises the importance of high-quality sequencing machinery to generate genome sequences, particularly in species with AT rich genomes such as the Clostridia.

6.3.2 The Role of the LCCs in Disease This study, being purely sequencing based, does not directly contribute to our understanding of the roles of tcsL and tcsH in disease. Nevertheless, it does provide insight. Most studies have previously indicated that TcsL is required for severe C. sordellii infections to develop (Hao et al., 2010), and mutational studies in strain ATCC9714 support this, showing TcsL to be a critical virulence factor (Carter et al., 2011). By demonstrating that tcsL is encoded on a mobile genetic element which can be lost by the bacteria, this study reconciles the above theory with the incidence of cases of severe C. sordellii infection caused by apparently tcsL- strains.

218

Given that the majority of strains in our collection are isolates from clinical/veterinary cases of disease, it is likely that a significant number of them once carried a pCS1-type plasmid which they have since lost. This of course includes strain UMC164, and all others associated with severe infections/toxic shock syndrome. Why some strains lose their plasmids upon laboratory culture while others do not is fundamentally impossible to answer, as plasmids which have been lost by strains are obviously unavailable for sequencing analysis and comparison to more stable plasmids. To speculate, it may be that the plasmids are naturally unstable and prone to loss, but that particular mutations have made certain variants of them stable, or that chromosomally-encoded differences in strains result in certain strains holding on to their plasmids better than others. Alternatively, it may be that the unstable plasmids are completely different in sequence to the stable plasmids, having the LCC genes in a different plasmid backbone (carrying different genes etc.) to those which are stable and have been sequenced in this study. That said, it should not be assumed that all strains in our collection were once toxigenic but have since lost their pCS1-type plasmids. For instance, the “UMC” strains in our collection were isolated from cadavers of individuals who had never suffered from C. sordellii infections (Voth et al., 2006), meaning some of them at least may be avirulent. Furthermore, only two- thirds of reported cases of C. sordellii infection are associated with toxic-shock syndrome, and this is likely to be an overestimate as less dramatic cases may be over-looked entirely or simply not reported by physicians (Walk et al., 2011). The remaining third of cases are of less severe infections almost certainly not mediated by TcsL. At least one non-toxigenic strain from our collection (DA-108) was isolated from a clinical case of infection not associated with toxic- shock syndrome (Hao et al., 2010), while strain W10 (which was isolated from a veterinary infection), despite appearing to carry a pCS1-like plasmid, nevertheless lacks tcsL and tcsH (it appears to carry a pCS1-4-like plasmid). It seems unlikely that W10 ever carried two pCS1- type plasmids, suggesting that it too caused a non-TcsL-mediated infection. All strains in our collection chromosomally-encode other putative virulence factors and toxins (e.g. sdl, nanS etc.) which may well be sufficient to cause infections not associated with toxic-shock syndrome, such as have previously been described in the literature (Abdulla and Yee, 2000; Hao et al., 2010; Valour et al., 2010; Walk et al., 2011). Further investigation of these toxins may shed light on their roles in disease.

6.3.3 Evolution of the LCC Genes

219

How the LCC genes evolved is a very interesting question. Our phylogenetic analysis suggests that a pCS-1-type plasmid has entered C. sordellii on at least 3 occasions. This is shown in Figure 6.24, which is a repeat of the phylogenetic tree from Figure 6.1 annotated to show the developments described in this chapter. One possible chronology is as follows: pCS1-3 first entered C. sordellii (indicated by “1” in Figure 6.24) carrying both tcsL and tcsH. pCS1-3 then entered the ancestor strain of Clade 1 (this event is marked “2”), whereupon a rearrangement occurred resulting in the loss of the majority tcsH, leading eventually to the formation of pCS1- 1 and pCS1-2. Finally, on a third occasion (marked “3”) pCS1-3 entered the ancestor of strains W10 and UMC2 and underwent a second rearrangement, resulting in the exchange of the entire PaLoc for a different gene cassette. Of course, though I believe this chronology is plausible, several other possible chronologies exist and given the likelihood that several other strains once carried a pCS1-type plasmid but have since lost them, this hypothesis may be lacking.

Figure 6.24. Phylogenetic Tree as Shown in Figure 6.1 with Additional Annotations. Additional to Figure 6.1, strains carrying pCS1-type plasmids have been annotated ‘p’; if the plasmid encodes tcsL and tcsH the strain is further annotated ‘LH’; if the plasmid encodes tcsL without tcsH the strain is further annotated ‘L’. Encircled numbers indicate hypothesised points of entry of pCS1-type plasmids into the strain, as discussed in Section 7.3.3. Published in (Couchman et al., 2015), reused with permission.

220

Comparison of the C. sordellii LCCs to their close relations found in C. difficile is also interesting. In C. difficile the LCC genes tcdA and tcdB are chromosomally localised, and the majority of strains isolated from clinical cases of C. difficile infection carry both genes. Only 5-10 % of clinical C. difficile isolates are tcdB+/tcdA- (the equivalent of a tcsL+/tcsH- C. sordellii strain, which appears to be the most common context for toxigenic C. sordellii) (Drudy et al., 2007). No tcsH+/tcsL- strains of C. sordellii have been isolated to date, and an + - equivalent tcdA /tcdB strain of C. difficile was only recently isolated for the first time (Monot et al., 2015), suggesting that strains of either species with this effective toxinotype are rare. The PaLocs of C. difficile and C. sordellii are also distinct (as described in Section 6.2.9), with the relative orientations of the genes in them differing between species and tcdC being absent completely from C. sordellii. It has recently been demonstrated that the C. difficile LCC genes can undergo horizontal transfer between strains (Brouwer et al., 2013), so it would be interesting to see whether the C. sordellii pCS1-type plasmids can be transferred between strains.

6.3.4 The C. sordellii Type IV Pili Genes The C. sordellii primary and secondary T4P genes clearly bear close resemblance to the equivalent C. difficile genes, indicating the close relation between the two species, and it is unfortunate that the C. sordellii pilin proteins are insufficiently homologous to their C. difficile equivalents to cross-react with antibodies raised against the latter. Given the similarity between the two clusters, and the presence of a conserved riboswitch apparently regulating their expression, it is both surprising and fascinating that they are inversely regulated, with the primary cluster in C. difficile up-regulated by cyclic-di-GMP and the primary cluster in C. sordellii down-regulated by the same. As discussed in Section 6.2.7, the riboswitch in C. sordellii differs to that in C. difficile primarily by the absence of a transcriptional terminator at the end of the riboswitch, which in C. difficile terminates transcription upstream of pilA1 in the absence of c-di-GMP, preventing pilA1 expression (Bordeleau et al., 2015). The absence of this terminator allows transcription of pilA1B in C. sordellii in the absence of c-di-GMP. The binding of c-di-GMP to the C. sordellii riboswitch must therefore somehow either terminate transcription prematurely, or destabilise the transcript so it is rapidly degraded. How c-di-GMP binding to this riboswitch could initiate transcriptional termination is unclear. Though it is easy to imagine that c-di-GMP binding could induce the formation of a stem-loop structure in the mRNA, Rho-independent transcriptional terminators are also characterised by the presence of a ‘poly-U tail’ immediately after the stem-loop. No poly-U/T stretch is apparent

221 in the vicinity of the riboswitch and it is harder to see how c-di-GMP binding could compensate for that absence. It is therefore more likely that c-di-GMP binding to the riboswitch could somehow induce degradation of the mRNA. It is also possible that the presumed riboswitch sequence is not in fact a riboswitch, and is not responsible for the regulation of expression of the C. sordellii primary T4P gene cluster – its function has merely been inferred, rather than experimentally demonstrated. However, this seems unlikely. It would be a remarkable coincidence if despite its clear homology to a c- di-GMP-binding riboswitch, this proposed riboswitch did not bind c-di-GMP, but that remarkably a completely different mechanism for control of these genes by c-di-GMP had evolved. The relative expression levels of pilA1A, pilA1B and pilB1 are also interesting. The relative expression levels of pilA1B and pilB1 are as expected based on results seen in C. difficile, with c-di-GMP having the same effect on pilB1 expression as on pilA1B expression, but with overall expression of pilB1 considerably lower than that of pilA1B in either condition, due to the existence of a Rho-independent transcriptional terminator between the two genes; indeed the expression differential is very similar to that between the equivalent C. difficile 630 genes (pilA1 and pilB1). The expression of pilA1A is far lower than that of pilA1B, suggesting that at least under these conditions PilA1B is the major pilin. The level of pilA1A expression seen here may reflect a very low, basal level of transcription of a gene whose expression is fundamentally switched off. It is possible that under particular conditions pilA1A expression is activated, enabling PilA1A to be incorporated into T4P in order, presumably, to play a specific role. It seems unlikely that it is an artefact and never expressed, given its high level of conservation between diverse strains of C. sordellii, though it is presumably not essential for general synthesis and function of T4P. It is interesting that pilA1A expression is also regulated by c-di-GMP, as the c-di-GMP binding riboswitch is downstream of the gene. However, if c-di-GMP binding somehow destabilises the mRNA and causes it to be degraded then that would explain how it causes pilA1A expression levels to drop.

222

7. Discussion

This study has advanced our understanding of the Type IV Pili of C. difficile. Several genes essential for T4P formation have been identified, while it has also been shown that PilT, the retraction ATPase, is not required for T4P assembly in this species. This work has also shown that the two T4P gene clusters found in C. difficile (the primary and secondary clusters) are independent and function separately. While good progress was made during this project on how the pili encoded by the primary T4P gene cluster work, much remains to be done to bring our understanding of these pili to the same level as that of Gram-negative T4P such as those of P. aeruginosa and N. meningitidis. Furthermore, though it had initially been hoped to investigate the secondary T4P gene cluster (and its products) as well, this proved to be overly ambitious and the plans were not pursued. As described in Chapter 4, it remains unclear how expression of the secondary T4P cluster is induced, so in order to investigate it conditions under which it is expressed must be identified. Identifying its expression may also prove challenging – as shown in Chapter 3, even exogenous expression of PilA2 in C. difficile (i.e. from a plasmid) appeared unsuccessful, suggesting investigation of any pilus produced from this cluster may not be straightforward.

7.1 Future Work on the Primary Type IV Pilus Cluster

As described in Chapter 4, various questions about the T4P produced from the primary T4P cluster remain unanswered. For instance, while it has been shown that PilD1 cleaves PilA1 (yielding mature pilin), it did not prove possible to identify which pre-pilin peptidase cleaves PilV, PilU or PilK. It has previously been speculated that these three proteins would be matured by PilD2 (Melville and Craig, 2013) and it would enhance our understanding of the system if this could be confirmed. Furthermore, the primary T4P cluster of C. difficile is somewhat unusual in possessing two separate pre-pilin peptidase genes, so offers an interesting opportunity to enhance our understanding of substrate recognition by these proteases. If the substrates of each protease can be delineated, the differences between them and the differences between the two proteases will help us understand how each set of substrates is recognised, enhancing our understanding of pre-pilin peptidases generally. It would also be of a great deal of interest to make double mutant strains, with deletions of genes essential for T4P synthesis and PilT, to investigate whether the situation is the same as in Gram-negative species in which many genes apparently essential for T4P biogenesis are

223 in fact only required (it seems) to combat the retractive force of PilT (Carbonnelle et al., 2006). It would be interesting to see if the genes truly essential for T4P biogenesis in C. difficile are the same as those in P. aeruginosa or N. meningitidis, or even different to both of them. The structures of C. difficile PilA1 proteins from three different strains have been elucidated (discussed below) (Piepenbrink et al., 2015), and if would also be interesting if other proteins from the system could be structured. The structures of other C. difficile pilins would be of particular interest, but it is also to be hoped that the structures of other T4P proteins, e.g. ATPases etc., are elucidated. Such structures would have both their own intrinsic value to further our understanding of T4P systems, and could also be compared to equivalent structures from Gram-negative systems to further enhance our understanding of what differences exist between Gram-negative and Gram-positive T4P. Finally, as discussed in Chapter 5, arguably of most interest would be work which enhances our understanding of the function of the T4P of C. difficile, particularly their in vivo function during infection and/or colonisation.

7.2 Are C. difficile Type IV Pili “T4aP” or “T4bP”? As discussed in Chapter 1, T4P in Gram-negative organisms can be split into two broad families: T4aP and T4bP (Burrows, 2012). There are various differences between these types of pili. For instance, the pilins themselves are distinctive: T4aP pilins tend to be about 150 amino acids long, with a short leader peptide (less than 10 amino acids). T4bP pilins on the other hand tend to be either shorter than T4aP pilins (50-60 amino acids) or longer (~200 amino acids) and to have a much longer leader peptide (15-30 amino acids) (Pelicic, 2008). Other distinctions include that T4aP genes tend to be spread out through the genome in multiple clusters, while T4bP genes all tend to be located together, generally in an operon, and are often clearly acquired by a species/strain by horizontal transfer (Roux et al., 2012). Furthermore, T4bP are generally not associated with motility (Roux et al., 2012). The T4P encoded by the primary cluster of C. difficile appear to have characteristics of both types of T4P. For instance, while the vast majority of genes involved in T4P synthesis are encoded with the primary T4P operon (and indeed clustering of genes in a single operon may in fact be generally characteristic of Gram-positive T4P, being seen in the vast majority of Clostridia (Varga et al., 2006) and Streptococcus sanguinis (Gurung et al., 2016)), at least one pilin which is incorporated into the pili is encoded externally to the operon (PilJ, (Piepenbrink et al., 2014)), and if PilW is incorporated into these pili (which remains unclear) then two such pilins are encoded externally to the main operon (in strains in which pilW is present).

224

Furthermore, it seems unlikely that the primary T4P operon has been acquired by horizontal gene transfer (at least not on a short-term evolutionary time-frame), given that the operon is transcriptionally linked to the essential gene pth. Also, as described in Chapter 6, the primary T4P operon is found at the same genomic locus in C. sordellii, indicating that the primary T4P operon was present at that position in the genome of the common ancestor of those two species. Estimates for the age of C. difficile range somewhat incredibly from 1.1 to 85 million years (He et al., 2010); either way, it is apparent that the primary T4P operon has been stably embedded in the chromosome of these species for an extremely long time. The structure of PilA1, the major pilin, has been solved from 3 different strains of C. difficile, and was reported to have a T4bP-like fold (Piepenbrink et al., 2015). However, PilA1 from strain 630 has 171 amino acids, which is rather in-between the expected lengths of a T4a pilin and a T4b pilin, and its leader peptide appears to be only 9 amino acids long, which is a length associated with a T4a pilin rather than a T4b pilin. Similarly, PilU has a length in between those expected for T4a pilins and T4b pilins (175 amino acids) and has a leader peptide only 9 amino acids long. PilV is a little more T4b pilin-like, in that it is 189 amino acids long and has an 11 amino acid leader, though both of these values are less than might be predicted for a Gram-negative T4b pilin. It also appears that the T4P encoded by the primary T4P operon in C. difficile do not mediate motility, which is a characteristic associated with Gram-negative T4bP. Overall, it seems that the T4P of C. difficile have characteristics associated with both the T4aP and T4bP of Gram-negative species. This is not necessarily unexpected, as presumably all Gram-negative T4P (both T4a and T4b types) and Gram-positive T4P evolved from a common ancestor. Certain Gram-positive T4P have been specifically identified as being of T4b-type (O'Connell Motherway et al., 2011); arguably, given that T4bP are often horizontally transferred between species, one would expect at least some Gram-positive species to acquire them from Gram-negatives where they were initially identified and appear to be primarily found. Overall it would have been surprising if Gram-positive T4P in general neatly fitted into either the T4aP or the T4bP box. They should instead be considered as a third type of T4P in their own right.

225

7.3 C. difficile as a Model for Investigation of Gram-Positive Type IV Pili One of the stated aims of this project was to develop the T4P of C. difficile as a model of Gram- positive T4P. On the face of it, this may sound odd. Obviously, C. difficile is an obligate anaerobe, requiring specialist equipment to work with. Furthermore, Clostridia generally, and possibly C. difficile in particular, are challenging to work with. However, I would venture that the idea was more plausible than it sounds, though of course, the issue of its being anaerobic and requiring specialist equipment is unalterable. Firstly, though C. difficile is a human pathogen, it is only category 2 and generally poses very little threat to young, healthy researchers. Many other Gram-positive species known to encode T4P, particularly Clostridial species, are or have the potential to be considerably more dangerous to work with (e.g. C. perfringens, C. sordellii etc.). Furthermore, due to recent advances in genetic techniques, strain 630 in particular is now relatively straightforward to work with. These facts supported the potential use of C. difficile T4P as a model. Certain of my results offer further support for using the primary T4P of C. difficile as a model for Gram-positive T4P. For instance, the fact that the primary T4P operon appears to operate independently of the secondary T4P gene cluster is beneficial for this, as obviously if genes from this secondary cluster operated in a redundant fashion to those of the primary cluster, this would significantly increase the complexity of the system. Also, the facts that synthesis of T4P is relatively easy to assay and T4P mutants are easy to complement is beneficial, as is the fact that it is straightforward to raise antibodies at least against the pilin genes from the cluster, these antibodies having high affinity and specificity for their targets. The ease with which the expression of these T4P may be manipulated is also advantageous. However, apart from its requirement for anaerobic conditions, the central difficulty in using the primary T4P of C. difficile as a model for Gram-positive T4P is that there is no simple assay which can be used to identify whether the pili are functional. The only functions currently associated with the primary T4P of C. difficile are aggregation and biofilm formation. However, as discussed in Chapter 5, biofilm assays for use with C. difficile are at best challenging to perform, and at worst impossible to obtain accurate results from. Aggregation assays, on the other hand, though straightforward to perform, can sometimes produce somewhat variable results. Also, the fact that the majority of aggregation seen in C. difficile in response to increased levels of c-di-GMP is not T4P-related counts against using this as a method to identify active T4P. Though the assay may be sufficient to use in research on

226

C. difficile, for a model system one would prefer a T4P system whose functionality can be easily assayed, using an assay which gives a simple yes/no answer for activity rather than requiring statistical analysis and giving results which are not always clear-cut. In this regard, the T4P of Streptococcus sanguinis are ideal, in that they mediate twitching motility. It is obvious from their colony morphology when these T4P are functional, giving an easy and reliable test for T4P functionality (Gurung et al., 2016). S. sanguinis is also highly genetically tractable. Thus, although the T4P of S. sanguinis do not appear ideal to function as a model, given that they have the very unusual trait of possessing two major pilins, and containing within the T4P gene cluster a number of genes of unknown function which may or may not be T4P- related, for the moment these T4P may be a preferable T4P system to use as a model for Gram- positive T4P.

7.4 Conclusions in Relation to C. sordellii and Future Work Though the majority of this thesis focuses on the T4P of C. difficile, it may be that the most important work performed was that on C. sordellii. As described in that chapter, I have herein demonstrated that the C. sordellii Cytotoxins TcsL and TcsH are encoded on a plasmid, known as pCS1. This plasmid appears to be unstable: at least one strain of C. sordellii has been shown to have lost a plasmid encoding TcsL during culture since it was isolated last decade, so unless multiple different types of TcsL-encoding plasmids exist, pCS1-type plasmids can be assumed to be (or to have the potential to be) unstable. Either way, it is clear that the tcsL gene, when present in a strain, is located on a mobile element which may easily be lost. This discovery directly impacts upon the methods used in the diagnosis of C. sordellii disease. Such infections have previously been most commonly diagnosed by PCR on isolated, cultured pathogen, targeting tcsL. However, this has resulted in highly virulent infections being improperly diagnosed (Bouvet et al., 2015), apparently because the infecting strain loses the pCS1 plasmid during culture. Having demonstrated that tcsL is plasmid-localised, and that this plasmid is unstable, this study unambiguously demonstrates that amplification of tcsL is an unsuitable method for C. sordellii identification. On the other hand, several C. sordellii genes have been identified which are distinctive to the species. For instance, sordellilysin (sdl) is a gene specific to C. sordellii and which, based on our analysis of 44 strains, appears to be universally present on the C. sordellii chromosome. Amplification of e.g. sdl would thus appear to be a vastly superior method of identification of the species, which I would recommend be used in clinical settings. It is therefore hoped that this thesis can contribute to improved and faster diagnosis of this terrible disease.

227

As described in Chapters 1 and 6, in addition to the LCCs TcsL and TcsH, C. sordellii strains tend to encode a large number of other putative exotoxins, including Sordellilysin, NanS, Csp, ColA and possibly others. However, as mentioned, very little work has been done to investigate the roles of any of these proteins in C. sordellii disease. Further work will inevitably and necessarily be undertaken on TcsL, it apparently being the key virulence factor of the species. It is to be hoped that work will also be undertaken to elucidate the roles of TcsH (about which little is currently known) and other exotoxins such as those mentioned above. Several other possible points of interest aside from the toxins have also been identified in the genomes of various of the C. sordellii strains sequenced, as described in (Couchman et al., 2015), such as putative lantibiotic production genes and genes predicted to be involved in immune evasion. Research into these genes and others would also be very interesting and possibly medically relevant.

228

Bibliography

Abbot, E.L., Smith, W.D., Siou, G.P., Chiriboga, C., Smith, R.J., Wilson, J.A., Hirst, B.H., and Kehoe, M.A. (2007). Pili mediate specific adhesion of Streptococcus pyogenes to human tonsil and skin. Cell Microbiol 9, 1822-1833. Abdulla, A., and Yee, L. (2000). The clinical spectrum of Clostridium sordellii bacteraemia: Two case reports and a review of the literature. J Clin Pathol 53, 709-712. Adamu, B.O., and Lawley, T.D. (2013). Bacteriotherapy for the treatment of intestinal dysbiosis caused by Clostridium difficile infection. Curr Opin Microbiol 16, 596-601. Aktories, K., and Wegner, A. (1992). Mechanisms of the cytopathic action of actin-ADP- ribosylating toxins. Mol Microbiol 6, 2905-2908.

Aldape, M.J., Bryant, A.E., Ma, Y., and Stevens, D.L. (2007). The leukemoid reaction in Clostridium sordellii infection: Neuraminidase induction of promyelocytic cell proliferation. J Infect Dis 195, 1838-1845. Aldape, M.J., Bryant, A.E., and Stevens, D.L. (2006). Clostridium sordellii infection: Epidemiology, clinical findings, and current perspectives on diagnosis and treatment. Clin Infect Dis 43, 1436-1446. Anantha, R.P., Stone, K.D., and Donnenberg, M.S. (2000). Effects of bfp mutations on biogenesis of functional enteropathogenic Escherichia coli Type IV pili. J Bacteriol 182, 2498- 2506. Anderson, T.F. (1949). The nature of the bacterial surface. (Blackwell, Oxford).

Aronoff, D.M. (2013). Clostridium novyi, sordellii, and tetani: Mechanisms of disease. Anaerobe 24, 98-101. Aronoff, D.M., Hao, Y., Chung, J., Coleman, N., Lewis, C., Peres, C.M., Serezani, C.H., Chen, G.H., Flamand, N., Brock, T.G., et al. (2008). Misoprostol impairs female reproductive tract innate immunity against Clostridium sordellii. J Immunol 180, 8222-8230. Ayers, M., Sampaleanu, L.M., Tammam, S., Koo, J., Harvey, H., Howell, P.L., and Burrows, L.L. (2009). PilM/N/O/P proteins form an inner membrane complex that affects the stability of the Pseudomonas aeruginosa Type IV pilus secretin. J Mol Biol 394, 128-142. Barnes, P., and Leedom, J.M. (1987). Infective endocarditis due to Clostridium sordellii. Am J Med 83, 605.

Barocchi, M.A., Ries, J., Zogaj, X., Hemsley, C., Albiger, B., Kanth, A., Dahlberg, S., Fernebro, J., Moschioni, M., Masignani, V., et al. (2006). A pneumococcal pilus influences virulence and host inflammatory responses. Proc Natl Acad Sci U S A 103, 2857-2862. Barrick, J.E., and Breaker, R.R. (2007). The distributions, mechanisms, and structures of metabolite-binding riboswitches. Genome Biol 8, R239.

229

Bartlett, J.G., Moon, N., Chang, T.W., Taylor, N., and Onderdonk, A.B. (1978). Role of Clostridium difficile in antibiotic-associated pseudomembranous colitis. Gastroenterology 75, 778-782. Belete, B., Lu, H., and Wozniak, D.J. (2008). Pseudomonas aeruginosa AlgR regulates Type IV pilus biosynthesis by activating transcription of the fimU-pilVWXY1Y2E operon. J Bacteriol 190, 2023-2030. Berg, H.C. (2003). The rotary motor of bacterial flagella. Annu Rev Biochem 72, 19-54. Berry, J.L., Cehovin, A., McDowell, M.A., Lea, S.M., and Pelicic, V. (2013). Functional analysis of the interdependence between DNA uptake sequence and its cognate ComP receptor during natural transformation in Neisseria species. PLoS Genet 9, e1004014. Berry, J.L., and Pelicic, V. (2015). Exceptionally widespread nanomachines composed of Type IV pilins: The prokaryotic Swiss Army knives. FEMS Microbiol Rev 39, 134-154. Berry, J.L., Phelan, M.M., Collins, R.F., Adomavicius, T., Tønjum, T., Frye, S.A., Bird, L., Owens, R., Ford, R.C., Lian, L.Y., et al. (2012). Structure and assembly of a trans-periplasmic channel for Type IV pili in Neisseria meningitidis. PLoS Pathog 8, e1002923. Bian, Z., and Normark, S. (1997). Nucleator function of CsgB for the assembly of adhesive surface organelles in Escherichia coli. EMBO J 16, 5827-5836. Bidet, P., Barbut, F., Lalande, V., Burghoffer, B., and Petit, J.C. (1999). Development of a new PCR-ribotyping method for Clostridium difficile based on ribosomal RNA gene sequencing. FEMS Microbiol Lett 175, 261-266. Bieber, D., Ramer, S.W., Wu, C.Y., Murray, W.J., Tobe, T., Fernandez, R., and Schoolnik, G.K. (1998). Type IV pili, transient bacterial aggregates, and virulence of enteropathogenic Escherichia coli. Science 280, 2114-2118. Bignell, C., and Thomas, C.M. (2001). The bacterial ParA-ParB partitioning proteins. J Biotechnol 91, 1-34. Black, W.P., Xu, Q., and Yang, Z. (2006). Type IV pili function upstream of the Dif chemotaxis pathway in Myxococcus xanthus EPS regulation. Mol Microbiol 61, 447-456. Blank, T.E., and Donnenberg, M.S. (2001). Novel topology of BfpE, a cytoplasmic membrane protein required for Type IV fimbrial biogenesis in enteropathogenic Escherichia coli. J Bacteriol 183, 4435-4450. Blank, T.E., Zhong, H., Bell, A.L., Whittam, T.S., and Donnenberg, M.S. (2000). Molecular variation among Type IV pilin (bfpA) genes from diverse enteropathogenic Escherichia coli strains. Infect Immun 68, 7028-7038. Bordeleau, E., Fortier, L.C., Malouin, F., and Burrus, V. (2011). C-di-GMP turn-over in Clostridium difficile is controlled by a plethora of diguanylate cyclases and phosphodiesterases. PLoS Genet 7, e1002039.

230

Bordeleau, E., Purcell, E.B., Lafontaine, D.A., Fortier, L.C., Tamayo, R., and Burrus, V. (2015). Cyclic di-GMP riboswitch-regulated Type IV pili contribute to aggregation of Clostridium difficile. J Bacteriol 197, 819-832. Borriello, S.P., Davies, H.A., Kamiya, S., Reed, P.J., and Seddon, S. (1990). Virulence factors of Clostridium difficile. Rev Infect Dis 12 Suppl 2, S185-191. Bouvet, P., Sautereau, J., Le Coustumier, A., Mory, F., Bouchier, C., and Popoff, M.R. (2015). Foot infection by Clostridium sordellii: Case report and review of 15 cases in France. J Clin Microbiol 53, 1423-1427. Bradley, D.E. (1972). Shortening of Pseudomonas aeruginosa pili after RNA-phage adsorption. J Gen Microbiol 72, 303-319. Braun, V., Hundsberger, T., Leukel, P., Sauerborn, M., and von Eichel-Streiber, C. (1996). Definition of the single integration site of the pathogenicity locus in Clostridium difficile. Gene 181, 29-38. Breaker, R.R. (2008). Complex riboswitches. Science 319, 1795-1797. Brinton, C.C. (1959). Non-flagellar appendages of bacteria. Nature 183, 782-786. Brouwer, M.S., Roberts, A.P., Hussain, H., Williams, R.J., Allan, E., and Mullany, P. (2013). Horizontal gene transfer converts non-toxigenic Clostridium difficile strains into toxin producers. Nat Commun 4, 2601. Brown, D.R., Helaine, S., Carbonnelle, E., and Pelicic, V. (2010). Systematic functional analysis reveals that a set of seven genes is involved in fine-tuning of the multiple functions mediated by Type IV pili in Neisseria meningitidis. Infect Immun 78, 3053-3063.

Brown, N.P., Leroy, C., and Sander, C. (1998). MView: A web-compatible database search or multiple alignment viewer. Bioinformatics 14, 380-381. Bucior, I., Pielage, J.F., and Engel, J.N. (2012). Pseudomonas aeruginosa pili and flagella mediate distinct binding and signaling events at the apical and basolateral surface of airway epithelium. PLoS Pathog 8, e1002616. Burrows, L.L. (2005). Weapons of mass retraction. Mol Microbiol 57, 878-888. Burrows, L.L. (2012). Pseudomonas aeruginosa twitching motility: Type IV pili in action. Annu Rev Microbiol 66, 493-520. Busch, A., Phan, G., and Waksman, G. (2015). Molecular mechanism of bacterial Type 1 and P pili assembly. Philos Trans A Math Phys Eng Sci 373. Carbonnelle, E., Helaine, S., Nassif, X., and Pelicic, V. (2006). A systematic genetic analysis in Neisseria meningitidis defines the Pil proteins required for assembly, functionality, stabilization and export of Type IV pili. Mol Microbiol 61, 1510-1522.

231

Carbonnelle, E., Hélaine, S., Prouvensier, L., Nassif, X., and Pelicic, V. (2005). Type IV pilus biogenesis in Neisseria meningitidis: PilW is involved in a step occurring after pilus assembly, essential for fibre stability and function. Mol Microbiol 55, 54-64. Carter, G.P., Awad, M.M., Hao, Y., Thelen, T., Bergin, I.L., Howarth, P.M., Seemann, T., Rood, J.I., Aronoff, D.M., and Lyras, D. (2011). TcsL is an essential virulence factor in Clostridium sordellii ATCC 9714. Infect Immun 79, 1025-1032. Cartman, S.T., Kelly, M.L., Heeg, D., Heap, J.T., and Minton, N.P. (2012). Precise manipulation of the Clostridium difficile chromosome reveals a lack of association between the tcdC genotype and toxin production. Appl Environ Microbiol 78, 4683-4690. Cartman, S.T., and Minton, N.P. (2010). A mariner-based transposon system for in vivo random mutagenesis of Clostridium difficile. Appl Environ Microbiol 76, 1103-1109.

CDC-Press-Release (2015). Nearly half a million Americans suffered from Clostridium difficile infections in a single year. (http://www.cdc.gov/media/releases/2015/p0225- clostridium-difficile.html). Cehovin, A., Simpson, P.J., McDowell, M.A., Brown, D.R., Noschese, R., Pallett, M., Brady, J., Baldwin, G.S., Lea, S.M., Matthews, S.J., et al. (2013). Specific DNA recognition mediated by a Type IV pilin. Proc Natl Acad Sci U S A 110, 3065-3070. Chakraborty, S., Monfett, M., Maier, T.M., Benach, J.L., Frank, D.W., and Thanassi, D.G. (2008). Type IV pili in Francisella tularensis: roles of pilF and pilT in fiber assembly, host cell adherence, and virulence. Infect Immun 76, 2852-2861. Chaves-Olarte, E., Weidmann, M., Eichel-Streiber, C., and Thelestam, M. (1997). Toxins A and B from Clostridium difficile differ with respect to enzymatic potencies, cellular substrate specificities, and surface binding to cultured cells. J Clin Invest 100, 1734-1741. Chien, C.H., Shann, Y.J., and Sheu, S.Y. (1996). Site-directed mutations of the catalytic and conserved amino acids of the neuraminidase gene, nanH, of Clostridium perfringens ATCC 10543. Enzyme Microb Technol 19, 267-276. Clarke, T.B. (2014). Microbial programming of systemic innate immunity and resistance to infection. PLoS Pathog 10, e1004506. Cole, A. (2013). MRSA and C. difficile deaths continue to fall in England and Wales. BMJ 347, f5278. Comolli, J.C., Hauser, A.R., Waite, L., Whitchurch, C.B., Mattick, J.S., and Engel, J.N. (1999). Pseudomonas aeruginosa gene products PilT and PilU are required for cytotoxicity in vitro and virulence in a mouse model of acute pneumonia. Infect Immun 67, 3625-3630. Conrad, J.C., Gibiansky, M.L., Jin, F., Gordon, V.D., Motto, D.A., Mathewson, M.A., Stopka, W.G., Zelasko, D.C., Shrout, J.D., and Wong, G.C. (2011). Flagella and pili-mediated near- surface single-cell motility mechanisms in P. aeruginosa. Biophys J 100, 1608-1616.

232

Couchman, E.C., Browne, H.P., Dunn, M., Lawley, T.D., Songer, J.G., Hall, V., Petrovska, L., Vidor, C., Awad, M., Lyras, D., et al. (2015). Clostridium sordellii genome analysis reveals plasmid localized toxin genes encoded within pathogenicity loci. BMC Genomics 16, 392. Craig, L., and Li, J. (2008). Type IV pili: Paradoxes in form and function. Curr Opin Struct Biol 18, 267-277. Craig, L., Pique, M.E., and Tainer, J.A. (2004). Type IV pilus structure and bacterial pathogenicity. Nat Rev Microbiol 2, 363-378. Crowther, L.J., Anantha, R.P., and Donnenberg, M.S. (2004). The inner membrane subassembly of the enteropathogenic Escherichia coli bundle-forming pilus machine. Mol Microbiol 52, 67-79. Das, G., and Varshney, U. (2006). Peptidyl-tRNA hydrolase and its critical role in protein biosynthesis. Microbiology 152, 2191-2195. Dawson, L.F., Valiente, E., Faulds-Pain, A., Donahue, E.H., and Wren, B.W. (2012). Characterisation of Clostridium difficile biofilm formation, a role for Spo0A. PLoS One 7, e50527. Deakin, L.J., Clare, S., Fagan, R.P., Dawson, L.F., Pickard, D.J., West, M.R., Wren, B.W., Fairweather, N.F., Dougan, G., and Lawley, T.D. (2012). The Clostridium difficile spo0A gene is a persistence and transmission factor. Infect Immun 80, 2704-2711. Dembek, M., Barquist, L., Boinett, C.J., Cain, A.K., Mayho, M., Lawley, T.D., Fairweather, N.F., and Fagan, R.P. (2015). High-throughput analysis of gene essentiality and sporulation in Clostridium difficile. MBio 6, e02383.

Drake, S.L., Sandstedt, S.A., and Koomey, M. (1997). PilP, a pilus biogenesis lipoprotein in Neisseria gonorrhoeae, affects expression of PilQ as a high-molecular-mass multimer. Mol Microbiol 23, 657-668. Drudy, D., Fanning, S., and Kyne, L. (2007). Toxin A-negative, toxin B-positive Clostridium difficile. Int J Infect Dis 11, 5-10. Dupuy, B., Daube, G., Popoff, M.R., and Cole, S.T. (1997). Clostridium perfringens urease genes are plasmid borne. Infect Immun 65, 2313-2320. Dupuy, B., Govind, R., Antunes, A., and Matamouros, S. (2008). Clostridium difficile toxin synthesis is negatively regulated by TcdC. J Med Microbiol 57, 685-689. Ðapa, T., Leuzzi, R., Ng, Y.K., Baban, S.T., Adamo, R., Kuehne, S.A., Scarselli, M., Minton, N.P., Serruto, D., et al. (2013). Multiple factors modulate biofilm formation by the anaerobic pathogen Clostridium difficile. J Bacteriol 195, 545-555. Edgar, R.C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32, 1792-1797. Elsayed, S., and Zhang, K. (2004). Human infection caused by Clostridium hathewayi. Emerg Infect Dis 10, 1950-1952.

233

Emerson, J.E., Reynolds, C.B., Fagan, R.P., Shaw, H.A., Goulding, D., and Fairweather, N.F. (2009). A novel genetic switch controls phase variable expression of CwpV, a Clostridium difficile cell wall protein. Mol Microbiol 74, 541-556. Evans, K.J., Lambert, C., and Sockett, R.E. (2007). Predation by Bdellovibrio bacteriovorus HD100 requires Type IV pili. J Bacteriol 189, 4850-4859. Fagan, R.P., and Fairweather, N.F. (2011). Clostridium difficile has two parallel and essential Sec secretion systems. J Biol Chem 286, 27483-27493. Fagan, R.P., Janoir, C., Collignon, A., Mastrantonio, P., Poxton, I.R., and Fairweather, N.F. (2011). A proposed nomenclature for cell wall proteins of Clostridium difficile. J Med Microbiol 60, 1225-1228. File, T.M., Fass, R.J., and Perkins, R.L. (1977). Pneumonia and empyema caused by Clostridium sordellii. Am J Med Sci 274, 211-212. Fischer, M., Bhatnagar, J., Guarner, J., Reagan, S., Hacker, J.K., Van Meter, S.H., Poukens, V., Whiteman, D.B., Iton, A., Cheung, M., et al. (2005). Fatal toxic shock syndrome associated with Clostridium sordellii after medical abortion. N Engl J Med 353, 2352-2360. Fives-Taylor, P.M., and Thompson, D.W. (1985). Surface properties of Streptococcus sanguis FW213 mutants nonadherent to saliva-coated hydroxyapatite. Infect Immun 47, 752-759. Francetic, O., Buddelmeijer, N., Lewenza, S., Kumamoto, C.A., and Pugsley, A.P. (2007). Signal recognition particle-dependent inner membrane targeting of the PulG pseudopilin component of a Type II secretion system. J Bacteriol 189, 1783-1793. Francis, M.B., Allen, C.A., Shrestha, R., and Sorg, J.A. (2013). Bile acid recognition by the Clostridium difficile germinant receptor, CspC, is important for establishing infection. PLoS Pathog 9, e1003356. Friedrich, C., Bulyha, I., and Søgaard-Andersen, L. (2014). Outside-in assembly pathway of the Type IV pilus system in Myxococcus xanthus. J Bacteriol 196, 378-390. Fu, H.L., Valiathan, R.R., Arkwright, R., Sohail, A., Mihai, C., Kumarasiri, M., Mahasenan, K.V., Mobashery, S., Huang, P., Agarwal, G., et al. (2013). Discoidin domain receptors: Unique receptor tyrosine kinases in collagen-mediated signaling. J Biol Chem 288, 7430-7437. Fu, L., Niu, B., Zhu, Z., Wu, S., and Li, W. (2012). CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150-3152. Furci, L., Baldan, R., Bianchini, V., Trovato, A., Ossi, C., Cichero, P., and Cirillo, D.M. (2015). New role for human α-defensin 5 in the fight against hypervirulent Clostridium difficile strains. Infect Immun 83, 986-995. Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., and Bairoch, A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res 31, 3784-3788.

234

Georgiadou, M., Castagnini, M., Karimova, G., Ladant, D., and Pelicic, V. (2012). Large-scale study of the interactions between proteins involved in Type IV pilus biology in Neisseria meningitidis: Characterization of a subcomplex involved in pilus assembly. Mol Microbiol 84, 857-873. Gerding, D.N., File, T.M., and McDonald. (2016). Diagnosis and treatment of Clostridium difficile infection. Infectious Diseases in Clinical Practice 24, 3-10. Gerding, D.N., Johnson, S., Rupnik, M., and Aktories, K. (2014). Clostridium difficile binary toxin CDT: Mechanism, epidemiology, and potential clinical importance. Gut Microbes 5, 15- 27. Ghosh, A., and Albers, S.V. (2011). Assembly and function of the archaeal flagellum. Biochem Soc Trans 39, 64-69.

Giesemann, T., Guttenberg, G., and Aktories, K. (2008). Human alpha-defensins inhibit Clostridium difficile Toxin B. Gastroenterology 134, 2049-2058. Giltner, C.L., Habash, M., and Burrows, L.L. (2010). Pseudomonas aeruginosa minor pilins are incorporated into Type IV pili. J Mol Biol 398, 444-461. Giltner, C.L., Nguyen, Y., and Burrows, L.L. (2012). Type IV pilin proteins: Versatile molecular modules. Microbiol Mol Biol Rev 76, 740-772. Gold, V.A., Salzer, R., Averhoff, B., and Kühlbrandt, W. (2015). Structure of a Type IV pilus machinery in the open and closed state. Elife 4. Goldenberg, S.D., and French, G.L. (2011). Lack of association of tcdC type and binary toxin status with disease severity and outcome in toxigenic Clostridium difficile. J Infect 62, 355- 362. Gough, E., Shaikh, H., and Manges, A.R. (2011). Systematic review of intestinal microbiota transplantation (fecal bacteriotherapy) for recurrent Clostridium difficile infection. Clin Infect Dis 53, 994-1002. Goulding, D., Thompson, H., Emerson, J., Fairweather, N.F., Dougan, G., and Douce, G.R. (2009). Distinctive profiles of infection and pathology in hamsters infected with Clostridium difficile strains 630 and B1. Infect Immun 77, 5478-5485. Govind, R., and Dupuy, B. (2012). Secretion of Clostridium difficile Toxins A and B requires the holin-like protein TcdE. PLoS Pathog 8, e1002727. Gredlein, C.M., Silverman, M.L., and Downey, M.S. (2000). Polymicrobial septic arthritis due to Clostridium species: Case report and review. Clin Infect Dis 30, 590-594. Griffiths, D., Fawley, W., Kachrimanidou, M., Bowden, R., Crook, D.W., Fung, R., Golubchik, T., Harding, R.M., Jeffery, K.J., Jolley, K.A., et al. (2010). Multilocus sequence typing of Clostridium difficile. J Clin Microbiol 48, 770-778.

235

Gurung, I., Spielman, I., Davies, M.R., Lala, R., Gaustad, P., Biais, N., and Pelicic, V. (2016). Functional analysis of an unusual Type IV pilus in the Gram-positive Streptococcus sanguinis. Mol Microbiol 99, 380-392. Hahn, H.P. (1997). The Type-4 pilus is the major virulence-associated adhesin of Pseudomonas aeruginosa - a review. Gene 192, 99-108. Hall, I.C., and Scott, J.P. (1927). Bacillus sordellii, a cause of malignant edema in man. J. Infect. Dis. 41, 329-335. Hambrook, J., Titball, R., and Lindsay, C. (2004). The interaction of Pseudomonas aeruginosa PAK with human and animal respiratory tract cell lines. FEMS Microbiol Lett 238, 49-55. Hammond, G.A., and Johnson, J.L. (1995). The toxigenic element of Clostridium difficile strain VPI 10463. Microb Pathog 19, 203-213.

Hanawalt, P.C., and Spivak, G. (2008). Transcription-coupled DNA repair: Two decades of progress and surprises. Nat Rev Mol Cell Biol 9, 958-970. Hao, Y., Senn, T., Opp, J.S., Young, V.B., Thiele, T., Srinivas, G., Huang, S.K., and Aronoff, D.M. (2010). Lethal Toxin is a critical determinant of rapid mortality in rodent models of Clostridium sordellii endometritis. Anaerobe 16, 155-160. Hayashi, N., Nishizawa, H., Kitao, S., Deguchi, S., Nakamura, T., Fujimoto, A., Shikata, M., and Gotoh, N. (2015). Pseudomonas aeruginosa injects Type III effector ExoS into epithelial cells through the function of Type IV pili. FEBS Lett 589, 890-896. He, M., Sebaihia, M., Lawley, T.D., Stabler, R.A., Dawson, L.F., Martin, M.J., Holt, K.E., Seth-Smith, H.M., Quail, M.A., Rance, R., et al. (2010). Evolutionary dynamics of Clostridium difficile over short and long time scales. Proc Natl Acad Sci U S A 107, 7527-7532. Heap, J.T., Kuehne, S.A., Ehsaan, M., Cartman, S.T., Cooksley, C.M., Scott, J.C., and Minton, N.P. (2010). The ClosTron: Mutagenesis in Clostridium refined and streamlined. J Microbiol Methods 80, 49-55. Heap, J.T., Pennington, O.J., Cartman, S.T., Carter, G.P., and Minton, N.P. (2007). The ClosTron: A universal gene knock-out system for the genus Clostridium. J Microbiol Methods 70, 452-464. Heiniger, R.W., Winther-Larsen, H.C., Pickles, R.J., Koomey, M., and Wolfgang, M.C. (2010). Infection of human mucosal tissue by Pseudomonas aeruginosa requires sequential and mutually dependent virulence factors and a novel pilus-associated adhesin. Cell Microbiol 12, 1158-1173. Heinlen, L., and Ballard, J.D. (2010). Clostridium difficile infection. Am J Med Sci 340, 247- 252. Hélaine, S., Carbonnelle, E., Prouvensier, L., Beretti, J.L., Nassif, X., and Pelicic, V. (2005). PilX, a pilus-associated protein essential for bacterial aggregation, is a key to pilus-facilitated attachment of Neisseria meningitidis to human cells. Mol Microbiol 55, 65-77.

236

Henriksen, S.D., and Henrichsen, J. (1975). Twitching motility and possession of polar fimbriae in spreading Streptococcus sanguis isolates from the human throat. Acta Pathol Microbiol Scand B 83, 133-140. Hilleringmann, M., Giusti, F., Baudner, B.C., Masignani, V., Covacci, A., Rappuoli, R., Barocchi, M.A., and Ferlenghi, I. (2008). Pneumococcal pili are composed of protofilaments exposing adhesive clusters of RrgA. PLoS Pathog 4, e1000026. Hodgkinson, J.L., Horsley, A., Stabat, D., Simon, M., Johnson, S., da Fonseca, P.C., Morris, E.P., Wall, J.S., Lea, S.M., and Blocker, A.J. (2009). Three-dimensional reconstruction of the Shigella T3SS transmembrane regions reveals 12-fold symmetry and novel features throughout. Nat Struct Mol Biol 16, 477-485. Hope, R. (2015). Surveillance of Clostridium difficile. P.H. England, ed.

Høiby, N., Ciofu, O., and Bjarnsholt, T. (2010). Pseudomonas aeruginosa biofilms in cystic fibrosis. Future Microbiol 5, 1663-1674. Huang, B., Whitchurch, C.B., and Mattick, J.S. (2003). FimX, a multidomain protein connecting environmental signals to twitching motility in Pseudomonas aeruginosa. J Bacteriol 185, 7068-7076. Hussain, H.A., Roberts, A.P., and Mullany, P. (2005). Generation of an erythromycin-sensitive derivative of Clostridium difficile strain 630 (630Δerm) and demonstration that the conjugative transposon Tn916ΔE enters the genome of this strain at multiple sites. J Med Microbiol 54, 137-141. Hyatt, D., Chen, G.L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. (2010). Prodigal: Prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119. Hyyryläinen, H.L., Marciniak, B.C., Dahncke, K., Pietiäinen, M., Courtin, P., Vitikainen, M., Seppala, R., Otto, A., Becher, D., Chapot-Chartier, M.P., et al. (2010). Penicillin-binding protein folding is dependent on the PrsA peptidyl-prolyl cis-trans isomerase in Bacillus subtilis. Mol Microbiol 77, 108-127. Imam, S., Chen, Z., Roos, D.S., and Pohlschröder, M. (2011). Identification of surprisingly diverse Type IV pili, across a broad range of Gram-positive bacteria. PLoS One 6, e28919. Izoré, T., Contreras-Martel, C., El Mortaji, L., Manzano, C., Terrasse, R., Vernet, T., Di Guilmi, A.M., and Dessen, A. (2010). Structural basis of host cell recognition by the pilus adhesin from Streptococcus pneumoniae. Structure 18, 106-115.

Jain, R., Behrens, A.J., Kaever, V., and Kazmierczak, B.I. (2012). Type IV pilus assembly in Pseudomonas aeruginosa over a broad range of cyclic di-GMP concentrations. J Bacteriol 194, 4285-4294. Johnston, C., Martin, B., Fichant, G., Polard, P., and Claverys, J.P. (2014). Bacterial transformation: Distribution, shared mechanisms and divergent control. Nat Rev Microbiol 12, 181-196.

237

Just, I., and Gerhard, R. (2004). Large Clostridial cytotoxins. Rev Physiol Biochem Pharmacol 152, 23-47. Käll, L., Krogh, A., and Sonnhammer, E.L. (2007). Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic Acids Res 35, W429- 432. Kang, H.J., and Baker, E.N. (2012). Structure and assembly of Gram-positive bacterial pili: Unique covalent polymers. Curr Opin Struct Biol 22, 200-207. Kang, H.J., Coulibaly, F., Clow, F., Proft, T., and Baker, E.N. (2007). Stabilizing isopeptide bonds revealed in Gram-positive bacterial pilus structure. Science 318, 1625-1628. Karasawa, T., Wang, X., Maegawa, T., Michiwa, Y., Kita, H., Miwa, K., and Nakamura, S. (2003). Clostridium sordellii phospholipase C: Gene cloning and comparison of enzymatic and biological activities with those of Clostridium perfringens and Clostridium bifermentans phospholipase C. Infect Immun 71, 641-646. Karimova, G., Pidoux, J., Ullmann, A., and Ladant, D. (1998). A bacterial two-hybrid system based on a reconstituted signal transduction pathway. Proc Natl Acad Sci U S A 95, 5752- 5756. Karimova, G., Ullmann, A., and Ladant, D. (2001). Protein-protein interaction between Bacillus stearothermophilus tyrosyl-tRNA synthetase subdomains revealed by a bacterial two- hybrid system. J Mol Microbiol Biotechnol 3, 73-82. Karuppiah, V., Collins, R.F., Thistlethwaite, A., Gao, Y., and Derrick, J.P. (2013). Structure and assembly of an inner membrane platform for initiation of Type IV pilus biogenesis. Proc Natl Acad Sci U S A 110, E4638-4647. Karuppiah, V., and Derrick, J.P. (2011). Structure of the PilM-PilN inner membrane Type IV pilus biogenesis complex from Thermus thermophilus. J Biol Chem 286, 24434-24442. Kazmierczak, B.I., Lebron, M.B., and Murray, T.S. (2006). Analysis of FimX, a phosphodiesterase that governs twitching motility in Pseudomonas aeruginosa. Mol Microbiol 60, 1026-1043. Kearns, D.B., Robinson, J., and Shimkets, L.J. (2001). Pseudomonas aeruginosa exhibits directed twitching motility up phosphatidylethanolamine gradients. J Bacteriol 183, 763-767. Kelley, L.A., and Sternberg, M.J. (2009). Protein structure prediction on the Web: A case study using the Phyre server. Nat Protoc 4, 363-371.

Khayatan, B., Meeks, J.C., and Risser, D.D. (2015). Evidence that a modified Type IV pilus- like system powers gliding motility and polysaccharide secretion in filamentous cyanobacteria. Mol Microbiol 98, 1021-1036. Klausen, M., Heydorn, A., Ragas, P., Lambertsen, L., Aaes-Jørgensen, A., Molin, S., and Tolker-Nielsen, T. (2003). Biofilm formation by Pseudomonas aeruginosa wild type, flagella and Type IV pili mutants. Mol Microbiol 48, 1511-1524.

238

Koo, J., Tammam, S., Ku, S.Y., Sampaleanu, L.M., Burrows, L.L., and Howell, P.L. (2008). PilF is an outer membrane lipoprotein required for multimerization and localization of the Pseudomonas aeruginosa Type IV pilus secretin. J Bacteriol 190, 6961-6969. Korea, C.G., Ghigo, J.M., and Beloin, C. (2011). The sweet connection: Solving the riddle of multiple sugar-binding fimbrial adhesins in Escherichia coli: Multiple E. coli fimbriae form a versatile arsenal of sugar-binding lectins potentially involved in surface-colonisation and tissue tropism. Bioessays 33, 300-311. Korotkov, K.V., and Hol, W.G. (2008). Structure of the GspK-GspI-GspJ complex from the enterotoxigenic Escherichia coli Type 2 secretion system. Nat Struct Mol Biol 15, 462-468. Korotkov, K.V., Sandkvist, M., and Hol, W.G. (2012). The Type II secretion system: biogenesis, molecular architecture and mechanism. Nat Rev Microbiol 10, 336-351.

Kovacs-Simon, A., Leuzzi, R., Kasendra, M., Minton, N., Titball, R.W., and Michell, S.L. (2014). Lipoprotein CD0873 is a novel adhesin of Clostridium difficile. J Infect Dis 210(2): 274-84. Kreikemeyer, B., Nakata, M., Oehmcke, S., Gschwendtner, C., Normann, J., and Podbielski, A. (2005). Streptococcus pyogenes collagen Type I-binding Cpa surface protein. Expression profile, binding characteristics, biological functions, and potential clinical impact. J Biol Chem 280, 33228-33239. Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. (2001). Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J Mol Biol 305, 567-580.

Kuehne, S.A., Collery, M.M., Kelly, M.L., Cartman, S.T., Cockayne, A., and Minton, N.P. (2014). Importance of Toxin A, Toxin B, and CDT in virulence of an epidemic Clostridium difficile strain. J Infect Dis 209, 83-86. Kuipers, E.J., and Surawicz, C.M. (2008). Clostridium difficile infection. Lancet 371, 1486- 1488. Laarman, A.J., Ruyken, M., Malone, C.L., van Strijp, J.A., Horswill, A.R., and Rooijakkers, S.H. (2011). Staphylococcus aureus metalloprotease Aureolysin cleaves Complement C3 to mediate immune evasion. J Immunol 186, 6445-6453. LaFrance, M.E., Farrow, M.A., Chandrasekaran, R., Sheng, J., Rubin, D.H., and Lacy, D.B. (2015). Identification of an epithelial cell receptor responsible for Clostridium difficile TcdB- induced cytotoxicity. Proc Natl Acad Sci U S A 112, 7073-7078.

Lambert, C., Hobley, L., Chang, C.Y., Fenton, A., Capeness, M., and Sockett, L. (2009). A predatory patchwork: Membrane and surface structures of Bdellovibrio bacteriovorus. Adv Microb Physiol 54, 313-361. LaPointe, C.F., and Taylor, R.K. (2000). The Type 4 prepilin peptidases comprise a novel family of aspartic acid proteases. J Biol Chem 275, 1502-1510.

239

Lawley, T.D., Clare, S., Walker, A.W., Stares, M.D., Connor, T.R., Raisen, C., Goulding, D., Rad, R., Schreiber, F., Brandt, C., et al. (2012). Targeted restoration of the intestinal microbiota with a simple, defined bacteriotherapy resolves relapsing Clostridium difficile disease in mice. PLoS Pathog 8, e1002995. Lee, E.R., Baker, J.L., Weinberg, Z., Sudarsan, N., and Breaker, R.R. (2010). An allosteric self-splicing ribozyme triggered by a bacterial second messenger. Science 329, 845-848. Leighton, T.L., Buensuceso, R.N., Howell, P.L., and Burrows, L.L. (2015). Biogenesis of Pseudomonas aeruginosa Type IV pili and regulation of their function. Environ Microbiol 17, 4148-4163. Lessa, F.C., Gould, C.V., and McDonald, L.C. (2012). Current status of Clostridium difficile infection epidemiology. Clin Infect Dis 55 Suppl 2, S65-70.

Lessa, F.C., Mu, Y., Bamberg, W.M., Beldavs, Z.G., Dumyati, G.K., Dunn, J.R., Farley, M.M., Holzbauer, S.M., Meek, J.I., Phipps, E.C., et al. (2015). Burden of Clostridium difficile infection in the United States. N Engl J Med 372, 825-834. Lewis, C.J., and Naylor, R.D. (1998). Sudden death in sheep associated with Clostridium sordellii. Vet Rec 142, 417-421. Lieberman, J.A., Petro, C.D., Thomas, S., Yang, A., and Donnenberg, M.S. (2015). Type IV pilus secretins have extracellular C termini. MBio 6 e00322-15. Linke, C., Young, P.G., Kang, H.J., Bunker, R.D., Middleditch, M.J., Caradoc-Davies, T.T., Proft, T., and Baker, E.N. (2010). Crystal structure of the minor pilin FctB reveals determinants of Group A streptococcal pilus anchoring. J Biol Chem 285, 20381-20389.

Maldarelli, G.A., De Masi, L., von Rosenvinge, E.C., Carter, M., and Donnenberg, M.S. (2014). Identification, immunogenicity, and cross-reactivity of Type IV pilin and pilin-like proteins from Clostridium difficile. Pathog Dis 71: 302-14. Manetti, A.G., Zingaretti, C., Falugi, F., Capo, S., Bombaci, M., Bagnoli, F., Gambellini, G., Bensi, G., Mora, M., Edwards, A.M., et al. (2007). Streptococcus pyogenes pili promote pharyngeal cell adhesion and biofilm formation. Mol Microbiol 64, 968-983. Mani, N., and Dupuy, B. (2001). Regulation of toxin synthesis in Clostridium difficile by an alternative RNA polymerase sigma factor. Proc Natl Acad Sci U S A 98, 5844-5849. Manteca, C., Daube, G., Pirson, V., Limbourg, B., Kaeckenbeeck, A., and Mainil, J.G. (2001). Bacterial intestinal flora associated with enterotoxaemia in Belgian Blue calves. Vet Microbiol 81, 21-32. Martin, P.R., Watson, A.A., McCaul, T.F., and Mattick, J.S. (1995). Characterization of a five- gene cluster required for the biogenesis of Type 4 fimbriae in Pseudomonas aeruginosa. Mol Microbiol 16, 497-508. Martinez, R.D., and Wilkins, T.D. (1992). Comparison of Clostridium sordellii toxins HT and LT with Toxins A and B of C. difficile. J Med Microbiol 36, 30-36.

240

Matthey, N., and Blokesch, M. (2016). The DNA-uptake process of naturally competent Vibrio cholerae. Trends Microbiol 24, 98-110. Mattick, J.S. (2002). Type IV pili and twitching motility. Annu Rev Microbiol 56, 289-314. Mayer, M.J., Narbad, A., and Gasson, M.J. (2008). Molecular characterization of a Clostridium difficile bacteriophage and its cloned biologically active endolysin. J Bacteriol 190, 6734-6740. McDonald, L.C., Killgore, G.E., Thompson, A., Owens, R.C., Kazakova, S.V., Sambol, S.P., Johnson, S., and Gerding, D.N. (2005). An epidemic, toxin gene-variant strain of Clostridium difficile. N Engl J Med 353, 2433-2441. McKee, R.W., Mangalea, M.R., Purcell, E.B., Borchardt, E.K., and Tamayo, R. (2013). The second messenger cyclic di-GMP regulates Clostridium difficile toxin production by controlling expression of sigD. J Bacteriol 195, 5174-5185.

Mellin, J.R., and Cossart, P. (2015). Unexpected versatility in bacterial riboswitches. Trends Genet 31, 150-156. Melville, S., and Craig, L. (2013). Type IV pili in Gram-positive bacteria. Microbiol Mol Biol Rev 77, 323-341. Mendez, M., Huang, I.H., Ohtani, K., Grau, R., Shimizu, T., and Sarker, M.R. (2008). Carbon catabolite repression of Type IV pilus-dependent gliding motility in the anaerobic pathogen Clostridium perfringens. J Bacteriol 190, 48-60. Menendez, A., Willing, B.P., Montero, M., Wlodarska, M., So, C.C., Bhinder, G., Vallance, B.A., and Finlay, B.B. (2013). Bacterial stimulation of the TLR-MyD88 pathway modulates the homeostatic expression of ileal Paneth cell α-defensins. J Innate Immun 5, 39-49.

Merz, A.J., So, M., and Sheetz, M.P. (2000). Pilus retraction powers bacterial twitching motility. Nature 407, 98-102. Mesnage, S., Fontaine, T., Mignot, T., Delepierre, M., Mock, M., and Fouet, A. (2000). Bacterial SLH domain proteins are non-covalently anchored to the cell surface via a conserved mechanism involving wall polysaccharide pyruvylation. EMBO J 19, 4473-4484. Mikkelsen, H., Sivaneson, M., and Filloux, A. (2011). Key two-component regulatory systems that control biofilm formation in Pseudomonas aeruginosa. Environ Microbiol 13, 1666-1681. Monot, M., Eckert, C., Lemire, A., Hamiot, A., Dubois, T., Tessier, C., Dumoulard, B., Hamel, B., Petit, A., Lalande, V., et al. (2015). Clostridium difficile: New insights into the evolution of the pathogenicity locus. Sci Rep 5, 15023.

Morand, P.C., Drab, M., Rajalingam, K., Nassif, X., and Meyer, T.F. (2009). Neisseria meningitidis differentially controls host cell motility through PilC1 and PilC2 components of Type IV Pili. PLoS One 4, e6834. Mullane, K. (2014). Fidaxomicin in Clostridium difficile infection: Latest evidence and clinical guidance. Ther Adv Chronic Dis 5, 69-84.

241

Mullany, P., Wilks, M., Lamb, I., Clayton, C., Wren, B., and Tabaqchali, S. (1990). Genetic analysis of a tetracycline resistance element from Clostridium difficile and its conjugal transfer to and from Bacillus subtilis. J Gen Microbiol 136, 1343-1349. Nakamura, S., Shimamura, T., and Nishida, S. (1976). Urease-negative strains of Clostridium sordellii. Can J Microbiol 22, 673-676. Natarajan, M., Walk, S.T., Young, V.B., and Aronoff, D.M. (2013). A clinical and epidemiological review of non-toxigenic Clostridium difficile. Anaerobe 22, 1-5. Ng, T.W., Akman, L., Osisami, M., and Thanassi, D.G. (2004). The usher N terminus is the initial targeting site for chaperone-subunit complexes and participates in subsequent pilus biogenesis events. J Bacteriol 186, 5321-5331. Ng, Y.K., Ehsaan, M., Philip, S., Collery, M.M., Janoir, C., Collignon, A., Cartman, S.T., and Minton, N.P. (2013). Expanding the repertoire of gene tools for precise manipulation of the Clostridium difficile genome: Allelic exchange using pyrE alleles. PLoS One 8, e56051. Nguyen, Y., Sugiman-Marangos, S., Harvey, H., Bell, S.D., Charlton, C.L., Junop, M.S., and Burrows, L.L. (2015). Pseudomonas aeruginosa minor pilins prime Type IVa pilus assembly and promote surface display of the PilY1 adhesin. J Biol Chem 290, 601-611. Nivaskumar, M., and Francetic, O. (2014). Type II secretion system: a magic beanstalk or a protein escalator. Biochim Biophys Acta 1843, 1568-1577. Novagen, (2004). User Protocol TB009 "Competent Cells". O'Connell Motherway, M., Zomer, A., Leahy, S.C., Reunanen, J., Bottacini, F., Claesson, M.J., O'Brien, F., Flynn, K., Casey, P.G., Munoz, J.A., et al. (2011). Functional genome analysis of Bifidobacterium breve UCC2003 reveals Type IVb tight adherence (Tad) pili as an essential and conserved host-colonization factor. Proc Natl Acad Sci U S A 108, 11217-11222. Okuda, J., Hayashi, N., Okamoto, M., Sawada, S., Minagawa, S., Yano, Y., and Gotoh, N. (2010). Translocation of Pseudomonas aeruginosa from the intestinal tract is mediated by the binding of ExoS to an Na,K-ATPase regulator, FXYD3. Infect Immun 78, 4511-4522. Palomino, C., Marín, E., and Fernández, L. (2011). The fimbrial usher FimD follows the SurA- BamB pathway for its assembly in the outer membrane of Escherichia coli. J Bacteriol 193, 5222-5230. Papatheodorou, P., Carette, J.E., Bell, G.W., Schwan, C., Guttenberg, G., Brummelkamp, T.R., and Aktories, K. (2011). Lipolysis-stimulated lipoprotein receptor (LSR) is the host receptor for the binary toxin Clostridium difficile Transferase (CDT). Proc Natl Acad Sci U S A 108, 16422-16427. Paredes-Sabja, D., Shen, A., and Sorg, J.A. (2014). Clostridium difficile spore biology: Sporulation, germination, and spore structural proteins. Trends Microbiol 22, 406-416. Parge, H.E., Forest, K.T., Hickey, M.J., Christensen, D.A., Getzoff, E.D., and Tainer, J.A. (1995). Structure of the fibre-forming protein pilin at 2.6 Å resolution. Nature 378, 32-38.

242

Pelicic, V. (2008). Type IV pili: e pluribus unum? Mol Microbiol 68, 827-837.

Peltier, J., Shaw, H.A., Couchman, E.C., Dawson, L.F., Yu, L., Choudhary, J.S., Kaever, V., Wren, B.W., and Fairweather, N.F. (2015). Cyclic-di-GMP regulates production of sortase substrates of Clostridium difficile and their surface exposure through ZmpI protease-mediated cleavage. J Biol Chem 290, 24453-69. Pepe, J.C., and Lory, S. (1998). Amino acid substitutions in PilD, a bifunctional enzyme of Pseudomonas aeruginosa. Effect on leader peptidase and N-methyltransferase activities in vitro and in vivo. J Biol Chem 273, 19120-19129. Pereira, F.C., Saujet, L., Tomé, A.R., Serrano, M., Monot, M., Couture-Tosi, E., Martin- Verstraete, I., Dupuy, B., and Henriques, A.O. (2013). The spore differentiation pathway in the enteric pathogen Clostridium difficile. PLoS Genet 9, e1003782. Perelle, S., Gibert, M., Bourlioux, P., Corthier, G., and Popoff, M.R. (1997). Production of a complete binary toxin (actin-specific ADP-ribosyltransferase) by Clostridium difficile CD196. Infect Immun 65, 1402-1407. Permpoonpattana, P., Phetcharaburanin, J., Mikelsone, A., Dembek, M., Tan, S., Brisson, M.C., La Ragione, R., Brisson, A.R., Fairweather, N., Hong, H.A., et al. (2013). Functional characterization of Clostridium difficile spore coat proteins. J Bacteriol 195, 1492-1503. Persat, A., Inclan, Y.F., Engel, J.N., Stone, H.A., and Gitai, Z. (2015). Type IV pili mechanochemically regulate virulence factors in Pseudomonas aeruginosa. Proc Natl Acad Sci U S A 112, 7563-7568. Petersen, T.N., Brunak, S., von Heijne, G., and Nielsen, H. (2011). SignalP 4.0: Discriminating signal peptides from transmembrane regions. Nat Methods 8, 785-786. Petrof, E.O., Gloor, G.B., Vanner, S.J., Weese, S.J., Carter, D., Daigneault, M.C., Brown, E.M., Schroeter, K., and Allen-Vercoe, E. (2013). Stool substitute transplant therapy for the eradication of Clostridium difficile infection: 'RePOOPulating' the gut. Microbiome 1, 3. Pham, T.A., and Lawley, T.D. (2014). Emerging insights on intestinal dysbiosis during bacterial infections. Curr Opin Microbiol 17, 67-74. Piepenbrink, K.H., Maldarelli, G.A., de la Peña, C.F., Mulvey, G.L., Snyder, G.A., De Masi, L., von Rosenvinge, E.C., Günther, S., Armstrong, G.D., Donnenberg, M.S., et al. (2014). Structure of Clostridium difficile PilJ exhibits unprecedented divergence from known Type IV pilins. J Biol Chem 289, 4334-4345. Piepenbrink, K.H., Maldarelli, G.A., Martinez de la Peña, C.F., Dingle, T.C., Mulvey, G.L., Lee, A., von Rosenvinge, E., Armstrong, G.D., Donnenberg, M.S., and Sundberg, E.J. (2015). Structural and evolutionary analyses show unique stabilization strategies in the Type IV pili of Clostridium difficile. Structure 23, 385-396. Pointon, J.A., Smith, W.D., Saalbach, G., Crow, A., Kehoe, M.A., and Banfield, M.J. (2010). A highly unusual thioester bond in a pilus adhesin is required for efficient host cell interaction. J Biol Chem 285, 33858-33866.

243

Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2 - approximately maximum- likelihood trees for large alignments. PLoS One 5, e9490. Proft, T., and Baker, E.N. (2009). Pili in Gram-negative and Gram-positive bacteria - structure, assembly and their role in disease. Cell Mol Life Sci 66, 613-635. Puente, J.L., Bieber, D., Ramer, S.W., Murray, W., and Schoolnik, G.K. (1996). The bundle- forming pili of enteropathogenic Escherichia coli: Transcriptional regulation by environmental signals. Mol Microbiol 20, 87-100. Puorger, C., Vetsch, M., Wider, G., and Glockshuber, R. (2011). Structure, folding and stability of FimA, the main structural subunit of Type 1 pili from uropathogenic Escherichia coli strains. J Mol Biol 412, 520-535. Purcell, E.B., McKee, R.W., Bordeleau, E., Burrus, V., and Tamayo, R. (2015). Regulation of Type IV pili contributes to surface behaviors of historical and epidemic strains of Clostridium difficile. J Bacteriol 198, 565-577. Purcell, E.B., McKee, R.W., McBride, S.M., Waters, C.M., and Tamayo, R. (2012). Cyclic diguanylate inversely regulates motility and aggregation in Clostridium difficile. J Bacteriol 194, 3307-3316. Purdy, D., O'Keeffe, T.A., Elmore, M., Herbert, M., McLeod, A., Bokori-Brown, M., Ostrowski, A., and Minton, N.P. (2002). Conjugative transfer of clostridial shuttle vectors from Escherichia coli to Clostridium difficile through circumvention of the restriction barrier. Mol Microbiol 46, 439-452. Qi, Y., Chuah, M.L., Dong, X., Xie, K., Luo, Z., Tang, K., and Liang, Z.X. (2011). Binding of cyclic diguanylate in the non-catalytic EAL domain of FimX induces a long-range conformational change. J Biol Chem 286, 2910-2917. Qin, J., Li, R., Raes, J., Arumugam, M., Burgdorf, K.S., Manichanh, C., Nielsen, T., Pons, N., Levenez, F., Yamada, T., et al. (2010). A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464, 59-65. Quigley, B.R., Zähner, D., Hatkoff, M., Thanassi, D.G., and Scott, J.R. (2009). Linkage of T3 and Cpa pilins in the Streptococcus pyogenes M3 pilus. Mol Microbiol 72, 1379-1394. Rakotoarivonina, H., Jubelin, G., Hebraud, M., Gaillard-Martinie, B., Forano, E., and Mosoni, P. (2002). Adhesion to cellulose of the Gram-positive bacterium Ruminococcus albus involves Type IV pili. Microbiology 148, 1871-1880.

Ramer, S.W., Schoolnik, G.K., Wu, C.Y., Hwang, J., Schmidt, S.A., and Bieber, D. (2002). The Type IV pilus assembly complex: Biogenic interactions among the bundle-forming pilus proteins of enteropathogenic Escherichia coli. J Bacteriol 184, 3457-3465. Reese, M.G. (2001). Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Comput Chem 26, 51-56.

244

Reguera, G., McCarthy, K.D., Mehta, T., Nicoll, J.S., Tuominen, M.T., and Lovley, D.R. (2005). Extracellular electron transfer via microbial nanowires. Nature 435, 1098-1101. Reindl, S., Ghosh, A., Williams, G.J., Lassak, K., Neiner, T., Henche, A.L., Albers, S.V., and Tainer, J.A. (2013). Insights into FlaI functions in archaeal motor assembly and motility from structures, conformations, and genetics. Mol Cell 49, 1069-1082. Remaut, H., Tang, C., Henderson, N.S., Pinkner, J.S., Wang, T., Hultgren, S.J., Thanassi, D.G., Waksman, G., and Li, H. (2008). Fiber formation across the bacterial outer membrane by the chaperone/usher pathway. Cell 133, 640-652. Rogers, L.M., Thelen, T., Fordyce, K., Bourdonnay, E., Lewis, C., Yu, H., Zhang, J., Xie, J., Serezani, C.H., Peters-Golden, M., et al. (2014). EP4 and EP2 receptor activation of protein kinase A by prostaglandin E2 impairs macrophage phagocytosis of Clostridium sordellii. Am J Reprod Immunol 71, 34-43.

Roux, N., Spagnolo, J., and de Bentzmann, S. (2012). Neglected but amazingly diverse type IVb pili. Res Microbiol 163, 659-673. Rudel, T., Scheurerpflug, I., and Meyer, T.F. (1995). Neisseria PilC protein identified as Type- 4 pilus tip-located adhesin. Nature 373, 357-359. Rutherford, J.C. (2014). The emerging role of urease as a general microbial virulence factor. PLoS Pathog 10, e1004062. Sampaleanu, L.M., Bonanno, J.B., Ayers, M., Koo, J., Tammam, S., Burley, S.K., Almo, S.C., Burrows, L.L., and Howell, P.L. (2009). Periplasmic domains of Pseudomonas aeruginosa PilN and PilO form a stable heterodimeric complex. J Mol Biol 394, 143-159.

Satyshur, K.A., Worzalla, G.A., Meyer, L.S., Heiniger, E.K., Aukema, K.G., Misic, A.M., and Forest, K.T. (2007). Crystal structures of the pilus retraction motor PilT suggest large domain movements and subunit cooperation drive motility. Structure 15, 363-376. Sauer, F.G., Fütterer, K., Pinkner, J.S., Dodson, K.W., Hultgren, S.J., and Waksman, G. (1999). Structural basis of chaperone function and pilus biogenesis. Science 285, 1058-1061. Sauer, F.G., Mulvey, M.A., Schilling, J.D., Martinez, J.J., and Hultgren, S.J. (2000). Bacterial pili: Molecular mechanisms of pathogenesis. Curr Opin Microbiol 3, 65-72. Schwan, C., Stecher, B., Tzivelekidis, T., van Ham, M., Rohde, M., Hardt, W.D., Wehland, J., and Aktories, K. (2009). Clostridium difficile toxin CDT induces formation of microtubule- based protrusions and increases adherence of bacteria. PLoS Pathog 5, e1000626.

Schwartz, D.C., Li, X., Hernandez, L.I., Ramnarain, S.P., Huff, E.J., and Wang, Y.K. (1993). Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science 262, 110-114.

245

Sebaihia, M., Wren, B.W., Mullany, P., Fairweather, N.F., Minton, N., Stabler, R., Thomson, N.R., Roberts, A.P., Cerdeño-Tárraga, A.M., Wang, H., et al. (2006). The multidrug-resistant human pathogen Clostridium difficile has a highly mobile, mosaic genome. Nat Genet 38, 779- 786. Serganov, A., and Nudler, E. (2013). A decade of riboswitches. Cell 152, 17-24. Shahapure, R., Driessen, R.P., Haurat, M.F., Albers, S.V., and Dame, R.T. (2014). The archaellum: A rotating Type IV pilus. Mol Microbiol 91, 716-723. Sieprawska-Lupa, M., Mydel, P., Krawczyk, K., Wójcik, K., Puklo, M., Lupa, B., Suder, P., Silberring, J., Reed, M., Pohl, J., et al. (2004). Degradation of human antimicrobial peptide LL-37 by Staphylococcus aureus-derived proteinases. Antimicrob Agents Chemother 48, 4673-4679. Sievers, F., Wilm, A., Dineen, D., Gibson, T.J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., et al. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7, 539. Sirigi Reddy, A.R., Girinathan, B.P., Zapotocny, R., and Govind, R. (2013). Identification and characterization of Clostridium sordellii toxin gene regulator. J Bacteriol 195, 4246-4254. Skotnicka, D., Petters, T., Heering, J., Hoppert, M., Kaever, V., and Søgaard-Andersen, L. (2015). Cyclic di-GMP regulates Type IV pilus-dependent motility in Myxococcus xanthus. J Bacteriol 198, 77-90. Smith, T.J., Blackman, S.A., and Foster, S.J. (2000). Autolysins of Bacillus subtilis: Multiple enzymes with multiple functions. Microbiology 146 ( Pt 2), 249-262. Smith, W.D., Pointon, J.A., Abbot, E., Kang, H.J., Baker, E.N., Hirst, B.H., Wilson, J.A., Banfield, M.J., and Kehoe, M.A. (2010). Roles of minor pilin subunits Spy0125 and Spy0130 in the serotype M1 Streptococcus pyogenes strain SF370. J Bacteriol 192, 4651-4659. Sorg, J.A., and Sonenshein, A.L. (2008). Bile salts and glycine as cogerminants for Clostridium difficile spores. J Bacteriol 190, 2505-2512. Soutourina, O.A., Monot, M., Boudry, P., Saujet, L., Pichon, C., Sismeiro, O., Semenova, E., Severinov, K., Le Bouguenec, C., Coppée, J.Y., et al. (2013). Genome-wide identification of regulatory RNAs in the human pathogen Clostridium difficile. PLoS Genet 9, e1003493. Stabler, R.A., He, M., Dawson, L., Martin, M., Valiente, E., Corton, C., Lawley, T.D., Sebaihia, M., Quail, M.A., Rose, G., et al. (2009). Comparative genome and phenotypic analysis of Clostridium difficile 027 strains provides insight into the evolution of a hypervirulent bacterium. Genome Biol 10, R102. Stevens, D.L., Aldape, M.J., and Bryant, A.E. (2012). Life-threatening clostridial infections. Anaerobe 18, 254-259. Strom, M.S., and Lory, S. (1987). Mapping of export signals of Pseudomonas aeruginosa pilin with alkaline phosphatase fusions. J Bacteriol 169, 3181-3188.

246

Strom, M.S., and Lory, S. (1991). Amino acid substitutions in pilin of Pseudomonas aeruginosa. Effect on leader peptide cleavage, amino-terminal methylation, and pilus assembly. J Biol Chem 266, 1656-1664. Strom, M.S., Nunn, D.N., and Lory, S. (1993). A single bifunctional enzyme, PilD, catalyzes cleavage and N-methylation of proteins belonging to the Type IV pilin family. Proc Natl Acad Sci U S A 90, 2404-2408. Sudarsan, N., Lee, E.R., Weinberg, Z., Moy, R.H., Kim, J.N., Link, K.H., and Breaker, R.R. (2008). Riboswitches in eubacteria sense the second messenger cyclic di-GMP. Science 321, 411-413. Takhar, H.K., Kemp, K., Kim, M., Howell, P.L., and Burrows, L.L. (2013). The platform protein is essential for Type IV pilus biogenesis. J Biol Chem 288, 9721-9728.

Tammam, S., Sampaleanu, L.M., Koo, J., Manoharan, K., Daubaras, M., Burrows, L.L., and Howell, P.L. (2013). PilMNOPQ from the Pseudomonas aeruginosa Type IV pilus system form a transenvelope protein interaction network that interacts with PilA. J Bacteriol 195, 2126-2135. Tammam, S., Sampaleanu, L.M., Koo, J., Sundaram, P., Ayers, M., Chong, P.A., Forman-Kay, J.D., Burrows, L.L., and Howell, P.L. (2011). Characterization of the PilN, PilO and PilP Type IVa pilus subcomplex. Mol Microbiol 82, 1496-1514. Tataki, H., and Huet, M. (1953). Value of urease test for differentiation of Clostridium sordellii and Clostridium bifermentans. Ann Inst Pasteur (Paris) 84, 890-894. Taylor, R.K., Miller, V.L., Furlong, D.B., and Mekalanos, J.J. (1987). Use of phoA gene fusions to identify a pilus colonization factor coordinately regulated with cholera toxin. Proc Natl Acad Sci U S A 84, 2833-2837. Thelen, T., Hao, Y., Medeiros, A.I., Curtis, J.L., Serezani, C.H., Kobzik, L., Harris, L.H., and Aronoff, D.M. (2010). The class A scavenger receptor, macrophage receptor with collagenous structure, is the major phagocytic receptor for Clostridium sordellii expressed by human decidual macrophages. J Immunol 185, 4328-4335. Theriot, C.M., Koenigsknecht, M.J., Carlson, P.E., Hatton, G.E., Nelson, A.M., Li, B., Huffnagle, G.B., Z Li, J., and Young, V.B. (2014). Antibiotic-induced shifts in the mouse gut microbiome and metabolome increase susceptibility to Clostridium difficile infection. Nat Commun 5, 3114. Tobe, T., Schoolnik, G.K., Sohel, I., Bustamante, V.H., and Puente, J.L. (1996). Cloning and characterization of bfpTVW, genes required for the transcriptional activation of bfpA in enteropathogenic Escherichia coli. Mol Microbiol 21, 963-975. Tripathi, S.A., and Taylor, R.K. (2007). Membrane association and multimerization of TcpT, the cognate ATPase ortholog of the Vibrio cholerae toxin-coregulated-pilus biogenesis apparatus. J Bacteriol 189, 4401-4409.

247

Turner, L.R., Lara, J.C., Nunn, D.N., and Lory, S. (1993). Mutations in the consensus ATP- binding sites of XcpR and PilB eliminate extracellular protein secretion and pilus biogenesis in Pseudomonas aeruginosa. J Bacteriol 175, 4962-4969. Underwood, S., Guan, S., Vijayasubhash, V., Baines, S.D., Graham, L., Lewis, R.J., Wilcox, M.H., and Stephenson, K. (2009). Characterization of the sporulation initiation pathway of Clostridium difficile and its role in toxin production. J Bacteriol 191, 7296-7305. Unger-Torroledo, L., Straub, R., Lehmann, A.D., Graber, F., Stahl, C., Frey, J., Gerber, V., Hoppeler, H., and Baum, O. (2010). Lethal Toxin of Clostridium sordellii is associated with fatal equine atypical myopathy. Vet Microbiol 144, 487-492. Valour, F., Boisset, S., Lebras, L., Martha, B., Boibieux, A., Perpoint, T., Chidiac, C., Ferry, T., and Peyramond, D. (2010). Clostridium sordellii brain abscess diagnosed by 16S rRNA gene sequencing. J Clin Microbiol 48, 3443-3444. van Eijk, E., Anvar, S.Y., Browne, H.P., Leung, W.Y., Frank, J., Schmitz, A.M., Roberts, A.P., and Smits, W.K. (2015). Complete genome sequence of the Clostridium difficile laboratory strain 630Δerm reveals differences from strain 630, including translocation of the mobile element CTn5. BMC Genomics 16, 31. Varga, J.J., Nguyen, V., O'Brien, D.K., Rodgers, K., Walker, R.A., and Melville, S.B. (2006). Type IV pili-dependent gliding motility in the Gram-positive pathogen Clostridium perfringens and other Clostridia. Mol Microbiol 62, 680-694. Varga, J.J., Therit, B., and Melville, S.B. (2008). Type IV pili and the CcpA protein are needed for maximal biofilm formation by the Gram-positive anaerobic pathogen Clostridium perfringens. Infect Immun 76, 4944-4951.

Verger, D., Miller, E., Remaut, H., Waksman, G., and Hultgren, S. (2006). Molecular mechanism of P pilus termination in uropathogenic Escherichia coli. EMBO Rep 7, 1228-1232. von Pawel-Rammingen, U., Johansson, B.P., and Björck, L. (2002). IdeS, a novel streptococcal cysteine proteinase with unique specificity for Immunoglobulin G. EMBO J 21, 1607-1615. Voth, D.E., and Ballard, J.D. (2005). Clostridium difficile toxins: Mechanism of action and role in disease. Clin Microbiol Rev 18, 247-263. Voth, D.E., Martinez, O.V., and Ballard, J.D. (2006). Variations in Lethal Toxin and cholesterol-dependent cytolysin production correspond to differences in cytotoxicity among strains of Clostridium sordellii. FEMS Microbiol Lett 259, 295-302.

Walk, S.T., Jain, R., Trivedi, I., Grossman, S., Newton, D.W., Thelen, T., Hao, Y., Songer, J.G., Carter, G.P., Lyras, D., et al. (2011). Non-toxigenic Clostridium sordellii: Clinical and microbiological features of a case of cholangitis-associated bacteremia. Anaerobe 17, 252-256. Walker, A.W., and Lawley, T.D. (2013). Therapeutic modulation of intestinal dysbiosis. Pharmacol Res 69, 75-86.

248

Weber, G.G., and Klose, K.E. (2011). The complexity of ToxT-dependent transcription in Vibrio cholerae. Indian J Med Res 133, 201-206. Wheeldon, L.J., Worthington, T., Hilton, A.C., Elliott, T.S., and Lambert, P.A. (2008). Physical and chemical factors influencing the germination of Clostridium difficile spores. J Appl Microbiol 105, 2223-2230. Willing, S.E., Candela, T., Shaw, H.A., Seager, Z., Mesnage, S., Fagan, R.P., and Fairweather, N.F. (2015). Clostridium difficile surface proteins are anchored to the cell wall using CWB2 motifs that recognise the anionic polymer PSII. Mol Microbiol 96, 596-608. Willing, S.E., Richards, E.J., Sempere, L., Dale, A.G., Cutting, S.M., and Fairweather, N.F. (2015). Increased toxin expression in a Clostridium difficile mfd mutant. BMC Microbiol 15, 280. Wolfgang, M.C., Lee, V.T., Gilmore, M.E., and Lory, S. (2003). Coordinate regulation of bacterial virulence genes by a novel adenylate cyclase-dependent signaling pathway. Dev Cell 4, 253-263. Yamagata, A., Milgotina, E., Scanlon, K., Craig, L., Tainer, J.A., and Donnenberg, M.S. (2012). Structure of an essential Type IV pilus biogenesis protein provides insights into pilus and Type II secretion systems. J Mol Biol 419, 110-124. Yan, W., Qu, T., Zhao, H., Su, L., Yu, Q., Gao, J., and Wu, B. (2010). The effect of c-di-GMP (3'-5'-cyclic diguanylic acid) on the biofilm formation and adherence of Streptococcus mutans. Microbiol Res 165, 87-96. Yuan, P., Zhang, H., Cai, C., Zhu, S., Zhou, Y., Yang, X., He, R., Li, C., Guo, S., Li, S., et al. (2015). Chondroitin sulfate proteoglycan 4 functions as the cellular receptor for Clostridium difficile Toxin B. Cell Res 25, 157-168.

249

Appendices Appendix 1

Plasmid Name Descriptive Plasmid Name Relevant Details Source or Reference pECC01 pMTL007C-E5-pilB1 ClosTron plasmid for insertional mutagenesis of R20291 pilB1. This Work pECC12 pMTL960-Pcwp2-CD630_dccA-His C. difficile 630 dccA cloned into pRPF144 with C-terminal His-tag. This Work ((Peltier et al., 2015)) pECC15 pMTL960-Ptet-CD630_pilV C. difficile 630 pilV cloned into pRPF185. This Work pECC17 pMTL960-Ptet-CD630_dccA-His CD630_dccA-His sub-cloned from pECC12 into pRPF185. This Work ((Peltier et al., 2015)) pECC19 pET28a-CD630_pilA2 DNA encoding C-terminal His-tag removed from pRPF230. This Work pECC23 pET28a-His-CD630_pilA2 DNA encoding N-terminal His-tag inserted into pECC19. This Work pECC24 pMTL960-Ptet-CDR20291_pilA2 C. difficile R20291 pilA2 cloned into pRPF185. This Work pECC25 pMTL960-Ptet-CDR20291_pilA2-His C. difficile R20291 pilA2 cloned into pECC12. This Work pECC28 pMTL960-Ptet-CD630_pilW C. difficile 630 pilW cloned into pRPF185. This Work pECC29 pMTL960-Pcwp2-CD630_pilU C. difficile 630 pilU sub-cloned from pECC33 into pRPF144. This Work pECC31 pMTL960-Pcwp2-CDR20291_pilA2 C. difficile R20291 pilA2 sub-cloned from pECC24 into pRPF144. This Work pECC32 pMTL960-Pcwp2-CDR20291_pilA2-His C. difficile R20291 pilA2 cloned into pECC17. This Work pECC33 pMTL960-Ptet-CD630_pilU C. difficile 630 pilU cloned into pRPF185. This Work pECC34 pMTL960-Ptet-CD630_pilA1 C. difficile 630 pilA1 cloned in to pRPF185. This Work pECC38 pET28a-His-CDR20291_pilA2Δ1-33 Truncated C. difficile R20291 pilA2 cloned into pECC23. This Work pECC53 pET28a-His-CD630_pilX C. difficile 630 pilX cloned into pECC23. This Work pECC54 pET28a-His-CD630_pilA3 C. difficile 630 pilA3 cloned into pECC23. This Work pECC55 pET28a-His-CD630_pilJ C. difficile 630 pilJ cloned into pECC23. This Work pECC56 pET28a-His-CD630_pilK C. difficile 630 pilK cloned into pECC23. This Work pECC58 pMTL-SC7215-ΔCD630_pilA1 pilA1 homologous arms assembled into pMTL-SC7215. This Work pECC59 pMTL-SC7215-ΔCD630_pilU pilU homologous arms assembled into pMTL-SC7215. This Work pECC60 pMTL-SC7215-ΔCD630_pilV pilV homologous arms assembled into pMTL-SC7215. This Work pECC61 pMTL-SC7215-ΔCD630_pilA1Terminator pilA1Terminator homologous arms assembled into pMTL-SC7215. This Work pECC62 pMTL-SC7215-ΔCD630_pilB1 pilB1 homologous arms assembled into pMTL-SC7215. This Work pECC63 pMTL-SC7215-ΔCD630_pilB2 pilB2 homologous arms assembled into pMTL-SC7215. This Work pECC64 pMTL-SC7215-ΔCD630_pilT pilT homologous arms assembled into pMTL-SC7215. This Work pECC65 pMTL-SC7215-ΔCD630_pilK pilK homologous arms assembled into pMTL-SC7215. This Work

250 pECC66 pMTL-SC7215-ΔCD630_pilC1 pilC1 homologous arms assembled into pMTL-SC7215. This Work pECC67 pMTL-SC7215-ΔCD630_pilMN pilMN homologous arms assembled into pMTL-SC7215. This Work pECC68 pMTL-SC7215-ΔCD630_pilO pilO homologous arms assembled into pMTL-SC7215. This Work pECC69 pMTL-SC7215-ΔCD630_pilD2 pilD2 homologous arms assembled into pMTL-SC7215. This Work pECC70 pMTL-SC7215-ΔCD630_pilD1 pilD1 homologous arms assembled into pMTL-SC7215. This Work pECC71 pMTL-SC7215 ΔCD630_Cdi2_4 Cdi2_4 riboswitch homologous arms assembled into pMTL-SC7215. This Work pECC72 pMTL960-Ptet-CD630_dccA-His-SphI Half a strep-tag and SphI restriction site inserted into pECC17. This Work pECC76 pMTL960-Ptet-CD630_dccA-His-Strep Remainder of Strep-tag and SalI restriction site inserted into pECC72. This Work pECC77 pKT25-CD630_pilA1Δ1-34 Truncated C. difficile 630 pilA1 cloned into pKT25. This Work pECC78 pKT25-CD630_pilUΔ1-33 Truncated C. difficile 630 pilU cloned into pKT25. This Work pECC79 pKT25-CD630_pilVΔ1-35 Truncated C. difficile 630 pilV cloned into pKT25. This Work pECC80 pET28a-CSW3025_pilA1A-His C. sordellii W3025 pilA1A cloned into pET28a. This Work pECC81 pET28a-CSW3025_pilA1B-His C. sordellii W3025 pilA1B cloned into pET28a. This Work pECC82 pKT25-CD630_pilKΔ1-32 Truncated C. difficile 630 pilK cloned into pKT25. This Work pECC83 pUT18C-CD630_pilA1Δ1-34 Truncated C. difficile 630 pilA1 cloned into pUT18C. This Work pECC84 pUT18C-CD630_pilUΔ1-33 Truncated C. difficile 630 pilU cloned into pUT18C. This Work pECC85 pUT18C-CD630_pilVΔ1-35 Truncated C. difficile 630 pilV cloned into pUT18C. This Work pECC86 pET28a-His-CD630_pilKΔ1-32 Truncated C. difficile 630 pilK cloned into pECC38. This Work pECC87 pET28a-GAP-CSW3025_pilU-His C. sordellii W3025 pilU cloned into pET28a. This Work pECC88 pUT18C-CD630_pilKΔ1-32 Truncated C. difficile 630 pilK cloned into pUT18C. This Work pECC89 pKT25-CD630_pilJΔ1-35 Truncated C. difficile 630 pilJ cloned into pKT25. This Work pECC90 pUT18C-CD630_pilJΔ1-35 Truncated C. difficile 630 pilJ cloned into pUT18C. This Work pECC91 pUT18C-CD630_pilWΔ1-31 Truncated C. difficile 630 pilW cloned into pUT18C. This Work pECC92 pKT25-CD630_pilWΔ1-31 Truncated C. difficile 630 pilW cloned into pKT25. This Work pECC93 pET28a-CSW3025_pilU-His Excess DNA between RBS and pilU removed from pECC87. This Work pECC94 pACYCDuet-1-/-I-pilVΔ1-35 Truncated C. difficile 630 pilV cloned into pACYCDuet-1 MCS2. This Work pECC95 pMTL960-Ptet-CD630_dccA-His-pilC1 C. difficile 630 pilC1 cloned into pECC76 site 2. This Work pECC96 pMTL960-Ptet-CD630_dccA-His-pilD1 C. difficile 630 pilD1 cloned into pECC76 site 2. This Work pECC97 pMTL960-Ptet-CD630_dccA-His-pilMN C. difficile 630 pilMN cloned into pECC76 site 2. This Work pECC98 pMTL960-Ptet-CD630_dccA-His-pilO C. difficile 630 pilO cloned into pECC76 site 2. This Work pECC99 pMTL960-Ptet-CD630_dccA-His-pilU C. difficile 630 pilU cloned into pECC76 site 2. This Work pECC103 pACYCDuet-1-/-pilVΔ1-35 Insertion between RBS and truncated pilV in pECC94 removed. This Work pECC104 pMTL960-Ptet-CD630_dccA-His-pilV C. difficile 630 pilV cloned into pECC76 site 2. This Work pECC106 pACYCDuet-1-pilUΔ1-33-pilVΔ1-35 Truncated C. difficile 630 pilU cloned into pECC103 MCS1. This Work pECC108 pACYCDuet-1-/-pilVΔ1-35-Strep Strep-tag sequence inserted after truncated pilV in pECC103. This Work

251 pECC109 pMTL960-Ptet-CD630_dccA-His-pilA1 C. difficile 630 pilA1 cloned into pECC76 site 2. This Work pECC115 pACYCDuet-1-/-pilV1-40-Strep Truncated C. difficile 630 pilV cloned into pECC108 site 2. This Work pECC116 pACYCDuet-1-/-pilU1-37-Strep Truncated C. difficile 630 pilU cloned into pECC108 site 2. This Work pECC117 pACYCDuet-1-/-pilK1-39-Strep Truncated C. difficile 630 pilK cloned into pECC108 site 2. This Work pECC118 pACYCDuet-1-/-pilA11-41-Strep Truncated C. difficile 630 pilA1 cloned into pECC108 site 2. This Work pECC119 pACYCDuet-1-pilD1-pilV1-40-Strep C. difficile 630 pilD1 cloned into pECC115 site 1. This Work pECC120 pACYCDuet-1-pilD1-pilU1-37-Strep C. difficile 630 pilD1 cloned into pECC116 site 1. This Work pECC121 pACYCDuet-1-pilD1-pilK1-39-Strep C. difficile 630 pilD1 cloned into pECC117 site 1. This Work pECC122 pACYCDuet-1-pilD1-pilA11-41-Strep C. difficile 630 pilD1 cloned into pECC118 site 1. This Work pECC123 pACYCDuet-1-pilD2-pilV1-40-Strep C. difficile 630 pilD2 cloned into pECC115 site 1. This Work pECC124 pACYCDuet-1-pilD2-pilU1-37-Strep C. difficile 630 pilD2 cloned into pECC116 site 1. This Work pECC125 pACYCDuet-1-pilD2-pilK1-39-Strep C. difficile 630 pilD2 cloned into pECC117 site 1. This Work pECC126 pACYCDuet-1-pilD2-pilA11-41-Strep C. difficile 630 pilD2 cloned into pECC118 site 1. This Work pECC127 pMTL960-Pcwp2-CD630_dccA-His-pilA1 The cwp2 promoter swapped with the tet promoter from pECC109. This Work pECC128 pMTL960-Ptet-CD630_dccA-His-pilB1 C. difficile 630 pilB1 cloned into pECC76 site 2. This Work pECC129 pMTL960-Ptet-CD630_dccA-His-pilB2 C. difficile 630 pilB2 cloned into pECC76 site 2. This Work pECC130 pMTL960-Ptet-CD630_dccA-His-pilC2 C. difficile 630 pilC2 cloned into pECC76 site 2. This Work pECC131 pMTL960-Ptet-CD630_dccA-His-pilM C. difficile 630 pilM cloned into pECC76 site 2. This Work pECC132 pMTL960-Ptet-CD630_dccA-His-pilK C. difficile 630 pilK cloned into pECC76 site 2. This Work pMTL960 N/A Empty E. coli/C. difficile shuttle vector. (Fagan and Fairweather, 2011) pMTL-SC7215 N/A Pseuso-suicide vector used in codA-linked allele exchange. (Cartman et al., 2012) pASF85 pMTL960-Ptet Ptet empty vector. Derived from pRPF185. Dr A. Fivian-Hughes pRPF144 pMTL960-Pcwp2-gusA gusA gene under control of constitutive cwp2 promoter. (Fagan and Fairweather, 2011) pRPF185 pMTL960-Ptet-gusA gusA gene under control of inducible tet promoter. (Fagan and Fairweather, 2011) pRPF226 pET28a-CD630_pilW-His C. difficile 630 pilW cloned into pET28a. Dr R. Fagan pRPF227 pET28a-CD630_pilA1-His C. difficile 630 pilA1 cloned into pET28a. Dr R. Fagan pRPF228 pET28a-CD630_pilV-His C. difficile 630 pilV cloned into pET28a. Dr R. Fagan pRPF230 pET28a-CD630_pilA2-His C. difficile 630 pilA2 cloned into pET28a. Dr R. Fagan pRPF232 pET28a-CD630_pilU-His C. difficile 630 pilU cloned into pET28a. Dr R. Fagan pET28a N/A Protein expression vector (empty). Novagen pACYCDuet-1 N/A Dual protein expression vector (empty). Novagen

252 pKT25 N/A B2H vector with T25 fragment of adenylate cyclase. (Karimova et al., 2001) pUT18C N/A B2H vector with T18 fragment of adenylate cyclase. (Karimova et al., 2001) pKT25-Zip pKT25-GCN4-Zip A leucine zipper domain of yeast GCN4 cloned into pKT25. (Karimova et al., 2001) pUT18C-Zip pUT18C-GCN4-Zip A leucine zipper domain of yeast GCN4 cloned into pUT18C. (Karimova et al., 2001) Table A1. Plasmids Used in This Study. Where a deletion is indicated within a gene contained in a particular plasmid (e.g. pKT25-CD630_pilA1Δ1-34) the numbers indicate that the DNA sequence encoding those amino acids from the protein are deleted (so in pKT25-CD630_pilA1Δ1-34 the first 102 bp of pilA1, encoding the first 34 amino acids of the PilA1, are absent from the construct). Dr A. Fivian Hughes and Dr R. Fagan are former members of the Fairweather Group.

253

Primer Sequence NF34 TAATACGACTCACTATAGGG NF35 GCTAGTTATTGCTCAGCGGT NF408 TCTTGAATATCAAAGGTGAGCCAGTACA NF409 TACAGCGTGGACTACCAGGGTATCTAAT NF793 CACCTCCTTTTTGACTTTAAGCCTACGAATACC NF794 CACCGACGAGCAAGGCAAGACCG NF944 GATAACAATTCCCCTGTAGAAATAATTTTG NF945 CCCTATAGTGAGTCGTATTAATTTCG NF1121 GCATAATCGAAATTAATACGACTCAC NF1122 CCTGAGGTTTCAGCAAAAAACC NF1323 CTGGACTTCATGAAAAACTAAAAAAAATATTG NF1519 G GGATCC TTAAGCTCCTTGTTGATCCATTAC NF1521 G GGATCC TTACCCTATTCTTGACATAACATC NF1525 G GAGCTC AAAAAATTTAAGGGGGAATAAAAATG NF1526 G GAGCTC CTAAGTAAGCTTTAAAGATAGGTG NF1696 GCCGCAGCTAACGCATTAA NF1697 CGTCAATTCCTTTGAGTTTCACTCT NF1708 GATC GAGCTC AAATAAGTAGGTTGATAAAATATGAG NF1709 GATC GGATCC TTATTTATCTTTAAATATTTCAGTTAC NF1906 CAAAGAGAAGTAGGAACTGATACAGAAAGT NF1907 GGGTCTCTCATCTCTCCTATTAGTATTACA NF1908 GTTCTTAGTGAAGGTGATGTGAGCTT NF1909 CATTCATGCTTTTTCCAGTTACAGTT NF1910 AGAAGAGATAATCTACAATCAACCACTAG NF1911 TGACTTTCTTGCACTGATTCATT NF1912 GTGGCAGTTCCAGCTTTATTTAG NF1913 GATAATGCTGCACTCTTAACTGAACTA NF1914 CATTGAAGGAAAAGGGATACCTAAAT NF1915 CGCTTATATCTGTTCCGCTATCATT NF1986 GATC CCATGG GC TTTTCCTTAATTGAAGTGTTAGTAGC NF1987 GATC CTCGAG TTTATCTTTAAATATTTCAGTTACAATTTTAAC NF1990 GATC CCATGG GC TTTACACTTGTTGAAATGATTGTAGTAG NF1991 GATC CTCGAG TTTTGGATAAAATTGCTTAGAATTTATATTTAC NF1994 GATC CCATGG GC CTTACTTTATTAGAAGTAATAATAGCTG NF1995 GATC CTCGAG CCCTATTCTTGACATAACATCAAATG NF1996 GATC CCATGG GC TTCACTTTAGTGGAATTATTGGTAG NF1997 GATC CTCGAG AGCTCCTTGTTGATCCATTAC NF1998 GATC CCATGG GC TTTACTCTAGTGGAATTATTAGTAGTAATTG NF1999 GATC CTCGAG AGTATTCGATTTATCATTTAAAAGTTTTATTAC NF2096 ATATAGAAAGTATATTTTTAGAAGGG NF2097 GTACTTGATATAAATATCTCTACG NF2126 GATC GAGCTC GTATTTTATTTTGGAGAAATTAATATGTTTAAAG NF2169 GGATTTCACATTTGCCGTTTTGTAAAC NF2170 GATCTTTTCTACGGGGTCTGAC NF2199 GATC GGATCC TTA ATGGTGATGGTGATGGTG CTCGAG ATAATCATTTTTATCAAATTTTTTCTTGTTTTTC NF2214 AAACTCCTTTTTGATAATCTCATGACC NF2215 AAACTTAGGGTAACAAAAAACACCG NF2351 TGCAGCATCTGATTTAGTAAGG NF2352 TCAAATTGGTATTTTGCGCTTGC NF2353 ACCATCTATACCAAAGAAATATATAGC NF2362 GGTAAATGGATAAATAAAGAAGAAAG NF2363 GATATATGAGTAGCATATTCAGAG

254

NF2409 ACATATATTGAAATGTTTTAATTTATATATTTTGTG NF2410 TATTAAAAGAAGTATTTGAAGGTCAAATATC NF2411 TACAAAGGAACTGAAAATCTATTTCG NF2412 ATATATTCAAAGGAGAGCAAATAGAG NF2413 ACAGTAGATGATAGTAAAGTAGATTGG NF2414 GATAGTATAAAGTTTAGTACTTCTAAATCG NF2415 ATGGAATAACATAACGAAAGATATTAATG NF2416 ATTTATTACAGGCATACTATTTATTACAAC NF2417 CTAAAGATACTAGTGAAGATATATATGC NF2418 ACAAAATATAAAGCTATTCCACTATCAC NF2419 ATAAGTTTTAGAATAGATTAAACAATAATAAGGAG NF2420 TTCTAGGCCTATAATAATCAATTGTCG NF2425 TTCTACACCTAATGTTACTTGGTTTATAG NF2426 TCAATAGAAGAATTATCTGCAACTATAATGG NF2467 TTCATAGTTTAAATTAAATTCAACTTGATCTC NF2468 TTAGATAAGGTGGTAATAATGAAAACG NF2503 AAACGTTGATTTATGTTCTGTAATGTGG NF2504 CCTATAATTGCAATTACTACCAATAATTCC NF2506 TGTTCTGATAGTGCCCATTTAAGC NF2507 ATCAGTAACTATTTATAGAGGAAAAGGTTG NF2509 ACTCTGTGTTGGTACAGATGATAG NF2510 ATCATTTCGTCCATCTCTAGCTTC NF2511 TGATAAAACTCCATTGTATGATAAGATACC NF2513 TCGTAGCTCCAAAGGCAGAAG NF2514 TCTCTATATTGTATTCATAATTATTACCTGAAC NF2516 ACAGTTCCATCTTCTAGTTTATCTAAAC NF2518 ACCTAGATATGGATTTTTCTTGATTTTCAC NF2519 TCAAAGTATAAGGTTGACGCCATTG NF2520 TATTGCGTAATTTCCTTTTTGCCTATATG NF2521 TACAGCAGATGGGAAAGGCAG NF2522 AGTTGATGTAGATATTCCATATAACATAAATAC NF2523 AGCCTTACTAAGCGTTTATGTAGG NF2524 ACCTTGAACTATTATACCACTAACTATTG NF2525 TGCATTGGGAAATAATTATGGGTTTTATG NF2526 ACGTATCTTTCCAACTTCTAAATCTATATC NF2527 AGTAGTATATGATGATATAGATTTAGAAGTTG NF2528 TCTTGAAAACCTAAATATTCAACTTTATCTTC NF2529 AACTTGAGGATAGATTCTCTGATATAC NF2530 AGAATATATTACTTCTGTAGTTATAAGTTGG NF2531 ATGCTAAAGAAACTATAAAAGACCAACTG NF2532 ATAGAATCTAAGTCTGTTATCAACACTC NF2533 AAGCAACTGGTACAAATACTATAGC NF2534 AGTTTCAATACTAGAAAGTGGAACATC NF2583 CTCGAG GATCCGGCTGCTAACAAAGC NF2584 TCA TTTTGGATAAAATTGCTTAGAATTTATATTTACGC NF2585 CATCACCAT GGCTTTACACTTGTTGAAATGATTGTAG NF2586 GTGATGGTG CATGGTATATCTCCTTCTTAAAGTTAAAC NF2618 ATCTTTAGTGTTTGTGAGAGGGTTC NF2619 TATATAATATCTACCATATTTTAGCAACACTG NF2632 AGTAAGCTTGATAAATTTAGTGAAGG NF2633 TTGTTCTAAATTAACACCTAAATCTTTAG NF2634 TAGTTTCATCAGATAGATTATTATGTTG NF2635 TAAGTATACCTAATCCTATAATTGATAC NF2654 GATC GAGCTC ACTTTTGAAAAGGAGAGAATATTTAATTG NF2655 GATC GGATCC TTATTTTGGATAAAATTGTTTAGAATTTATATTTAC

255

NF2656 GATC CTCGAG TTTTGGATAAAATTGTTTAGAATTTATATTTAC NF2695 TAGCTACAAGTATAATAACTGATCC NF2696 TACACCTCCAACTCGTAAATAC NF2697 CTTCAGTTGCTAAGAAAATAGCTG NF2698 TCTTCTTTTAAGCCTCTTGTCAAC NF2699 CAAAGTGAACTACTTATAAATCTAATAAATTTC NF2700 ATAGAGATGGATCCTCAGTTAATG NF2701 ATTTGTATTCCCAGGTGGTTTAG NF2702 TAGAGTTTTCTAGGGTTCCGTAG NF2703 GAAGTAGTAAAAAAGCCCTTAAAATCC NF2742 GATC CCATGG C AAGTATAGTAAGGTTCAAGAAAGTGC NF2743 GATC CTCGAG TTATTTTGGATAAAATTGTTTAGAATTTATATTTACG NF2793 TATACCTTGAACTATTATACCACTAAC NF2794 TATAGAGACATTGAATATGTAGATGC NF2992 GATC CCATGG C AGGAAGTGGAATAAATTTAAAAGTG NF2993 GATC GGATCC TTAGTTTACTTTTTTGTATGAACTTATATCATG NF2994 GATC CCATGG C AATAAAAAGGGTTTTACACTAATTGAATTGTTAG NF2995 GATC GGATCC TTAATTTAAAGGACCCCATCCCTC NF2997 GATC GGATCC TTATTTTCTAGTAACATAGCTATTAAAC NF2999 GATC GGATCC TTATATACTTTCACTAAAATTCCATAC NF3000 GTTTTTTGTTACCCTAAGTTTGGAGTGTTTATCAATTCATCATATATC NF3001 GGGAATAAAAAAGTAAATTGTTATAATGAAAAATATAATAAAAAAAC NF3002 AACAATTTACTTTTTTATTCCCCCTTAAATTTTTTAATTAATAC NF3003 GATTATCAAAAAGGAGTTTAGGATATAAACCTTTTGAGCAAC NF3004 GTTTTTTGTTACCCTAAGTTTTTGCAAATTTCTAAGATAGAATCTC NF3005 AAAAATAATGAATAAGAATTTATTAAAAAATAAGTAGGTTG NF3006 ATAAATTCTTATTCATTATTTTTGTACAAACTCATCAC NF3007 GATTATCAAAAAGGAGTTTTTATAAATATAGAAGGAAATGCTAAAGATTC NF3008 GTTTTTTGTTACCCTAAGTTTTAGTCAAAAGTTGTTATCAAAGCC NF3009 GTTGATAAAATTTTTACTTGGGGGTGAGATTTTG NF3010 CCAAGTAAAAATTTTATCAACCTACTTATTTTTTAATAAATTC NF3011 GATTATCAAAAAGGAGTTTTATGGTATATCAACAACAATAAGTG NF3012 GTTTTTTGTTACCCTAAGTTTTTGGTCAGTACACCTACTAAGTGC NF3013 GGTGAGATTATATAAGAAGCAACTTAGTATAAATAACAAAATAATTTTATTAT TTAC NF3014 CTTCTTATATAATCTCACCCCCAAGTAAAATTATTTATC NF3015 GATTATCAAAAAGGAGTTTTACTGACAATAGTTTTAAGTATTTCTTATAAAG NF3031 GCGTATGTCAGGTAAAGACGTAGAGA NF3033 TTGTTGTACTCAGTCATTGCCTTATTT NF3057 TATTTCAGGGCTAACAAATTTAGAC NF3058 CAGTAATTTTAAGGAGATGAACTATAC NF3059 AGTGACAGGTCTTATGGCTAAC NF3060 TGTATCTTCTCGGAATAATTTCTCC NF3061 CTTAAGTTTATGTCATATTTTCTTGGTTC NF3067 TAAGGATTTTACTTCATCTACTGTCAATC NF3068 CTCCTTGATAAATAGAATCTATATCTTCTTC NF3073 TTTGGATTATGGCGAGAAATTTTATTTGC NF3078 GTTTTTTGTTACCCTAAGTTTTCTTCTATTGTTAATAGATTTCTATTTTCTG NF3079 TGAAAAATATAAACTATTTTCTTTAATGTTCAATTATGAATG NF3080 AGAAAATAGTTTATATTTTTCATTATAACAATTTACTTTTAAGC NF3081 GATTATCAAAAAGGAGTTTATAGCTTTTAGCTTTCAAATAAATATTAG NF3082 GTTTTTTGTTACCCTAAGTTTATTCAAAGATTTACCTATGCTATTTC NF3083 TGATTAAGATTTAACTGAATTGTATAGTTAAACTTCG NF3084 AATTCAGTTAAATCTTAATCACCCTTTATATACTTTAATTTATTAG NF3085 GATTATCAAAAAGGAGTTTTTTAGTAAATTGGTGTAGCTTTGAGG

256

NF3086 GTTTTTTGTTACCCTAAGTTTATACGTATATCTCTCTAAATTTTCTG NF3087 AGTAAAGGAGAAGAAATGATTAGAATAATTTTGATGAC NF3088 CATTTCTTCTCCTTTACTATCTCAATTATATCATCATTC NF3089 GATTATCAAAAAGGAGTTTTCCTTACGATATATAATAATTGTAGGG NF3091 TTCATACTCTATAGGGTCTTCTATTG NF3092 AGTTAATAGTTCTTGATACTATACAACTTC NF3093 AGTAATACCTTCTGTTTTATCAATTTCTTG NF3094 AGCAAAAATCTTTTATATGATATGCAAGATG NF3095 TATTACAGATAGGTACAAATACTGAAGG NF3096 AATGTTTCTAATCCAGAAATACTATTTATACC NF3097 AACATTTCCAACATCATAACTGGTTAAG NF3098 AGTTAGAGAAGAGATAATCTACAATCAAC NF3099 AAGGTGATGTGAGCTTATATGATAC NF3106 GTTTTTTGTTACCCTAAGTTTAACCATATATTGTACACATTCTACAC NF3107 GAGTAAATATTGACCAAGAGTTTATGACAAGAATG NF3108 TCTTGGTCAATATTTACTCTCCATATCTTCTATTAG NF3109 GATTATCAAAAAGGAGTTTTATAGACTCAACTACATTGATTGAG NF3110 GATC CCATGG C TTTATATCAATCGAATGTATAATAAGTATTG NF3111 GATC CCATGG C TACTTGCTTTTGGAAAGTGTAG NF3112 GTTTTTTGTTACCCTAAGTTTAAACATAAATAAGGTATAGCAAGAATC NF3113 AGGACGGACTTAGTATATATGATGCCATGTAG NF3114 ATATACTAAGTCCGTCCTCTTTTATTACTGTG NF3115 GATTATCAAAAAGGAGTTTTATATGAGGATGAACTCAGCAC NF3116 GTTTTTTGTTACCCTAAGTTTACTGCTGATGTTATACCATTGAAC NF3117 ATGGCGATAACTATACATTTAGTATGACATTATACC NF3118 ATGTATAGTTATCGCCATAATATGATACATAAACC NF3119 GATTATCAAAAAGGAGTTTAGACTTAGGAGCTGACCAG NF3120 GTTTTTTGTTACCCTAAGTTTTCCAGTTACAGTTATTTTATATATCTGAC NF3121 TTAAAAAAGATAAGTCTAGTGCGAAATAAAGCTAAG NF3122 ACTAGACTTATCTTTTTTAAATATATCTTTATTCATTAGTTAATAC NF3123 GATTATCAAAAAGGAGTTTTGGACATGCAATTATTAAATGTAGTAG NF3124 CTAACTTTGCCTCAATCAATGTAGTTG NF3125 TGTGTCAGAAGCATTAACTGGAAATAG NF3126 ACCATTATTATATCTGGGTCTTGACG NF3127 TTTCGTTTATATAATGATTGTGGCTGAATC NF3128 TATCCTTTTTCGACCAGTTTGTC NF3129 AGATAATATTGTAATGGATCAACAAGG NF3130 CATACTCATAAAATTCTGCCATCTGTC NF3131 AGCTTTCAAATAAATATTAGCAAAGGATGG NF3132 TTATCTAGTTTCCCACTTTCTTCTCC NF3133 ATATGAGATTTGATGAAGTAGGGAAATTAAG NF3152 GTTTTTTGTTACCCTAAGTTTTCTCAGTTAACTTTTCTAAGTCTAAG NF3153 ATGGAGTATTATAATTTGTTATACAAATTTATATAACAGGAG NF3154 AACAAATTATAATACTCCATAGTTTGTAGCACTTTAG NF3155 GATTATCAAAAAGGAGTTTAACCTGATAAGGGAAGGCAAAAC NF3156 GTTTTTTGTTACCCTAAGTTTACGTATCTTTCCAACTTCTAAATC NF3157 TCTATATGTATCTATGGTTATATAGATGATAGTCAAG NF3158 AACCATAGATACATATAGATTAACGATTATAATATCCATTG NF3159 GATTATCAAAAAGGAGTTTTAGAGCAAGGTATAAGACTAAACG NF3176 GTTTTTTGTTACCCTAAGTTTCCTGACATACGCCTTACATC NF3177 AGGGAGATATTAATTAAAAAATTTAAGGGGGAATAAAAATG NF3178 TTTTTAATTAATATCTCCCTACAATTATTATATATCGTAAG NF3179 GATTATCAAAAAGGAGTTTAGTTCCAATAGCTATAATAGATAAAAGAAG NF3180 AAAATAGGCATGC ATAAAACTTTAAATAGAAAAAGGCTTCTCTC NF3181 TCAAATTGTGGAT AACTTATAGGATCCTTAATGGTGATG

257

NF3189 GATC GAGCTC TTTATACGGAGGGGAAATAAATG NF3190 GATC GGATCC TTAAGTATTCGATTTATCATTTAAAAG NF3191 TACAGACGTTAAAGAAACTTCCAAAC NF3192 TCCATTTAAGTCTAGTGCGAAATAAAG NF3193 CTTTATTTCTTCGCTTTTACTCTGTATC NF3194 GAAAATGCTGTTAGAATGGAAGCAAG NF3195 TGAACCAGTTTTAGTTAGTTCTTTACAG NF3196 GAAGCCAACGATTTTAATGAAGCTAG NF3197 TATCAATCCAACAATAGAAAGCAATGC NF3198 ACATTATCAACAGCTTACTTAAGAATGC NF3199 TCCATACGAATTGGATTTGTATATTCAAGTG NF3200 TACAGTTTGTGAAGGTGTGGTATC NF3201 CTGCTAAGTCTTGACCAGCTTC NF3202 ATTCAAAGTATAAGGTTGACGCCATTG NF3223 ACTGGAGTC ATCCACAATTTGAAAAATAGGCATGCATAAAAC NF3224 CGAC AACTTATAGGATCCTTAATGGTGATGGTG NF3235 GATC GGATCC C AATGGTATAACATCAGCAGTAAAAAAGC NF3236 GATC GGTACC TTACCCTATTCTTGACATAACATCAAATG NF3237 GATC GGATCC C AATACTAATAACAAAGCAAACACTAAAAATG NF3238 GATC GGTACC TTATTTATCTTTAAATATTTCAGTTACAATTTTAAC NF3239 GATC GGATCC C TCAAATCAGATAGCTAATAGAATAAAATCTACAAAG NF3240 GATC GGTACC TTAGTTTACTTTTTTGTATGAACTTATATCATGATTAATC NF3241 GATC GGATCC C AGTAATATAAACAAGGCTAAGGTAG NF3242 GATC GGTACC TTAAGCTCCTTGTTGATCCATTAC NF3245 GATC CCATGG C TCAAATCAGATAGCTAATAGAATAAAATC NF3246 GATC CTCGAG TTAGTTTACTTTTTTGTATGAACTTATATC NF3247 GATC CCATGG GC AAAAAGGGAGATAGTGGATTTAC NF3248 GATC CTCGAG TTTAGTAATATCATTGGGTTTAGG NF3249 GATC CCATGG GC ATGAAGAAAAAAAGAAATCAAAAAGGG NF3250 GATC CTCGAG CATGTTAGTATTATCTATTAAAACTATGTC NF3251 GATC GGATCC ATGAAACTAAAAAACAAAGGATTTACGC NF3252 GATC CTCGAG AATTCCCCCATTATCAAGTACTTTTG NF3253 GGATCCATGAAGCTAAAAAACAAAG NF3254 TATATCTCCTTCTTAAAGTTAAACAAAATTATTTC NF3259 GATC GGATCC C AGAAATATAGAAAAAAGTAAAGCAGTTAC NF3260 GATC GGTACC TTAATTTAAAGGACCCCATCCC NF3261 GATC GGATCC C AAAAATATAGAAAAAGCAAAGATAGCTAAAC NF3262 GATC GGTACC TTAAGTATTCGATTTATCATTTAAAAGTTTTATTAC NF3263 GATC GGATCC CTAAGTAAGCTTTAAAGATAGGTG NF3264 GATC GCATGC TTACCCTATTCTTGACATAACATC NF3265 GATC GGATCC AAATAAGTAGGTTGATAAAATATGAGTAAAAAG NF3266 GATC GCATGC TTATTTATCTTTAAATATTTCAGTTACAATTTTAACC NF3267 GATC GGATCC TGAAAGAAGGTGTATTAACTAATGAATAAAG NF3268 GATC GCATGC TTATTTCGCACTAGACTTAAATGGATTATATTTTC NF3270 GATC GGATCC CATGTAGGAGGGAAAGGAC NF3271 GATC GCATGC TTATTCATTAGTTAATACACCTTCTTTC NF3273 GATC GGATCC TGATTAAAGGATGATATTATGAAAAGTTTTAAATAC NF3274 GATC GCATGC CTACATGGCATCATATATACTAAGC NF3276 GATC GGATCC GAATGATAGGTGGAGTAAACTAC NF3277 GATC GCATGC CTATATAACCATAGAAAAATAAAGTTCAAG NF3279 GATC CCATGG GC AATACTAATAACAAAGCAAACACTAAAAATG NF3281 GATC GTCGAC TTATTTATCTTTAAATATTTCAGTTACAATTTTAAC NF3282 GATC CATATG AATGGTATAACATCAGCAGTAAAAAAGC NF3283 GATC CTCGAG TTACCCTATTCTTGACATAACATCAAATG NF3286 CATCACCATCACCATTAAGG

258

NF3287 AGGGATTTCTCACATAAAATAGAG NF3288 AGCAGTAGTAGCTTTACCGGCTTT NF3289 TCAAATGCGTTATTCTCTGCATAAT NF3290 GCCTGCTAAAAAAGATGAACTTAAATC NF3291 CATAAGCTCCTCCAAATGCAGAA NF3292 CATATGAATGGTATAACATCAGCAG NF3293 TATATCTCCTTCTTATACTTAACTAATATACTAAG NF3294 TGCCTAACACATGCAAGTCGAG NF3295 TTCACTTCTGGCTTGAAAGACC NF3296 CACAATTTGAAAAATAG TCTGGTAAAGAAACCGCTGCTG NF3297 GATGACTCCAGTCGAC CTCGAGTTACCCTATTCTTGAC NF3298 GATC GGATCC AAAAAATTTAAGGGGGAATAAAAATGAAG NF3299 GATC GCATGC TTAAGCTCCTTGTTGATCCATTAC NF3308 GGAGAAGGCTTAGCAGAAAATTTAGA NF3309 TGCTCCCCTATAGCTATCATCTGTAC NF3310 GAAATGATGGCTTTGGAGTCAGA NF3311 TACCTCTTCCTTTTGTATAGCTCCAAT NF3312 CTGGGTGGAAGGCGAAAAT NF3313 TGTTGCAGCTCTAAGTACACAAGTATTACA NF3314 TATAAGGCTGACTACGACATGACAAA NF3315 CCGCCCCTGAACCATTTT NF3316 AACTATTTGCAATGCTAGGCCTTT NF3317 CATAGACTGCTCCTACATAAACGCTTA NF3318 GGTATAATTTTTTTGGCCTTTTTCA NF3319 AGCCTAATCCTTTTGTTAGTTTAGCA NF3320 GTTGGAAAGATACGTATAAGAAAAAAAGG NF3321 AAAGTCATTTGAACCTAGGCACTTAGT NF3322 CACATTGAAGAAATGGCAAGTATGAT NF3323 AAGTCATTCTTCCATGTGCAACA NF3324 TTCACAAGATACTTCAGCAAGTGAT NF3325 CAACCATTTGACCCCTTGAAA NF3326 GATC CCATGG GC GATATTATAATCGTTAATCTATATGTATTTATAG NF3327 GATC GAGCTC CTATATAACCATAGAAAAATAAAGTTCAAG NF3328 GATC CCATGG GC CTACAAACTATGGAGTATATAATAATAATTTC NF3330 GATC GAGCTC TTATATAAATTTGTATAACAAATTATAATAAAAATTAAATATATTATAC NF3331 GATC CATATG AAGTTAAAAAAGAATAAAAAAGGTTTCACTTTAG NF3332 GATC CTCGAG CTTAGCCTTGTTTATATTACTAAATAAAGC NF3333 GATC CATATG ATGAGTTTGTACAAAAATAATGAAAAAGGTC NF3334 GATC CTCGAG TGATGTTATACCATTGAACACTTTATAAG NF3335 GATC CATATG AGTAAAAAGAATAGTGAAAGAGGATTTTC NF3336 GATC CTCGAG GTTATTAGTATTGATAATATTAAAAAATGCAAATAG NF3337 GATC CATATG TTGAGGAAGTGGAATAAATTTAAAAGTG NF3338 GATC CTCGAG ATTAGCTATCTGATTTGAAAATATCAATCC NF3339 GATC GGATCC AATTGAGATAGTAAAGGAGTGAAACC NF3340 GATC GCATGC TTAATCAGTCATCAAAATTATTCTAATCATTTC NF3342 GATC GGATCC AATTAAAGTATATAAAGGGTGATTAAGATTATATTAG NF3343 GATC GCATGC TTAAATCATTTCATTTACTACTAAACACTCTTC NF3345 GATC GGATCC TGGAAGGTATTATAGTACATGAGATTATC NF3346 GATC GCATGC CTAAAAATTAGTAATTGCATCAAACATAGG NF3348 GATC GGATCC TTATTTTTATTTAAGAAGGAGATGGC NF3349 GATC GCATGC TTAAATAATCATTCCTATATTACTTATATAATTTATC NF3351 GATC GGATCC TTACTTGGGGGTGAGATTTTG NF3352 GATC GCATGC TTAGTTTACTTTTTTGTATGAACTTATATC NF3365 GGTACAGGCAAAAGAATAGGTGAAA

259

NF3366 GAAGCTCCGTTAATACGCTAGTCAA Table A2. Primers Used in This Study. Underlined bases indicate restriction sites. Bases highlighted red indicate insertions.

260

Appendix 2

Alignments performed using Clustal Omega and visualised using MView. Conserved residues are coloured according to type. Light green indicates hydrophobic residues; dark green indicates aromatic residues; red indicates basic residues; dark blue indicates acidic residues; light blue and purple indicate polar residues; yellow indicates cysteine residues; light yellow indicates histidine residues. Unconserved residues are uncoloured.

C. difficile Type IV Pilin Sequence Alignments

Sequences are ordered according to alphabetical order of strains. PilA1

1 [ . . . . : . . . 80 196 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE 305 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLNVLE 630 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE BI9 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE CF5 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLNVLE Liv22 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE Liv24 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE M68 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLNVLE M120 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSIKSAALSYYSDTNKIPVTPDGQTGLNVLE R20291 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE TL176 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTTTDKAGLTILE TL178 MKLKKNKKGFTLVELLVVIAIIGILAVVAVPALFSNINKAKVASVESDYSSVKSAALSYYSDTNKIPVTPDGQTGLSVLE

81 . 100 . . . . : . 160 196 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNDEGVTLTEAQSAKLLSDIGENKIYTSVTADNLGNPLTSNTKVDNKVLY 305 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNDDGVTLTEAQSAKLLSDIGKDKIYTGVTGDNFGDQLKDTTKIDNKALY 630 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNTEGVTLTEAQSAKLLSDIGENKIYTNAA---LSAKLTSTTKVNNEALY BI9 TYMESLPDKADIGGEYKLIKVGSKLVLQIGTNTEGVTLTEAQSAKLLSDIGEKKIYTSATTNSLGDPLTSNTKIDNKVLY CF5 TYMESLPDKADIGGKYKLIKVGNKLVLQIGKDGEGVTLTEAQSAKLLSDIGKDKIYTGVTGDNFGDQLKDTTKIDNKALY Liv22 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNTEGVTLTEAQSAKLLSDIGENKIYTSTTTNSLGNPLTSNTKIDNNVLY Liv24 TYMESLPDKADIGGEYKLIKVGSKLVLQIGTNTEGVTLTEAQSAKLLSDIGEKKIYTSATTNSLGDPLTSNTKIDNKVLY M68 TYMESLPDKADIGGKYKLIKVGNKLVLQIGKDGEGVTLTEAQSAKLLSDIGKDKIYTGVTGDNFGDQLKDTTKIDNKALY M120 TYMESLPDKADIGGEYKLIKVGNKLVLQIGKDGEGVTLTEAQSAKLLSDIGKDKIYTGVTGDNFGEQLKDTTKIDNKALY R20291 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNDEGVTLTEAQSAKLLSDIGENKIYTSVTADNLGNPLTSNTKVDNKVLY TL176 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTDTEGVTLTEAQSAKLLSDIGKDKIYTGVTGDNFGDELTDTTKIDNKALY TL178 TYMESLPDKADIGGKYKLIKVGNKLVLQIGTNTEGVTLTEAQSAKLLSDIGENKIYTSATTNSLGNPLTSNTKIDNNVLY

161 . ] 174 196 IVLIDNTVMDSTK- 305 IVLIDNTVMD---- 630 IVLIDNIVMDQQGA BI9 IVLIDNTVMDTTK- CF5 IVLIDNTVMDSTK- Liv22 IVLIDNTVMDTTK- Liv24 IVLIDNTVMDTTK- M68 IVLIDNTVMDSTK- M120 IVLIDNTVMDSTK- R20291 IVLIDNTVMDSTK- TL176 IVLIDNTVMDSTK- TL178 IVLIDNTVMDTTK-

261

PilU

1 [ . . . . : . . . 80 196 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG 305 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG 630 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG BI9 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG CF5 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG Liv22 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG Liv24 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG M68 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG M120 MSKKNSKRGFSLIEVLVAMAIMGIILFAFFNIINTNNKANIKNDTDINSLNYVQSEIENLREKIKSGEFDFDSLDKMEDG R20291 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG TL176 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG TL178 MSKKNSERGFSLIEVLVAMAIMGIVLFAFFNIINTNNKANTKNDTDITSLNYVQSEIENLREKIKSGEFDFDSLDKLEDG

81 . 100 . . . . : . 160 196 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS 305 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKENITNKIKTIVEDKSGQIYKIAVTGKSMNDYS 630 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKITVTGKSMNDYS BI9 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS CF5 TVVYEKLIDKSKKVVYDKVLSESDVSLYDIPYEKITTIKDEDGNLIDKENITNKIKTIVEDKSGQIYKVAVTGKSMNDYS Liv22 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS Liv24 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS M68 TVVYEKLIDKSKKVVYDKVLSESDVSLYDIPYEKITTIKDEDGNLIDKENITNKIKTIVEDKSGQIYKVAVTGKSMNDYS M120 TVVYEKLIDKSKKIVYDKVLSEGNVSLYDTPYEKITTIKDEDGNLIDKENITNKIKTIVEDKSGQIYKIAVTGKSMNDYS R20291 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS TL176 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS TL178 TVVYEKLIDKSKKVVYDKVLSEGDVSLYDTPYEKITTIKDEDGNLIDKGNITNKIKTIVEDKSGQIYKIAVTGKSMNDYS

161 . ] 175 196 SKKEVKIVTEIFKDK 305 SKKEVKIVTEIFKDK 630 SKKEVKIVTEIFKDK BI9 SKKEVKIVTEIFKDK CF5 SKKEVKIVTEIFKDK Liv22 SKKEVKIVTEIFKDK Liv24 SKKEVKIVTEIFKDK M68 SKKEVKIVTEIFKDK M120 SKKEVKIVTEIFKDK R20291 SKKEVKIVTEIFKDK TL176 SKKEVKIVTEIFKDK TL178 SKKEVKIVTEIFKDK

262

PilV

1 [ . . . . : . . . 80 196 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY 305 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY 630 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY BI9 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY CF5 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKRQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY Liv22 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY Liv24 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY M68 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKRQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY M120 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGSNYEY R20291 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY TL176 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY TL178 LYKNNEKGLTLLEVIIAVFILTIVLSISYKVFNGITSAVKKQQIITDAQVNINLINKYLNRDLENCKELTKTGSGNNYEY

81 . 100 . . . . : . 160 196 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN 305 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQKNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN 630 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN BI9 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN CF5 NIEMPDNVVKYEVSIETKKHTEVYSVTRIQKNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN Liv22 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN Liv24 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN M68 NIEMPDNVVKYEVSIETKKHTEVYSVTRIQKNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN M120 NIETPDNIVKYEVSIETKKNTEVYSVTRIEKNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTDKSIYTVSIFYN R20291 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN TL176 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN TL178 NIEMPDNVVKYEVSIETKKNTEVYSVTRIQNNTIDTENEVREEIIYNQPLVQNNKEMKETPFKIEKQTGKSIYTVSIYYN

161 . . ] 186 196 ESVQESHKNSNLNNKTYTFDVMSRIG 305 ESVQESHKNSNLNNKTYTFDVMSRIG 630 ESVQESHKNSNLNNKTYTFDVMSRIG BI9 ESVQESHKNSNLNNKTYTFDVMSRIG CF5 ESVQESHKNINLNNKTYTFDVMSRIG Liv22 ESVQESHKNSNLNNKTYTFDVMSRIG Liv24 ESVQESHKNSNLNNKTYTFDVMSRIG M68 ESVQESHKNINLNNKTYTFDVMSRIG M120 ESVQEGHKNSNLNNKTYTFDVMSRIG R20291 ESVQESHKNSNLNNKTYTFDVMSRIG TL176 ESVQESHKNSNLNNKTYTFDVMSRIG TL178 ESVQESHKNSNLNNKTYTFDVMSRIG

263

PilK

1 [ . . . . : . . . 80 196 LRKWNKFKSERGAALVLVLIVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN 305 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN 630 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN BI9 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN CF5 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN Liv22 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN Liv24 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN M68 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN M120 LRKWNKFKSERGAALVLVLVVVALLSIVGLMFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN R20291 LRKWNKFKSERGAALVLVLIVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN TL176 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN TL178 LRKWNKFKSERGAALVLVLVVVALLSIVGLIFSNQIANRIKSTKTTNEGIQAKYLAETCVENSIDKAYEKLYDELEKMDN

81 . 100 . . . . : . 160 196 EFKSENQEKSISRSKLRNISDEDFNNQDEKNIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA 305 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA 630 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA BI9 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA CF5 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSELEKASMELKKLYDLDMLDYRDIEYVDA Liv22 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA Liv24 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA M68 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSELEKASMELKKLYDLDMLDYRDIEYVDA M120 EFKSENQQKSISRSKLRNISDEDFNNQNEENLEVERLGYMNNINFYLNKASSDLEKASIELKKLYDLDMLDYRDIEYVDE R20291 EFKSENQEKSISRSKLRNISDEDFNNQDEKNIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA TL176 EFKSENQEKSISRSKLRNISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLDMLDYRDIEYVDA TL178 EFKSENQEKSISRSKLRDISDEDFNNQDEENIEAERLGYMNNINFYLNKASSDLEKASMELKKLYDLYMLDYRDIEYVDA

161 . . . 200 . . . . 240 196 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL 305 NIISHRDSILEICKNYTSGDISKINEHILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL 630 NIISHRDSILEICKNYTSGDISKINEYILEEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL BI9 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL CF5 NIISHRDSILEICKNYTSGDISKINEHILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL Liv22 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKKFLEENKIENEHLNSAFSHTYKALDNISL Liv24 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL M68 NIISHRDSILEICKNYTSGDISKINEHILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL M120 NIISHRNSILDICKNYTSGDISKINEYILDEDINSTTLVEAKLVNNDILLKMFLEENKIENEHLNSAFSYTYKALDKISL R20291 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL TL176 NIISHRNSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKKFLEENKIENEHLNSAFSHTYKALDNISL TL178 NIISHRDSILEICKNYTSGDISKINEYILKEDIDSTTLIEAKLVNNDILLKMFLEENKIENEHLNSAFSHTYKALDNISL

241 : . . . . 300 . . 320 196 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYIWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR 305 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSLEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR 630 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYIWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR BI9 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR CF5 AMQNMIEYRHTFHIDEPKVEVSNGIPNSQEYYELIQNPIINSMEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR Liv22 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR Liv24 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR M68 AMQNMIEYRHTFHIDEPKVEVSNGIPNSQEYYELIQNPIINSMEYSWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR M120 AMQNMIEYRHTFHIDEQKVEVSNGIPNSQQYYELIQNPIINSKEYSWNSKWDTLENLLETLPNQTQGFNSLRVHLRNNVR R20291 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYIWNSKWDTLENLLEILPNQTQGFNSLRVHLRNNVR TL176 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYSWNSKCDTLENLLEILPNQTQGFNSLRVHLRNNVR TL178 AMQNMIEYRHTFHIDEPKVEVSNGIPDSQQYYELIQNPIINSMEYIWNSKWDTLEKLLEILPNQTQGFNSLRVHLRNNVR

321 . . : . . . . . 400 196 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE 305 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE 630 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE BI9 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE CF5 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE Liv22 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE Liv24 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE M68 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE M120 KFEELSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE R20291 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE TL176 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE TL178 KFEKLSDNISSGKKNTAKNFLKYKELLYEISDQCNQLKSMSYEKIPVKYDNMALITTFDYIQNELLAEIKCRLKELKPQE

401 . . . . : . . . 480

264

196 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN 305 VDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGNKDGIKEVEVTDGKKNIIGLGVEENSNSKYKVEAIVN 630 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKNGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN BI9 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKNGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN CF5 VDKTEGISIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDSKKNIIGLGVEENSNSKYKVEAIVN Liv22 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKNGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN Liv24 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKNGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN M68 VDKTEGISIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDSKKNIIGLGVEENSNSKYKVEAIVN M120 VDKTEDITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDSKKNIIGLGVEENSNSKYKVEAIIN R20291 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN TL176 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKNGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN TL178 IDKTEGITIKIPFYKADYDMTKEGWPKLKENGSGAELSLMVTGDKDGIKEVEVTDGKKNIIGLGVEENSNSKYKVDAIVN

481 . 500 . ] 512 196 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN 305 FNLNIDTNVVGNYNIKDKILINHDISSYKKVN 630 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN BI9 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN CF5 FNLNIDTNVVGNYNIKDKILINHDISSYKKVN Liv22 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN Liv24 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN M68 FNLNIDTNVVGNYNIKDKILINHDISSYKKVN M120 FNLNIDTNIVGNYDIKDKILINHDISSYKKVN R20291 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN TL176 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN TL178 FNLNIDTNVVGNYDIKDKILINHDISSYKKVN

265

PilA2

1 [ . . . . : . . . 80 196 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK 305 ---LINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKIIDSLSVETLKEK 630 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK BI9 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK CF5 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK Liv22 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK Liv24 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK M68 ---LINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKIIDSLSVETLKEK M120 ---LINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDENIIDSLSVEALKEK R20291 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK TL176 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK TL178 LINLINKKRKGFTLVEMIVVVTILGVISSIALVKYSKVQESAKLNADYTNAANIVTAASMAINDDEKTIDSLSVETLKEK

81 . 100 . ] 119 196 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK 305 GYLNTVPVPQSTSGKFELVINDNGTDISVNINSKQFYPK 630 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK BI9 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK CF5 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK Liv22 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK Liv24 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK M68 GYLNTVPVPQSTSGKFELVINDNGTDISVNINSKQFYPK M120 GYLNTVPVPQSTSGKFELVINDNGTDISVNINSKQFYPK R20291 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK TL176 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK TL178 GYLNTVPVPQSTSGKFELVINDSGTDISVNINSKQFYPK

266

PilJ

1 [ . . . . : . . . 80 196 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKDKNEVIKEVLENKDGKYF 305 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEEPSKDKNEVIKEVLENKDGEYF 630 MNKKGFTLIELLVVISIIGILVIVAIPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKGKNEVMKEVLENKDGKYF BI9 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKDKNEVIKEVLENKDGKYF CF5 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEEQSKDKNEVIKEVLQNKDGKYF Liv22 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKGKNEVMKEVLENKDGKYF Liv24 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKDKNEVIKEVLENKDGKYF M68 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEEQSKDKNEVIKEVLQNKDGKYF M120 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEEPSKDKNKVIKDVLENKDGKYF R20291 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKDKNEVIKEVLENKDGKYF TL176 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKGKNEVMKEVLENKDGKYF TL178 MNKKGFTLIELLVVISIIGILVIVAVPALFRNIEKSKAVTCLSNRENIKTQIVIAMAEESSKGKNEVMKEVLENKDGKYF

81 . 100 . . . . : . 160 196 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGIEMARDIHQSMKDLIASFAQDPSIIPGASKGNDDFRKY 305 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDIHQSMKDLIASFAQDPSIIPGASKGNDDFRKY 630 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDIHQSMKDLIASFSQDPSIIPGASKGNDDFRKY BI9 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDVHQSMKDLIASFAQDPSIIPGASKGNDDFRKY CF5 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDVHQSMKDLIASFAQDPSIIPGASKGNDDFRKY Liv22 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDIHQSMKDLIASFAQDPSIIPGASKGNDDFRKY Liv24 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDVHQSMKDLIASFAQDPSIIPGASKGNDDFRKY M68 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDVHQSMKDLIASFAQDPSIIPGASKGNDDFRKY M120 ETEPKCKSGGIYSATFDDGYDGITGGESIAKVYVTCTEHPDGVEMARDVHQSMKDLIASFAQDPSIIPGASKSNDDFRKY R20291 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGIEMARDIHQSMKDLIASFAQDPSIIPGASKGNDDFRKY TL176 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDIHQSMKDLIASFSQDPSIIPGASKGNDDFRKY TL178 ETEPKCKSGGIYSATFDDGYDGITGIESIAKVYVTCTKHPDGVEMARDIHQSMKDLIASFAQDPSIIPGASKGNDDFRKY

161 . . . 200 . . . . 240 196 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG 305 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVIFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG 630 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYSPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG BI9 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG CF5 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG Liv22 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYSPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG Liv24 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG M68 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG M120 LLDNKYKKGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKPDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG R20291 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG TL176 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYSPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG TL178 LLDNKYKNGWPTIPDEFKAKYGLSKDTLYIQPYAYNPTKSDATVVVFANNKTGGNWYTSLVYDYDEGRWYKGKNGISVAG

241 : . ] 267 196 RSWDVDTDSVKSVKTEIHSKEGWGPLN 305 RSWDVDTDSVKSVKTDIHSKEGWGPLN 630 RSWDVDTDSVKSVKTEIHSKEGWGPLN BI9 RSWDVDTDSVKSVKTEIHSKEGWGPLN CF5 RSWDVDTDSAKSVKTEIHSKEGWGPLN Liv22 RSWDVDTDSVKSVKTEIHSKEGWGPLN Liv24 RSWDVDTDSVKSVKTEIHSKEGWGPLN M68 RSWDVDTDSAKSVKTEIHSKEGWGPLN M120 RSWDVDTDSVKSVKTEIHSKEGWGPLN R20291 RSWDVDTDSVKSVKTEIHSKEGWGPLN TL176 RSWDVDTDSVKSVKTEIHSKEGWGPLN TL178 RSWDVDTDSVKSVKTEIHSKEGWGPLN

267

PilW

1 [ . . . . : . . . 80 196 MKNKKGFTLVELLVVIAIIGILAIIALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTDGGMISWVKKDGKIII 630 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTDGGMISWVKKDGKIII BI9 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTEGNIIWWTKKDGKITV Liv22 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTEGNIIWWTKKDGKITV Liv24 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTEGNIIWWTKKDGKITV R20291 MKNKKGFTLVELLVVIAIIGILAIIALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTDGGMISWVKKDGKIII TL176 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTDGGMISWVKKDGKIII TL178 MKNKKGFTLVELLVVIAIIGILAIVALPALFKNIEKAKIAKLEADISAIKSASLSYYADESKYTEGNIIWWTKKDGKITV

81 . 100 . . . . : . 160 196 NGGFK-DDPLADKIENLGMPYNGSYLLMSSPGHEKYLELSILPEGEISKSGLDKLKNDYGNLIDITNDQNKINIVIKLLN 630 NGGFK-DDPLADKIENLGMPYNGSYLLMSSPGHEKYLELSILPEGEISKSGLDKLKSDYGSSIDIKNDQNKIDIVIKLLN BI9 NSGIGDEDPLAHKIENLGMPYNGSYTLVSSNGSEEYLELNIIIDGEISKSGLDKLEEDYGSSITIPNDKNMI---ITFLS Liv22 NSGIGDEDPLAHKIENLGMPYNGSYTLVSSNGSEEYLELNIIIDGEISKSGLDKLEEDYGSSIKIPNDKNMI---ITFLS Liv24 NSGIGDEDPLAHKIENLGMPYNGSYTLVSSNGSEEYLELNIIIDGEISKSGLDKLEEDYGSSITIPNDKNMI---ITFLS R20291 NGGFK-DDPLADKIENLGMPYNGSYLLMSSPGHEKYLELSILPEGEISKSGLDKLKNDYGNLIDITNDQNKINIVIKLLN TL176 NGGFK-DDPLADKIENLGMPYNGSYLLMSSPGHEKYLELSILPEGEISKSGLDKLKSDYGSSIDIMNEQNKIDIVIKFLN TL178 NSGIGDEDPLAHKIENLGMPYNGSYTLVSSNGSEEYLELNIIIDGEISKSGLDKLEEDYGSSIKIPNDKNMI---ITFLS

161 ] 165 196 NKSNT 630 DKSNT BI9 NKSDN Liv22 NKSDN Liv24 NKSDN R20291 NKSNT TL176 DKSNT TL178 NKSDN

268

PilA3

1 [ . . . . : . . . 80 196 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT 305 MKHKYGYLLFESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT 630 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIIGNSMGIMNISNYEETFKKT BI9 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIIGNSMGIMNISNYEETFKKT CF5 MKHKYGYLLLESVVSLSSMVIIMLVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT Liv22 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT Liv24 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIIGNSMGIMNISNYEETFKKT M68 MKHKYGYLLLESVVSLSSMVIIMLVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT M120 MKHKYGYLLLESVVSLSSMLIMILVLYSIFLSTISLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIINVSNYEDTFKKA R20291 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT TL176 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT TL178 MKHKYGYLLLESVVSLSSMVIIILVLYSIFLSTINLKLKVEDKIELQQQSLEIIKSMEGIISNSMGIMNVSNYEETFKKT

81 . 100 . . . . : . 160 196 TSIKCRYVDEN--NNEESISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVLITNNGQYVNIKLKLSKRSQK 305 TSIKCRYVDEN--NNEENISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK 630 MSIKSRYVDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK BI9 TSIKCRYVDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK CF5 TSIKCRYVDEN--NNEENISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK Liv22 TSIKCRYLDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK Liv24 TSIKCRYVDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK M68 TSIKCRYVDEN--NNEENISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK M120 TSIKCRYVDENVNNNEESISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSINNNGQYVNIKLKLSKRSQK R20291 TSIKCRYVDEN--NNEESISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVLITNNGQYVNIKLKLSKRSQK TL176 TSIKCRYLDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKCSQK TL178 TSIKCRYLDEN--NNEEGISNKEIILNERRNKLFVNSLNGESSQAGGYEIGDYVDEMYVSITNNGQYVNIKLKLSKRSQK

161 . ] 176 196 YETDFKIKVWNFSESI 305 YETDFKIKVWNFSESI 630 YETDFKIKVWNFSESI BI9 YETDFKIKVWNFSESI CF5 YETDFKIKVWNFSESV Liv22 YETDFKIKVWNFSESI Liv24 YETDFKIKVWNFSESI M68 YETDFKIKVWNFSESV M120 YETEFKIKVWNFSENI R20291 YETDFKIKVWNFSESI TL176 YETDFKIKVWNFSESI TL178 YETDFKIKVWNFSESI

269

PilX

1 [ . . . . : . . . 80 196 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN 305 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN 630 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN BI9 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN CF5 LKIRKSGFISIECIISIAILYVVVYLVSTSLYNCYSFISRSMSDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN Liv22 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN Liv24 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN M68 LKIRKSGFISIECIISIAILYVVVYLVSTSLYNCYSFISRSMSDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN M120 LKIRKRGFISIECVISIAILYVAVYLVSTSLYSCYSFVSRNISDRKMLSTAKKYIEDEKYRIQNSKYELIENKIEKNYIN R20291 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN TL176 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN TL178 LKIRKSGFISIECIISIAILYVAVYLVSTSLYNCYSFISRNISDREMLSTAKKYIEDEKYRIQNSKYELIEDKIEKNYIN

81 . 100 . ] 120 196 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK 305 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK 630 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK BI9 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK CF5 GYEINSRIKQILDYYQCYEINIEIKNEFKELRFNSYVTRK Liv22 GYEINSKIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK Liv24 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK M68 GYEINSRIKQILDYYQCYEINIEIKNEFKELRFNSYVTRK M120 GYEISSRVEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK R20291 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK TL176 GYEINSRIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK TL178 GYEINSKIEQILDYYQCYEINIEIKNEFKKLRFNSYVTRK

270

7/24/2016 RightsLink Printable License

THE ROYAL SOCIETY LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and The Royal Society ("The Royal Society") consists of your license details and the terms and conditions provided by The Royal Society and Copyright Clearance Center.

All payments must be made in full to CCC. For payment instructions, please see information listed at the bottom of this form.

License Number 3915560973053 License date Jul 24, 2016 Licensed Content Publisher The Royal Society

Licensed Content Publication Philosophical Transactions A Licensed Content Title Molecular mechanism of bacterial type 1 and P pili assembly Licensed Content Author Andreas Busch,Gilles Phan,Gabriel Waksman Licensed Content Date 2015-03-06 Licensed Content Volume 373 Number Licensed Content Issue 2036 Number Volume number 373

Issue number 2036 Type of Use Thesis/Dissertation

Requestor type academic/educational Format print and electronic Portion figures/tables/images Quantity 1 Will you be translating? no Circulation 5

Order reference number Title of your thesis / Investigating the Type IV Pili of Clostridium difficile and Clostridium dissertation sordellii

Expected completion date Aug 2016 Estimated size (number of 270 pages) Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom Attn: Edward Couchman Publisher Tax ID GB927580009

Billing Type Credit Card Credit card info Master Card ending in 1772 Credit card expiration 11/2017

Price 3.20 GBP https://s100.copyright.com/AppDispatchServlet 1/2 7/24/2016 RightsLink Printable License

Tax/VAT (20%) 0.64 GBP Total 3.84 GBP Terms and Conditions STANDARD TERMS AND CONDITIONS FOR REPRODUCTION OF MATERIAL FROM A ROYAL SOCIETY JOURNAL 1. Use of the material is restricted to the type of use specified in your order details. 2. The publisher for this copyrighted material is the Royal Society. By clicking "accept" in connection with completing this licensing transaction, you agree that the following terms and conditions apply to this transaction (along with the Billing and Payment terms and conditions established by Copyright Clearance Center, Inc. ("CCC"), at the time that you opened your Rightslink account and that are available at any time at http://myaccount.copyright.com. 3. The following credit line appears wherever the material is used: author, title, journal, year, volume, issue number, pagination, by permission of the Royal Society. 4. For the reproduction of a full article from a Royal Society journal for whatever purpose, the corresponding author of the material concerned should be informed of the proposed use. Contact details for the corresponding authors of all Royal Society journals can be found alongside either the abstract or full text of the article concerned, accessible from royalsocietypublishing.org. 5. If the credit line in our publication indicates that any of the figures, images or photos was reproduced from an earlier source it will be necessary for you to clear this permission with the original publisher as well. If this permission has not been obtained, please note that this material cannot be included in your publication/photocopies. 6. Licenses may be exercised anywhere in the world. 7. While you may exercise the rights licensed immediately upon issuance of the license at the end of the licensing process for the transaction, provided that you have disclosed complete and accurate details of your proposed use, no license is finally effective unless and until full payment is received from you (either by publisher or by CCC) as provided in CCC's Billing and Payment terms and conditions. If full payment is not received on a timely basis, then any license preliminarily granted shall be deemed automatically revoked and shall be void as if never granted. Further, in the event that you breach any of these terms and conditions or any of CCC's Billing and Payment terms and conditions, the license is automatically revoked and shall be void as if never granted. Use of materials as described in a revoked license, as well as any use of the materials beyond the scope of an unrevoked license, may constitute copyright infringement and publisher reserves the right to take any and all action to protect its copyright in the materials. 8. Publisher reserves all rights not specifically granted in the combination of (i) the license details provided by you and accepted in the course of this licensing transaction, (ii) these terms and conditions and (iii) CCC's Billing and Payment terms and conditions. 9. Publisher makes no representations or warranties with respect to the licensed material. 10. You hereby indemnify and agree to hold harmless publisher and CCC, and their respective officers, directors, employees and agents, from and against any and all claims arising out of your use of the licensed material other than as specifically authorized pursuant to this license. 11. This license may not be amended except in a writing signed by both parties (or, in the case of publisher, by CCC on publisher's behalf).

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 2/2 7/24/2016 RightsLink Printable License

ELSEVIER LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and Elsevier ("Elsevier") consists of your license details and the terms and conditions provided by Elsevier and Copyright Clearance Center.

License Number 3915561267647 License date Jul 24, 2016 Licensed Content Publisher Elsevier Licensed Content Publication Current Opinion in Structural Biology Licensed Content Title Structure and assembly of Gram-positive bacterial pili: unique covalent polymers Licensed Content Author Hae Joo Kang,Edward N Baker Licensed Content Date April 2012 Licensed Content Volume 22 Number Licensed Content Issue 2 Number Licensed Content Pages 8 Start Page 200 End Page 207

Type of Use reuse in a thesis/dissertation Intended publisher of new other work Portion figures/tables/illustrations

Number of 1 figures/tables/illustrations Format both print and electronic Are you the author of this No Elsevier article? Will you be translating? No Order reference number Original figure numbers Figure 3

Title of your Investigating the Type IV Pili of Clostridium difficile and Clostridium thesis/dissertation sordellii Expected completion date Aug 2016 Estimated size (number of 270 pages)

Elsevier VAT number GB 494 6272 12

Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom Attn: Edward Couchman Total 0.00 GBP https://s100.copyright.com/AppDispatchServlet 1/5 7/24/2016 RightsLink Printable License Terms and Conditions INTRODUCTION 1. The publisher for this copyrighted material is Elsevier. By clicking "accept" in connection with completing this licensing transaction, you agree that the following terms and conditions apply to this transaction (along with the Billing and Payment terms and conditions established by Copyright Clearance Center, Inc. ("CCC"), at the time that you opened your Rightslink account and that are available at any time at http://myaccount.copyright.com). GENERAL TERMS 2. Elsevier hereby grants you permission to reproduce the aforementioned material subject to the terms and conditions indicated. 3. Acknowledgement: If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source, permission must also be sought from that source. If such permission is not obtained then that material may not be included in your publication/copies. Suitable acknowledgement to the source must be made, either as a footnote or in a reference list at the end of your publication, as follows: "Reprinted from Publication title, Vol /edition number, Author(s), Title of article / title of chapter, Pages No., Copyright (Year), with permission from Elsevier [OR APPLICABLE SOCIETY COPYRIGHT OWNER]." Also Lancet special credit ­ "Reprinted from The Lancet, Vol. number, Author(s), Title of article, Pages No., Copyright (Year), with permission from Elsevier." 4. Reproduction of this material is confined to the purpose and/or media for which permission is hereby given. 5. Altering/Modifying Material: Not Permitted. However figures and illustrations may be altered/adapted minimally to serve your work. Any other abbreviations, additions, deletions and/or any other alterations shall be made only with prior written authorization of Elsevier Ltd. (Please contact Elsevier at [email protected]) 6. If the permission fee for the requested use of our material is waived in this instance, please be advised that your future requests for Elsevier materials may attract a fee. 7. Reservation of Rights: Publisher reserves all rights not specifically granted in the combination of (i) the license details provided by you and accepted in the course of this licensing transaction, (ii) these terms and conditions and (iii) CCC's Billing and Payment terms and conditions. 8. License Contingent Upon Payment: While you may exercise the rights licensed immediately upon issuance of the license at the end of the licensing process for the transaction, provided that you have disclosed complete and accurate details of your proposed use, no license is finally effective unless and until full payment is received from you (either by publisher or by CCC) as provided in CCC's Billing and Payment terms and conditions. If full payment is not received on a timely basis, then any license preliminarily granted shall be deemed automatically revoked and shall be void as if never granted. Further, in the event that you breach any of these terms and conditions or any of CCC's Billing and Payment terms and conditions, the license is automatically revoked and shall be void as if never granted. Use of materials as described in a revoked license, as well as any use of the materials beyond the scope of an unrevoked license, may constitute copyright infringement and publisher reserves the right to take any and all action to protect its copyright in the materials. 9. Warranties: Publisher makes no representations or warranties with respect to the licensed material. 10. Indemnity: You hereby indemnify and agree to hold harmless publisher and CCC, and their respective officers, directors, employees and agents, from and against any and all claims arising out of your use of the licensed material other than as specifically authorized pursuant to this license. 11. No Transfer of License: This license is personal to you and may not be sublicensed, assigned, or transferred by you to any other person without publisher's written permission. 12. No Amendment Except in Writing: This license may not be amended except in a writing signed by both parties (or, in the case of publisher, by CCC on publisher's behalf). 13. Objection to Contrary Terms: Publisher hereby objects to any terms contained in any https://s100.copyright.com/AppDispatchServlet 2/5 7/24/2016 RightsLink Printable License purchase order, acknowledgment, check endorsement or other writing prepared by you, which terms are inconsistent with these terms and conditions or CCC's Billing and Payment terms and conditions. These terms and conditions, together with CCC's Billing and Payment terms and conditions (which are incorporated herein), comprise the entire agreement between you and publisher (and CCC) concerning this licensing transaction. In the event of any conflict between your obligations established by these terms and conditions and those established by CCC's Billing and Payment terms and conditions, these terms and conditions shall control. 14. Revocation: Elsevier or Copyright Clearance Center may deny the permissions described in this License at their sole discretion, for any reason or no reason, with a full refund payable to you. Notice of such denial will be made using the contact information provided by you. Failure to receive such notice will not alter or invalidate the denial. In no event will Elsevier or Copyright Clearance Center be responsible or liable for any costs, expenses or damage incurred by you as a result of a denial of your permission request, other than a refund of the amount(s) paid by you to Elsevier and/or Copyright Clearance Center for denied permissions. LIMITED LICENSE The following terms and conditions apply only to specific license types: 15. Translation: This permission is granted for non­exclusive world English rights only unless your license was granted for translation rights. If you licensed translation rights you may only translate this content into the languages you requested. A professional translator must perform all translations and reproduce the content word for word preserving the integrity of the article. 16. Posting licensed content on any Website: The following terms and conditions apply as follows: Licensing material from an Elsevier journal: All content posted to the web site must maintain the copyright information line on the bottom of each image; A hyper­text must be included to the Homepage of the journal from which you are licensing at http://www.sciencedirect.com/science/journal/xxxxx or the Elsevier homepage for books at http://www.elsevier.com; Central Storage: This license does not include permission for a scanned version of the material to be stored in a central repository such as that provided by Heron/XanEdu. Licensing material from an Elsevier book: A hyper­text link must be included to the Elsevier homepage at http://www.elsevier.com . All content posted to the web site must maintain the copyright information line on the bottom of each image.

Posting licensed content on Electronic reserve: In addition to the above the following clauses are applicable: The web site must be password­protected and made available only to bona fide students registered on a relevant course. This permission is granted for 1 year only. You may obtain a new license for future website posting. 17. For journal authors: the following clauses are applicable in addition to the above: Preprints: A preprint is an author's own write­up of research results and analysis, it has not been peer­ reviewed, nor has it had any other value added to it by a publisher (such as formatting, copyright, technical enhancement etc.). Authors can share their preprints anywhere at any time. Preprints should not be added to or enhanced in any way in order to appear more like, or to substitute for, the final versions of articles however authors can update their preprints on arXiv or RePEc with their Accepted Author Manuscript (see below). If accepted for publication, we encourage authors to link from the preprint to their formal publication via its DOI. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help users to find, access, cite and use the best available version. Please note that Cell Press, The Lancet and some society­owned have different preprint policies. Information on these policies is available on the journal homepage. Accepted Author Manuscripts: An accepted author manuscript is the manuscript of an article that has been accepted for publication and which typically includes author­ incorporated changes suggested during submission, peer review and editor­author communications. Authors can share their accepted author manuscript: https://s100.copyright.com/AppDispatchServlet 3/5 7/24/2016 RightsLink Printable License

 immediately via their non­commercial person homepage or blog by updating a preprint in arXiv or RePEc with the accepted manuscript via their research institute or institutional repository for internal institutional uses or as part of an invitation­only research collaboration work­group directly by providing copies to their students or to research collaborators for their personal use for private scholarly sharing as part of an invitation­only work group on commercial sites with which Elsevier has an agreement  after the embargo period via non­commercial hosting platforms such as their institutional repository via commercial sites with which Elsevier has an agreement

In all cases accepted manuscripts should:

 link to the formal publication via its DOI  bear a CC­BY­NC­ND license ­ this is easy to do  if aggregated with other manuscripts, for example in a repository or other site, be shared in alignment with our hosting policy not be added to or enhanced in any way to appear more like, or to substitute for, the published journal article.

Published journal article (JPA): A published journal article (PJA) is the definitive final record of published research that appears or will appear in the journal and embodies all value­adding publishing activities including peer review co­ordination, copy­editing, formatting, (if relevant) pagination and online enrichment. Policies for sharing publishing journal articles differ for subscription and gold open access articles: Subscription Articles: If you are an author, please share a link to your article rather than the full­text. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help your users to find, access, cite, and use the best available version. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect. If you are affiliated with a library that subscribes to ScienceDirect you have additional private sharing rights for others' research accessed under that agreement. This includes use for classroom teaching and internal training at the institution (including use in course packs and courseware programs), and inclusion of the article for grant funding purposes. Gold Open Access Articles: May be shared according to the author­selected end­user license and should contain a CrossMark logo, the end user license, and a DOI link to the formal publication on ScienceDirect. Please refer to Elsevier's posting policy for further information. 18. For book authors the following clauses are applicable in addition to the above: Authors are permitted to place a brief summary of their work online only. You are not allowed to download and post the published electronic version of your chapter, nor may you scan the printed edition to create an electronic version. Posting to a repository: Authors are permitted to post a summary of their chapter only in their institution's repository. 19. Thesis/Dissertation: If your license is for use in a thesis/dissertation your thesis may be submitted to your institution in either print or electronic form. Should your thesis be published commercially, please reapply for permission. These requirements include permission for the Library and Archives of Canada to supply single copies, on demand, of the complete thesis and include permission for Proquest/UMI to supply single copies, on demand, of the complete thesis. Should your thesis be published commercially, please reapply for permission. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect.

Elsevier Open Access Terms and Conditions https://s100.copyright.com/AppDispatchServlet 4/5 7/24/2016 RightsLink Printable License You can publish open access with Elsevier in hundreds of open access journals or in nearly 2000 established subscription journals that support open access publishing. Permitted third party re­use of these open access articles is defined by the author's choice of Creative Commons user license. See our open access license policy for more information. Terms & Conditions applicable to all Open Access articles published with Elsevier: Any reuse of the article must not represent the author as endorsing the adaptation of the article nor should the article be modified in such a way as to damage the author's honour or reputation. If any changes have been made, such changes must be clearly indicated. The author(s) must be appropriately credited and we ask that you include the end user license and a DOI link to the formal publication on ScienceDirect. If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source it is the responsibility of the user to ensure their reuse complies with the terms and conditions determined by the rights holder. Additional Terms & Conditions applicable to each Creative Commons user license: CC BY: The CC­BY license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article and to make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by/4.0. CC BY NC SA: The CC BY­NC­SA license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article, provided this is not done for commercial purposes, and that the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. Further, any new works must be made available on the same conditions. The full details of the license are available at http://creativecommons.org/licenses/by­nc­sa/4.0. CC BY NC ND: The CC BY­NC­ND license allows users to copy and distribute the Article, provided this is not done for commercial purposes and further does not permit distribution of the Article if it is changed or edited in any way, and provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, and that the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by­nc­nd/4.0. Any commercial reuse of Open Access articles published with a CC BY NC SA or CC BY NC ND license requires permission from Elsevier and will be subject to a fee. Commercial reuse includes:

 Associating advertising with the full text of the Article  Charging fees for document delivery or access  Article aggregation  Systematic distribution via e­mail lists or share buttons

Posting or linking by commercial companies for use by customers of those companies.

20. Other Conditions:

v1.8

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 5/5 11200 Rockville Pike Suite 302 Rockville, Maryland 20852

August 19, 2011

American Society for Biochemistry and Molecular Biology

To whom it may concern,

It is the policy of the American Society for Biochemistry and Molecular Biology to allow reuse of any material published in its journals (the Journal of Biological Chemistry, Molecular & Cellular Proteomics and the Journal of Lipid Research) in a thesis or dissertation at no cost and with no explicit permission needed. Please see our copyright permissions page on the journal site for more information.

Best wishes,

Sarah Crespi

American Society for Biochemistry and Molecular Biology 11200 Rockville Pike, Rockville, MD Suite 302 240-283-6616 JBC | MCP | JLR

Tel: 240-283-6600 • Fax: 240-881-2080 • E-mail: [email protected] 7/24/2016 RightsLink Printable License

THE AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and The American Association for the Advancement of Science ("The American Association for the Advancement of Science") consists of your license details and the terms and conditions provided by The American Association for the Advancement of Science and Copyright Clearance Center.

License Number 3915570997963

License date Jul 24, 2016 Licensed Content Publisher The American Association for the Advancement of Science Licensed Content Publication Science Licensed Content Title Stabilizing Isopeptide Bonds Revealed in Gram-Positive Bacterial Pilus Structure Licensed Content Author Hae Joo Kang,Fasséli Coulibaly,Fiona Clow,Thomas Proft,Edward N. Baker Licensed Content Date Dec 7, 2007 Licensed Content Volume 318 Number

Licensed Content Issue 5856 Number Volume number 318 Issue number 5856 Type of Use Thesis / Dissertation

Requestor type Scientist/individual at a research institution Format Print and electronic

Portion Figure Number of figures/tables 1 Order reference number

Title of your thesis / Investigating the Type IV Pili of Clostridium difficile and Clostridium dissertation sordellii Expected completion date Aug 2016 Estimated size(pages) 270

Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom Attn: Edward Couchman

Billing Type Invoice Billing Address Edward Couchman 57 Bowman Mews

London, United Kingdom SW18 5TN Attn: Edward Couchman Total 0.00 GBP Terms and Conditions https://s100.copyright.com/AppDispatchServlet 1/6 7/24/2016 RightsLink Printable License American Association for the Advancement of Science TERMS AND CONDITIONS Regarding your request, we are pleased to grant you non­exclusive, non­transferable permission, to republish the AAAS material identified above in your work identified above, subject to the terms and conditions herein. We must be contacted for permission for any uses other than those specifically identified in your request above. The following credit line must be printed along with the AAAS material: "From [Full Reference Citation]. Reprinted with permission from AAAS." All required credit lines and notices must be visible any time a user accesses any part of the AAAS material and must appear on any printed copies and authorized user might make. This permission does not apply to figures / photos / artwork or any other content or materials included in your work that are credited to non­AAAS sources. If the requested material is sourced to or references non­AAAS sources, you must obtain authorization from that source as well before using that material. You agree to hold harmless and indemnify AAAS against any claims arising from your use of any content in your work that is credited to non­AAAS sources. If the AAAS material covered by this permission was published in Science during the years 1974 ­ 1994, you must also obtain permission from the author, who may grant or withhold permission, and who may or may not charge a fee if permission is granted. See original article for author's address. This condition does not apply to news articles. The AAAS material may not be modified or altered except that figures and tables may be modified with permission from the author. Author permission for any such changes must be secured prior to your use. Whenever possible, we ask that electronic uses of the AAAS material permitted herein include a hyperlink to the original work on AAAS's website (hyperlink may be embedded in the reference citation). AAAS material reproduced in your work identified herein must not account for more than 30% of the total contents of that work. AAAS must publish the full paper prior to use of any text. AAAS material must not imply any endorsement by the American Association for the Advancement of Science. This permission is not valid for the use of the AAAS and/or Science logos. AAAS makes no representations or warranties as to the accuracy of any information contained in the AAAS material covered by this permission, including any warranties of merchantability or fitness for a particular purpose. If permission fees for this use are waived, please note that AAAS reserves the right to charge for reproduction of this material in the future. Permission is not valid unless payment is received within sixty (60) days of the issuance of this permission. If payment is not received within this time period then all rights granted herein shall be revoked and this permission will be considered null and void. In the event of breach of any of the terms and conditions herein or any of CCC's Billing and Payment terms and conditions, all rights granted herein shall be revoked and this permission will be considered null and void. AAAS reserves the right to terminate this permission and all rights granted herein at its discretion, for any purpose, at any time. In the event that AAAS elects to terminate this permission, you will have no further right to publish, publicly perform, publicly display, distribute or otherwise use any matter in which the AAAS content had been included, and all fees paid hereunder shall be fully refunded to you. Notification of termination will be sent to the contact information as supplied by you during the request process and termination shall be immediate upon sending the notice. Neither AAAS nor CCC shall be liable for any costs, expenses, or damages you may incur as a result of the termination of this permission, beyond the refund noted above. This Permission may not be amended except by written document signed by both parties. The terms above are applicable to all permissions granted for the use of AAAS material. Below you will find additional conditions that apply to your particular type of use. FOR A THESIS OR DISSERTATION If you are using figure(s)/table(s), permission is granted for use in print and electronic versions of your dissertation or thesis. A full text article may be used in print versions only of a dissertation or thesis. https://s100.copyright.com/AppDispatchServlet 2/6 7/24/2016 RightsLink Printable License Permission covers the distribution of your dissertation or thesis on demand by ProQuest / UMI, provided the AAAS material covered by this permission remains in situ. If you are an Original Author on the AAAS article being reproduced, please refer to your License to Publish for rules on reproducing your paper in a dissertation or thesis. FOR JOURNALS: Permission covers both print and electronic versions of your journal article, however the AAAS material may not be used in any manner other than within the context of your article. FOR BOOKS/TEXTBOOKS: If this license is to reuse figures/tables, then permission is granted for non­exclusive world rights in all languages in both print and electronic formats (electronic formats are defined below). If this license is to reuse a text excerpt or a full text article, then permission is granted for non­exclusive world rights in English only. You have the option of securing either print or electronic rights or both, but electronic rights are not automatically granted and do garner additional fees. Permission for translations of text excerpts or full text articles into other languages must be obtained separately. Licenses granted for use of AAAS material in electronic format books/textbooks are valid only in cases where the electronic version is equivalent to or substitutes for the print version of the book/textbook. The AAAS material reproduced as permitted herein must remain in situ and must not be exploited separately (for example, if permission covers the use of a full text article, the article may not be offered for access or for purchase as a stand­alone unit), except in the case of permitted textbook companions as noted below. You must include the following notice in any electronic versions, either adjacent to the reprinted AAAS material or in the terms and conditions for use of your electronic products: "Readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. Except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or in part, without prior written permission from the publisher." If your book is an academic textbook, permission covers the following companions to your textbook, provided such companions are distributed only in conjunction with your textbook at no additional cost to the user:

­ Password­protected website ­ Instructor's image CD/DVD and/or PowerPoint resource ­ Student CD/DVD All companions must contain instructions to users that the AAAS material may be used for non­commercial, classroom purposes only. Any other uses require the prior written permission from AAAS. If your license is for the use of AAAS Figures/Tables, then the electronic rights granted herein permit use of the Licensed Material in any Custom Databases that you distribute the electronic versions of your textbook through, so long as the Licensed Material remains within the context of a chapter of the title identified in your request and cannot be downloaded by a user as an independent image file. Rights also extend to copies/files of your Work (as described above) that you are required to provide for use by the visually and/or print disabled in compliance with state and federal laws. This permission only covers a single edition of your work as identified in your request. FOR NEWSLETTERS: Permission covers print and/or electronic versions, provided the AAAS material reproduced as permitted herein remains in situ and is not exploited separately (for example, if permission covers the use of a full text article, the article may not be offered for access or for purchase as a stand­alone unit) FOR ANNUAL REPORTS: Permission covers print and electronic versions provided the AAAS material reproduced as permitted herein remains in situ and is not exploited separately (for example, if permission covers the use of a full text article, the article may not be offered for access or for purchase as a stand­alone unit) https://s100.copyright.com/AppDispatchServlet 3/6 7/24/2016 RightsLink Printable License FOR PROMOTIONAL/MARKETING USES: Permission covers the use of AAAS material in promotional or marketing pieces such as information packets, media kits, product slide kits, brochures, or flyers limited to a single print run. The AAAS Material may not be used in any manner which implies endorsement or promotion by the American Association for the Advancement of Science (AAAS) or Science of any product or service. AAAS does not permit the reproduction of its name, logo or text on promotional literature. If permission to use a full text article is permitted, The Science article covered by this permission must not be altered in any way. No additional printing may be set onto an article copy other than the copyright credit line required above. Any alterations must be approved in advance and in writing by AAAS. This includes, but is not limited to, the placement of sponsorship identifiers, trademarks, logos, rubber stamping or self­adhesive stickers onto the article copies. Additionally, article copies must be a freestanding part of any information package (i.e. media kit) into which they are inserted. They may not be physically attached to anything, such as an advertising insert, or have anything attached to them, such as a sample product. Article copies must be easily removable from any kits or informational packages in which they are used. The only exception is that article copies may be inserted into three­ring binders. FOR CORPORATE INTERNAL USE: The AAAS material covered by this permission may not be altered in any way. No additional printing may be set onto an article copy other than the required credit line. Any alterations must be approved in advance and in writing by AAAS. This includes, but is not limited to the placement of sponsorship identifiers, trademarks, logos, rubber stamping or self­adhesive stickers onto article copies. If you are making article copies, copies are restricted to the number indicated in your request and must be distributed only to internal employees for internal use. If you are using AAAS Material in Presentation Slides, the required credit line must be visible on the slide where the AAAS material will be reprinted If you are using AAAS Material on a CD, DVD, Flash Drive, or the World Wide Web, you must include the following notice in any electronic versions, either adjacent to the reprinted AAAS material or in the terms and conditions for use of your electronic products: "Readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. Except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or in part, without prior written permission from the publisher." Access to any such CD, DVD, Flash Drive or Web page must be restricted to your organization's employees only. FOR CME COURSE and SCIENTIFIC SOCIETY MEETINGS: Permission is restricted to the particular Course, Seminar, Conference, or Meeting indicated in your request. If this license covers a text excerpt or a Full Text Article, access to the reprinted AAAS material must be restricted to attendees of your event only (if you have been granted electronic rights for use of a full text article on your website, your website must be password protected, or access restricted so that only attendees can access the content on your site). If you are using AAAS Material on a CD, DVD, Flash Drive, or the World Wide Web, you must include the following notice in any electronic versions, either adjacent to the reprinted AAAS material or in the terms and conditions for use of your electronic products: "Readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. Except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or in part, without prior written permission from the publisher." FOR POLICY REPORTS: These rights are granted only to non­profit organizations and/or government agencies. Permission covers print and electronic versions of a report, provided the required credit line appears in both versions and provided the AAAS material reproduced as permitted herein remains in situ and is not exploited separately. https://s100.copyright.com/AppDispatchServlet 4/6 7/24/2016 RightsLink Printable License FOR CLASSROOM PHOTOCOPIES: Permission covers distribution in print copy format only. Article copies must be freestanding and not part of a course pack. They may not be physically attached to anything or have anything attached to them. FOR COURSEPACKS OR COURSE WEBSITES: These rights cover use of the AAAS material in one class at one institution. Permission is valid only for a single semester after which the AAAS material must be removed from the Electronic Course website, unless new permission is obtained for an additional semester. If the material is to be distributed online, access must be restricted to students and instructors enrolled in that particular course by some means of password or access control. FOR WEBSITES: You must include the following notice in any electronic versions, either adjacent to the reprinted AAAS material or in the terms and conditions for use of your electronic products: "Readers may view, browse, and/or download material for temporary copying purposes only, provided these uses are for noncommercial personal purposes. Except as provided by law, this material may not be further reproduced, distributed, transmitted, modified, adapted, performed, displayed, published, or sold in whole or in part, without prior written permission from the publisher." Permissions for the use of Full Text articles on third party websites are granted on a case by case basis and only in cases where access to the AAAS Material is restricted by some means of password or access control. Alternately, an E­Print may be purchased through our reprints department ([email protected]). REGARDING FULL TEXT ARTICLE USE ON THE WORLD WIDE WEB IF YOU ARE AN ‘ORIGINAL AUTHOR’ OF A SCIENCE PAPER If you chose "Original Author" as the Requestor Type, you are warranting that you are one of authors listed on the License Agreement as a "Licensed content author" or that you are acting on that author's behalf to use the Licensed content in a new work that one of the authors listed on the License Agreement as a "Licensed content author" has written. Original Authors may post the ‘Accepted Version’ of their full text article on their personal or on their University website and not on any other website. The ‘Accepted Version’ is the version of the paper accepted for publication by AAAS including changes resulting from peer review but prior to AAAS’s copy editing and production (in other words not the AAAS published version). FOR MOVIES / FILM / TELEVISION: Permission is granted to use, record, film, photograph, and/or tape the AAAS material in connection with your program/film and in any medium your program/film may be shown or heard, including but not limited to broadcast and cable television, radio, print, world wide web, and videocassette. The required credit line should run in the program/film's end credits. FOR MUSEUM EXHIBITIONS: Permission is granted to use the AAAS material as part of a single exhibition for the duration of that exhibit. Permission for use of the material in promotional materials for the exhibit must be cleared separately with AAAS (please contact us at [email protected]). FOR TRANSLATIONS: Translation rights apply only to the language identified in your request summary above. The following disclaimer must appear with your translation, on the first page of the article, after the credit line: "This translation is not an official translation by AAAS staff, nor is it endorsed by AAAS as accurate. In crucial matters, please refer to the official English­ language version originally published by AAAS." FOR USE ON A COVER: Permission is granted to use the AAAS material on the cover of a journal issue, newsletter issue, book, textbook, or annual report in print and electronic formats provided the AAAS material reproduced as permitted herein remains in situ and is not exploited separately By using the AAAS Material identified in your request, you agree to abide by all the terms and conditions herein. Questions about these terms can be directed to the AAAS Permissions department [email protected]. Other Terms and Conditions: https://s100.copyright.com/AppDispatchServlet 5/6 7/24/2016 RightsLink Printable License v 2

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 6/6 7/24/2016 RightsLink Printable License

OXFORD UNIVERSITY PRESS LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and Oxford University Press ("Oxford University Press") consists of your license details and the terms and conditions provided by Oxford University Press and Copyright Clearance Center.

License Number 3915561436214 License date Jul 24, 2016 Licensed content publisher Oxford University Press Licensed content publication FEMS Microbiology Reviews Licensed content title Exceptionally widespread nanomachines composed of type IV pilins: the prokaryotic Swiss Army knives Licensed content author Jamie-Lee Berry,Vladimir Pelicic Licensed content date 2015-01-01 Type of Use Thesis/Dissertation

Institution name Title of your work Investigating the Type IV Pili of Clostridium difficile and Clostridium sordellii Publisher of your work n/a Expected publication date Aug 2016 Permissions cost 0.00 GBP Value added tax 0.00 GBP

Total 0.00 GBP Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom Attn: Edward Couchman Publisher Tax ID GB125506730 Billing Type Invoice Billing Address Edward Couchman 57 Bowman Mews

London, United Kingdom SW18 5TN Attn: Edward Couchman Total 0.00 GBP

Terms and Conditions

STANDARD TERMS AND CONDITIONS FOR REPRODUCTION OF MATERIAL FROM AN OXFORD UNIVERSITY PRESS JOURNAL 1. Use of the material is restricted to the type of use specified in your order details. 2. This permission covers the use of the material in the English language in the following territory: world. If you have requested additional permission to translate this material, the terms and conditions of this reuse will be set out in clause 12. 3. This permission is limited to the particular use authorized in (1) above and does not allow you to sanction its use elsewhere in any other format other than specified above, nor does it https://s100.copyright.com/AppDispatchServlet 1/2 7/24/2016 RightsLink Printable License apply to quotations, images, artistic works etc that have been reproduced from other sources which may be part of the material to be used. 4. No alteration, omission or addition is made to the material without our written consent. Permission must be re­cleared with Oxford University Press if/when you decide to reprint. 5. The following credit line appears wherever the material is used: author, title, journal, year, volume, issue number, pagination, by permission of Oxford University Press or the sponsoring society if the journal is a society journal. Where a journal is being published on behalf of a learned society, the details of that society must be included in the credit line. 6. For the reproduction of a full article from an Oxford University Press journal for whatever purpose, the corresponding author of the material concerned should be informed of the proposed use. Contact details for the corresponding authors of all Oxford University Press journal contact can be found alongside either the abstract or full text of the article concerned, accessible from www.oxfordjournals.org Should there be a problem clearing these rights, please contact [email protected] 7. If the credit line or acknowledgement in our publication indicates that any of the figures, images or photos was reproduced, drawn or modified from an earlier source it will be necessary for you to clear this permission with the original publisher as well. If this permission has not been obtained, please note that this material cannot be included in your publication/photocopies. 8. While you may exercise the rights licensed immediately upon issuance of the license at the end of the licensing process for the transaction, provided that you have disclosed complete and accurate details of your proposed use, no license is finally effective unless and until full payment is received from you (either by Oxford University Press or by Copyright Clearance Center (CCC)) as provided in CCC's Billing and Payment terms and conditions. If full payment is not received on a timely basis, then any license preliminarily granted shall be deemed automatically revoked and shall be void as if never granted. Further, in the event that you breach any of these terms and conditions or any of CCC's Billing and Payment terms and conditions, the license is automatically revoked and shall be void as if never granted. Use of materials as described in a revoked license, as well as any use of the materials beyond the scope of an unrevoked license, may constitute copyright infringement and Oxford University Press reserves the right to take any and all action to protect its copyright in the materials. 9. This license is personal to you and may not be sublicensed, assigned or transferred by you to any other person without Oxford University Press’s written permission. 10. Oxford University Press reserves all rights not specifically granted in the combination of (i) the license details provided by you and accepted in the course of this licensing transaction, (ii) these terms and conditions and (iii) CCC’s Billing and Payment terms and conditions. 11. You hereby indemnify and agree to hold harmless Oxford University Press and CCC, and their respective officers, directors, employs and agents, from and against any and all claims arising out of your use of the licensed material other than as specifically authorized pursuant to this license. 12. Other Terms and Conditions: v1.4

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 2/2 7/24/2016 RightsLink Printable License

ELSEVIER LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and Elsevier ("Elsevier") consists of your license details and the terms and conditions provided by Elsevier and Copyright Clearance Center.

License Number 3915570182112 License date Jul 24, 2016 Licensed Content Publisher Elsevier Licensed Content Publication Structure Licensed Content Title Structural and Evolutionary Analyses Show Unique Stabilization Strategies in the Type IV Pili of Clostridium difficile Licensed Content Author Kurt H. Piepenbrink,Grace A. Maldarelli,Claudia F. Martinez de la Peña,Tanis C. Dingle,George L. Mulvey,Amanda Lee,Erik von Rosenvinge,Glen D. Armstrong,Michael S. Donnenberg,Eric J. Sundberg Licensed Content Date 3 February 2015

Licensed Content Volume 23 Number

Licensed Content Issue 2 Number Licensed Content Pages 12 Start Page 385

End Page 396 Type of Use reuse in a thesis/dissertation

Intended publisher of new other work Portion figures/tables/illustrations

Number of 1 figures/tables/illustrations Format both print and electronic Are you the author of this No Elsevier article? Will you be translating? No Order reference number

Original figure numbers Figure 6 Title of your Investigating the Type IV Pili of Clostridium difficile and Clostridium thesis/dissertation sordellii Expected completion date Aug 2016

Estimated size (number of 270 pages)

Elsevier VAT number GB 494 6272 12 Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom https://s100.copyright.com/AppDispatchServlet 1/6 7/24/2016 RightsLink Printable License Attn: Edward Couchman Total 0.00 GBP Terms and Conditions INTRODUCTION 1. The publisher for this copyrighted material is Elsevier. By clicking "accept" in connection with completing this licensing transaction, you agree that the following terms and conditions apply to this transaction (along with the Billing and Payment terms and conditions established by Copyright Clearance Center, Inc. ("CCC"), at the time that you opened your Rightslink account and that are available at any time at http://myaccount.copyright.com). GENERAL TERMS 2. Elsevier hereby grants you permission to reproduce the aforementioned material subject to the terms and conditions indicated. 3. Acknowledgement: If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source, permission must also be sought from that source. If such permission is not obtained then that material may not be included in your publication/copies. Suitable acknowledgement to the source must be made, either as a footnote or in a reference list at the end of your publication, as follows: "Reprinted from Publication title, Vol /edition number, Author(s), Title of article / title of chapter, Pages No., Copyright (Year), with permission from Elsevier [OR APPLICABLE SOCIETY COPYRIGHT OWNER]." Also Lancet special credit ­ "Reprinted from The Lancet, Vol. number, Author(s), Title of article, Pages No., Copyright (Year), with permission from Elsevier." 4. Reproduction of this material is confined to the purpose and/or media for which permission is hereby given. 5. Altering/Modifying Material: Not Permitted. However figures and illustrations may be altered/adapted minimally to serve your work. Any other abbreviations, additions, deletions and/or any other alterations shall be made only with prior written authorization of Elsevier Ltd. (Please contact Elsevier at [email protected]) 6. If the permission fee for the requested use of our material is waived in this instance, please be advised that your future requests for Elsevier materials may attract a fee. 7. Reservation of Rights: Publisher reserves all rights not specifically granted in the combination of (i) the license details provided by you and accepted in the course of this licensing transaction, (ii) these terms and conditions and (iii) CCC's Billing and Payment terms and conditions. 8. License Contingent Upon Payment: While you may exercise the rights licensed immediately upon issuance of the license at the end of the licensing process for the transaction, provided that you have disclosed complete and accurate details of your proposed use, no license is finally effective unless and until full payment is received from you (either by publisher or by CCC) as provided in CCC's Billing and Payment terms and conditions. If full payment is not received on a timely basis, then any license preliminarily granted shall be deemed automatically revoked and shall be void as if never granted. Further, in the event that you breach any of these terms and conditions or any of CCC's Billing and Payment terms and conditions, the license is automatically revoked and shall be void as if never granted. Use of materials as described in a revoked license, as well as any use of the materials beyond the scope of an unrevoked license, may constitute copyright infringement and publisher reserves the right to take any and all action to protect its copyright in the materials. 9. Warranties: Publisher makes no representations or warranties with respect to the licensed material. 10. Indemnity: You hereby indemnify and agree to hold harmless publisher and CCC, and their respective officers, directors, employees and agents, from and against any and all claims arising out of your use of the licensed material other than as specifically authorized pursuant to this license. 11. No Transfer of License: This license is personal to you and may not be sublicensed, assigned, or transferred by you to any other person without publisher's written permission. 12. No Amendment Except in Writing: This license may not be amended except in a writing https://s100.copyright.com/AppDispatchServlet 2/6 7/24/2016 RightsLink Printable License signed by both parties (or, in the case of publisher, by CCC on publisher's behalf). 13. Objection to Contrary Terms: Publisher hereby objects to any terms contained in any purchase order, acknowledgment, check endorsement or other writing prepared by you, which terms are inconsistent with these terms and conditions or CCC's Billing and Payment terms and conditions. These terms and conditions, together with CCC's Billing and Payment terms and conditions (which are incorporated herein), comprise the entire agreement between you and publisher (and CCC) concerning this licensing transaction. In the event of any conflict between your obligations established by these terms and conditions and those established by CCC's Billing and Payment terms and conditions, these terms and conditions shall control. 14. Revocation: Elsevier or Copyright Clearance Center may deny the permissions described in this License at their sole discretion, for any reason or no reason, with a full refund payable to you. Notice of such denial will be made using the contact information provided by you. Failure to receive such notice will not alter or invalidate the denial. In no event will Elsevier or Copyright Clearance Center be responsible or liable for any costs, expenses or damage incurred by you as a result of a denial of your permission request, other than a refund of the amount(s) paid by you to Elsevier and/or Copyright Clearance Center for denied permissions. LIMITED LICENSE The following terms and conditions apply only to specific license types: 15. Translation: This permission is granted for non­exclusive world English rights only unless your license was granted for translation rights. If you licensed translation rights you may only translate this content into the languages you requested. A professional translator must perform all translations and reproduce the content word for word preserving the integrity of the article. 16. Posting licensed content on any Website: The following terms and conditions apply as follows: Licensing material from an Elsevier journal: All content posted to the web site must maintain the copyright information line on the bottom of each image; A hyper­text must be included to the Homepage of the journal from which you are licensing at http://www.sciencedirect.com/science/journal/xxxxx or the Elsevier homepage for books at http://www.elsevier.com; Central Storage: This license does not include permission for a scanned version of the material to be stored in a central repository such as that provided by Heron/XanEdu. Licensing material from an Elsevier book: A hyper­text link must be included to the Elsevier homepage at http://www.elsevier.com . All content posted to the web site must maintain the copyright information line on the bottom of each image.

Posting licensed content on Electronic reserve: In addition to the above the following clauses are applicable: The web site must be password­protected and made available only to bona fide students registered on a relevant course. This permission is granted for 1 year only. You may obtain a new license for future website posting. 17. For journal authors: the following clauses are applicable in addition to the above: Preprints: A preprint is an author's own write­up of research results and analysis, it has not been peer­ reviewed, nor has it had any other value added to it by a publisher (such as formatting, copyright, technical enhancement etc.). Authors can share their preprints anywhere at any time. Preprints should not be added to or enhanced in any way in order to appear more like, or to substitute for, the final versions of articles however authors can update their preprints on arXiv or RePEc with their Accepted Author Manuscript (see below). If accepted for publication, we encourage authors to link from the preprint to their formal publication via its DOI. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help users to find, access, cite and use the best available version. Please note that Cell Press, The Lancet and some society­owned have different preprint policies. Information on these policies is available on the journal homepage. Accepted Author Manuscripts: An accepted author manuscript is the manuscript of an article that has been accepted for publication and which typically includes author­ incorporated changes suggested during submission, peer review and editor­author https://s100.copyright.com/AppDispatchServlet 3/6 7/24/2016 RightsLink Printable License communications. Authors can share their accepted author manuscript:

 immediately via their non­commercial person homepage or blog by updating a preprint in arXiv or RePEc with the accepted manuscript via their research institute or institutional repository for internal institutional uses or as part of an invitation­only research collaboration work­group directly by providing copies to their students or to research collaborators for their personal use for private scholarly sharing as part of an invitation­only work group on commercial sites with which Elsevier has an agreement  after the embargo period via non­commercial hosting platforms such as their institutional repository via commercial sites with which Elsevier has an agreement

In all cases accepted manuscripts should:

 link to the formal publication via its DOI  bear a CC­BY­NC­ND license ­ this is easy to do  if aggregated with other manuscripts, for example in a repository or other site, be shared in alignment with our hosting policy not be added to or enhanced in any way to appear more like, or to substitute for, the published journal article.

Published journal article (JPA): A published journal article (PJA) is the definitive final record of published research that appears or will appear in the journal and embodies all value­adding publishing activities including peer review co­ordination, copy­editing, formatting, (if relevant) pagination and online enrichment. Policies for sharing publishing journal articles differ for subscription and gold open access articles: Subscription Articles: If you are an author, please share a link to your article rather than the full­text. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help your users to find, access, cite, and use the best available version. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect. If you are affiliated with a library that subscribes to ScienceDirect you have additional private sharing rights for others' research accessed under that agreement. This includes use for classroom teaching and internal training at the institution (including use in course packs and courseware programs), and inclusion of the article for grant funding purposes. Gold Open Access Articles: May be shared according to the author­selected end­user license and should contain a CrossMark logo, the end user license, and a DOI link to the formal publication on ScienceDirect. Please refer to Elsevier's posting policy for further information. 18. For book authors the following clauses are applicable in addition to the above: Authors are permitted to place a brief summary of their work online only. You are not allowed to download and post the published electronic version of your chapter, nor may you scan the printed edition to create an electronic version. Posting to a repository: Authors are permitted to post a summary of their chapter only in their institution's repository. 19. Thesis/Dissertation: If your license is for use in a thesis/dissertation your thesis may be submitted to your institution in either print or electronic form. Should your thesis be published commercially, please reapply for permission. These requirements include permission for the Library and Archives of Canada to supply single copies, on demand, of the complete thesis and include permission for Proquest/UMI to supply single copies, on demand, of the complete thesis. Should your thesis be published commercially, please reapply for permission. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links https://s100.copyright.com/AppDispatchServlet 4/6 7/24/2016 RightsLink Printable License back to the formal publications on ScienceDirect.

Elsevier Open Access Terms and Conditions You can publish open access with Elsevier in hundreds of open access journals or in nearly 2000 established subscription journals that support open access publishing. Permitted third party re­use of these open access articles is defined by the author's choice of Creative Commons user license. See our open access license policy for more information. Terms & Conditions applicable to all Open Access articles published with Elsevier: Any reuse of the article must not represent the author as endorsing the adaptation of the article nor should the article be modified in such a way as to damage the author's honour or reputation. If any changes have been made, such changes must be clearly indicated. The author(s) must be appropriately credited and we ask that you include the end user license and a DOI link to the formal publication on ScienceDirect. If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source it is the responsibility of the user to ensure their reuse complies with the terms and conditions determined by the rights holder. Additional Terms & Conditions applicable to each Creative Commons user license: CC BY: The CC­BY license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article and to make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by/4.0. CC BY NC SA: The CC BY­NC­SA license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article, provided this is not done for commercial purposes, and that the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. Further, any new works must be made available on the same conditions. The full details of the license are available at http://creativecommons.org/licenses/by­nc­sa/4.0. CC BY NC ND: The CC BY­NC­ND license allows users to copy and distribute the Article, provided this is not done for commercial purposes and further does not permit distribution of the Article if it is changed or edited in any way, and provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, and that the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by­nc­nd/4.0. Any commercial reuse of Open Access articles published with a CC BY NC SA or CC BY NC ND license requires permission from Elsevier and will be subject to a fee. Commercial reuse includes:

 Associating advertising with the full text of the Article  Charging fees for document delivery or access  Article aggregation  Systematic distribution via e­mail lists or share buttons

Posting or linking by commercial companies for use by customers of those companies.

20. Other Conditions:

v1.8

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 5/6 7/24/2016 RightsLink Printable License

https://s100.copyright.com/AppDispatchServlet 6/6 7/24/2016 RightsLink Printable License

ELSEVIER LICENSE TERMS AND CONDITIONS Jul 24, 2016

This Agreement between Edward Couchman ("You") and Elsevier ("Elsevier") consists of your license details and the terms and conditions provided by Elsevier and Copyright Clearance Center.

License Number 3915570325454 License date Jul 24, 2016 Licensed Content Publisher Elsevier Licensed Content Publication Structure Licensed Content Title Crystal Structures of the Pilus Retraction Motor PilT Suggest Large Domain Movements and Subunit Cooperation Drive Motility Licensed Content Author Kenneth A. Satyshur,Gregory A. Worzalla,Lorraine S. Meyer,Erin K. Heiniger,Kelly G. Aukema,Ana M. Misic,Katrina T. Forest

Licensed Content Date March 2007 Licensed Content Volume 15 Number Licensed Content Issue 3 Number Licensed Content Pages 14

Start Page 363 End Page 376 Type of Use reuse in a thesis/dissertation Intended publisher of new other work Portion figures/tables/illustrations Number of 1 figures/tables/illustrations Format both print and electronic

Are you the author of this No Elsevier article? Will you be translating? No

Order reference number Original figure numbers Figure 5

Title of your Investigating the Type IV Pili of Clostridium difficile and Clostridium thesis/dissertation sordellii

Expected completion date Aug 2016 Estimated size (number of 270 pages) Elsevier VAT number GB 494 6272 12

Requestor Location Edward Couchman 57 Bowman Mews

London, SW18 5TN United Kingdom Attn: Edward Couchman https://s100.copyright.com/AppDispatchServlet 1/5 7/24/2016 RightsLink Printable License Total 0.00 GBP Terms and Conditions INTRODUCTION 1. The publisher for this copyrighted material is Elsevier. By clicking "accept" in connection with completing this licensing transaction, you agree that the following terms and conditions apply to this transaction (along with the Billing and Payment terms and conditions established by Copyright Clearance Center, Inc. ("CCC"), at the time that you opened your Rightslink account and that are available at any time at http://myaccount.copyright.com). GENERAL TERMS 2. Elsevier hereby grants you permission to reproduce the aforementioned material subject to the terms and conditions indicated. 3. Acknowledgement: If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source, permission must also be sought from that source. If such permission is not obtained then that material may not be included in your publication/copies. Suitable acknowledgement to the source must be made, either as a footnote or in a reference list at the end of your publication, as follows: "Reprinted from Publication title, Vol /edition number, Author(s), Title of article / title of chapter, Pages No., Copyright (Year), with permission from Elsevier [OR APPLICABLE SOCIETY COPYRIGHT OWNER]." Also Lancet special credit ­ "Reprinted from The Lancet, Vol. number, Author(s), Title of article, Pages No., Copyright (Year), with permission from Elsevier." 4. Reproduction of this material is confined to the purpose and/or media for which permission is hereby given. 5. Altering/Modifying Material: Not Permitted. However figures and illustrations may be altered/adapted minimally to serve your work. Any other abbreviations, additions, deletions and/or any other alterations shall be made only with prior written authorization of Elsevier Ltd. (Please contact Elsevier at [email protected]) 6. If the permission fee for the requested use of our material is waived in this instance, please be advised that your future requests for Elsevier materials may attract a fee. 7. Reservation of Rights: Publisher reserves all rights not specifically granted in the combination of (i) the license details provided by you and accepted in the course of this licensing transaction, (ii) these terms and conditions and (iii) CCC's Billing and Payment terms and conditions. 8. License Contingent Upon Payment: While you may exercise the rights licensed immediately upon issuance of the license at the end of the licensing process for the transaction, provided that you have disclosed complete and accurate details of your proposed use, no license is finally effective unless and until full payment is received from you (either by publisher or by CCC) as provided in CCC's Billing and Payment terms and conditions. If full payment is not received on a timely basis, then any license preliminarily granted shall be deemed automatically revoked and shall be void as if never granted. Further, in the event that you breach any of these terms and conditions or any of CCC's Billing and Payment terms and conditions, the license is automatically revoked and shall be void as if never granted. Use of materials as described in a revoked license, as well as any use of the materials beyond the scope of an unrevoked license, may constitute copyright infringement and publisher reserves the right to take any and all action to protect its copyright in the materials. 9. Warranties: Publisher makes no representations or warranties with respect to the licensed material. 10. Indemnity: You hereby indemnify and agree to hold harmless publisher and CCC, and their respective officers, directors, employees and agents, from and against any and all claims arising out of your use of the licensed material other than as specifically authorized pursuant to this license. 11. No Transfer of License: This license is personal to you and may not be sublicensed, assigned, or transferred by you to any other person without publisher's written permission. 12. No Amendment Except in Writing: This license may not be amended except in a writing signed by both parties (or, in the case of publisher, by CCC on publisher's behalf). https://s100.copyright.com/AppDispatchServlet 2/5 7/24/2016 RightsLink Printable License 13. Objection to Contrary Terms: Publisher hereby objects to any terms contained in any purchase order, acknowledgment, check endorsement or other writing prepared by you, which terms are inconsistent with these terms and conditions or CCC's Billing and Payment terms and conditions. These terms and conditions, together with CCC's Billing and Payment terms and conditions (which are incorporated herein), comprise the entire agreement between you and publisher (and CCC) concerning this licensing transaction. In the event of any conflict between your obligations established by these terms and conditions and those established by CCC's Billing and Payment terms and conditions, these terms and conditions shall control. 14. Revocation: Elsevier or Copyright Clearance Center may deny the permissions described in this License at their sole discretion, for any reason or no reason, with a full refund payable to you. Notice of such denial will be made using the contact information provided by you. Failure to receive such notice will not alter or invalidate the denial. In no event will Elsevier or Copyright Clearance Center be responsible or liable for any costs, expenses or damage incurred by you as a result of a denial of your permission request, other than a refund of the amount(s) paid by you to Elsevier and/or Copyright Clearance Center for denied permissions. LIMITED LICENSE The following terms and conditions apply only to specific license types: 15. Translation: This permission is granted for non­exclusive world English rights only unless your license was granted for translation rights. If you licensed translation rights you may only translate this content into the languages you requested. A professional translator must perform all translations and reproduce the content word for word preserving the integrity of the article. 16. Posting licensed content on any Website: The following terms and conditions apply as follows: Licensing material from an Elsevier journal: All content posted to the web site must maintain the copyright information line on the bottom of each image; A hyper­text must be included to the Homepage of the journal from which you are licensing at http://www.sciencedirect.com/science/journal/xxxxx or the Elsevier homepage for books at http://www.elsevier.com; Central Storage: This license does not include permission for a scanned version of the material to be stored in a central repository such as that provided by Heron/XanEdu. Licensing material from an Elsevier book: A hyper­text link must be included to the Elsevier homepage at http://www.elsevier.com . All content posted to the web site must maintain the copyright information line on the bottom of each image.

Posting licensed content on Electronic reserve: In addition to the above the following clauses are applicable: The web site must be password­protected and made available only to bona fide students registered on a relevant course. This permission is granted for 1 year only. You may obtain a new license for future website posting. 17. For journal authors: the following clauses are applicable in addition to the above: Preprints: A preprint is an author's own write­up of research results and analysis, it has not been peer­ reviewed, nor has it had any other value added to it by a publisher (such as formatting, copyright, technical enhancement etc.). Authors can share their preprints anywhere at any time. Preprints should not be added to or enhanced in any way in order to appear more like, or to substitute for, the final versions of articles however authors can update their preprints on arXiv or RePEc with their Accepted Author Manuscript (see below). If accepted for publication, we encourage authors to link from the preprint to their formal publication via its DOI. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help users to find, access, cite and use the best available version. Please note that Cell Press, The Lancet and some society­owned have different preprint policies. Information on these policies is available on the journal homepage. Accepted Author Manuscripts: An accepted author manuscript is the manuscript of an article that has been accepted for publication and which typically includes author­ incorporated changes suggested during submission, peer review and editor­author communications. https://s100.copyright.com/AppDispatchServlet 3/5 7/24/2016 RightsLink Printable License Authors can share their accepted author manuscript:

 immediately via their non­commercial person homepage or blog by updating a preprint in arXiv or RePEc with the accepted manuscript via their research institute or institutional repository for internal institutional uses or as part of an invitation­only research collaboration work­group directly by providing copies to their students or to research collaborators for their personal use for private scholarly sharing as part of an invitation­only work group on commercial sites with which Elsevier has an agreement  after the embargo period via non­commercial hosting platforms such as their institutional repository via commercial sites with which Elsevier has an agreement

In all cases accepted manuscripts should:

 link to the formal publication via its DOI  bear a CC­BY­NC­ND license ­ this is easy to do  if aggregated with other manuscripts, for example in a repository or other site, be shared in alignment with our hosting policy not be added to or enhanced in any way to appear more like, or to substitute for, the published journal article.

Published journal article (JPA): A published journal article (PJA) is the definitive final record of published research that appears or will appear in the journal and embodies all value­adding publishing activities including peer review co­ordination, copy­editing, formatting, (if relevant) pagination and online enrichment. Policies for sharing publishing journal articles differ for subscription and gold open access articles: Subscription Articles: If you are an author, please share a link to your article rather than the full­text. Millions of researchers have access to the formal publications on ScienceDirect, and so links will help your users to find, access, cite, and use the best available version. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect. If you are affiliated with a library that subscribes to ScienceDirect you have additional private sharing rights for others' research accessed under that agreement. This includes use for classroom teaching and internal training at the institution (including use in course packs and courseware programs), and inclusion of the article for grant funding purposes. Gold Open Access Articles: May be shared according to the author­selected end­user license and should contain a CrossMark logo, the end user license, and a DOI link to the formal publication on ScienceDirect. Please refer to Elsevier's posting policy for further information. 18. For book authors the following clauses are applicable in addition to the above: Authors are permitted to place a brief summary of their work online only. You are not allowed to download and post the published electronic version of your chapter, nor may you scan the printed edition to create an electronic version. Posting to a repository: Authors are permitted to post a summary of their chapter only in their institution's repository. 19. Thesis/Dissertation: If your license is for use in a thesis/dissertation your thesis may be submitted to your institution in either print or electronic form. Should your thesis be published commercially, please reapply for permission. These requirements include permission for the Library and Archives of Canada to supply single copies, on demand, of the complete thesis and include permission for Proquest/UMI to supply single copies, on demand, of the complete thesis. Should your thesis be published commercially, please reapply for permission. Theses and dissertations which contain embedded PJAs as part of the formal submission can be posted publicly by the awarding institution with DOI links back to the formal publications on ScienceDirect. https://s100.copyright.com/AppDispatchServlet 4/5 7/24/2016 RightsLink Printable License

Elsevier Open Access Terms and Conditions You can publish open access with Elsevier in hundreds of open access journals or in nearly 2000 established subscription journals that support open access publishing. Permitted third party re­use of these open access articles is defined by the author's choice of Creative Commons user license. See our open access license policy for more information. Terms & Conditions applicable to all Open Access articles published with Elsevier: Any reuse of the article must not represent the author as endorsing the adaptation of the article nor should the article be modified in such a way as to damage the author's honour or reputation. If any changes have been made, such changes must be clearly indicated. The author(s) must be appropriately credited and we ask that you include the end user license and a DOI link to the formal publication on ScienceDirect. If any part of the material to be used (for example, figures) has appeared in our publication with credit or acknowledgement to another source it is the responsibility of the user to ensure their reuse complies with the terms and conditions determined by the rights holder. Additional Terms & Conditions applicable to each Creative Commons user license: CC BY: The CC­BY license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article and to make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by/4.0. CC BY NC SA: The CC BY­NC­SA license allows users to copy, to create extracts, abstracts and new works from the Article, to alter and revise the Article, provided this is not done for commercial purposes, and that the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work. Further, any new works must be made available on the same conditions. The full details of the license are available at http://creativecommons.org/licenses/by­nc­sa/4.0. CC BY NC ND: The CC BY­NC­ND license allows users to copy and distribute the Article, provided this is not done for commercial purposes and further does not permit distribution of the Article if it is changed or edited in any way, and provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, and that the licensor is not represented as endorsing the use made of the work. The full details of the license are available at http://creativecommons.org/licenses/by­nc­nd/4.0. Any commercial reuse of Open Access articles published with a CC BY NC SA or CC BY NC ND license requires permission from Elsevier and will be subject to a fee. Commercial reuse includes:

 Associating advertising with the full text of the Article  Charging fees for document delivery or access  Article aggregation  Systematic distribution via e­mail lists or share buttons

Posting or linking by commercial companies for use by customers of those companies.

20. Other Conditions:

v1.8

Questions? [email protected] or +1-855-239-3415 (toll free in the US) or +1-978-646-2777.

https://s100.copyright.com/AppDispatchServlet 5/5