Elucidating pneumophila effector function using proteomic approaches

Ernest Cheng So

Imperial College London Institute of Chemical Biology Department of Chemistry

This thesis is presented for the degree of Doctor of Philosophy of Imperial College London and Diploma of Imperial College London 2016 Abstract

Legionella pneumophila is the causative agent of Legionnaires’ disease, a severe and potentially fatal . This intracellular pathogen proliferates by creating a replicative niche, the Legionella containing vacuole (LCV), inside the host and subverting host signalling pathways. Critical to L. pneumophila’s virulence strategy is its defect in organelle trafficking/intracellular multiplication

(Dot/Icm) type IVB secretion system. Using the Dot/Icm, L. pneumophila translocates over 300 effector proteins into the host cell to manipulate signalling pathways.

The novel effector LtpG localises to the nucleus upon Dot/Icm-dependent translocation. Genomic deletion of ltpG did not exhibit a L. pneumophila intracellular growth defect in all infection models tested. Although LtpG expression did not cause toxicity in mammalian cells, its filamentation induced by cAMP (Fic) domain caused cytotoxicity in yeast and has auto-AMPylation activity. However, small molecule substrate binding assays suggest a guanosine-containing metabolite is preferred.

Determination of LtpG host targets using in vitro protein-protein interaction assays did not yield satisfactory results and consequently a more physiologically relevant infection-based mass spectrometric method was developed.

Using the biotin ligase BirA, tagged-effectors were biotinylated in a translocation dependent manner.

Effector-host protein complexes formed during infection were subsequently isolated and their composition deciphered using quantitative mass spectrometry. The method was downscaled by over

100-fold from the proof-of-concept study and critical parameters such as number of purifications, lysis conditions and crosslinker reactivity tested. This revealed the infection dependent Rab GTPase binding profiles of the promiscuous Rab binding effectors SidM and LidA. Additionally, HSP90, EEF2 and

NACA were identified as high confidence physiological binding partners of LtpG, suggesting a role in manipulation of host translation and autophagic pathways as part of L. pneumophila’s virulence strategy.

1

Declaration of Originality

I, Ernest So, declare that this thesis constitutes my own work and that any external contributions are appropriately acknowledged or referenced.

Copyright Declaration

The copyright of this thesis rests with the author and is made available under a Creative Commons

Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work.

2

Acknowledgements

“This is not science, this is bullshit…” Gadi Frankel 2016 (completely out of context)

Firstly, I would like to thank my supersivors Gadi and Ed for welcoming me into their research groups and giving me the opportunity to pursue a PhD. Your insight and guidance have been essential for not only the progression of the project but also my development as a scientist.

An insurmountable thanks to Gunnar who taught me everything I know in the lab. Thank you for your support and wisdom over the many years and always finding time to help me out. You have the patience of a saint for putting up with me for so long. A big thank you to Team Legionella (Corinna and Dani) for lending me a helping hand whenever I needed it and also providing much needed moral support.

Thank you to everyone in the Frankel and Tate groups for general scientific discussions and especially for listening to my rants. Also thanks to everyone on CMBI1 for putting up with me for so many years and potentially a few more yet to come… In particular, I would like to thank Goska and Julia for all their help and discussions with the MS work and especially for looking after the rather temperamental

Q-Exactive. I would also like to thank Jyoti Choudhary for invaluable help and expertise with mass spectrometry. Thanks to David Charles and Andreas Förster for all their help with the crystallography work and Eleni for her constant help with protein purifications. Also, thanks to Dan Brown with helping me with the radioactivity work.

Thanks to the ICB and EPSRC for funding my PhD and in particular for allowing me to be in the same cohort as Jenny and Ben. Our frequent lunches kept me vaguely sane throughout the years.

Finally, I would like to thank my wife-to-be Joanna. Thank you for your love and support throughout my PhD and in particular during the thesis writing period. I know it wasn’t easy watching me write in the most chaotic way possible and so I would like to dedicate this thesis to you.

3

Abbreviations

AA amino acid ABC ATP-binding cassette ACES N-(2-Acetamido)-2-aminoethanesulfonic acid AD GAL4 activation domain ADP adenosine diphosphate AMBIC ammonium bicarbonate Amp ampicillin AMP adenosine monophosphate ARF1 ADP-ribosylation factor 1 ATP adenosine triphosphate AYE ACES buffered yeast extract AzTB Azide-TAMRA-biotin capture reagent BCA bicinchoninic acid BD GAL4 DNA binding domain BSA bovine serum albumin cAMP cyclic adenosine monophosphate CDK7 cyclin dependent kinase 7 CDP cytidine diphosphate CFU colony forming unit CHAPS 3-((3-cholamidopropyl) dimethylammonio)-1-propanesulfonate Cm chloramphenicol CMP cytidine monophosphate COPI coat protein complex I CTP cytidine triphosphate CYE buffered charcoal yeast extract DAPI 4',6-diamidino-2-phenylindole DDO double dropout SD (-Leu, -Trp) DMEM Dulbecco's Modified Eagle Medium DNA deoxyribonucleic acid dNTP deoxynucleoside triphosphate Dot/Icm defect in organelle trafficking/intracellular multiplication DSF differential scanning fluorimetry DSP dithiobis(succinimidyl propionate) DTME dithiobismaleimidoethane DTT dithiothreitol EDTA ethylenediaminetetraacetic acid EEF2 elongation factor 2 EM electron microscopy EPF exponential phase form ER endoplasmic reticulum FACS fluorescence activated cell sorting FCS fetal calf serum FDR false discovery rate FF filamentous form

4

Fic filamentation induced by cAMP FL full length GAP GTPase activating protein GDI guanosine nucleotide dissociation inhibitor GDP guanosine diphosphate GEF guanine nucleotide exchange factor GFP green fluorescent protein GMP guanosine monophosphate GnCl guanidinium chloride GST glutathione S-transferase GTP guanosine triphosphate h hour HEPES 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid HRP Horseradish peroxidase HSP90 heat shock protein 90 HYPE Huntingtin-interacting protein E ID identification IDA iminodiacetic acid IMPA1 inositol(myo)-1(or 4)-monophosphatase 1 IPTG isopropyl β-D-1-thiogalactopyranoside iTRAQ isobaric tagging for relative and absolute quantification Kn kanamycin LB Luria-Bertani LC-MS/MS liquid chromatography tandem mass spectrometry LCV Legionella-containing vacuole LPS lipopolysaccharide Lsp Legionella secretion pathway MIF mature infectious form min minute MOI multiplicity of infection MS mass spectrometry MVB multivesicular body NAC nascent polypeptide-associated complex NAD nicotinamide adenine dinucleotide NAPPA nucleic acid programmable protein array NDP nucleoside diphosphate NMP nucleoside monophosphate NTA nitrilotriacetic acid NTP nucleoside trisphosphate OD optical density 600 PBS phosphate buffered saline PBST PBS-Tween buffer PCR polymerase chain reaction PEG polyethylene glycol PFA paraformaldehyde PI phosphatidylinositol

5

PI(3)P phosphatidylinositol 3-phosphate

PI(3,4)P2 phosphatidylinositol 3, 4-diphosphate

PI(3,4,5)P3 phosphatidylinositol 3, 4, 5-triphosphate

PI(3,5)P2 phosphatidylinositol 3, 5-diphosphate PI(4)P phosphatidylinositol 4-phosphate

PI(4,5)P2 phosphatidylinositol 4, 5-diphosphate PI4K phosphatidylinositol 4-kinase PIP phosphatidylinositol phosphate PLA1 patatin-like phospholipase A1 PMA phorbol 12-myristate 13-acetate PPi pyrophosphate PVDF Polyvinylidene fluoride QDO quadruple dropout SD (-Leu, -Trp, -Ade, -His) RNA ribonucleic acid RNAi RNA interference RPF replicative phase form RPMI Roswell Park Memorial Institute medium RT room temperature SAP single Neutravidin affinity purification SCF SKP1-Cullin-F-box SD synthetic defined media SDB-XC poly(styrenedivinylbenzene) copolymer SDS sodium dodecyl sulphate SDS-PAGE SDS polyacrylamide gel electrophoresis Sec general secretory SILAC stable isotope labelling in cell culture SMCC 4-(N-maleimidomethyl)cyclohexane-1-carboxylate SNARE soluble N-ethylmaleimide-sensitive factor activating protein receptor SNX sorting nexin SOC super optimal broth with catabolite repression SPF stationary phase form SRP signal recognition particle T1SS type I secretion system T2SS type II secretion system T3SS type III secretion system T4SS type IV secretion system T5SS type V secretion system T6SS type VI secretion system TAE Tris-acetate-EDTA TAMRA 5,6-Carboxytetramethylrhodamine TAP tandem-affinity purification Tat twin arginine translocation TBTA Tris[(1-benzyl-1H-1,2,3-triazol-4-yl)methyl]amine TCEP tris(2-carboxyethyl)phosphine TEM β-lactamase

6

TGS Tris-glycine-SDS Tm melting temperature TMT tandem mass tag TRITC tetramethylrhodamine UAT urinary antigen test UBE2T ubiquitin-conjugating enzyme E2 T UMP uridine monophosphate UTP uridine triphosphate UV ultraviolet VAMP4 vesicle-associated membrane protein 4 VBNC viable but not culturable WB Western blot WT wild type XIC extracted ion current X-α-Gal 5-bromo-4-chloro-3-indoyl-α-D-galactopyranoside Y2H yeast-2-hybrid Yn-6-TP clickable ATP analogue YPDA yeast extract peptone dextrose adenine

7

Table of Contents Abstract ...... 1 Declaration of Originality ...... 2 Copyright Declaration ...... 2 Acknowledgements ...... 3 Abbreviations ...... 4 List of Figures ...... 12 List of Tables ...... 15 Chapter 1: Introduction ...... 17 1.1. History and epidemiology of ...... 17 1.2. Legionella spp. diagnostics ...... 19 1.3. The life cycle of L. pneumophila ...... 20 1.4. The intracellular lifestyle of L. pneumophila ...... 22 1.5. Protein secretion by Gram-negative ...... 24 1.5.1. Type I secretion system ...... 24 1.5.2. Type III secretion system ...... 24 1.5.3. Type VI secretion system ...... 25 1.5.4. Sec and Tat pathways ...... 25 1.5.5. Type V secretion system ...... 26 1.5.6. Type II secretion system ...... 27 1.5.7. Type IV secretion system ...... 29 1.6. Dot/Icm T4SS effectors ...... 32 1.6.1. Avoiding phagolysosomal maturation ...... 33 1.6.2. Lipid manipulation ...... 34 1.6.3. Manipulating lipid compositions of early endosomes ...... 36 1.6.4. Preventing acidification of the LCV ...... 37 1.6.5. Inhibiting retrograde trafficking ...... 37 1.6.6. Avoiding detection from host-defence mechanisms ...... 38 1.6.7. Rab GTPases mature the LCV into a unique replicative organelle ...... 39 1.6.8. Acquiring nutrients ...... 47 1.6.9. Manipulating host transcription and translation ...... 50 1.7. Proteomics ...... 53 1.7.1. Top-down proteomics ...... 53 1.7.2. Bottom-up proteomics...... 53 1.7.3. Quantitative proteomics ...... 54 1.7.4. From raw data to peptide identifications ...... 56

8

1.8. Fic domain proteins ...... 58 1.8.1. Fic domain architecture ...... 60 1.8.2. Enzymatic activity of Fic domains ...... 61 1.8.3. Substrate specificity of Fic domains ...... 63 1.8.4. Protein substrate recognition by Fic domains ...... 65 1.8.5. Functional consequences of Fic activity on target proteins ...... 66 1.8.6. Regulation of Fic domain enzymatic activity ...... 67 1.9. Project aims ...... 69 Chapter 2 Materials and methods ...... 70 2.1 Strains and cells ...... 70 2.1.1 Strains and growth conditions ...... 70 2.1.2 Preparation of competent bacteria ...... 71 2.1.3 Transformation of competent cells...... 72 2.1.4 Preparation and transformation of chemically competent S. cerevisiae ...... 72 2.2 Molecular biology techniques ...... 73 2.2.1 DNA manipulations ...... 73 2.2.2 Table of plasmids and primers ...... 73 2.2.3 Plasmid DNA purification ...... 77 2.2.4 Agarose gel electrophoresis ...... 77 2.2.5 Restriction digestion of PCR products ...... 77 2.2.6 Ligation of digested DNA products ...... 77 2.3 Cell culture-based techniques ...... 78 2.3.1 Cell culture ...... 78 2.3.2 Generation of stably transduced cell lines by viral transduction ...... 78 2.3.3 Legionella infection for immunofluorescence ...... 79 2.3.4 Transfection for immunofluorescence ...... 79 2.3.5 L. pneumophila growth curves in amoebae ...... 79 2.3.6 Legionella infection for mass spectrometry (Bio/BirA) ...... 79 2.3.7 Immunofluorescence (IF) preparation...... 80 2.3.8 Microscopy ...... 80 2.3.9 Co-immunoprecipitation from infected cells ...... 82 2.4 Biochemical techniques ...... 83 2.4.1 Protein purification ...... 83 2.4.2 SDS polyacrylamide gel electrophoresis (SDS-PAGE) ...... 84 2.4.3 Coomassie staining ...... 84 2.4.4 Silver staining ...... 84

9

2.4.5 Western Blot (WB) ...... 84 2.4.6 In vitro AMPylation assays ...... 84 2.4.7 Differential scanning fluorimetry (DSF) ...... 85 2.4.8 Radioactive NMPylation ...... 86 2.4.9 Setting up crystal trays ...... 86 2.4.10 X-ray diffraction ...... 87 2.4.11 Thrombin cleavage ...... 87 2.5 Yeast techniques ...... 87 2.5.1 Yeast cytotoxicity screen ...... 87 2.5.2 Yeast-2-Hybrid screen ...... 87 2.5.3 Direct Yeast-2-Hybrid ...... 88 2.6 Mass spectrometry-based techniques ...... 88 2.6.1 In vitro pulldown ...... 88 2.6.2 Competition in vitro pulldown ...... 88 2.6.3 On-bead reduction and alkylation ...... 89 2.6.4 Purification of effectors from infected cells for MS (Bio/BirA) ...... 89 2.6.5 StageTipping and dimethyl labelling ...... 92 2.6.6 MS sample preparation ...... 92 2.6.7 Mass spectrometry ...... 93 2.6.8 MS data processing – MaxQuant ...... 93 2.6.9 MS data processing – Perseus (in vitro pulldown) ...... 94 2.6.10 MS data processing – Perseus (competition pulldown) ...... 94 2.6.11 MS data processing – Perseus (Bio/BirA experiments) ...... 94 Chapter 3: Results: The novel Legionella pneumophila Dot/Icm Fic domain effector LtpG ...... 96 3.1. Introduction ...... 96 3.2. Contribution of LtpG to virulence and intracellular growth ...... 99 3.3. Localisation of LtpG in infection ...... 100 3.4. Localisation of ectopically-expressed LtpG...... 102 3.5. LtpG causes Fic-dependent cytotoxicity in yeast ...... 102 3.6. Purification of recombinant LtpG ...... 103 3.7. Recombinant LtpG exhibits auto-AMPylation activity in vitro ...... 105 3.8. LtpG preferentially binds diphosphate-containing metabolites ...... 109 3.9. LtpG does not have alternative NMPylation or kinase activity ...... 115 3.10. LtpG crystallisation trials ...... 116 3.11. Identifying effector-host protein interactions ...... 128 3.12. Yeast-2-hybrid screen ...... 128

10

3.13. In vitro pulldowns ...... 137 3.13.1. In-gel digest proteomics ...... 138 3.13.2. On-bead digest proteomics ...... 141 3.13.3. Competition pulldown and quantitative proteomics by dimethyl labelling ...... 146 3.14. Discussion ...... 151 Chapter 4: Results – Developing mass spectrometric methods to determine effector binding proteins ...... 154 4.1. Introduction ...... 154 4.2. TAP aids interactor identification in addition to reducing background binders compared to SAP ...... 157 4.3. Denaturing conditions are detrimental to the SidM-Rab1 interaction ...... 164 4.4. Moderate crosslinking increases the detectable SidM interactome ...... 167 4.5. The SidM interactome is dependent on both crosslinker length and reactivity ...... 171 4.6. The effector LidA binds a specific subset of Rab GTPases during infection ...... 176 4.7. Co-immunoprecipitations confirm Rab10 as a genuine interactor of both SidM and LidA 178 4.8. SidM and LidA are not ubiquitinated during infection ...... 179 4.9. The MavP interactome ...... 180 4.10. Direct Y2H to confirm SidM interactions ...... 180 4.11. The LtpG interactome ...... 181 4.12. Discussion ...... 185 Chapter 5: General discussion ...... 192 Chapter 6: References ...... 199

11

List of Figures

Figure 1.1. L. pneumophila has evolved to proliferate inside phagocytes...... 23

Figure 1.2. Schematic model of the Dot/Icm type IVB secretion system...... 30

Figure 1.3. L. pneumophila avoids host defence mechanisms to survive inside macrophages...... 33

Figure 1.4. Manipulation of Rab GTPases by L. pneumophila effectors...... 41

Figure 1.5. Nutrient acquisition by L. pneumophila and exploitation of host ubiquitination machinery.

...... 48

Figure 1.6. L. pneumophila manipulates both host transcription and translation...... 50

Figure 1.7. Taxonomic distribution of Fic domains found amongst all domains of life...... 59

Figure 1.8. Comparison of Fic domain architectures...... 60

Figure 1.9. Schematic of the conserved Fic domain enzymatic activity...... 61

Figure 1.10. Crystal structure of IbpA in complex with AMPylated Cdc42 reveals its small molecule substrate specificity...... 63

Figure 1.11. Thr166 of RIN4 is similarly located as AMPylated Tyr32 of Cdc42 relative to AvrB and

IbpA respectively...... 65

Figure 3.1. Homology structural model of LtpG by Phyre2...... 96

Figure 3.2. Primary sequence alignment of LtpG with BT_2513 by Clustal Omega...... 98

Figure 3.3. Deletion of ltpG does not affect L. pneumophila intracellular replication in amoebae...... 99

Figure 3.4. LtpG localises to the nucleus 24 h post-infection...... 101

Figure 3.5. Ectopically expressed LtpG localises to the nucleus...... 102

Figure 3.6. Ectopic expression of LtpG in yeast causes Fic-dependent cytotoxicity...... 103

Figure 3.7. Recombinant His-LtpG expression tests...... 104

Figure 3.8. Batch purification of recombinant His-LtpG...... 105

Figure 3.9. Chemical structures of click reagents for in vitro AMPylation...... 105

Figure 3.10. In vitro AMPylation schematic...... 106

Figure 3.11. LtpG auto-AMPylates in a Fic-dependent manner...... 107

12

Figure 3.12. No additional AMPylation targets can be seen when LtpG is incubated with THP-1 cell lysate whilst VopS AMPylates Rho GTPases...... 108

Figure 3.13. Schematic of the differential scanning fluorimetry (DSF) assay...... 109

Figure 3.14. GDP causes the largest shift in LtpG melting temperature...... 110

Figure 3.15. Comparison of nucleotide binding preferences of LtpG and VopS...... 112

Figure 3.16. Mutation of the catalytic histidine does not alter the nucleotide binding preferences of

LtpG...... 113

Figure 3.17. MgCl2 is critical for LtpG stability...... 114

Figure 3.18. LtpG does not appear to have GMPylation or kinase activity...... 115

Figure 3.19. Schematic of sitting drop and hanging drop vapour diffusion methodology for protein crystallisation...... 116

Figure 3.20. Purification of recombinant LtpG for crystallography...... 117

Figure 3.21. LtpG crystals formed in 100 mM BisTris, 3 M NaCl...... 118

Figure 3.22. LtpG crystals grown in 100 mM BisTris, 2.6 M NaCl, pH 6.1...... 119

Figure 3.23. X-ray diffraction patterns of His-LtpG crystals...... 121

Figure 3.24. LtpG does not appear to bind to sugars in the DSF assay...... 124

Figure 3.25. The His-tag cannot be efficiently cleaved from recombinant His-LtpG...... 126

Figure 3.26. Comparison of linker lengths of N-terminally His-tagged LtpG constructs...... 126

Figure 3.27. Crystal forms of His-LtpG SL...... 127

Figure 3.28. Schematic of the yeast-2-hybrid (Y2H) assay...... 129

Figure 3.29. Yeast expressing BD-fused LtpG exhibits a growth defect in a Fic-dependent manner.130

Figure 3.30. LtpG interacts with the CDK7 fragment from the Y2H screen...... 132

Figure 3.31. LtpG does not interact with full length CDK7 or UBE2T...... 133

Figure 3.32. The non-coding regions of the Y2H CDK7 hit do not affect its ability to bind LtpG. ... 135

Figure 3.33. LtpG is unable to interact with full length CDK7 regardless of activation state...... 136

Figure 3.34. Novel protein bands appear in LtpG pulldowns...... 138

Figure 3.35. Putative LtpG interactors cluster into three main groups...... 144

Figure 3.36. LtpG does not interact with any of the COPI subunits in a Y2H assay...... 145

13

Figure 3.37. Schematic of the competition pulldown experiment...... 146

Figure 3.38. Dose response curves of normalised log2 L/H (competed/non-competed) ratios of all quantifiable proteins...... 149

Figure 4.1. Schematic of the BirA/Bio-tag translocation dependent biotinylation methodology...... 155

Figure 4.2. His6-Bio-SidM is specifically biotinylated in A549-BirA cells and both tags can be used for enrichment...... 158

Figure 4.3. His6-Bio-SidM is specifically biotinylated THP-1-BirA cells and both tags can be used for enrichment...... 160

Figure 4.4. TAP enhances the detectable SidM interactome...... 161

Figure 4.5. Rab1A is found in SidM effector complexes during infection of THP-1 cells...... 162

Figure 4.6. Guanidinium chloride aids solubilisation of crosslinked effector complexes...... 165

Figure 4.7. Denaturing conditions hinders detection of the SidM/Rab1 interaction...... 166

Figure 4.8. Increasing formaldehyde concentration decreases effector complex solubility...... 168

Figure 4.9. Moderate formaldehyde crosslinking enhances the detectable SidM interactome...... 169

Figure 4.10. Alternative crosslinkers do not alter to ability to purify effector complexes using TAP.

...... 172

Figure 4.11. The detectable SidM interactome is dependent on both crosslinker reactivity and crosslinker length...... 173

Figure 4.12. The LidA interactome reveals Rab GTPase binding preferences during infection...... 177

Figure 4.13. SidM and LidA co-immunoprecipitate with Rab10...... 178

Figure 4.14. SidM and LidA are not ubiquitinated during infection...... 179

Figure 4.15. The MavP interactome identifies multiple Legionella effectors...... 180

Figure 4.16. SidM is not amenable to Y2H analysis...... 181

Figure 4.17. The LtpG interactome during infection...... 182

Figure 4.18. Elongation factor 2 is a potential interaction partner of LtpG...... 183

Figure 4.19. MS/MS spectrum of NACA peptide...... 184

14

List of Tables

Table 2.1. Table of bacterial and yeast strains...... 71

Table 2.2. Table of nutrient supplements for yeast SD media...... 71

Table 2.3. Table of plasmids and primers...... 77

Table 2.4. Table of antibodies...... 81

Table 2.5. Table of commercially available crystallisation screens...... 86

Table 2.6. Table of crosslinking solutions for Bio/BirA pulldowns...... 91

Table 2.7. Table of buffers for Bio/BirA pulldowns ...... 91

Table 2.8. Table of dimethyl labelling solutions...... 92

Table 3.1. Table of broad optimisation conditions for LtpG crystallisation based on BisTris and NaCl base condition...... 118

Table 3.2. Table of fine optimisation conditions for LtpG crystallisation based on BisTris and NaCl base condition...... 119

Table 3.3. Table of additives in the additive screen on the base BisTris and NaCl crystallisation condition...... 124

Table 3.4. Table of putative LtpG interactors from the Y2H screen...... 131

Table 3.5. Table of MS protein identifications for each gel slice...... 140

Table 3.6. Table showing spectral counts of the bait proteins Lem28, LtpG and the putative interactor

HSP90 for each sample...... 142

Table 3.7. Table showing spectral counts of the top 25 putative LtpG interaction partners after stringent filtering...... 143

Table 3.8 Table of number of quantifiable proteins for each MS sample...... 148

Table 4.1. Table of average log2 intensities and enrichment factors of SidM, Rab1A and Rab1B across all lysis conditions...... 167

Table 4.2. Table of average log2 intensities and difference in log2 intensities with respect to SidM for

SidM, Rab1A, Rab1B and ubiquitin for each formaldehyde concentration...... 170

Table 4.3. Table of the high confidence interaction partners of SidM...... 175

15

Table 5.1. Summary of LtpG results...... 192

Table 5.2. Summary of high confidence interaction partners (Top 10 ranked enriched proteins) of SidM,

LidA, MavP and LtpG as determined by the Bio/BirA methodology...... 192

Table 5.3. Summary of paramaters tested in the Bio/BirA pulldown optimisation...... 193

16

Chapter 1: Introduction

1.1. History and epidemiology of Legionella pneumophila

Legionella pneumophila, a Gram-negative bacterial pathogen, is the causative agent of Legionnaires’ disease, a severe and potentially fatal pneumonia. It was first discovered in 1976 during a convention for American Legion veterans in Philadelphia where it was responsible for a severe pneumonia outbreak with high mortality [Fraser et al., 1977; McDade et al., 1977]. 34 out of 182 patients from the convention died. Despite identification of L. pneumophila as causative agent and substantial progress in understanding of the molecular mechanisms underlying infection in the past 40 years, the mortality of hospitalised Legionnaires’ disease cases remains between 8-10% [Dooling et al., 2015; European

Centre for Disease Prevention and Control, 2016]. L. pneumophila is now known to be endemic worldwide, causing both community- and hospital-acquired [Phin et al., 2014].

L. pneumophila is a water-borne pathogen which causes disease in humans upon inhalation of contaminated aerosolised water. Outbreaks can typically be associated with contaminated stagnant water sources such as potable water sources, air conditioning units, spas and water cooling towers. L. longbeachae infections, in contrast, have been associated with contact with potting composts. However, its environmental distribution and transmission are not well understood. Legionellosis, the generic term for diseases caused by the genus Legionella, can be classified by the means by which the disease was acquired: community acquired, domestically acquired, nosocomial or travel associated.

Legionellosis can manifest itself in two major forms of disease: Pontiac fever and Legionnaires’ disease

[Fraser et al., 1977; Glick et al., 1978]. Whilst Pontiac fever exhibits mild flu-like symptoms and manifests itself typically in otherwise healthy patients, Legionnaires’ disease causes a severe acute pneumonia with high mortality rates. Between 2005-2009, 99.5% of reported legionellosis cases in the

United States were classified as Legionnaires’ disease whilst only 0.5% were identified as Pontiac fever

[Hicks et al., 2012]. Risk factors for the manifestation of Legionnaires’ disease include old age, immunosuppression, smoking and male gender [Phin et al., 2014]. Most cases of Legionnaires’ disease are isolated and sporadic with only 8% of European cases associated with outbreaks [Beaute et al.,

17

2013; Joseph et al., 2010]. There are 16 serogroups of L. pneumophila [Bartram, 2007]. However, ~85% of Legionella isolates obtained from hospitalised patients belong to L. pneumophila serogroup 1

[European Centre for Disease Prevention and Control, 2013; Yu et al., 2002]. Nevertheless, at least half of the 50 known species of Legionella have been shown to cause disease in humans [Lamoth et al.,

2010; Muder et al., 2002]. , comprised of 2 serogroups, causes up to 50% of legionellosis cases in New Zealand and Australia, highlighting the regional differences in causative agents [Whiley et al., 2011]. Interestingly, although the majority of Legionnaires’ disease cases are reportedly caused by L. pneumophila serogroup 1 strains, their clinical dominance is not reflected in the environment [Doleans et al., 2004]. Much research has therefore been focused on determining virulence traits of serogroup 1 strains to understand their prevalence in clinical settings.

Legionella species, together with Mycoplasma pneumonia and Chlamydophila pneumonia, are classed as atypical nonzoonotic bacterial respiratory pathogens which are responsible for 22% of community- acquired pneumonias worldwide [Arnold et al., 2007]. Although Legionella spp. only accounts for 4% of community-acquired pneumonias, this is likely an underestimation [von Baum et al., 2008; Yu et al.,

2008]. Legionella is not routinely tested for amongst patients admitted with mild respiratory infections and there is currently no effective diagnostic test which covers a broad range of Legionella species.

Legionnaires’ disease is almost indistinguishable from other more common forms of pneumonia by radiography [Tan et al., 2000]. Furthermore, although L. pneumophila is sensitive to most-common antibiotics and, unlike M. pneumoniae and Streptococcus pneumoniae, development of antibiotic resistance has rarely been reported and does not seem to represent a major health issue currently,

Legionnaires’ disease presents itself at higher severity and fatality than community-acquired pneumonias attributed to other atypical pathogens [Edelstein, 1995; Ishiguro et al., 2013; Lewis et al.,

1978].

Although Legionella can cause disease in humans, it has been termed an accidental pathogen as there were no known cases of person-to-person spread of legionellosis until 2014 (however, this was only reported in Feb 2016) [Correia et al., 2016]. Infection of humans by Legionella has long been thought of as an evolutionary dead-end and hence provides no long-term advantage for the pathogen. This

18 paradigm may begin to shift due to the first reported case of human-to-human transmission. Genetic sequencing of this strain revealed that it belonged to L. pneumophila subspecies fraseri [Borges et al.,

2016].

As Legionella is a waterborne pathogen, potential outbreaks would be difficult to control if the source is not isolated and treated quickly. As such, all public spaces undergo routine Legionella surveillance.

Moreover, prevention of Legionella contamination and in particular biofilm formation in plumbing systems is a key factor in engineering design.

1.2. Legionella spp. diagnostics

Current testing for Legionella include urine antigen tests, culture and PCR [Mercante et al., 2015].

Whilst culture remains the gold standard for confirmation of Legionella detection, its turnaround time of 3-5 days for L. pneumophila and up to two weeks for non-pneumophila species means it seldom alters treatment plans against suspected Legionnaires’ disease cases [Fields, 1994; Wilkinson, 1987].

However, isolation of a specific strain from an initial patient is indispensable for control and management of a potential outbreak [Mandell et al., 2007]. In addition, culture is technically challenging and hence requires experienced personnel for it to be a robust detection method.

Due to the difficulties of culture, the urinary antigen test (UAT) has become the most common diagnostic assay for Legionella detection [Beaute et al., 2013; Hicks et al., 2012]. The UAT exists in two formats as a 96-well plate enzyme immunoassay or as an immunochromatographic test. Its popularity in diagnostic labs can be attributed to its low cost, speed and ease-of-use. However, the main drawback of the UAT is its lack of sensitivity towards non-serogroup 1 Legionella [Helbig et al., 2001;

Olsen et al., 2009]. As such, the UAT is only efficient at detecting L. pneumophila serogroup 1 strains.

Furthermore, this likely skews the distribution of Legionnaires’ disease causing Legionella towards serogroup 1 and hence it may be overrepresented in current statistics given that most diagnoses is done using the UAT. Between 1996-2006 in Denmark, where more extensive diagnostic testing is performed, only 60% of cases were attributed to serogroup 1 strains [Jespersen et al., 2009; St-Martin et al., 2013].

This is in line with statistics in the US before the decline of culture as the primary diagnostic tool and

19 in contrast to the reported ~85% of cases in Europe and US [Benin et al., 2002; European Centre for

Disease Prevention and Control, 2013; Yu et al., 2002]. Worryingly, the mortality rates of Legionnaires’ disease caused by non-serogroup 1 strains are higher than those caused by serogroup 1 L. pneumophila

[St-Martin et al., 2013]. To this end, there is a distinct need for novel rapid diagnostic tools to detect a wide range of Legionella species and serogroups.

PCR is an attractive alternative to UAT. With appropriate design of probes/primers, its sensitivity is essentially 100% due to its ability to exponentially amplify the signal. It also produces results in a short time frame, enabling direct influence over treatment plans. Furthermore, it has no bias over different

Legionella species or serogroups. This is a large advantage over culture whereby culture medium bias skews sensitivity towards specific species and serogroups. In particular, some Legionella cannot be cultured in typical Legionella culture media and can only be grown within its protozoan host, these have been termed Legionella-like amoebal pathogens and some have been considered to be human pathogens

[Adeleke et al., 2001; La Scola et al., 2004]. Currently, PCR is the only diagnostic technique which can detect all species and serogroups of Legionella within a suitable timeframe to alter patient treatment.

However, PCR is not without flaws. Although specialist personnel and equipment are required, its main disadvantage is its inability to determine whether the DNA source originated from viable bacteria.

Hence PCR may be prone to false positives where Legionella DNA is detected but may not be the cause of the clinical symptoms.

1.3. The life cycle of L. pneumophila

L. pneumophila is a facultative intracellular pathogen as it can survive as planktonic bacteria, in biofilm communities and inside host cells. However, in the context of replication, L. pneumophila behaves more similarly to an obligate intracellular pathogen in nature. Intracellular growth accounts for the majority of L. pneumophila replication with extracellular growth only playing a minor role [Kuiper et al., 2004;

Temmerman et al., 2006]. The life cycle of L. pneumophila is a complex multiphasic network which consists of at least 14 forms so far (reviewed in [Robertson et al., 2014]). The growth phases of extracellularly grown L. pneumophila can be divided in two phenotypically distinct phases: the replicative phase and the infectious/transmissive phase. In the replicative phase, L. pneumophila are not

20 infectious and are known as the exponential phase form (EPF). However, upon depletion of local nutrients as it reaches stationary phase, there is a switch in gene regulation promoting virulence genes.

Whilst L. pneumophila is non-motile in the replicative phase, it gains motility through expression of flagella in the transmissive phase. In this state, extracellularly grown Legionella is able to infect host cells and proliferate intracellularly and is termed the stationary phase form (SPF). EPF and SPF have been hypothesised to be the naturally occurring L. pneumophila forms found in biofilms.

L. pneumophila’s life cycle is multiphasic in intracellular environments. However, the intracellular forms can also be generalised into two major forms: the replicative phase form (RPF) and mature infectious form (MIF). Upon infection of a host cell, L. pneumophila switches to the RPF where it begins replicating in its replicative niche. As the host cell becomes filled with bacteria, L. pneumophila differentiates into MIFs, enabling infection of new host cells. Multiple intermediate forms are present during these transitions and some forms appear to be host-cell dependent. Inside Acanthamoebae castellanii, L. pneumophila fully develops into MIFs whilst they are only partially differentiated and less infectious in human macrophage cell lines such as U937 and THP-1 [Abdelhady et al., 2013].

Furthermore, inside Tetrahymena (a ciliated protozoa), L. pneumophila does not replicate at 30°C or under but does differentiate [Berk et al., 2008; Faulkner et al., 2008]. Although both SPF and MIF are both transmissive forms of L. pneumophila, they are distinct from one another [Garduno et al., 2002].

In particular, MIFs are 10-fold more infectious than their SPF counterparts and appear more resistant to stress.

Besides these major forms, the filamentous form (FF) and viable but not culturable (VBNC) L. pneumophila have been described. FFs have been linked to increased stress signals such as limiting nutrients, presence of antibiotics, high temperature and UV radiation [Charpentier et al., 2011; Piao et al., 2006; Smalley et al., 1980; Warren et al., 1979]. FFs are infectious and able to infect both lung epithelial and macrophage cells [Prashar et al., 2013; Prashar et al., 2012]. Furthermore, filamentous L. pneumophila has been found in clinical lung tissues [Rodgers et al., 1978]. Although filamentation of other bacterial pathogens such as uropathogenic and have been

21 associated with enhanced virulence, its role in L. pneumophila pathogenesis requires more research

[Allison et al., 1994; Rosen et al., 2007].

The VBNC form represents the dormant state of L. pneumophila under conditions of severe stress.

VBNC cells can be derived from EPFs, SPFs and MIFs [Al-Bana et al., 2014; Ohno et al., 2003]. Whilst

EPF- and SPF-derived VBNC L. pneumophila can be resuscitated in the presence of amoebae, conditions to resuscitate MIF-derived VBNC cells have not yet been determined [Ohno et al., 2003;

Steinert et al., 1997]. There is some controversy regarding whether VBNC cells represent a dormant state which require a signal to fully activate and enter the life cycle or are just remnants of dying cells

[Nystrom, 2003].

1.4. The intracellular lifestyle of L. pneumophila

The evolutionary pressure on L. pneumophila stems from their primary hosts, environmental water- borne protozoa. It is due to its interactions with protozoa which leads to its ability to infect humans, still generally regarded as an evolutionary dead-end. The similarities in cellular signalling processes of protozoa and alveolar macrophages enable Legionella to effectively employ means selected by its environmental hosts to survive within macrophages. Although phagocytes are their primary hosts and therefore Legionella does not require additional invasion mechanisms, the bacteria have been shown to further upregulate phagocytosis in human macrophage-like cells as well as facilitate uptake into non- phagocytic cells [Garduno et al., 1998; Hilbi et al., 2001]. This suggests that Legionella has evolved additional mechanisms to ensure its replication within host cells rather than solely relying on their host cells innate ability to engulf bacteria. Furthermore, multiple morphologically distinct phagocytic uptake mechanisms have been reported for different host cells [Cirillo et al., 1994; Horwitz, 1984; Prashar et al., 2012].

Upon uptake by macrophages, Legionella subverts the host endosomal-lysosomal pathway, which would typically lead to destruction of avirulent bacteria, and forms a replicative niche: the Legionella- containing vacuole (LCV) (Figure 1.1). Although the LCV is derived from plasma membranous material, it quickly acquires ER-derived vesicles as well as mitochondria. However, although the LCV

22 acquires host components, its composition is distinct from any host organelle and comprises a mix of both host and bacterial proteins. As the LCV matures, L. pneumophila begins to replicate inside the vacuole until the host cell is drained of resources and filled with bacteria. This leads to lysis of the host cell, releasing the bacteria for a subsequent round of infection. Although Legionella typically escape the host cell through lysis, non-lytic escape of bacteria has also been reported [Chen et al., 2004].

Essential to Legionella’s virulence strategy and its ability to survive and replicate intracellularly is its defect in organelle trafficking/intracellular multiplication (Dot/Icm) type IVB secretion system.

Figure 1.1. L. pneumophila has evolved to proliferate inside phagocytes. Upon uptake by macrophages, L. pneumophila subverts host signalling pathways using its Dot/Icm T4SS to avoid the endosomal-lysosomal system. Instead, it forms a replicative niche, the Legionella containing vacuole (LCV) to which mitochondria and ribosomes are recruited. L. pneumophila replicates in the LCV until it escapes from the host cell to reinitiate the infection cycle. Taken from [So et al., 2015].

23

1.5. Protein secretion by Gram-negative bacteria

Bacteria have evolved various secretion systems to ascertain a host of different functions from transporting proteins onto the outer membrane to translocating virulence factors into host cells to manipulate cellular functions. There are currently six (I-VI) known classes of bacterial secretion system found in Gram-negative bacteria (reviewed in [Costa et al., 2015]). These secretion systems have been broadly categorised into two groups: those which translocate material across the inner and outer membranes and those which translocate molecules from the periplasm through just the outer membrane.

1.5.1. Type I secretion system

The type I secretion system (T1SS) is able to transport proteins across the inner and outer membranes in a single step. The system is comprised of three components: ATP-binding cassette (ABC) transporter, a membrane fusion protein and an outer-membrane protein. The Legionella T1SS Lss comprises the three proteins LssB, LssD and TolC [Fuche et al., 2015; Jacobi et al., 2003]. This system has been shown to be important for the invasion into host cells. However, it appears dispensable for intracellular replication as recruitment of ER material to the LCV is independent of a functional Lss. The role of Lss in Legionella virulence has only recently been studied and as such only one substrate, the toxin RtxA, has been identified so far. RtxA is a pore forming toxin which aids entry into both protozoan and human macrophage hosts but does not affect ER recruitment to the LCV [Cirillo et al., 2001; Cirillo et al.,

2002; Fuche et al., 2015]. A ΔlssB/lssD mutant shows a similar pore forming defect phenotype to that of a ΔrtxA strain, indicating that RtxA is a substrate of the T1SS during infection.

1.5.2. Type III secretion system

The type III injectisome is a molecular syringe which is able to translocate proteins from the bacterium cytoplasm directly into the host cell cytoplasm in a single step (reviewed in [Galan et al., 2014]).

Although Legionella does not encode a T3SS, it is found in many Gram-negative pathogens such as

Salmonella, Shigella, enterohaemorrhagic and enteropathogenic Escherichia coli and has been shown to be essential for virulence. The T3SS consists of approximately 20 proteins which form three distinct substructures: the basal body, the needle complex and the needle tip. The basal body spans both the inner and outer membranes whilst the needle complex enables the T3SS to traverse the extracellular

24 space. The needle tip helps form the translocon pore between the needle and the host membrane, enabling translocation of effector proteins from the bacterium into the host cell.

1.5.3. Type VI secretion system

The T6SS spans both inner and outer membranes and is utilised to translocate toxins into both eukaryotic and prokaryotic cells. It is made up of two complexes: the membrane complex and the tail complex which has similarities with contractile bacteriophage tails [Leiman et al., 2009]. The tail complex contains a VgrG tip/spike component to which the rest of the tail complex forms [Brunet et al., 2014], creating a spike-tube complex. Upon receiving a stimulus, the tail complex contracts and ejects the spike-tube complex through the target membrane to deliver its effector proteins [Basler et al.,

2012; Kudryashev et al., 2015].

1.5.4. Sec and Tat pathways

The general secretory (Sec) pathway and twin arginine translocation (Tat) protein export pathway are the main mechanisms by which bacteria secrete proteins through the inner membrane. The Sec pathway shuttles proteins through the cytoplasmic membrane in an unfolded state whilst the Tat pathway transports folded proteins.

The Sec pathway can occur co-translationally and post-translationally [Lycklama et al., 2012]. A typical

Sec signal peptide is 20 amino acids in length with a tripartite structure: a positively charged N-terminus followed by a hydrophobic region and finally a polar C-terminus [Natale et al., 2008]. Some signal peptides contain cleavage sites which enable the signal peptide to be removed upon translocation by signal peptidases. However, other than conserved cleavage motifs, there is no consensus amino acid sequence for a Sec signal peptide. During the co-translational process, the nascent polypeptide chain containing a highly hydrophobic N-terminal signal peptide is bound by signal recognition particles

(SRP) as it emerges from the ribosome. This slows down the translation process, allowing the SRP to bind its membrane receptor FtsY. Upon binding to FtsY, the nascent polypeptide in complex with the ribosome is transferred onto the SecYEG, a transmembrane complex termed the protein conducting channel. Here, translation is resumed which provides the energy for membrane insertion.

25

For proteins which contain less hydrophobic signal peptides, trigger factors bind instead of SRPs.

Unlike SRPs, binding of trigger factors does not affect translation. The chaperone activities of trigger factors are replaced by SecB after the elongation process, keeping the translated protein in an unfolded state. SecB then transports the protein to the ATPase SecA which provides the energy for the translocation process in conjunction with the proton motive force. Upon binding to SecA, the protein substrate is recruited to the SecYEG complex to initiate the translocation process. SecD and SecF are involved in the latter stages of the pathway.

The Tat pathway differs between organisms with some requiring only two components whilst others needing three [Berks, 2015; Palmer et al., 2012]. TatA and TatC are always needed in contrast to TatB whose activity can be covered by TatA in some systems. All three components are integral membrane proteins with TatBC existing in complex. This TatBC complex recognises the twin arginine sequence

(S/T-R-R-x-F-L-K) on the signal peptide. The Tat sequence is similar in structure to Sec signal peptides with a positively charged N-terminal stretch followed by a hydrophobic core and a polar C-terminal region. However, Tat sequences are typically longer than Sec sequences. Upon forming a substrate-

TatBC complex, TatA is recruited and oligomerises to form a channel in the inner membrane through which the substrate protein is translocated using the proton motive force. Upon translocation, the signal peptide is typically cleaved from the substrate by a signal peptidase [Luke et al., 2009]. Interestingly, unlike other , L. pneumophila does not encode the Tat translocon as an operon [De Buck et al., 2004]. Instead whilst tatA and tatB are in an operon, tatC is located 27 kB downstream. The Tat pathway has been shown to play a role in intracellular replication of L. pneumophila in both protozoa and human macrophages as well as being important for biofilm formation [De Buck et al., 2005].

1.5.5. Type V secretion system

The type V secretion system is also known as the autotransporter system which requires Sec machinery to move substrates into the periplasm initially [Costa et al., 2015]. The system is so-called the autotransporter system because a single polypeptide encodes both the substrate and the pore with the

C-terminal region capable of forming β-barrel structures, enabling the N-terminal cargo to be

26 transported through the outer membrane. Currently, one autotransporter has been identified in L. pneumophila strain Paris [Cazalet et al., 2004].

1.5.6. Type II secretion system

Type II secretion is a two-step process which requires substrates to first be translocated into the periplasm by either the Sec or Tat pathway followed by secretion through the outer membrane by the type II machinery. Although the type II secretion system (T2SS) only transports its substrate proteins through the outer membrane, its structures spans both the inner and outer membranes. L. pneumophila encodes one T2SS termed the Legionella secretion pathway (Lsp). Discovery of the pseudopilin peptidase gene pilD in L. pneumophila strain 130b provided the first clue that it may encode a T2SS

[Liles et al., 1998]. Although PilD is not a structural component of the T2SS, it is involved in the maturation of prepilins. The T2SS is made up of four parts: an outer membrane complex, a pseudopilus which spans the periplasm, an inner membrane platform and a cytoplasmic ATPase [Costa et al., 2015].

The outer membrane complex is formed by oligomerisation of the secretin LspD. The pseudopilus consists of the major pseudopilin LspG along with the four minor pseudopilins LspH, LspI, LspJ and

LspK. The inner membrane complex consists of LspC, LspF, LspL and LspM. The outer membrane complex is linked to the inner membrane platform by the interaction of LspC with the periplasmic domain of LspD. Finally LspE acts as the cytoplasmic ATPase. The T2SS is used to secrete >25

Legionella proteins and has been shown to play a role in virulence with deletion of T2SS components resulting in lower intracellular replication of L. pneumophila is both amoebae and macrophages [De

Buck et al., 2007]. Additionally, L. pneumophila lacking a functional T2SS is less fit than wild type

(WT) in competition experiments in a mouse infection model [Rossier et al., 2005]. The Lsp system has also been shown to aid growth of Legionella at temperatures below 37°C [Liles et al., 1998;

Soderberg et al., 2004]. Hence the T2SS appears important for the L. pneumophila to survive in tap water from 4-17°C as well as aquatic amoebae at temperatures between 22-25°C [Soderberg et al.,

2008].

T2SS substrates were initially identified through identification of enzymatic activities in the bacterial culture supernatant in a T2SS-dependent manner (reviewed in [Cianciotto, 2009]) [Aragon et al., 2000;

27

Rossier et al., 2001; Rossier et al., 2004]. Distinct enzymatic activities were discovered in supernatants in a T2SS-dependent manner: tartrate sensitive (Map) and tartrate resistant acid phosphatases, phospholipase A, phospholipase C (PlcA), lysophospholipase A (PlaA), glycerophospholipid:cholesterol acyltransferase (PlaC), mono-, di- and triacylglycerol lipases (LipA and LipB), ribonuclease and metalloprotease (ProA) [Aragon et al., 2001; Aragon et al., 2002; Banerji et al., 2005; Flieger et al., 2001; Flieger et al., 2002; Hales et al., 1999; Liles et al., 1999]. Mutation of single identified genes did not abolish enzymatic activity in the supernatant completely, suggesting that there were additional secreted proteins. Further T2SS substrates were identified by comparison between two-dimensional gel electrophoresis of the bacterial supernatants from WT bacteria and ΔT2SS mutants

[DebRoy et al., 2006]. Twenty proteins were identified including the previously reported T2SS substrates ProA, PlaA and Map. SrnA was attributed to the ribonuclease activity previously observed in supernatants [Rossier et al., 2009]. ChiA was discovered as a T2SS-dependent chitinase [DebRoy et al., 2006]. LapA and LapB were identified as aminopeptidases [Rossier et al., 2008]. Three other T2SS substrates shared homology to other bacterial enzymes: an amidase (Lpg0264), cysteine protease

(Lpg2622) and endoglucanase (Lpg1918) [DebRoy et al., 2006]. Two further substrates were eukaryotic-like with LegP (Lpg2999) resembling an astacin-like zinc protease and Lpg2644 sharing similarity to collagen-like proteins [DebRoy et al., 2006]. Lpg1809, 1385, 0873, 0189 and 0956 shared no homology with any known proteins. IcmX, LvrE and VirK have been implicated in type IV secretion systems [Brand et al., 1994; Matthews et al., 2000; Ridenour et al., 2003; Segal et al., 1999]. Whilst

IcmX and LvrE have been directly linked to the Legionella Dot/Icm system, VirK has been associated to the type IV secretion system of Agrobacterium tumefaciens [Kalogeraki et al., 1998; Li et al., 2005].

The currently identified T2SS substrates is likely an underestimation due to the limitations of 2D gel electrophoresis. Certain in silico algorithms have suggested that the actual number of T2SS substrates is around 60 [DebRoy et al., 2006]. Prediction of signal sequences of T2SS substrates revealed that most were translocated across the inner membrane via the Sec pathway. However, PlcA uses the Tat pathway [Rossier et al., 2005].

28

1.5.7. Type IV secretion system

The type IV secretion system (T4SS) shares similarities with DNA-conjugation systems and bacterial pili. In addition to translocation of proteins, T4SS are also able to transport DNA. Recently, the structure of the prototypical T4ASS encoded on the R388 plasmid of E. coli was elucidated by electron microscopy [Low et al., 2014]. Legionella encodes at least two T4SS: the type IVA Lvh system and the type IVB Dot/Icm. In some strains, multiple putative type IVA secretion systems have been reported

[Gomez-Valero et al., 2011; Schroeder et al., 2010]. The Lvh T4ASS is thought to be dispensable for virulence. However, it has been implicated in entry into host cells and intracellular replication in the absence of the Dot/Icm T4BSS and at lower temperatures [Bandyopadhyay et al., 2007; Ridenour et al., 2003]. Furthermore, the presence of a third type of type IV secretion system: genomic island- associated (GI)-T4SS have also been found in genomic studies of L. pneumophila 130b [Juhas et al.,

2007; Schroeder et al., 2010]. Whether these GI-T4SS play a role in pathogenesis is currently unknown.

In 1998, a number (~20) of dot/icm genes were discovered simultaneously by the groups of Ralph Isberg and Howard Shuman whilst trying to identify essential genes for Legionella pathogenesis using forward genetic approaches [Segal et al., 1998; Vogel et al., 1998]. These genes were predicted to encode a

T4SS due to similarities of dot/icm genes to conjugation system components in addition to L. pneumophila’s ability to transfer IncQ plasmids in a Dot/Icm dependent manner.

Unlike the T4ASS, the structure of T4BSS is poorly characterised. A putative core complex spanning both the inner and outer bacterial membranes consists of five proteins: DotC, DotD, DotF, DotG and

DotH/IcmK (Figure 1.2) [Vincent et al., 2006a]. This was determined by mapping the subcellular localisation of Dot/Icm components. DotC, DotD and DotH localise to the outer membrane. DotF and

DotG reside in the inner membrane but can be found in both inner and outer membrane fractions suggesting that they interact with Dot/Icm components on the outer membrane. Electron microscopy studies suggest that DotG forms the central channel which spans both the inner and outer membranes with DotF playing an important role in stabilising the core complex [Kubori et al., 2014]. The function of most other Dot/Icm genes are yet to be determined. However, localisation studies and bioinformatics predictions have shed some light onto their roles in effector translocation. DotA is an integral inner

29 membrane protein with unknown function [Roy et al., 1997]. However, DotA is essential for Dot/Icm- dependent translocation and a ΔdotA mutant is frequently used as a Dot/Icm null mutant. DotB, a homologue of PilT and VirB11, is an ATPase which has been implicated in Dot/Icm substrate export

[Sexton et al., 2004b; Sexton et al., 2005]. DotI and DotJ form a complex on the inner membrane with

DotI sharing structural homology to the T4ASS VirB8 [Kuroda et al., 2015]. DotK/IcmN contains an

OmpA domain, suggesting a role in anchoring the Dot/Icm T4BSS onto the peptidoglycan [Morozova et al., 2004]. DotL is related with type IV coupling proteins and has been suggested to function as an inner membrane receptor for Dot/Icm substrates [Buscher et al., 2005]. DotU and IcmF have been shown to stabilise the Dot/Icm complex [Sexton et al., 2004a]. IcmX localises to the periplasm in a Sec- dependent manner and is essential for intracellular replication of Legionella [Matthews et al., 2000].

IcmR acts a chaperone for IcmQ, a protein capable of forming pores in membranes through insertion into membranes by its N-terminal domain [Dumenil et al., 2001; Dumenil et al., 2004].

Figure 1.2. Schematic model of the Dot/Icm type IVB secretion system. DotC, DotD, DotF, DotG and DotH form the core complex spanning both inner (IM) and outer membranes (OM). DotU and IcmF are inner membrane proteins which stabilise the core complex. DotA is a crucial IM protein required for Dot/Icm-dependent translocation. IcmS and IcmW act as effector chaperones and DotL may be an IM effector receptor. IcmS also forms a complex with LvgA. DotI and DotJ form a complex on the IM. DotB is a cytoplasmic ATPase implicated in effector translocation. IcmR is a chaperone for the pore-forming protein IcmQ. IcmX is an essential periplasmic protein required for intracellular replication. DotK is an OM protein linked with anchoring the secretion apparatus to the peptidoglycan. The core complex is in green, OM proteins in purple, periplasmic proteins in cyan, IM proteins in yellow and cytoplasmic proteins in blue.

30

Two small acidic cytoplasmic proteins IcmS and IcmW resemble type III secretion system chaperones

(Figure 1.2). These two proteins have been shown to interact with each other as well as forming complexes with substrate proteins [Cambronne et al., 2007; Ninio et al., 2005]. Deletion of IcmS/IcmW either individually or together results in a partial intracellular growth defect in human U937 cells [Coers et al., 2000]. Indeed deletion of IcmS/IcmW severely hinders translocation of many Dot/Icm effectors

[Cambronne et al., 2007; Lifshitz et al., 2013]. IcmS has also been shown to bind another small acidic cytoplasmic protein LvgA. Whilst deletion of lvgA results in a slight defect in intracellular replication in U937 cells and A. castellanii, the defect is much more apparent in mouse macrophages, suggesting a host-specific role of LvgA in Legionella pathogenesis [Vincent et al., 2006b].

In 2002, the first protein substrate of the Dot/Icm system was discovered, the effector RalF [Nagai et al., 2002]. Currently, more than 300 effectors have been found in L. pneumophila [de Felipe et al.,

2008; Huang et al., 2011; Lifshitz et al., 2013; Zhu et al., 2011]. Dot/Icm substrates are typically characterised by a disordered C-terminal secretion signal of ~20 amino acids [Amor et al., 2005; Nagai et al., 2005]. There is no consensus sequence but there are trends based on the biophysical properties of the amino acid residues. At the -3/-4 position relative to the C-terminus, there is a hydrophobic residue [Nagai et al., 2005]. There is an overrepresentation of small amino acids (G, A, S and T) at positions -2 to -8 relative to this hydrophobic residue [Kubori et al., 2008]. Within the C-terminal 30 amino acids, there is commonly (~50% of known effectors) an E-block motif (EEXXE - rich in glutamic acid residues) [Huang et al., 2011]. Using a machine learning approach, the Pupko lab was able to identify novel Legionella effectors as well as deduce a common T4SS secretion signal based on biophysical properties of amino acids [Burstein et al., 2009]. They were able to further detect and generalise the secretion signal to generate a synthetic “optimal” secretion signal which translocated proteins in a Dot/Icm dependent manner [Lifshitz et al., 2013]. It has recently emerged that some effectors such as SidJ may encode an alternative translocation signal which is not located at the C- terminus [Jeong et al., 2015b]. Deletion of the SidJ C-terminus results in delayed translocation suggesting that the C-terminus enables SidJ to be translocated into host cells at early infection timepoints whilst the internal sequence mediates translocation at later timepoints. The possibility of

31 having multiple translocation signals within one protein has been hypothesised to enable Legionella to create a hierarchy of effectors, thereby allowing precise temporal control of specific effector translocation. Dot/Icm substrates are likely to be translocated in an unfolded state with fusion of substrates with tightly folded proteins such as dihydrofolate reductase and ubiquitin reducing their translocation efficiency [Amyot et al., 2013].

1.6. Dot/Icm T4SS effectors

Using its Dot/Icm T4SS, Legionella translocates over 300 effector proteins into the host cell to affect host cell processes. This is currently the largest number of effector proteins secreted by a bacterial pathogen. This large arsenal of molecular weaponry is attributed to Legionella’s broad protozoan host range. By accumulating a diverse range of tools, Legionella is able to survive and efficiently replicate inside a plethora of different hosts. However, this also leads to difficulties in identifying functions of effectors as individual effector deletions seldom lead to an intracellular growth phenotype. This has been attributed to functional redundancy of the large number of effectors whereby multiple effectors target the same pathway to ensure bacterial proliferation. In addition to functional redundancies, groups of effectors can act consecutively to fine tune the manipulation of a particular pathway.

The functional redundancy between effectors is exemplified by large deletions of the L. pneumophila genome without affecting intracellular replication [O'Connor et al., 2011]. Up to 30% of effectors were removed without changing the ability of Legionella to replicate in bone-marrow derived mouse macrophage. However, this mutant did not replicate as efficiently in amoebae, supporting the hypothesis that effectors have host-specific activities.

32

1.6.1. Avoiding phagolysosomal maturation

In order to survive and subsequently replicate inside phagocytic cells, Legionella must first avoid endosomal-lysosomal maturation before it can successfully create its replicative niche (Figure 1.3).

Figure 1.3. L. pneumophila avoids host defence mechanisms to survive inside macrophages. The typical phago-lysosomal pathway in which early endosomes (EE) mature into late endosomes (LE) and finally into lysosomes (LY) is prevented by L. pneumophila in a Dot/Icm T4SS dependent manner. Upon phagocytosis, the effector LegG1 enables L. pneumophila residing inside the Legionella-containing vacuole (LCV) to transverse through the cell along microtubules. The phosphatase activities of SidP and SidF allow the phosphatidylinositol phosphate (PIP) composition of the LCV to be changed. VipA, VipD and AnkX inhibit endosomal vesicular trafficking. RidL prevents retrograde trafficking whilst SidK inhibits acidification of the LCV. RavZ and LpSpl inhibit autophagy via removal of Atg8 from the LCV and degradation of sphingosine-1 phosphate respectively. SdhA is critical for maintenance of LCV integrity to prevent inflammasome activation. Adapted from [So et al., 2015].

Several effectors such as LegC3 and LegC7/YlfA have been shown to disrupt the endosomal system in yeast [Bennett et al., 2013; Campodonico et al., 2005; de Felipe et al., 2008; Heidtman et al., 2009;

O'Brien et al., 2015; Shohdy et al., 2005]. However, the roles of the majority of these effectors during infection have yet to be confirmed. The mechanisms behind Legionella’s ability to decouple its LCV from the endosomal-lysosomal pathway still remains largely unknown. A potential hypothesis is the bacteria’s ability to move around the cell along microtubules [Lu et al., 2005]. This mobility is dependent on the effector LegG1 which aids in the indirect activation of the small GTPase Ran during

33 infection (Figure 1.3) [Rothmeier et al., 2013]. Although LegG1 has sequence homology to Ran GEF

RCC1, recombinant LegG1 did not have direct Ran GEF activity [Rothmeier et al., 2013]. The mechanism behind its ability to activate Ran during infection remains to be determined. However, this activity results in the recruitment of the Ran GTP-binding protein RanBP1 onto the LCV, stabilisation of microtubules in addition to promoting LCV movement and phagocyte migration [Rothmeier et al.,

2013; Simon et al., 2014]. Both LegG1 and Ran appear to play important roles in Legionella pathogenesis as depletion of either results in intracellular growth defects of Legionella. However, whether this is a result of LegG1 and Ran acting to help LCVs avoid the endosomal-lysosomal pathway through LCV motility or they are able to directly enhance intracellular growth in an alternative yet-to- be-determined mechanism remains to be established.

1.6.2. Lipid manipulation

To further disguise itself, Legionella actively changes lipid composition of the LCV to resemble the

Golgi apparatus. Lipid composition of membranes plays important roles as it not only affects the biophysical properties of the membrane such as curvature but lipids are also common scaffolds to which proteins can bind to. Hence by altering the LCV lipid composition, Legionella ensures selective recruitment of proteins. A key class of lipids which Legionella regulates are the phosphatidylinositol phosphates (PIPs). Studies on PIP dynamics during infection of Dictyostelium discoideum revealed that

PI(3,4,5)P3 is present on early LCVs [Weber et al., 2014b]. However, this is converted into PI(3)P within 1 minute. Over the first 2 h of infection, PI(3)P is depleted from the LCV and PI(4)P takes over as the dominant PIP species. Whilst PI(3)P and its other polyphosphorylated forms are usually found on endosomal vesicles, PI(4)P are predominantly found on Golgi membranes [Behnia et al., 2005]. Not only is PI(3)P a marker for early endosomes but also they serve as docking motifs for proteins which promote phagolysosomal maturation.

Two Legionella effectors, SidF and SidP, have been shown to directly manipulate the PIP composition of LCVs (Figure 1.3). SidF is a PI 3-phosphatase using both PI(3,4)P2 and PI(3,4,5)P3 as substrates to create PI(4)P [Hsu et al., 2012]. SidP also shows PI 3-phosphatase activity in vitro [Toulabi et al.,

34

2013]. However, it utilises PI(3)P and PI(3,5)P2 as substrates and completely dephosphorylates its substrates.

Many Legionella effectors utilise PIPs as anchors for subcellular localisation. So far LtpD, RidL, SetA,

LidA, LpnE and RavZ have shown PI(3)P binding preferences whilst SidC, Lem4, Lem28 and SidM specifically bind PI(4)P [Brombacher et al., 2009; Finsel et al., 2013; Harding et al., 2013a; Horenkamp et al., 2015; Hubber et al., 2014; Jank et al., 2012; Ragaz et al., 2008; Weber et al., 2009]. Legionella appears to have independently evolved PIP binding domains as they show no sequence or structural homology with known eukaryotic PIP binding motifs [Del Campo et al., 2014]. Certain Legionella effectors have developed alternative methods to ensure membrane localisation. Rather than binding lipids, some effectors hijack host lipidation machinery to become covalently lipidated [Ivanov et al.,

2010].

In addition to translocating its own PI phosphatases, Legionella is able to recruit host cell proteins to further alter the LCV PIP composition. The PI 5-phosphatase OCRL1, involved in the dephosphorylation of PI(4,5)P2 to PI(4)P is recruited to the LCV [Weber et al., 2009]. Although the effector LpnE has been shown to bind OCRL1 in vitro, deletion of LpnE does not abolish Legionella’s ability to recruit OCRL1 to the LCV nor does it prevent the accumulation of PI(4)P on the LCV.

Whether OCRL1 is directly recruited by a redundant effector or a downstream effect of the presence of

ARF1 and Rab1 on the LCV remains to be determined [Hyvola et al., 2006; Lichter-Konecki et al.,

2006].

Not only can Legionella utilise PI phosphatases to generate PI(4)P but PI 4-kinases (PI4K) have also been implicated in decorating the LCV with PI(4)P [Brombacher et al., 2009; Hubber et al., 2014]. Two isoforms, PI4K IIIα and IIIβ, have so far been associated with LCV PI(4)P generation. However, studies have shown that the importance of each isoform is dependent on the host cell with PI4K IIIα playing a more prominent role in human- and mouse-derived cells whilst PI4K IIIβ is more important during infection of Drosophila melanogaster cells.

35

Although PIPs play an important role for Legionella to establish its LCV, Legionella also modulates other phospholipid species. The effector CegC1/PlcC promotes virulence through its phosphatidylcholine-specific phospholipase C activity [Aurass et al., 2013]. LpdA acts as a phospholipase D to generate phosphatidic acid from PI, PI(3)P, PI(4)P and phosphatidylglycerol in vitro

[Schroeder et al., 2015]. A deletion of LpdA results in a growth detect in murine lungs. Furthermore

LpdA is post-translationally modified with palmitoyl group in its C-terminus by the host palmitoylation machinery to enable membrane localisation. Phosphatidic acid may also be generated with the help of the effector LecE which acts as an activator of Pah1, a host cell phosphatidic acid phosphatase [Viner et al., 2012]. The host enzyme inositol(myo)-1(or 4)-monophosphatase 1 (IMPA1) dephosphorylates inositol monophosphates to generate free inositol. The effector LtpD is capable of binding to IMPA1

[Harding et al., 2013a]. As the generation of free inositol is the rate limiting step in the biosynthesis of

PIs, LtpD may play an important role providing the LCV with sufficient resources for remodelling.

1.6.3. Manipulating lipid compositions of early endosomes

L. pneumophila translocates the effector VipD to alter the lipid composition of early endosomes (Figure

1.3). VipD has an N-terminal patatin-like phospholipase A1 (PLA1) domain which removes PI(3)P from membranes [VanRheenen et al., 2006]. Interestingly, only in the presence of either Rab5 or Rab22 does the PLA1 become active [Gaspar et al., 2014; Lucas et al., 2014]. The selective C-terminal Rab

GTPase binding domain of VipD therefore allows spatial control of the activation of the PLA1 domain, allowing Legionella to only remove PI(3)P from specific membranous organelles [Ku et al., 2012]. In a VipD null mutant, more Rab5 is recruited to LCVs suggesting fusion of early endosomes with the

LCVs. This suggests that VipD plays a role in helping Legionella avoid the endosomal-lysosomal pathway.

Legionella is likely to translocate more effectors to manipulate early endosomal pathways. The effector

Lpg0393 has been shown to have GEF activity against Rab5, Rab21 and Rab22 in vitro [Sohn et al.,

2015]. However, its role during infection is yet to be determined. The coiled-coil effector VipA was shown to localise to early endosomes but not the LCV during infection [Franco et al., 2012]. The hypothesis that VipA disrupts endosomal trafficking along actin polymers is supported by its ability to

36 bind and polymerise actin in vitro whilst also exhibiting defects in the multivesicular body (MVB) pathway in yeast when ectopically expressed.

1.6.4. Preventing acidification of the LCV

Typical phagolysosomal maturation involves the acidification of the compartment to aid digestion of its contents through activation of hydrolytic enzymes. The effector SidK has been implicated in inhibiting ATP hydrolysis of VatA, a component of the mammalian vacuolar H+-ATPase (vATPase), thereby inactivating the vacuolar proton pump (Figure 1.3) [Xu et al., 2010]. However, vATPase is not universally identified in LCV proteomes from all host cell types tested, suggesting a difference in LCV development between hosts [Hoffmann et al., 2014; Lu et al., 2005]. In addition, the localisation of

SidK has not yet been reported and hence may act on other vesicles such as endosomes. The deletion of SidK from the genome shows no change in LCV pH or an intracellular growth defect, suggesting that there may be undiscovered functionally redundant effectors.

1.6.5. Inhibiting retrograde trafficking

The retrograde trafficking pathway transports vesicles from the plasma membrane and endosomes to the ER via the Golgi apparatus [Johannes et al., 2011]. This pathway has been shown to limit intracellular growth of Legionella as disruption of retrograde trafficking by RNA interference (RNAi) leads to increased replication of Legionella [Finsel et al., 2013]. The effector RidL is translocated by L. pneumophila to disrupt this pathway (Figure 1.3). Deletion of RidL leads to an intracellular growth defect in RAW264.7 murine macrophages, D. discoideum and A. castellanii. RidL localises to the LCV bacterial poles at early infection time points with 70% of LCVs being RidL positive after 1 h infection of D. discoideum. This drops down to 45% 6 h post-infection. In vitro, RidL specifically binds to Vps29, a subunit of the heterotrimeric retrograde cargo recognition (retromer) complex which is essential for cargo sorting and function of the retrograde trafficking pathway. Although RidL binds Vps29 in vitro, recruitment of the cargo recognition subcomplex (a heterotrimer of Vsp26, Vsp29 and Vsp35) to the

LCV occurs in a RidL independent but Dot/Icm dependent manner. However, recruitment of other important proteins in the formation of full retrograde cargo vesicles, such as cargo recognition receptors

(Vps10 and CIMPR) and sorting nexins (SNXs), to the LCV are reduced when RidL is present. This

37 suggests that RidL may inhibit retrograde trafficking by sterically blocking Vps29 from binding its endogenous cargo complex binding partners. RidL appears as an important effector to prevent phago- lysosomal maturation of the LCV as the lysosome marker protein LAMP1 is recruited to the LCV in the absence of RidL.

1.6.6. Avoiding detection from host-defence mechanisms

Escaping the vacuole would not only enable Legionella to avoid the phago-lysosomal degradation pathway but also facilitate uptake of host nutrients by residing in the cytoplasm. However, Legionella has evolved to remain in a vacuole and replicate inside rather than enter the nutrient-rich cytoplasm. By remaining in the LCV, Legionella is able to more efficiently avoid detection by the host-defence mechanisms. Key to maintaining LCV integrity is the effector SdhA, one of the few effectors in which a strong intracellular growth defect is observed upon deletion (Figure 1.3) [Laguna et al., 2006]. SdhA has been shown to be an important effector for intracellular replication across a wide range of infection models including G. mellonella, mice and murine bone-marrow-derived macrophages whilst providing a weaker phenotype in human U937 macrophage-like cells and in ameobae [Harding et al., 2013b;

Laguna et al., 2006]. Although SdhA was found to be crucial, its molecular mechanisms for stabilising the LCV is yet to be determined. SdhA likely functions in delicate balance with the effector lipase PlaA whose adverse effects are not fully defined [Creasey et al., 2012]. Deletion of plaA in a ΔsdhA background is sufficient to restore LCV integrity. Hence the lipase activity of PlaA appears to have drastic consequences on LCV stability. The interplay between SdhA and PlaA requires further investigation to fully understand their respective roles in Legionella pathogenesis. Upon disruption of the LCV, Legionella-derived molecules are released into the host cytosol where cytosolic intracellular pattern recognition receptors are able to detect these signals. One such receptor is AIM2 which detects bacterial DNA. This subsequently triggers a type I interferon immune response, leading to an inflammasome-mediated caspase-1 activation response resulting in pyroptosis (proinflammatory cell death) [Ge et al., 2012; Monroe et al., 2009].

Legionella also activates autophagy upon internalisation with acquisition of the autophagy markers

Atg7 and Atg8 on the LCV at early time points of infection [Amer et al., 2005]. Autophagy is a host

38 mechanism to degrade and recycle cellular components. However, autophagy has also been shown to be a method to control microbial infection [Huang et al., 2014]. By encapsulating the target cargo in a double membrane to create an autophagosome, degradation of its contents is triggered by fusion with lysosomes, maturing the autophagosome into an autophagolysosome. Although the LCV acquires Atg7 and Atg8, they are quickly lost from the LCV thus masking it from further autophagolysosome maturation. In the absence of SdhA, more bacteria are Atg8 positive, suggesting that not only pyroptosis but also autophagy is involved in restricting Legionella proliferation [Creasey et al., 2012].

Three effectors have so far been found to interfere with the host autophagy pathway. Whilst the effector

LegA9 promotes recognition of the LCV by autophagy components, the effector RavZ removes Atg8 from the LCV to prevent further autophagosome maturation (Figure 1.3) [Choy et al., 2012; Khweek et al., 2013]. RavZ is a cysteine protease which cleaves the lipid anchor from Atg8, thereby removing it from the LCV membrane. Both of these effectors are not conserved amongst all strains and hence implies that some strains may promote autophagy whilst others inhibit this pathway as part of their respective virulence strategies.

In contrast to the LegA9 and RavZ being strain-specific, LpSpl is a conserved effector protein found in all currently sequenced L. pneumophila strains [Rolando et al., 2016]. LpSpl is a sphingosine-1 phosphate lyase whose activity is able to inhibit autophagy in macrophages (Figure 1.3). Furthermore, a ΔlpSpl mutant is outcompeted by wild type L. pneumophila in a mouse model suggesting a critical role for LpSpl in Legionella pathogenesis.

1.6.7. Rab GTPases mature the LCV into a unique replicative organelle

Having avoided the host lysosomal/endosomal pathway, the LCV starts to resemble the ER. Within 15 min of infection, ER-derived vesicles can be seen to be recruited to the LCV in transmission electron micrographs [Tilney et al., 2001]. Although the LCV has ER-like characteristics, it is very much a unique organelle as shown by profiling of the LCV proteome by mass spectrometry [Hoffmann et al.,

2014; Shevchuk et al., 2009; Urwyler et al., 2009]. These differences are highlighted by the presence of 60 T4SS effectors and at least 12 Rab GTPases (Rab1, 2, 4, 5, 7, 8, 9, 10, 11, 14, 21 and 32) on the

LCV, many of which would typically not be localised on the same host membrane.

39

Rab GTPases have been shown to be essential regulators of vesicular trafficking (reviewed in

[Stenmark, 2009]). They are part of the Ras superfamily of small GTPases and exist in an inactive GDP- bound or an active GTP-bound form. Their activity is determined by their nucleotide binding state and they cycle between the states with the help of other proteins. Guanine nucleotide exchange factors

(GEFs) activate Rab GTPases by switching GDP for GTP whilst GTPase activating proteins (GAPs) encourage hydrolysis of GTP to GDP, deactivating the Rab GTPase. Not only is their activity governed by GDP/GTP binding but also their localisation. To exert their function, Rab GTPases need to be localised to specific membranes. Membrane association is enabled by prenylation of a C-terminal CaaX box motif. However, due to this post-translational modification, Rab GTPases become insoluble when not associated with membranes. In their inactive GDP-bound state, Rab GTPases localise to the cytoplasm whilst bound to guanosine nucleotide dissociation inhibitor (GDI) which binds the prenyl group, aiding solubility.

The involvement of Rab GTPases in membrane dynamics earmarks them as frequent targets of

Legionella effectors to aid maturation of the LCV as well as avoiding the endosomal system (Figure

1.4). Their importance to intracellular replication is exemplified by RNA interference studies

[Hoffmann et al., 2014]. Depletion of Rab5a, Rab14 and Rab21 promoted intracellular replication of L. pneumophila whilst depletion of Rab8a, Rab10 and Rab32 inhibited bacterial proliferation (Figure

1.4D). Recruitment of secretory pathway-associated Rab GTPases (Rab8a, 10 and 32) to the LCV appear beneficial for the bacterial whilst endocytic Rab GTPases (Rab5a, 14 and 21) exhibit deleterious functions onto Legionella.

40

Figure 1.4. Manipulation of Rab GTPases by L. pneumophila effectors. (A) SidM anchors onto the Legionella-containing vacuole (LCV) via its PI(4)P binding domain. It recruits Rab1 onto the LCV and activates it using its guanine nucleotide exchange factor (GEF) domain. SidM modifies Rab1 with AMP resulting in Rab1 remaining constitutively active. SidD deAMPylates Rab1 which enables the GTPase activating protein (GAP) effector LepB to deactivate Rab1. Host guanosine nucleotide dissociation inhibitors (GDI) extract Rab1 from the LCV. (B) SidM binds syntaxins and recruits Sec22b-positive vesicles to the LCV via activation of Rab1, thereby forcing non-canonical SNARE pairing between syntaxin-3A (Stx3) and Sec22b which results in fusion of ER-derived vesicles onto the LCV. The ARF1 GEF effector RalF promotes fusion of ER- derived vesicles with the LCV by recruiting and activating ARF1. (C) Phosphocholination of Rab1 by AnkX prevents Rab1 activation and extraction by GEFs and GDIs respectively. Lem3 removes the phosphocholination modification from Rab1. (D) PieE and LidA localise onto the LCV and bind multiple Rab GTPases. Rab GTPases involved in the secretory pathway (Rab8a, 10 and 32) are beneficial for L. pneumophila intracellular replication whilst endocytic Rab GTPases (Rab5a, 14 and 21) have a negative impact. Taken from [So et al., 2015].

41

Of the Rab GTPases manipulated by Legionella effectors, Rab1 has been the best characterised.

Legionella exhibits precise control over Rab1 activity with at least 6 effectors shown to alter its function and is a prime example of how Legionella effectors have evolved to regulate cellular processes with such intricate spatio-temporal precision (Figure 1.4). Rab1 is involved in the trafficking of ER-derived vesicles between the ER and Golgi. Legionella hijacks this function to recruit ER-derived vesicles to the LCV. As mentioned previously, ER-derived material can be seen on the LCV as early as 15 min post-infection [Tilney et al., 2001]. Similarly, recruitment of Rab1 onto the LCV also occurs during the early stages of infection. The effector SidM (DrrA) is critical for Rab1 recruitment to the LCV (Figure

1.4A). Upon translocation, SidM localises to the LCV using its PI4P binding domain [Brombacher et al., 2009]. Its GEF domain then aids dissociation of Rab1 from its GDI as well as activates Rab1 by exchanging GDP for GTP, resulting in active Rab1 residing on the LCV [Machner et al., 2006; Machner et al., 2007; Murata et al., 2006]. Finally, SidM maintains Rab1in an active state through AMPylation of tyrosine 77, preventing binding of GAPs [Muller et al., 2010]. Both the GEF and AMPylation activities of SidM have been shown to be vital for recruitment of Rab1 onto the LCV [Hardiman et al.,

2014]. Furthermore, some host binders of Rab1 such as MICAL-3 are unable to bind AMPylated Rab1.

SidM represents a remarkable multifunctional effector which not only recruits and activates Rab1 on the LCV but also selectively influences its interactome to ensure only a specific subset of Rab1- dependent signalling pathways are activated.

As indicated by the presence of multiple Rab GTPases in LCV proteomes, Legionella translocates multiple Rab GTPase-binding effectors. LidA has been shown to have picomolar affinity for Rab1, 6 and 8 in vitro (Figure 1.4D) [Schoebel et al., 2011]. Additional in vitro studies have shown that it is able to bind at least 24 Rab GTPases [Cheng et al., 2012; Yu et al., 2015a]. LidA appears to act in concert with SidM as it is able to bind AMPylated Rab1 [Muller et al., 2010]. Rab GTPases are stabilised in the GTP-bound form when interacting with LidA as this interaction abrogates binding to

Rab GAPs [Chen et al., 2013]. LidA localises to the LCV during early stages of infection but can also be seen to localise to other membranous structures as the infection progresses [Derre et al., 2005].

42

Although LidA has been implicated in Rab1-dependent recruitment of early secretory vesicles in vitro, its role in infection is not fully understood [Machner et al., 2006].

The effector PieE is also able to bind a host of different Rab GTPases (Rab1A, 1B, 5C, 6A, 7 and 10) during infection and is predicted to act as a Rab GTPase binding hub on the LCV (Figure 1.4D)

[Mousnier et al., 2014]. Ectopic expression of PieE causes distinct rearrangement of the smooth ER to form laminar structures. However, PieE does not induce fusion of these ER sheets. This suggests PieE helps tether vesicles to the LCV and perhaps acts in concert with LidA in a functionally redundant manner.

To aid fusion of vesicles to membranes, eukaryotic cells utilise soluble N-ethylmaleimide-sensitive factor activating protein receptors (SNAREs) [Jahn et al., 2006]. These proteins exist on both vesicular

(v) and target (t) membranes but are distinct from one another depending on the membrane. When v- and t-membranes are sufficiently close, v-SNAREs and t-SNAREs are able to complex. Together with supporting proteins, this interaction generates the force required for membrane fusion. In order to maintain specificity and prevent undesirable fusion events, each v-SNARE can only complex with a particular subset of t-SNAREs. As the LCV membrane is initially plasma membrane-like due to the engulfment process, it contains plasma membrane associated SNAREs such as syntaxins. However, these are typically incompatible with ER SNAREs such as Sec22b [Arasaki et al., 2012]. To overcome these incompatible SNARE pairings and efficiently recruit ER-derived vesicles to the LCV, Legionella translocates effectors to force fusion between non-canonical SNARE pairings. Remarkably, SidM has additional functions beyond Rab1 manipulation. Its ability to directly bind syntaxins and also recruit

Sec22b-positive vesicles to the LCV forces these two typically incompatible SNAREs into such close proximity that a fusion event occurs (Figure 1.4B) [Arasaki et al., 2012].

In addition to utilising host SNAREs, some Legionella effectors such as LseA, LegC2/YlfB, LegC3 and LegC7 share homology with eukaryotic SNARE proteins [Campodonico et al., 2016; King et al.,

2015; Shi et al., 2016]. Ectopic expression of LseA revealed that it was able to interact with a subset of host SNAREs. Although LseA localises to Golgi-associated membranes during infection, its role in L. pneumophila pathogenesis remains to be determined. LegC2, LegC3 and LegC7 mimic Q-SNARE

43 proteins and are able to complex with R-SNARE vesicle-associated membrane protein 4 (VAMP4).

These proteins hence promote fusion of VAMP4-positive vesicles with the LCV [Shi et al., 2016].

Deletion of legC2 and legC7 results in a defect in the number of replication-viable LCVs and a decrease in recruitment of ER-derived vesicles to the LCV [Campodonico et al., 2016].

Although Rab1 recruitment and activation on the LCV is important for delivery of ER-vesicles to the

LCV during early stages of infection, protein levels of both Rab1 and SidM being to decline 2 h post- infection in macrophages [Ingmundson et al., 2007]. To deactivate Rab1 and subsequently signal its extraction from the LCV by GDIs, the AMPylation modification must first be removed. The effector

SidD is an antagonist effector to SidM which is encoded next to SidM in the genome and capable to deAMPylating Rab1 (Figure 1.4A). [Neunuebel et al., 2011; Tan et al., 2011b]. To further exert control over Rab1 regulation, Legionella translocates its own Rab1 GAP, the effector LepB, which can now bind the deAMPylated Rab1 and aid hydrolysis of GTP to GDP [Ingmundson et al., 2007]. Upon deactivation, Rab1 can then be extracted by host GDIs to remove it from the LCV.

Although this elaborate effector cascade already describes manipulation of Rab1 at every stage of its

GDP-GTP cycle, there appears to be additional mechanisms by which Legionella is able to fine tune its activity. Surprisingly, AMPylation of tyrosine 77 is not the only post-translational modification exerted on Rab1 by Legionella effectors. Serine 79 can be modified with a phosphocholine moiety by the effector AnkX whilst mono-ubiquitination of Rab1 is catalysed by the effector SidE [Mukherjee et al.,

2011; Qiu et al., 2016].

AnkX contains a filamentation induced by cAMP (Fic) domain which catalyses the phosphocholine transfer from CDP-choline onto target Rab GTPases (Figure 1.4C) [Mukherjee et al., 2011]. So far, both Rab1 and Rab35 have been identified as substrates of AnkX. In contrast to SidM, AnkX modifies

Rab1 preferentially in its GDP-bound state. In this phosphocholinated GDP-bound state, binding of Rab

GEFs and GDIs is inhibited. Therefore, AnkX prevents activation and extraction of Rab1 from the LCV membrane. This suggests AnkX acts in an antagonist manner to SidM. However, phosphocholination of GTP-bound Rab1 also prevents GAP interaction. Although phosphocholination inhibits many interactions of Rab GTPases with their cognate binding partners, it differs from AMPylation as

44 phosphocholinated Rab1 is still able to bind MICAL-3 albeit with weaker affinity than the unmodified protein whilst AMPylation completely abolishes this interaction [Goody et al., 2012]. A dually

AMPylated and phosphocholinated Rab1 species was identified under in vitro conditions. However, kinetics and the artificial in vitro conditions suggests that these two modifications are unlikely to occur on the same Rab1 molecule in physiological environments. It is currently unknown how Legionella controls these multiple modifications onto Rab1 but it is likely linked to the spatio and temporal regulation of effector translocation and localisation. Furthermore, pre-binding of Rab1 to LidA reduces the efficiency of modification by both SidM and LidA, suggesting that other effectors may influence the modification preferences of Rab1 during infection. Similar to the antagonistic effects of SidD to

SidM, L. pneumophila also encodes a dephosphocholinase enzyme (Lem3) to remove the AnkX- dependent modification on Rab1 (Figure 1.4C) [Tan et al., 2011a].

Recently the effector SidE was shown to ubiquitinate Rab GTPases using a novel mechanism [Qiu et al., 2016]. In contrast to the canonical activation of ubiquitin by sequential E1, E2 and E3 enzymes,

SidE directly modifies ubiquitin with an ADP-ribose motif which consequently activates it. As a result

SidE is able to ubiquitinate Rab GTPases in an E1- and E2-independent manner. Currently only ubiquitination of Rab1 and Rab33 by SidE have been shown during infection. However, ectopic expression of Rab6A and Rab30 in conjunction with SidE also reveals ubiquitation of these Rab

GTPases. Although the exact mechanisms are still unclear, SidJ is able to regulate the half-life of SidE on the LCV [Havey et al., 2015; Jeong et al., 2015a]. SidJ has also been implicated in the efficient recruitment of ER proteins onto the LCV [Liu et al., 2007]. However, whether this is an additional direct function of SidJ or a consequence of its interaction with SidE requires further inquiry.

In addition to Rab GTPases, the small GTPase ADP-ribosylation factor 1 (ARF1) is also recruited to the LCV. ARF1 is typically involved in secretory vesicular trafficking and has been shown to aid efficient fusion of ER-derived vesicles to the LCV [Robinson et al., 2006]. ARF1 localisation to the

LCV is dependent on the effector RalF, the first discovered Dot/Icm effector (Figure 1.4B) [Nagai et al., 2002]. Using its Sec7-homology domain, RalF activates ARF1 by acting as a GEF. Activity of the

Sec7-homology domain is regulated by the RalF capping domain. This capping domain localises RalF

45 onto the LCV and upon membrane contact, initiates a conformational change to expose the Sec7- homology domain [Alix et al., 2012]. Currently, RalF is the only effector to affect ARF signalling pathways. However, although ARF1 depletion affects intracellular replication, RalF appears dispensable for efficient intracellular growth. This suggests alternative effectors may exist to play a functionally redundant role to RalF [Hoffmann et al., 2014; Nagai et al., 2002; Rothmeier et al., 2013].

In addition to RalF, the homologous effectors SidC and SdcA play a role in ARF1 recruitment to the

LCV [Horenkamp et al., 2014]. Additional studies have strengthened the hypothesis that SidC/SdcA is involved in ER recruitment to the LCV [Dolinsky et al., 2014; Ragaz et al., 2008]. A crystal structure of SidC recently revealed that SidC is a novel family of E3 ubiquitin ligases [Hsu et al., 2014].

Recruitment of ER vesicles was dependent on this ubiquitination activity. The ubiquitination targets of

SidC are currently unknown. Although mono-ubiquitination of Rab1 was dependent on the presence of

SidC during infection, SidC did not appear to directly modify Rab1 [Horenkamp et al., 2014; Hsu et al., 2014]. As SidE appears to directly ubiquitinate Rab1, this suggests that SidC/SdcA may have additional roles in regulating SidE activity.

Effective ER recruitment to the LCV has been shown to be reliant on additional phosphorylation- dependent mechanisms. The kinase activity of the effector LegK2 has been shown to be important for both virulence in amoebae and ER recruitment to the LCV [Hervet et al., 2011]. LegK2 is capable of phosphorylate itself and generic protein kinase substrates in vitro [Hervet et al., 2011]. However, its host cell targets are currently unknown.

Creation and maintenance of a replication permissive LCV does not appear to be a trivial task and this is exemplified by the extensive number of effectors translocated by L. pneumophila to manipulate vesicular trafficking pathways.

46

1.6.8. Acquiring nutrients

As Legionella resides in the LCV during replication, it must retrieve nutrients from the host cell via the vacuolar membrane. IroT (MavN/DimB) is an essential effector for Legionella intracellular replication tested across multiple strains (Philadelphia, Paris and 130b) and host infection models [Isaac et al.,

2015; O'Connor et al., 2011; Portier et al., 2015]. IroT is involved in iron acquisition and ΔiroT strains show a strong defect in iron uptake (Figure 1.5A). As there does not appear to be a functionally redundant effector for IroT, it suggests that the source of iron from all hosts is very similar and hence a single effector is able to ensure Legionella’s iron requirements. However, there appears to be some differences in uptake of the secreted siderophore legiobactin between different hosts. Both LbtP and

LbtU are outer membrane Legionella proteins which have been implicated in iron uptake [Burnside et al., 2015; O'Connor et al., 2016]. Whilst both proteins are dispensable for intracellular growth in A. castellanii, deletion of lbtP results in an intracellular growth defect in macrophages. In contrast, a ΔlbtU mutant behaves like wild type in macrophages. Furthermore, a ΔlbtP mutant escapes the host cell earlier than wild type suggesting the L. pneumophila has evolved a mechanism to detect low levels of iron availability in the host and escape when nutrients are exhausted.

In addition to IroT, the effector LppA may indirectly aid iron uptake (Figure 1.5A) [Weber et al.,

2014a]. LppA acts as a phytate phosphatase. The metabolite phytate has two main roles in eukaryotic cells: as a phosphorous storage compound and as a siderophore [Smith et al., 1994]. Phytate also has bacteriostatic effects on Legionella which can be abolished by the presence of LppA. Although the mechanism of phytate’s ability to restrict growth is unclear, it may be due to its chelation of metallic ions such iron, preventing their uptake by Legionella. LppA remains the only example of a Legionella effector with the capability to disarm potentially harmful host metabolites.

47

Figure 1.5. Nutrient acquisition by L. pneumophila and exploitation of host ubiquitination machinery. (A) The effector IroT is critical for L. pneumophila to absorb iron from the host. LppA disarms the antimicrobial activity of metal-chelating phytate using its phosphatase activity. (B) L. pneumophila hijacks the host ubiquitination system for both nutrients and signalling. The ubiquitin E3 ligase effectors AnkB, SidC, SdcA and LegU1 polyubiquitinate host proteins, signalling them for proteasomal degradation resulting in a release of free amino acids. L. pneumophila recruits host amino acid transporters such as SLC1A5 onto the LCV to ensure a sufficient supply of amino acids for efficient intracellular replication. Modulation of BAT3 and ParvB ubiquitination states by LegU1 and AnkB respectively may manipulate host apoptosis signalling. SidE monoubiquitinates Rab1 using a non-canonical mechanism. LubX controls the half-life of SidH by signalling SidH for proteasomal degradation via ubiquitination. Adapted from [So et al., 2015].

Amino acids are the major source of carbon and energy for intracellular Legionella. Although carbohydrate metabolism is important for intracellular replication, amino acids are preferred even in the presence of glucose [Harada et al., 2010; Schunder et al., 2014]. Additionally, isotopologue profiling of amino acid incorporation indicated that Legionella obtains and utilises amino acids from the host cell

[Schunder et al., 2014].

Legionella uses the host ubiquitin-proteasome system for the generation of amino acids [Price et al.,

2011]. Ubiquitin is a 76 amino acid (8.5 kDa) protein which acts as a signalling marker upon conjugation to lysine residues. As ubiquitin contains seven lysines of its own, polyubiquitin chains can form. The nature of the chain type determines the outcome of an ubiquitination signal. K48-linked ubiquitin signals a protein for degradation by the proteasome. This processes creates short peptides which are further degraded into amino acids by host peptidases.

48

The LCV is decorated with K48-linked host proteins and this process is dependent on the Dot/Icm T4SS

(Figure 1.5B) [Price et al., 2011]. The effector AnkB (LegAU13) was the first effector to be shown to aid in the ubiquitination of LCV proteins. AnkB encodes an F-box domain which is involved in determining substrate specificity in the eukaryotic SKP1-Cullin-F-box (SCF) E3 ligase complex. In some L. pneumophila strains, AnkB also encodes a C-terminal CaaX box, allowing it to be anchored onto the LCV by post-translational farnesylation using host machinery [Price et al., 2010].

Ubiquitination of the LCV appears solely dependent on AnkB in certain strains such as 130b whilst other effectors are also involved in decorating the LCV with ubiquitin in the Philadelphia and Paris strains [Hsu et al., 2014; Ivanov et al., 2009; Lomma et al., 2010; Price et al., 2009]. The effectors SidC and SdcA have been shown to be involved in the ubiquitination of the LCVs of L. pneumophila

Philadelphia [Horenkamp et al., 2014; Hsu et al., 2014].

Interestingly, AnkB has other roles in addition to signalling proteins for degradation. AnkB was reported to bind the host protein ParvB [Lomma et al., 2010]. However, AnkB does not polyubiquitinate

ParvB for degradation but instead prevents it endogenous ubiquitination. Consequentially, there is more

ParvB inside the host cells leading to enhanced intracellular replication for Legionella. ParvB levels have also been shown to control caspase 3 activation and apoptosis.

In addition to AnkB, Legionella encodes at least a further five effectors with homology to eukaryotic ubiquitin ligases: LegU1, LicA, PpgA, Lpg2525 and LubX. Similar to AnkB, LegU1, LicA, PpgA and

Lpg2525 also contain F-box domains. However, only LegU1 and LicA in addition to AnkB interact with the SCF complex component SKP1 [Ensminger et al., 2010; Lomma et al., 2010]. Furthermore, only LegU1 formed a functional SCF E3 ligase complex in vitro. LegU1 has been shown to polyubiquitinate the host protein BAT3, involved in the protein quality control, ER stress and apoptosis

[Kawahara et al., 2013]. BAT3 may be an important target for Legionella as another effector, Lpg2160, has also been shown to interact with it [Ensminger et al., 2010]. However, the implications of BAT3 polyubiquitination by LegU1 and binding by Lpg2160 in Legionella’s infection strategy requires further investigation.

49

LubX contains two U-box domains which, unlike F-box domains, only requires an E2 enzyme to ubiquitinate substrates without the need for a Cullin-RING E3 scaffold [Kubori et al., 2008]. LubX has been shown to polyubiquitinate the host cell cycle kinase Clk1 (Figure 1.5B) [Kubori et al., 2008].

However, LubX does not only target host proteins but is also able to polyubiquitinate the effector SidH, thereby controlling its half-life using the host proteasome machinery [Kubori et al., 2010].

1.6.9. Manipulating host transcription and translation

There have so far been three Legionella effectors reported to be involved in altering host transcription.

The effector RomA/LegAS4 acts as a histone methyltransferase using its eukaryotic-like SET domain.

It is able to methylate histone 3 at two different sites: lysine 14 (H3K14) and lysine 4 (H3K4) (Figure

1.6A) [Li et al., 2013; Rolando et al., 2013]. H3K14 methylation is a novel histone methylation site and acts in direct competition with acetylation of the same residue. As acetylation of H3K14 results in upregulation of transcription, methylation has the opposing effect and represses transcription by preventing H3K14 acetylation. In contrast, methylation of H3K4 has been implicated in rDNA transcription.

Figure 1.6. L. pneumophila manipulates both host transcription and translation. (A) The effector RomA/LegAS4 methylates histone H3 at both positions K14 and K4. Methylation of K14 inhibits acetylation of the same site, resulting in transcription inhibition. Methylation of K4 activates rDNA transcription. LegK1 and LnaB activate NF-κB signalling, resulting in transcription of anti-apoptotic genes. (B) Lgt1, 2 and 3 inhibit host translation by glucosylation of translation elongation factor eEFA1. SidI also prevents translation by inhibiting eEFA1. The translation inhibition mechanism of SidL is currently unknown. Taken from [So et al., 2015].

50

In addition to directly modifying histones, Legionella secretes effectors to activate the NF-κB pathway, resulting in transcription of anti-apoptotic genes and hence prevents apoptosis-driven cell death of

Legionella infected macrophages [Abu-Zant et al., 2007; Bartfeld et al., 2009; Losick et al., 2006; Shin et al., 2008]. So far two effectors LegK1 and LnaB have been implicated in NF-κB activation (Figure

1.6A) [Ge et al., 2009; Losick et al., 2010]. LegK1 resembles the host IκB kinase and phosphorylates multiple IκB-family proteins [Ge et al., 2009]. These proteins typically bind to NF-κB and hence prevent its translocation into the nucleus. However, upon phosphorylation IκB-family proteins are signalled for ubiquitination and subsequent proteasomal degradation, thereby relieving the nuclear translocation inhibition of NF-κB. Once in the nucleus, NF-κB can act as a transcription factor for anti- apoptotic genes. The mechanism of LnaB is yet to be elucidated.

Although Legionella is able to alter the host proteome via manipulation of transcription, it gains further control of the host cell by affecting host translation. This presumably aims to prevent the host cell from mounting an immune response against the infection as well as maximising the potential nutrients available to be taken up by Legionella. Five Legionella effectors have been implicated in preventing host translation (Figure 1.6B). The effectors Lgt1, Lgt2 and Lgt3 act as glucosyltransferases and modify eEF1A, an important protein in the elongation of nascent polypeptide chains [Belyi et al., 2008]. Upon glucosylation, eEF1A is rendered inactive and protein translation is inhibited. Although the three Lgt proteins are functionally equivalent in vitro, their translocation is temporally regulated with Lgt1 and

Lgt2 being expressed in the stationary phase whereas Lgt3 is produced in the lag (pre-log) phase [Belyi et al., 2008]. The effector SidI similarly targets eEF1A and to a lesser extent eEF1Bγ to inhibit translation [Shen et al., 2009]. SidL is also able to inhibit translation, however, its mechanism has not yet been determined [Fontana et al., 2011]. Although these five effectors act to prevent translation, the block is not complete and the host is still able to trigger a pro-inflammatory response through translation of pro-inflammatory cytokines [Asrat et al., 2014; Ivanov et al., 2013]. Production of IL-1α and IL-1β bypass the inhibition imposed by Legionella and this response is dependent on MyD88 signalling [Asrat et al., 2014].

51

Furthermore, not only do Legionella effectors inhibit translation but there is also a host response to restrict translation. Upon detection of pathogenic Legionella, multiple members of the mTOR pathway become ubiquitinated by the host [Ivanov et al., 2013]. Proteosomal degradation of mTOR leads to a cap-dependent inhibition of translational initiation via eIF4F. This results in a translational bias whereby abundant transcripts (e.g. IL6-proinflammatory cytokines and calnexin) are still readily translated whilst less abundant transcripts (e.g. IL10) are not. This suggests a complicated relationship between

Legionella and the host in which the Legionella actively inhibits translation at the elongation phase whilst the host inhibits it at the cap-dependent translation initiation stage.

Although many aspects of host cell signalling have already been shown to be influenced by Dot/Icm effectors, most effector functions are still uncharacterised. Not only is research on effector function important to understand bacterial virulence mechanisms but they also reveal intricate host immune signalling mechanisms to counteract infection. Recently, proteomics has emerged as a vital tool for studying bacterial pathogenesis.

52

1.7. Proteomics

In the post-genomic era, more emphasis has been put on understanding cellular responses at a transcriptome and proteome level. Whilst the human genome contains ~30,000 open-reading frames, this leads to ~100,000 different mRNA transcripts. Furthermore, these mRNA transcripts can lead to an estimated 10 million different proteins [Jensen, 2004]. The transcriptome provides the first port of call to understanding how a biological system responds to a stimulus. However, an increase in transcription does not necessarily correlate to an increase in translation of said transcript [Maier et al.,

2009]. Furthermore, proteins may exist in different proteoforms and hence only a subset may be active at any one time [Smith et al., 2013]. As proteins are the cellular machinery which enables a cell to respond to stimuli, proteomic studies are of paramount importance in understanding cellular processes.

1.7.1. Top-down proteomics

The proteome can be studied from a top-down or a bottom-up approach [Aebersold et al., 2016]. In a top-down approach, whole intact proteins are ionised and their identities evaluated through fragmentation to peptides. The main advantage of top-down proteomics is the ability to distinguish between proteoforms, particularly being able to detect post-translational modifications. However, top- down approaches are technically more challenging and hence are still currently less common than bottom-up studies. However, recent technological advances have enabled top-down proteomics to become more accessible [Toby et al., 2016].

1.7.2. Bottom-up proteomics

Bottom-up proteomics involves pre-digestion of proteins into peptides. These peptides are then analysed by mass spectrometry in which intact peptide masses are first measured (MS1) and then peptides fragmented enabling peptide sequences to be elucidated (MS2). MS2 spectra consist of b and y ions where b ions represent N-terminal fragments of the peptide and y ions contain C-terminal fragments. The mass differences between series of b or y ions allow the amino acid sequence of the peptide to be inferred. The presence of particular proteins in the initial mixture is then inferred from peptide identifications [Marcotte, 2007]. Bottom-up approaches have gained in popularity due to their ease. No specialised equipment is required and powerful software enables peptides to be identified from

53 complex raw data. Furthermore, peptides are readily solubilised and easily separated by liquid chromatography in contrast to intact proteins. Additionally fragmentation of the amide bonds between amino acids is robust and hence peptide fragmentation patterns can be predicted with high confidence.

Recent advances have enabled over 4000 yeast proteins to be identified from a single-shot 70 minute

LC-MS/MS run [Richards et al., 2015].

1.7.3. Quantitative proteomics

Although the identity of proteins within complex mixtures is important, their abundances are key in understanding regulation of cellular processes. Proteins can be quantified by mass spectrometry absolutely and relatively [Bantscheff et al., 2012]. Although absolute quantification is possible, it is technically challenging and typically requires standards to be spiked in to act as a reference. Relative quantification is sufficient under many circumstances and provides a method to compare between multiple conditions, for instance between different disease states.

As each peptide behaves differently in a mass spectrometer, it is difficult to compare intensities between peptides. However, isotopic differences between peptides of the same sequence should not affect their physical properties. Therefore, peptides of the same identity but different isotopic composition should behave identically in the mass spectrometer and hence their comparative intensities enables a measurement of their abundance.

Isotopic labels may be introduced metabolically using Stable Isotope Labelling in Cell Culture (SILAC)

[Ong et al., 2002]. Between the two comparative samples, one is grown in light media containing normal lysine and arginine whilst the other is grown in heavy media containing heavier isotopes of lysine and arginine. Typically the carbon (12C) and nitrogen (14N) are replaced with 13C and 15N respectively but hydrogens (1H) can also be substituted for deuterium (2D) as an alternative heavy isotope. Lysine and arginine residues are typically labelled as the enzyme trypsin cleaves after these two residues. Therefore, all tryptically digested peptides will contain at least one lysine or arginine, enabling all tryptic peptides to be quantified theoretically. Additionally, metabolic labelling allows for the tags to be incorporated at the beginning of the experiment. Therefore, different samples can be combined earlier in the process, thereby reducing the errors associated with manual handling. However,

54 not all systems are amenable to SILAC. For example, clinical samples cannot be metabolically labelled retrospectively nor can organisms which are autotrophic for typically labelled amino acids.

Alternatively, isotope labels can be incorporated at the peptide level. Dimethyl labelling provides a cheap and fast method of labelling peptides [Boersema et al., 2009]. As nearly all peptides have a free amino group on the N-terminus and many tryptic peptides will have a C-terminal lysine, dimethyl labelling enables most tryptic peptides to be quantified. Using differentially isotopically labelled formaldehyde and sodium cyanoborohydride, three distinct methyl group compositions can be

13 achieved: -CH3, -CHD2 and - CD3. Each group adds 14, 16 and 18 daltons respectively. As primary amines can be dimethylated, this results in a mass shift of 28, 32 and 36 daltons per N-terminal amine/lysine on each peptide. Although dimethyl labelling is cheap and fast, due to the isotopic mass differences stemming from hydrogen to deuterium changes, there are larger isotopic effects compared with isotope changes of other elements. The resulting C-D bond is more polar than a C-H bond, leading to differences in retention time during liquid chromatography. However, advances in algorithms are able to correct for these differences, allowing pairs of isotopic peptides to be identified and quantified.

Both SILAC and dimethyl labelling quantify peptides at the MS1 (intact peptide) level. Alternative peptide isotope labelling strategies such as isobaric tagging for relative and absolute quantification

(iTRAQ) and tandem mass tag (TMT) utilise isobaric tags to enable quantification at the MS2

(fragmentation) level [Ross et al., 2004; Thompson et al., 2003]. Each tag has the same mass but their fragmentation patterns differ, allowing relative quantification between differentially tagged peptides.

Therefore, the complexity of the sample at the MS1 level does not increase with these labels.

Furthermore, iTRAQ strategies enable samples to be multiplexed up to 8-plex whilst 10-plex TMT is possible [Choe et al., 2007; Pierce et al., 2008; Werner et al., 2014]. However, interference effects have been reported whereby ions with similar masses are co-isolated and fragmented with the isobaric ions of interest. This results in contaminating signals in the MS2, masking the true intensities of the reporter ions required for accurate quantification.

In addition to isotope labelling strategies, advances in algorithms have allowed for peptides and proteins to be quantified accurately in a label-free manner [Tate et al., 2013]. However, as samples to be

55 compared cannot be mixed together without labels, they must be run on the LC-MS/MS individually, resulting in more experimental bias. Normalisation strategies enable the data from these individual runs to be analysed together and compared. There are two common label-free strategies: intensity-based and spectral counts. Intensity-based methods utilise the XIC (extracted ion current) associated with each peptide as an indicator of peptide abundance. Although this method does not take into account of the unique physical properties of each peptide, intensities remain a good indicator of abundance in simple samples. Software packages such as MaxLFQ (MaxQuant) enable datasets obtained from different LC-

MS/MS runs to be normalised and compared, providing more accurate label-free quantification [Cox et al., 2014]. For instance, by comparing peptide pairs across LC-MS/MS runs allows MaxLFQ to quantify peptides in a similar way to isotopic label-based methods. Spectral counting quantifies peptides based on the physical number of MS2 spectra obtained per peptide. Although this has its merits, spectral counting is becoming less popular in the field. This is in part due to the technological advances to mass spectrometers in which there is high mass resolution. With increasing resolution, XIC for individual peptides can be extracted more accurately without interfering signals from nearby peptides. In particular, lowly abundant peptides are more accurately quantified with intensity-based methods as they enable a continuous distribution of intensities whilst spectral counting only provides discrete information. Furthermore, spectral counting biases towards larger proteins as they inherently will have more peptides to be analysed.

1.7.4. From raw data to peptide identifications

Upon obtaining raw MS data in the form of mass spectra, they must be assigned to peptides. Although de novo sequencing is possible by inferring amino acid sequence from MS2, it requires severe computational power in addition to high quality MS2 spectra [Ma et al., 2012]. Imperfect fragmentation patterns can only provide partial peptide sequencing. To simplify the problem, typically a complete proteome of the organism of interest is used as a target database. This target database should contain all the possible proteins within the sample of interest. An in silico digest of the target proteome generates a list of peptides which the raw data is then matched to. A subset of candidate peptides based on intact peptide mass (MS1) is identified before matching sequences using fragmentation patterns (MS2). To

56 determine the false discovery rate (FDR), a target-decoy approach is used [Elias et al., 2007]. Typically, the decoy database uses the sequences from the reference proteome but reverses all the sequences. This generates a nonsense database in which no spectra should theoretically be assigned a positive ID. By setting a score threshold of “positive” identification rates against this decoy database, a FDR is generated.

Once the spectra have been assigned to peptides, their parent protein identifications can be inferred.

However, only peptides unique to a particular protein can unequivocally infer the presence of a protein within the sample. In cases in which a single protein cannot be assigned, peptides may be postulated to originate from a protein group. However, only looking at unique peptides results in a loss of information from the remaining identified peptides. Often ambiguous peptides (peptides which could originate from multiple entries of the search space) are assigned to the protein with the most unique peptides, these are termed razor peptides based on the theory of Occam’s razor in which the simplest explanation (i.e. one with the fewest assumptions) is the most likely explanation. Most intensity-based quantification algorithms utilise both razor and unique peptides.

57

1.8. Fic domain proteins

Filamentation induced by cAMP (Fic) domains are found in all kingdoms of life. Currently, there are over 5000 proteins annotated with Fic domains across 2192 species according to Pfam (Figure 1.7)

[Finn et al., 2016]. Although Fic domain-containing proteins are widespread in bacteria, they are less common in eukaryotes and archaea. The human genome encodes a single Fic domain protein:

Huntingtin-interacting protein E (HYPE). The widespread distribution of Fic domains indicates important physiological functions of these proteins. Currently, several characterised Fic domain proteins are bacterial virulence factors, suggesting that these proteins have diversified over time to function in different roles including pathogenesis.

58

Figure 1.7. Taxonomic distribution of Fic domains found amongst all domains of life. Downloaded from Pfam (PF02661) on 21/08/2016 [Finn et al., 2016].

59

1.8.1. Fic domain architecture

Fic domains are characterised by a conserved HPFx(D/E)GN(G/K)R motif flanked by three alpha- helices either side [Garcia-Pino et al., 2014]. This helical core with the Fic loop represents the minimal

Fic domain (Figure 1.8). There are currently 74 different architectures of Fic domain containing proteins. However, these extensions onto the core domain can be broadly grouped into three classes: N- terminal insertions, C-terminal insertions and β-hairpin flaps. Interestingly, the Pseudomonas syringae

T3SS effector protein AvrB shares a similar fold to Fic domains but does not encode the Fic motif

[Kinch et al., 2009]. However, AvrB-like proteins are still considered part of the Fic family of proteins due to structural similarity. Additionally, some proteins encode multiple Fic domains within the same polypeptide. Although many Fic domain structures have been solved (structures for 19 distinct Uniprot

IDs have been reported in Pfam), the functions of most of these proteins remain unknown.

Figure 1.8. Comparison of Fic domain architectures. The helical Fic core is highlighted in red with the canonical Fic motif (or AvrB equivalent) highlighted in blue. N-terminal and C-terminal extensions are in green and yellow respectively. Β-hairpin flaps are in orange. (A) IbpA from Histophilus somni (PDB ID: 3N3U). (B) AvrB from Pseudomonas syringae (PDB ID: 1NH1). (C) AnkX from Legionella pneumophila (PDB ID: 4BEP). (D) VopS from (PDB ID: 3LET).

60

1.8.2. Enzymatic activity of Fic domains

The first Fic domain proteins were discovered in 1982 as stress response proteins in Escherichia coli, where they promote filamentous bacterial growth in the presence of excess cAMP [Kawamukai et al.,

1989; Utsumi et al., 1982]. However, it was not until 2009 that an enzymatic activity was discovered.

Yarbrough et al. discovered that the Fic domain of the Vibrio parahaemolyticus T3SS effector VopS utilised ATP as a substrate to transfer AMP onto a target threonine on Rho GTPases, a process termed

AMPylation [Yarbrough et al., 2009]. Since then multiple other diphosphate containing small molecule substrates have been confirmed as substrates of Fic domains. In addition to ATP, the other NTPs (CTP,

GTP and UTP) can all be utilised by Fic domains [Mattoo et al., 2011]. The hypothesis that only a diphosphate moiety is required stems from the ability of AnkX to use CDP-choline as a substrate molecule [Mukherjee et al., 2011].

Figure 1.9. Schematic of the conserved Fic domain enzymatic activity. Fic motif residues are in blue, the target protein in green and small molecule substrate in red. The catalytic Fic histidine acts as a base to activate the hydroxyl group of the target residue. Increased nucleophilicity enables the hydroxyl group to attack the small molecule substrate and cleave the phosphodiester bond, resulting in the transfer of the proximal group of the substrate onto the target residue. The small molecule substrate is bound in the active site using a coordinated Mg2+ ion and arginine residue in the Fic motif.

61

All characterised Fic domains to-date share the same catalytic mechanism in which the histidine residue at the beginning of the Fic motif activates a hydroxyl containing target residue (threonine, tyrosine or serine) for nucleophilic attack onto the diphosphate containing small molecule (Figure 1.9)

[Campanacci et al., 2013; Castro-Roa et al., 2013; Luong et al., 2010; Xiao et al., 2010]. This histidine residue is essential as mutation to alanine renders the Fic domain inactive. The diphosphate is bound to the Fic active site by a conserved arginine residue and via a coordinated Mg2+ ion by an acidic residue, both of which are encoded in the Fic motif. Nucleophilic attack onto the diphosphate cleaves the phosphodiester bond and transfers part of the substrate onto the hydroxyl containing amino acid. The transferred portion is always located closer to the target residue. So far Fic domains have been shown to act as nucleoside monophosphate (NMP) transferases, kinases and phosphocholine transferases

[Castro-Roa et al., 2013; Cruz et al., 2014; Mattoo et al., 2011; Mukherjee et al., 2011]. The overall conserved catalytic mechanism and diverse substrate range suggests that any diphosphate containing metabolite may potentially be a substrate of Fic domains and hence novel post-translational modifications are likely to be discovered in the future.

Although AvrB shares structural similarity with Fic domains, it does not have any of the Fic catalytic residues in the AvrB P(D/E)X(R/K)G(S/A)(A/G)(A/G) motif [Kinch et al., 2009; Lee et al., 2004]. As such, AvrB-like proteins have a different catalytic mechanism to Fic domains. Indeed mutational analysis revealed a different set of essential residues for AvrB function [Desveaux et al., 2007].

However, the enzymatic function of AvrB remains to be determined.

62

1.8.3. Substrate specificity of Fic domains

Although many small molecule substrates for Fic domains have been reported, each individual Fic domain has varying degrees of specificity. VopS preferentially utilises ATP and GTP as substrates but can also transfer CMP and UMP from CTP and UTP respectively [Mattoo et al., 2011]. Similarly, the

Fic containing toxin IbpA from Histophilus somni (a cattle pathogen) can utilise all four NTPs but has preferences for ATP and CTP [Mattoo et al., 2011]. Analysis of its crystal structure reveals a glutamine residue (Q3757) in its active site which forms a hydrogen bond with the adenine base (Figure 1.10)

[Xiao et al., 2010]. As cytosine has a similar amine group, it likely is also able to form this hydrogen bond whereas uracil and guanine bases would lack this stabilising interaction.

Figure 1.10. Crystal structure of IbpA in complex with AMPylated Cdc42 reveals its small molecule substrate specificity. (A) Cdc42 is in green and IbpA in teal. Critical Fic motif residues (H3717, E3721 and R3725) are highlighted in yellow. A3717 mutation was performed in Pymol to restore the catalytic histidine for illustrative purposes. The crucial hydrogen bond governing IbpA small molecule substrate specificity is highlighted by the dashed yellow line. (PDB: 4ITR) (B) Comparison of nuclear base structures reveals the potential for cytosine to exploit the same hydrogen bonding interaction with IbpA as adenine.

63

Whilst VopS and IbpA appear more promiscuous, the T3SS effector AvrAC from the plant pathogen

Xanthomonas campestris is only able to act as a UMP transferase and is unable to AMPylate its target proteins BIK1 and RIPK [Feng et al., 2012]. Although there appears to be some substrate promiscuity for NMP transferases, these characterisations were conducted in recombinant in vitro assays. Therefore, whether these enzymes are still able to use multiple substrates in a physiological setting remains to be determined. Furthermore, these assays would have utilised NTPs at equimolar concentrations whilst their actual preferences are likely to be influenced by the in vivo endogenous concentrations of these metabolites at relevant cellular localisations.

Whilst VopS, IbpA and AvrAC are all NMP transferases, Doc is a kinase and AnkX is a phosphocholine transferase [Castro-Roa et al., 2013; Cruz et al., 2014; Mukherjee et al., 2011]. Doc is part of a toxin- antitoxin module with Phd on the E. coli P1 prophage. Structural comparisons between these Fic domains revealed that Doc and AnkX bind their respective substrates ATP and CDP-choline in an inverted confirmation relative to VopS and IbpA [Campanacci et al., 2013; Castro-Roa et al., 2013].

As the proximal portion of the small molecule substrate is still transferred onto their target proteins,

Doc phosphorylates EF-Tu instead of catalysing an AMPylation reaction whilst AnkX phosphocholinates Rab1 instead of acting as a CMP transferase. These differences indicate that although the catalytic mechanism of all Fic domains so far is identical, their substrate binding sites have diverged, enabling them to catalyse a broad range of PTMs.

Interestingly, AvrB also binds ADP in an inverted orientation similar to AnkX and therefore suggests that it may act as a putative kinase [Desveaux et al., 2007]. Although kinase activity could not be detected for AvrB, a comparison of co-crystal structures of AvrB and a RIN4 peptide with IbpA and

Cdc42 revealed that the phosphorylated residue (Thr166) of RIN4 is at the same location as the

AMPylated tyrosine (Tyr32) of Cdc42 (Figure 1.11) [Roy et al., 2015].

64

Figure 1.11. Thr166 of RIN4 is similarly located as AMPylated Tyr32 of Cdc42 relative to AvrB and IbpA respectively. Comparison of IbpA and AvrB (teal) co-crystal structures with their substrates Cdc42 and RIN4 (green). The Fic and AvrB motifs are highlighted in blue. AMPylated Tyr32 of Cdc42 and Thr166 of RIN4 are highlighted in red. (PDB IDs: 4ITR and 2NUD). 1.8.4. Protein substrate recognition by Fic domains

Little is known about protein substrate specificity of Fic domain proteins as currently the only structure of a Fic protein with its target protein is a co-crystal structure of IbpA and Cdc42 [Xiao et al., 2010].

The switch 1 region of Cdc42 adopts an extended confirmation, enabling it to interact with a β-hairpin flap next to the Fic motif. As such, the target Tyr residue is positioned for modification in the Fic active site. VopS, AnkX and HYPE all encode β-hairpins close to the Fic motif and these structures may also be involved in substrate recognition [Bunney et al., 2014; Campanacci et al., 2013; Luong et al., 2010].

It is likely that additional domains to the Fic core are required for substrate binding.

The location of the residue to be modified in the Fic active site appears essential. Both VopS and IbpA target hydroxyl residues in the switch 1 region of Rho GTPases. Although the threonine and tyrosine residues they target respectively are adjacent to each other, these two enzymes are unable to modify the other residue [Mattoo et al., 2011]. Hence precise positioning of the substrate protein in the active site is critical for efficient modification.

65

1.8.5. Functional consequences of Fic activity on target proteins

VopS is able to AMPylate Rho GTPases such as Cdc42 and Rac on a threonine residue (Thr35 for Rac and Cdc42, Thr37 for RhoA) in the switch I region, typically involved in protein binding to initiate signalling cascades [Yarbrough et al., 2009]. Modification, however, results in steric hindrance preventing binding of interacting partners, rendering the Rho GTPase inactive. As Rho GTPases are involved in actin rearrangements, this results in a collapse of the cell cytoskeleton which can be observed as a cell rounding phenotype.

IbpA similarly also targets Rho GTPases in the switch I region. However, it modifies them on a conserved tyrosine residue (Tyr32 for Rac and Cdc42, Tyr34 for RhoA) instead of a threonine residue, although this does not alter the functional outcome of the modification [Worby et al., 2009].

AMPylation of Rho GTPases not only causes collapse of the cytoskeleton but also prevents the host from mounting a NF-κB immune response and production of reactive oxygen species [Woolery et al.,

2014]. However, the host also appears to have adapted to these modifications and signals for inflammasome activation upon recognition of Rho GTPase modification [Xu et al., 2014].

The human Fic protein HYPE was recently shown to AMPylate BIP, a protein involved in the unfolded protein response during ER stress [Ham et al., 2014]. Modification of BiP at Ser-365 and Thr-366 enhances its ATPase activity which is critical for refolding misfolded proteins [Sanyal et al., 2015].

However, there is some controversy regarding both the identity of the modified residue and its functional consequences. Preissler et al. claim that Thr-518 of BiP is the only AMPylated residue both in vitro and in cell and that the ATPase activity is hindered due to this modification [Preissler et al.,

2015]. This is further supported by detection of the modified Thr-518 peptide whilst modified Ser-365 and Thr-366 were not detected using a chemical proteomics approach in our group [Broncel et al.,

2016]. Although there is controversy regarding the AMPylation site and its direct influence on BiP activity, it is clear that the presence of HYPE is essential for the cell to mount an unfolded protein response (UPR). Both siRNA knockdown and CRISPR/Cas9 knockout of HYPE inhibits UPR induction [Preissler et al., 2015; Sanyal et al., 2015].

66

AnkX is able to phosphocholinate Rab1 and Rab35 preferentially in the GDP-bound state [Goody et al., 2012; Mukherjee et al., 2011]. This modification on Rab1 prevents binding of GEFs and GDIs to the small GTPase, likely forcing the Rab GTPase to stay on the membrane in an inactivated state.

The targets of AvrAC: BIK1 and RIPK are both involved in immune signalling in plant cells [Liu et al.,

2011; Lu et al., 2010; Zhang et al., 2010]. Therefore AvrAC likely subverts immunity pathways as part of the virulence strategy of X. campestris. However, the exact mechanisms are still to be determined.

Doc is able to phosphorylate the bacterial elongation factor EF-Tu [Castro-Roa et al., 2013; Cruz et al.,

2014]. This results in an inhibition of protein translation. In addition to Doc, there are many Fic toxin- antitoxin modules encoded in bacterial genomes. Many of which are unlikely to be virulence factors.

One such toxin is VbhT from shoenbuchensis which exists with its antitoxin VbhA. VbhT uses its Fic domain to AMPylate DNA gyrase and topoisomerase IV in their ATP-binding pockets

[Harms et al., 2015]. This modification prevents hydrolysis of ATP by either enzyme and consequently causes their inactivation.

1.8.6. Regulation of Fic domain enzymatic activity

As Fic domain activity can have drastic consequences for cellular signalling such as actin rearrangements and translation inhibition, it suggests that their activities are under tight regulation inside cells. The activities of some Fic domains were recently found to be regulated by an inhibitory helix motif [Engel et al., 2012]. This SxxxE motif has been classified into three classes: N-terminal, C- terminal and as part of an anti-toxin. The glutamate residue blocks the ATP binding pocket and as such limits AMPylation activity. Upon mutation to a glycine, ATP is able to bind without the need for the helix to be displaced, resulting in a hyperactive enzyme. Activity of certain Fic domains (e.g. HYPE) could only be confirmed when the hyperactive mutant was used. The natural activation mechanism leading to displacement of the auto-inhibitory helix remains unknown. An alternative theory suggests that the inhibitory helix may act as a substrate specificity element as opposed to limiting enzymatic turnover. This is exemplified by the crystal structure of the Clostridium difficile Fic domain protein

CdFic [Dedic et al., 2016]. CdFic encodes an N-terminal inhibitory helix. However, CdFic is able to catalyse auto-AMPylation without mutation of the glutamate residue. The crystal structure shows that

67 it binds ATP in a non-canonical manner and hence even in the presence of the inhibitory helix, ATP can bind as a substrate.

Some Fic domain proteins exist as toxin-antitoxin modules such as Doc-Phd and VbhT-VbhA. Whilst

VbhA encodes a SxxxE inhibitory helix to regulate VbhT Fic activity, Phd does not. However, structural analysis of co-crystal structures of these toxins with their antitoxins revealed that Phd binds Doc in a similar but distinct way to VbhA with VbhT [Engel et al., 2012; Garcia-Pino et al., 2008].

Alternatively, Fic domain activity can be regulated by hydrolysis of the resulting PTM as evidenced by the ability of Lem3 to dephosphocholinate Rab1 after initial modification by the AnkX Fic domain [Tan et al., 2011a]. Currently, AnkX is the only Fic domain with a hydrolase counterpart but there are likely more to be discovered.

Many of the characterised Fic domain proteins show auto-modification. Whether these self- modifications are solely an in vitro artefact given the high concentrations of small molecule substrate typically used in these assays or play roles in regulation remains to be determined. Initial evidence suggests the former. An auto-modification null mutant of AnkX, created through removing a flexible region outside the Fic core, is still able to modify Rab1 as a substrate [Campanacci et al., 2013].

Furthermore, mutational analysis of auto-AMPylation sites of HYPE (Ser79, Thr80 and Thr183) revealed that they did not have a significant effect on the apoptotic phenotype caused by ectopic expression of HYPE E234G [Sanyal et al., 2015]. Kinetic analysis of the Fic enzymatic mechanism suggests that the AMP moiety is directly transferred from ATP onto the hydroxyl group of the target residue in a concerted mechanism [Luong et al., 2010]. This suggests that the AMP group should not be attached onto the enzyme active site even in a transient manner during the catalytic cycle.

On the other hand, a recent study suggests that auto-AMPylation of NmFic from is a mechanism to displace the inhibitory helix [Stanger et al., 2016]. Two auto-AMPylation sites were mapped to Y183 and Y188 located on the inhibitory helix. Y183 is conserved amongst all Fic domains containing a C-terminal inhibitory helix and buried in the hydrophobic core of NmFic. Therefore, for modification of Y183 to occur, the helix must first be displaced. Although the mechanism for this

68 displacement is still unknown, AMPylation of the residue would prevent its ability to rebind into the active site. Hence auto-AMPylation of Y183 enables activation of the Fic domain. Furthermore, this auto-modification appears to occur in cis (i.e. self-modification by a single polypeptide) as the reaction follows first-order enzyme kinetics (independent of the initial concentration of NmFic). Although Y183 appears conserved amongst the C-terminal inhibitory helix class of Fic domains, the consequences of auto-AMPylation of the other two classes remains to be determined.

In addition to auto-modification NmFic seems also to be regulated by oligomerisation into tetramers

[Stanger et al., 2016]. Mutants with defects in oligomerisation caused cytotoxicity in E. coli, presumably due to the Fic activity of NmFic on its target protein DNA gyrase as the phenotype was also dependent on the Fic motif histidine. However, wild-type NmFic does not exhibit this growth defect, suggesting that oligomerisation of NmFic inhibits its Fic domain catalytic activity.

1.9. Project aims

The aims of the project were:

1. To characterise the novel Dot/Icm effector protein LtpG both biochemically and in cell culture

models.

2. To develop mass spectrometric-based methods to identify protein-protein interactions of

Legionella effectors during infection.

69

Chapter 2 Materials and methods

2.1 Strains and cells

2.1.1 Strains and growth conditions

Bacterial strains used in this study are listed in Table 2.1. E. coli strains were routinely grown in Luria-

Bertani (LB) broth at 37°C shaking at 200 rpm. Antibiotics ampicillin (Amp) (100 µg/ml), kanamycin

(Kn) (50 µg/ml) and chloramphenicol (Cm) (30 µg/ml) were used as appropriate.

L. pneumophila strains were grown for 3 days on buffered charcoal yeast extract (CYE) agar plates at

37°C. The bacteria were diluted to an optical density 600 (OD) of 0.1 and subcultured in N-(2-

Acetamido)-2-aminoethanesulfonic acid (ACES) buffered yeast extract (AYE) broth for 21 h at 37°C

shaking at 200 rpm. Antibiotics Cm (6 µg/ml) and Kn (25 µg/ml) were used as appropriate. AYE

consisted of 10 g/L ACES, 10 g/L yeast extract (Merck), 1 g/L α-ketoglutarate (Sigma), pH 6.9. CYE

consisted of AYE base supplemented with 1.5 g/L activated charcoal powder (Sigma) and 15 g/L agar.

Both AYE and CYE were supplemented with 3.3 mM L-cysteine and 330 µM Fe(NO3)3 prior to use.

Saccharomyces cerevisiae strains AH109 and BY4741 were cultured at 30°C at 200 rpm in either yeast

extract peptone dextrose adenine (YPDA) or synthetic defined medium (SD) for plasmid selection.

YPDA was made up with 20 g/L Difco peptone, 10 g/L yeast extract, 20 g/L agar (if required) and

supplemented with 0.003% adenine hemisulphate and 2% glucose prior to use. SD consisted of 6.7 g/L

yeast nitrogen base without amino acids, 20 g/L agar (if required), pH 5.8 supplemented with 2%

glucose (or 2% galactose for pYES2 induction) and appropriate amino acids and nucleobases (Table

2.2) prior to use. X-α-gal was used at a final concentration of 80 mg/L if required.

Strain Serogroup/ genotype Reference

L. pneumophila 130b O1; clinical isolate [Edelstein, 1986; (ATCC BAA-74) Engleberg et al., 1984] 130b ΔdotA dotA gene disrupted with a kanamycin resistance cassette [Sansom et al., 2007] 130b ΔltpG ltpG gene (lpw_20091) replaced with a kanamycin resistance G. Schroeder, unpublished cassette

70

E. coli TOP10 F- mcrA Δ(mrr-hsdRMS-mcrBC) φ80lacZΔM15 nupG recA1 Invitrogen araD139 Δ(ara-leu)7697 rpsL(StrR) endA1 λ-

- - - BL21(DE3)Star F ompT hsdSB (rB mB ) gal dcm rne131 (DE3) Novagen

S. cerevisiae

AH109 MATa, trp1-901, leu2-3, 112, ura3-52, his3-200, gal4Δ, Clontech

gal80Δ, LYS2 : : GAL1UAS-GAL1TATA-HIS3, GAL2UAS-

GAL2TATA-ADE2, URA3 : : MEL1UAS-MEL1TATA-lacZ

BY4741 MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0 Ilan Rosenshine (Hebrew University of Jerusalem)

Table 2.1. Table of bacterial and yeast strains.

Nutrient Final concentration in SD Absent in following media Relevant plasmid L-adenine hemisulphate salt 20 mg/L QDO L-arginine HCL 20 mg/L L-histidine HCL monohydrate 20 mg/L QDO L-isoleucine 30 mg/L L-leucine 100 mg/L DDO, QDO pGADT7 L-lysine HCL 30 mg/L L-methionine 20 mg/L L-phenylalanine 50 mg/L L-threonine 200 mg/L L-tryptophan 20 mg/L DDO, QDO pGBKT7 L-tyrosine 30 mg/L L-uracil 20 mg/L SD -uracil pYES2 L-valine 150 mg/L Table 2.2. Table of nutrient supplements for yeast SD media. 2.1.2 Preparation of competent bacteria

For chemically competent E. coli TOP10 cells, overnight bacterial cultures were subcultured at a

dilution of 1:100 until they reached an OD of 0.4-0.6. All subsequent steps were carried out at 4°C using

ice-cold buffers. The cells were pelleted and resuspended in ½ volume of ice cold 100 mM RbCl, 50

mM MgCl2, 30 mM CH3COOK, 10 mM CaCl2, 15% glycerol. They were re-pelleted and resuspended

th in 1/20 10 mM MOPS, 10 mM RbCl, 75 mM CaCl2, 15% glycerol. Competent cells were flash frozen

using an ethanol/dry ice bath and stored at -80°C. Competent TOP10 cells were made by Lukasz

Bukowski (Imperial College London).

For electrocompetent L. pneumophila, bacteria grown on plates for 3 days were resuspended to an OD

of 0.1 and grown for 17-18 h. The bacteria were diluted to an OD of 0.1 and subcultured in 100 ml until

it reached 0.3-0.5 (5-6 h). All subsequent steps were carried out at 4°C using ice-cold buffers. The

71 bacteria were pelleted and washed 4 x 10% glycerol (35 ml, 2 x 10 ml and 8 ml). Washed cells were resuspended in 750 µl 10% glycerol, aliquoted, flash frozen in an ethanol/dry ice bath and stored at -

80°C.

2.1.3 Transformation of competent cells

Chemically competent E. coli were incubated with 5 µl ligation mix or 100 ng plasmid DNA for 30 min on ice. The cells were heat shocked at 42°C for 1 min and put on ice immediately for 10 min. 1 ml SOC media was added and cells allowed to recover at 37°C, 200 rpm for 1 h prior to plating onto antibiotic selection LB agar plates. Colonies were allowed to grow overnight at 37°C and positive clones screened by colony PCR.

Electrocompetent L. pneumophila were thawed on ice and 200 ng plasmid DNA added. The suspension was transferred into a pre-chilled electroporation cuvette (0.2 cm electrode, Biorad) and electroporated using the bacteria EC2 setting (2.5 kV) on a Biorad Micropulser (Biorad). 450 µl of warm AYE was added and the bacterial suspension incubated at 37°C, 200 rpm for 6 h before plating onto antibiotic selection CYE plates. Plates were incubated at 37°C for 3-4 days.

2.1.4 Preparation and transformation of chemically competent S. cerevisiae

Overnight cultures of S. cerevisiae strains AH109 and BY4741 were diluted 1:20 and grown to an OD of 0.4-0.6 (100 ml culture for 10 transformations). The cells were harvested by centrifugation at 3,200 x g for 5 min at RT. Each 50 ml culture cell pellet was washed 1 x 10 ml water, 2 x 1 ml 100 mM lithium acetate and aliquoted into 50 µl for individual transformations. To each aliquot was added 240

µl 50% (w/v) PEG3350, 36 µl 1 M lithium acetate, 25 µl heat denatured herring sperm DNA, 50 µl water and 1 µg plasmid DNA (0.5 µg of each plasmid for co-transformations). The cells were resuspended by gentle pipetting and incubated at 30°C statically for 30 min followed by heat shock at

42°C for 25 min. The cells were pelleted, resuspended in sterile water and plated onto synthetic defined

(SD) agar plates. Plates were grown at 30°C for 3 days.

72

2.2 Molecular biology techniques

2.2.1 DNA manipulations

PCR: DNA sequences were amplified by PCR for insertion into plasmids using primers shown in Table

2.3. KOD polymerase (Merck) was used with 50 ng DNA template (genomic DNA or plasmid), 0.2

mM dNTPs and 0.3 µM primers (Thermo) in a 50 µl reaction volume. Colony screening PCR was

performed using OneTaq Master Mix (NEB) in 13 µl reaction volumes with 0.5 µl primers. Typical

PCR settings were 95°C for 5 min followed by 30 cycles of 95°C for 30 s, 50°C for 30 s and 68/70°C

for 60/30 s per 1 kb (OneTaq and KOD respectively), and a final elongation step at 68/70°C for 10 min.

Site-directed mutagenesis: Quikchange II Site-Directed Mutagenesis kit (Agilent) was used as per the

manufacturer’s instructions. Typical PCR settings were 95°C for 3 min followed by 18 cycles of 95°C

for 50 s, 50°C for 60 s and 68°C for 2.5 min per kb of plasmid, and a final elongation step at 68°C for

7 min.

2.2.2 Table of plasmids and primers

Plasmid Description RS/templa Source or te Reference pRK5 Vector for the expression of proteins with N-terminal Myc-tag in Clontech mammalian cells pRK5 Expressed Primers 5’-3’ -derived protein pICC1946 Myc-LtpG CATACTGGATCCTTTGATTTAGCAAAACTGACTGAATTT BamHI This study TGACGTCTGCAGTCAAAACATCGCAACAAATTCTG PstI pICC1341 pRK5-HA; vector for the expression of proteins with N-terminal HA- [Harding et al., tag in mammalian cells 2013a] pICC1988 HA-LtpG Same as pICC1946 This study pICC1989 HA-LtpG Same as pICC1949 pICC1988 This study H263A pET28a Vector for expression of proteins with N-/C-terminal His-tag in Novagen bacterial cells Expressed Primers 5’-3’ protein pICC1947 His-LtpG CATACTGGATCCTTTGATTTAGCAAAACTGACTGAATTT BamHI This study TGACGTCTCGAGTCAAAACATCGCAACAAATTCTG XhoI pICC1948 His-LtpG CATACTCATATGTTTGATTTAGCAAAACTGACTGAATTT NdeI This study TGACGTCTCGAGTCAAAACATCGCAACAAATTCTG XhoI pICC1949 His-LtpG SDM: pICC1947 This study H263A CAGCTTACAGAAGTGGCTCCTTTTGCCAATGCTAATGGAAGAA CAGCAACCT SDM: AGGTTGCTGTTCTTCCATTAGCATTGGCAAAAGGAGCCACTTC TGTAAGCTG

73 pICC1950 His-LtpG- Inv: pICC1947 This study Strep CGGGTGGCTCCAGCCGCCAAACATCGCAACAAATTCTGCG Inv: CAGTTCGAAAAATGACTCGAGCACCACCACCACCACCAC pICC1951 His-LtpG Inv: pICC1949 This study H263A- CGGGTGGCTCCAGCCGCCAAACATCGCAACAAATTCTGCG Strep Inv: CAGTTCGAAAAATGACTCGAGCACCACCACCACCACCAC

pQE-HS2 Vector for expression of proteins with N-terminal His-tag and C- Qiagen terminal Strep II tag in bacterial, mammalian and insect cells Expressed Primers 5’-3’ protein pICC1952 His- TACTGGTACCCCTAAAATAAATAAAATTGTTAATGGAACAGA KpnI This study Lem28- TACTAAGCTTAATGCTTACATTAGGGCTATTCAG HindIII Strep pYES2 Vector for inducible expression of proteins in yeast Ilan Rosenshine (Hebrew University of Jerusalem) Expressed Primers 5’-3’ protein pICC1953 HA-LtpG TACTGAGCTCAACACAATGTCTTACCCATACGATGTTCCAGAT SacI This study TAC TGACGTCTCGAGTCAAAACATCGCAACAAATTCTG XhoI pICC1988 pICC1954 HA-LtpG Same as pICC1953 pICC1989 This study H263A pGBKT7 Vector for expression of GAL4-BD-fused proteins in yeast Expressed Primers 5’-3’ protein pICC1955 BD-LtpG CATACTCATATGTTTGATTTAGCAAAACTGACTGAATTT NdeI This study TGACGTGGATCCTCAAAACATCGCAACAAATTCTG BamHI pICC1956 BD-LtpG Same as pICC1955 pICC1949 This study H263A pICC1957 BD-CDK7 TACTGAATTCGCTCTGGACGTGAAGTCTC EcoRI This study TACTCTCGAGTTAAAAAATTAGTTTCTTGGGCAATCCT XhoI pICC1958 BD-CDK7 Same as pICC1957 pICC1969 This study K41A pICC1959 BD-CDK7 Same as pICC1957 pICC1970 This study T170A pICC1960 BD-CDK7 TACTGAATTCGGCCTGGCCAAATCTTTTGG EcoRI This study AA157- TACTCTCGAGTTAAAAAATTAGTTTCTTGGGCAATCCT XhoI 346 pICC1961 BD-CDK7 Same as pICC1960 pICC1970 This study AA157- 346 T170A pICC1962 BD-COPG TACTCATATGCTGAAGAAATTCGACAAGAAGGAC NdeI This study TACTGAATTCTTAGCCCACAGACGCCAAGA EcoRI pICC1963 BD-SidM TACTCATATGAGTGTTAATGAAGAGCAATTTGGTAG NdeI This study TACTGAATTCTTATTTTATCTTAATGGTTTGTCTTTCTTGA EcoRI pICC1964 BD-SidM SDM: pICC1963 This study DD102/104 GCACAAGCCACTGAGTATAGTGCTTTGGCTGCCTTTGTTATTGT AA TAAAAAT SDM: ATTTTTAACAATAACAAAGGCAGCCAAAGCACTATACTCAGTG GCTTGTGC pICC1965 BD-MavP TACTCATATGACATTAAAACAATTTGCAACTGGCG NdeI This study

74

TACTGAATTCTTACTTTGAGCCAGTGAGAGATA EcoRI pGADT7 Vector for expression of GAL4-AD-fused proteins in yeast Expressed Primers 5’-3’ protein pICC1966 AD-LtpG TACTGGATCCAATTTGATTTAGCAAAACTGACTGAATTT BamHI This study TGACGTCTCGAGTCAAAACATCGCAACAAATTCTG XhoI pICC1967 AD-LtpG Same as pICC1966 pICC1949 This study H263A pICC1968 AD-CDK7 TACTGAATTCGCTCTGGACGTGAAGTCTC EcoRI This study TACTCTCGAGTTAAAAAATTAGTTTCTTGGGCAATCCT XhoI pICC1969 AD-CDK7 SDM: pICC1968 This study K41A AACACCAACCAAATTGTCGCCATTGCGAAAATCAAACTTGGAC ATAGATC SDM: GATCTATGTCCAAGTTTGATTTTCGCAATGGCGACAATTTGGTT GGTGTT pICC1970 AD-CDK7 SDM: GGAGCCCCAATAGAGCTTATGCACATCAGGTTGTAAC pICC1968 This study T170A SDM: GTTACAACCTGATGTGCATAAGCTCTATTGGGGCTCC pICC1971 AD-CDK7 Same as pICC1970 pICC1969 This study K41A T170A pICC1972 AD-CDK7 TACTGAATTCGGCCTGGCCAAATCTTTTGG EcoRI This study AA157- TACTCTCGAGTTAAAAAATTAGTTTCTTGGGCAATCCT XhoI 346 pICC1973 AD-CDK7 Same as pICC1972 pICC1970 This study AA157- 346 T170A pICC1974 AD- TACTGAATTCCAGAGAGCTTCACGTCTGA EcoRI This study UBE2T TACTGGATCCCTAAACATCAGGATGAAATTTCTTTTC BamHI pICC1975 AD-COPA TACTCATATGTTTGAGACCAAGAGTGCGCG NdeI This study TACTCTCGAGTCAGCGAAACTGCAGAGGG XhoI pICC1976 AD- TACTCATATGACCGCAGCTGAGAACGTGT NdeI This study COPB1 TACTCTCGAGTTAGAGACTAGTCTTCTTTTGAGACA XhoI pICC1977 AD- TACTGAATTCATGCCTCTGCGACTTGATATCAA EcoRI This study COPB2 TACTCTCGAGTTAGTCGTCCAAAATATCTTCATCTAG XhoI pICC1978 AD-COPG TACTCATATGCTGAAGAAATTCGACAAGAAGGAC NdeI This study TACTGAATTCTTAGCCCACAGACGCCAAGA EcoRI pICC1979 AD- TACTCATATGGTGCTGTTGGCAGCAGCA NdeI This study ARCN1 TACTCTCGAGCTACAGAATTTCATACTTATCCACTAG XhoI pICC1980 AD-COPE TACTCATATGGCTCCTCCGGCTCCTG NdeI This study TACTCTCGAGTCAGGCGCTGGGGGCA XhoI pICC1981 AD-COPZ TACTCATATGGAGGCGCTGATTTTGCAACC NdeI This study TACTCTCGAGTCACCGAAGGAGGGACCA XhoI pICC1552 AD-Rab1A [Mousnier et al., 2014] pICC1553 AD-Rab1B [Mousnier et al., 2014] pICC1554 AD-Rab2A [Mousnier et al., 2014] pICC1555 AD-Rab5C [Mousnier et al., 2014] pICC1556 AD-Rab6A [Mousnier et al., 2014] pICC1557 AD-Rab7 [Mousnier et al., 2014] pICC1558 AD-Rab10 [Mousnier et al., 2014] pICC1982 AD-MavP TACTCATATGACATTAAAACAATTTGCAACTGGCG NdeI This study

75

TACTGAATTCTTACTTTGAGCCAGTGAGAGATA EcoRI pICC562 pMMB207c-HA4; Vector for the expression of proteins with four N- [Dolezal et al., terminal HA-tags in L. pneumophila 2012] pICC562 Expressed Primers 5’-3’ -derived protein pICC1983 HA4-LtpG TCGCACTGAGGTACCTTTGATTTAGCAAAACTGACTGAAT KpnI This study GACGTCGCATCTAGATCAAAACATCGCAACAAATTCTG XbaI pICC1565 HA4-SidM TCGCACTGAGGTACCATGAGTGTTAATGAAGAGCAATTTG KpnI This study GACGTCGCATCTAGATTATTTTATCTTAATGGTTTGTCTTTCTT XbaI [Mousnier et al., G 2014] pICC1935 HA4- GTCATAGGATCCAAATATTCCTCCAAGCCATTATTGG BamHI This study

SidCPI4P [So et al., 2016]

GGCTATCTAGACTATTTCTTTATAACTCCCGTGTAC XbaI pICC1936 HA4-LidA GCACTGAGGTACCGCAAAAGATAACAAATCACATCAAG KpnI This study GTCGCATCTAGATTATGATGTCTTGAATGGAGATAAAG XbaI [So et al., 2016] pICC1544 pMMB207c-His6-Bio; Vector for the expression in L. pneumophila of [Mousnier et al., proteins with 2 N-terminal hexahistidine tags and a BirA biotinylation 2014] sequence pICC1544- Expressed Primers 5’-3’ derived protein pICC1937 His6-Bio- TCGCACTGAGGTACCATGAGTGTTAATGAAGAGCAATTTG KpnI This study SidM [So et al., 2016] GACGTCGCATCTAGATTATTTTATCTTAATGGTTTGTCTTTCTT XbaI G pICC1938 His6-Bio- Same as pICC1936 This study LidA [So et al., 2016] pICC1984 His6-Bio- TACTGGTACCACATTAAAACAATTTGCAACTGGCG KpnI This study MavP TACTTCTAGATTACTTTGAGCCAGTGAGAGATA XbaI pICC1985 His6-Bio- TCGCACTGAGGTACCTTTGATTTAGCAAAACTGACTGAAT KpnI This study LtpG GACGTCGCATCTAGATCAAAACATCGCAACAAATTCTG XbaI H263A pICC1939 pMMB207c-His6-Bio K/A; pMMB207c-His6-Bio derivative in which This study the lysine to which the biotin is attached is mutated to alanine. [So et al., 2016]

Cloned from pMMB207c-His6-Bio (pICC1544) by site directed mutagenesis CATCTTCGAGGCCCAGGCGATCGAGTGGCACGAG; CTCGTGCCACTCGATCGCCTGGGCCTCGAAGATG pICC1939- Expressed Cloning method derived protein pICC1940 His6-Bio Same as pICC1937 This study K/A-SidM [So et al., 2016] pICC1941 His6-Bio Same as pICC1936 This study K/A-LidA [So et al., 2016] pICC1986 His6-Bio Same as pICC1984 This study K/A-MavP pICC1987 His6-Bio Same as pICC1985 This study K/A-LtpG H263A pMXs- pMXs-IP; Viral transduction vector for the expression of proteins in Clontech IRES-Puro mammalian cells pMXs-IP- Expressed Primers 5’-3’ derived protein

76 pICC1942 GFP-Rab2a TACTGGATCCGCCACCATGGTGAGCAAGGGCGAGGAG BamHI This study [So et al., 2016] TACTTACTTAGCGGCCGCTCAACAGCAGCCTCCCCC NotI pICC1943 GFP-Rab5c TACTGGATCCGCCACCATGGTGAGCAAGGGCGAGGAG BamHI This study [So et al., 2016] TACTTACTTAGCGGCCGCTCAGTTGCTGCAGCACTGGCT NotI pICC1944 GFP- TACTGGATCCGCCACCATGGTGAGCAAGGGCGAGGAG BamHI This study Rab10 [So et al., 2016] TACTTACTTAGCGGCCGCTCAGCAGCACTTGCTCTTCCAGCC NotI pICC1945 GFP-BirA CTAGGGATCCGCCACCATGGTGAGCAAGG BamHI This study [So et al., 2016] TCTAGCGGCCGCTTATTTTTCTGCACTACGCAGGGA NotI

Table 2.3. Table of plasmids and primers. SDM – site directed mutagenesis, Inv – inverse PCR and blunt ligation. 2.2.3 Plasmid DNA purification

For small scale extraction, plasmid-containing bacteria were grown in 5 ml LB overnight and plasmid

DNA purification was carried out using Plasmid Miniprep Kit (peqlab) according to the manufacturer’s

protocol.

2.2.4 Agarose gel electrophoresis

PCR products were separated on agarose gels at 120V (1-2% w/v agarose (Invitrogen) in TAE buffer

(40 mM Tris-acetate, 1 mM EDTA) mixed with a 1:10000 dilution of SYBRSafe DNA stain

(Invitrogen). The gels were imaged on a SafeImage blue light trans-illuminator (Invitrogen).

2.2.5 Restriction digestion of PCR products

PCR products were purified either by Qiaquick PCR Purification kit (Qiagen) or extracted from the

agarose gel using Qiaquick Gel Extraction kit (Qiagen) according to the manufacturer’s protocols.

Backbone plasmid vectors and PCR products were digested at 37°C for 3 h using appropriate restriction

enzymes (NEB) (1 unit of enzyme per µg DNA). Vectors were dephosphorylated at 37°C for 1 h

following digestion by addition of 1 µl (10 units) calf intestinal alkaline phosphatase. Digested products

were purified using Qiaquick PCR Purification kit.

2.2.6 Ligation of digested DNA products

100 ng digest dephosphorylated vector was ligated with an insert at a molar ratio of 1:5 using T4 DNA

ligase (NEB) for 1 h at RT in 10-20 µl reaction volumes.

77

2.3 Cell culture-based techniques

2.3.1 Cell culture

HeLa and A549 cells were cultured in 1000 mg/l glucose Dulbecco’s Modified Eagle Medium (DMEM)

(Sigma) supplemented with 10% heat-inactivated fetal calf serum (FCS), 2mM GlutaMAX (Gibco) and non-essential amino acids (Sigma). HEK293E cells were cultured in 4500 mg/l glucose DMEM supplemented as above. THP-1 cells were maintained in RPMI (Sigma) supplemented with 10% FCS and 2mM GlutaMAX. All cell lines were maintained in a 5% CO2 humidified atmosphere at 37°C.

Adherent cell lines were passaged every 3-4 days and diluted by 8-10 fold following detachment by trypsinisation (PBS, 1mM EDTA, 2mM trypsin). THP-1 cells were maintained between 3 x 105 and 1 x 106 cells per ml. THP-1 cells were differentiated over three days by addition of 80 nM phorbol 12- myristate 13-acetate (PMA) if required.

2.3.2 Generation of stably transduced cell lines by viral transduction

HEK293E cells were transfected with pMXs-IP containing the target gene, pCMV-VSV-G envelope and pCMV-MMLV-gag-pol packaging plasmids using Lipofectamine 2000 (Life Technologies) according to the manufacturer’s protocol. Briefly, 125 µl OptiMEM was mixed with 3 µl Lipofectamine

2000. Separately, a DNA mix containing 125 µl OptiMEM, 500 ng pMXs-IP containing the target gene,

100 ng pCMV-VSV-G, 400 ng pCMV-MMLV-gag-pol was made. The DNA mix and diluted

Lipofectamine were mixed at RT for 5 min and 250 µl added to a 6-well plate well of HEK293E cells.

After 24 h, the media was replaced and virion production allowed to continue for a further 24 h. The supernatant containing virions was collected and HEPES added to a final concentration of 20 mM. The solution was sterile filtered through a 0.45 µm membrane. 100 µl of the filtered virion supernatant was added to each 24-well plate well containing the cell line of interest. After 24 h, the media was replaced and puromycin added to select for transduced cells. Puromycin was used at 1.5 µg/ml and 2 µg/ml for

A549 and THP-1 cells respectively. Selection was maintained for a minimum of 24 h. If the gene of interest was fused with GFP, a homogenous fluorescent population was obtained by FACS (BD

Fortessa). FACS was performed by Danielle Carson (Imperial College London).

78

2.3.3 Legionella infection for immunofluorescence

A549 cells were seeded on 12 mm glass coverslips overnight at a density of 1.5 x 105 cells per 24-well plate well. The media was replaced prior to infection and supplemented with 1mM IPTG and Cm (6

µg/ml) as required. Cells were infected with L. pneumophila strains at a multiplicity of infection (MOI) of 50. Infection was synchronised by centrifugation at 500 x g for 5 min. Extracellular bacteria were washed away after 2 h with 3 x 1 ml PBS. Fresh media supplemented as above was added for longer infection time points.

2.3.4 Transfection for immunofluorescence

HeLa cells (4 x 104 cells per 24 well plate well) were seeded overnight. For each well of a 24 well plate,

0.75 µl GeneJuice transfection reagent (Merck Millipore) was mixed with 20 µl OptiMEM (Gibco) at

RT for 5 min. 0.25 µg plasmid DNA was added to transfection mix and incubated at RT for 15 min.

The transfection mix was added to cells and incubated in a 5% CO2 humidified atmosphere at 37°C for

24 h.

2.3.5 L. pneumophila growth curves in amoebae

Dictyostelium discoideum and Acanthamoebae castellanii were seeded in black 96-well plates at a density of 8.5 x 104 cells per well for 4 h in LoFlo media supplemented with 1 mM IPTG and Cm (6

µg/ml). Cells were infected with L. pneumophila strains expressing either GFP or mCherry at a MOI of

1. Growth was measured by fluorescence in a plate reader over 72 and 94 h for D. discoideum and A. castellanii respectively. L. pneumophila growth curves in amoebae were performed by Corinna

Mattheis (Imperial College London, UK).

2.3.6 Legionella infection for mass spectrometry (Bio/BirA)

A549-BirA cells were seeded at a density of 4 x 106 per 10 cm dish 14 h before infection. THP-1-BirA cells were seeded at 1 x 107 cells per 10 cm dish and differentiation induced by addition of 80 nM phorbol 12-myristate 13-acetate (PMA) for 3 days. On the day of infection, the media was changed and supplemented with 4 µM biotin (Sigma), 6 µg/ml Cm and 1 mM IPTG. A549-BirA cells were infected

6 L. pneumophila strains expressing His6-Bio (K/A)-effectors at a MOI of 15 assuming 5 x 10 cells per

79

10 cm dish at the time of infection. THP-1-BirA cells were infected with L. pneumophila strains

7 expressing His6-Bio (K/A)-effectors at a MOI of 1 assuming 1 x 10 cells per 10 cm dish at the time of infection. Cells were washed with 3 x 5 ml PBS 2 h post-infection and media supplemented as above added. Infections of A549-BirA and THP-1-BirA cells were allowed to progress for another 22 h and 4 h respectively.

2.3.7 Immunofluorescence (IF) preparation

Cells were washed with 3 x 1 ml PBS and fixed with 4% formaldehyde in PBS for 15 min. The cells were washed with 3 x 1 ml PBS and any excess PFA quenched by the addition of 1 ml 50mM NH4Cl in PBS for 10 min. Cells were permeabilised with 0.1% Triton X-100 in PBS for 5 min, washed 3 x 1 ml PBS and incubate in blocking bufffer (2% bovine serum albumin (BSA) and 2% donkey serum in

PBS). After a minimum of 1 h blocking at RT or overnight at 4°C, primary antibodies (Table 2.4) diluted in blocking buffer were incubated for 1 h. Coverslips were washed 3x PBS and incubated with secondary antibodies followed for how long by 3x washes in PBS and a final wash in water. The coverslips were mounted onto glass slides using ProLong Gold antifade reagent (Invitrogen).

2.3.8 Microscopy

Slides were visualised using an Axio Z1 Observer microscope (Zeiss). Images were processed using

AxioVision software (Zeiss) and Z-stacks deconvoluted using a built-in nearest neighbour algorithm.

80

Primary antibody Species IF dilution WB dilution Reference HA-tag (HA.11 clone Mouse 1:1000 Cambridge Bioscience 16B12) (MMs-101P-1000) Legionella LPS Rabbit 1:900 Affinity BioReagents (Pa1- 7227) Myc-tag (clone 4A6) Mouse 1:200 Millipore (05-724) PolyHistidine- Mouse 1:10000 Sigma (A7058) peroxidase (clone HIS- 1) HA-peroxidase (clone Mouse 1:4000 Sigma (H6533) HA-7) GFP Rabbit 1:2000 Abcam (ab290) Ubiquitin (FK2) Mouse 1:1000 Enzo (BML-PW8810)

Secondary antibodies and reagents Anti-mouse IgG Donkey 1:200 Jackson ImmunoResearch Rhodamine Red-X (715-295-150) Anti-mouse IgG Donkey 1:200 Jackson ImmunoResearch AlexaFluor488 (715-545-150) Anti-rabbit IgG Donkey 1:200 Jackson ImmunoResearch AlexaFluor488 (711-545-152) Anti-mouse IgG Goat 1:10000 Jackson ImmunoResearch peroxidase (115-035-008) Anti-rabbit IgG Goat 1:10000 Jackson ImmunoResearch peroxidase (111-035-008) Phalloidin AlexaFluor 1:100 Stratech (23127-AAT) 647 Phalloidin TRITC 1:500 Sigma (P1951) DAPI (4',6-diamidino-2- 1:1000 Invitrogen (D3571) phenylindole) Streptavidin HRP 1:5000 Dako (P0397) Table 2.4. Table of antibodies.

81

2.3.9 Co-immunoprecipitation from infected cells

A549-GFP-Rab2A/5C/10 cells were seeded at a density of 4 x 106 cells per 10 cm dish for 14 h. The media were replaced prior to infection and cells were infected with L. pneumophila strains expressing

HA-tagged effectors at a MOI of 15 assuming 5 x 106 cells per 10 cm dish at the time of infection. Cells were washed 3 x 5 ml PBS 2 h post infection. The media were replaced and the infection allowed to progress for a further 22 h. Cells were washed 3 x 5 ml PBS and 1 ml of Triton lysis buffer (50 mM

Na2HPO4 pH 7.3, 150 mM NaCl, 1% Triton X-100) (supplemented with protease inhibitors and

Benzonase) added. Cells were incubated with lysis buffer for 30 min at 4°C and scraped into 1.5 ml tubes. The insoluble fraction was removed by centrifugation at 20,000 x g for 15 min at 4°C.

Protein G-coupled Dynabeads (50 µl per sample) (Thermo Fisher) were washed with PBS and resuspended in 80 µl PBS with 0.02% Tween20. Washes were done using a magnetic rack with beads allowed to settle on the rack for at least 1 min prior to removal of supernatant. 16.6 µl were taken for pre-clearing and the remaining 63.4 µl coated with 1.53 µl anti-GFP antibody (Abcam ab1218) by incubation at RT for 10 min with end-to-end rotation. Coated beads were washed with PBS with 0.02%

Tween20 and resuspended in 50 µl Triton lysis buffer.

Soluble lysates were pre-clearing by addition of 16.6 µl washed unbound beads for 15 min at 4°C with end-to-end rotation. The precleared lysate was added to the anti-GFP coated beads and incubated for 1 h at 4°C with end-to-end rotation. Beads were washed sequentially with 1 x Triton lysis buffer, 1 x PBS with 0.5% Triton X-100, 1 x PBS with 0.05% Tween20, 1 x 20 mM Tris with 200 mM NaCl and 1 x

PBS. Beads were incubated with each wash buffer for 5 min at 4°C with end-to-end rotation. Bound proteins were eluted by addition on 30 µl 1 x SDS loading buffer and boiling for 5 min. Proteins were separated by SDS-PAGE and analysed by Western Blot using anti-GFP and anti-HA antibodies.

82

2.4 Biochemical techniques

2.4.1 Protein purification

Overnight cultures of BL21 Star (DE3) with a plasmid in were diluted 1:100 into 1 L LB and grown to an OD of 0.4-0.6. Protein production was induced by addition of 1 mM IPTG and cells grown for either

4 h at 37°C or overnight at 18°C. Cells were harvested by centrifugation and pellets frozen at -20°C.

Bacterial cell pellets were resuspended in lysis buffer (50 mM NaH2PO4 pH7.4, 500 mM NaCl, 20 mM imidazole) with EDTA-free complete protease inhibitors (Roche) and lysed by passaging three times in an EmulsiFlex B15 cell disruptor. The insoluble fraction was removed by centrifugation (15 min,

10,000 x g).

His-tagged recombinant protein was purified from clarified lysate using a 1 ml Ni2+-NTA column on an ÄKTA prime (buffer A: 50 mM NaH2PO4 pH7.4, 500 mM NaCl, 20 mM imidazole; buffer B: 50 mM NaH2PO4 pH7.4, 500 mM NaCl, 500 mM imidazole). Bound proteins were eluted using a gradient of 0-100% buffer B over 15 ml. Elution fractions were analysed by SDS-PAGE and Coomassie staining.

Fractions containing sufficiently pure protein were pooled and dialysed into 25 mM Tris pH8.0, 25 mM

NaCl using either Slide-A-Lyzer dialysis cassettes or SnakeSkin dialysis tubing (both ThermoFisher).

Dialysed protein was subjected to anion exchange using a SOURCE-15Q column (GE Healthcare Life

Sciences) on an ÄKTA prime (buffer A: 25 mM Tris pH8.4, 25 mM NaCl; buffer B: 25 mM Tris pH8.4,

1000 mM NaCl). Bound proteins were eluted using a gradient of 0-100% buffer B over 35 ml. Elution fractions were analysed by SDS-PAGE and Coomassie staining. Fractions containing sufficiently pure protein were pooled, concentrated to < 2 ml and further purified by size-exclusion chromatography using a Superdex 200 pg (16/600) column on an ÄKTA prime with 20 mM Tris pH7.5, 200 mM NaCl,

10% glycerol as buffer. Elution fractions were analysed by SDS-PAGE and Coomassie staining.

Fractions containing sufficiently pure protein were pooled and concentrated if required (10 mg/ml by

A280 for crystallisation).

83

2.4.2 SDS polyacrylamide gel electrophoresis (SDS-PAGE)

Protein samples were diluted using 5x SDS loading buffer (313 mM Tris pH 6.8, 50% (v/v) glycerol,

0.5% (w/v) bromophenol blue, 10% (w/v) SDS, 12.5% (v/v) β-mercaptoethanol) and boiled for 10 min.

Samples were separated on 8-15% Tris acrylamide gels in Tris-glycine-SDS (TGS) buffer (25 mM Tris,

192 mM glycine, 0.1% (w/v) SDS) (Geneflow) at 180V.

2.4.3 Coomassie staining

Polyacrylamide gels were immersed in Coomassie blue solution (45% (v/v) water, 45% (v/v) ethanol,

10% (v/v) acetic acid, 0.1% (w/v) Coomassie Blue R250) for 1 h and destained in 45% (v/v) water,

45% (v/v) ethanol, 10% (v/v) acetic acid until bands became visible. The destain solution was removed and the gel kept in water.

2.4.4 Silver staining

Silver staining was performed using SilverQuest silver staining kit (Thermo Fisher Scientific) according to the manufacturer’s protocol.

2.4.5 Western Blot (WB)

Proteins separated by SDS-PAGE were transferred onto PVDF membrane (manufacturer) at 400 mA for 2 h using wet blotting apparatus. The membrane was either blocked in 5% (w/v) skim milk powder in PBST (PBS with 0.1% (w/v) Tween-20) or 3% BSA in PBST for at least 1 h. The membrane was incubated with primary antibodies (Table 2.4) at RT for 1 h, washed 3 x 5 min with PBST, then incubated with secondary antibodies at RT for 1 h. The membrane was washed 3 x 5 min with PBST and visualised using EZ-ECL (Geneflow) reagent and a Fuji LAS3000 imager.

2.4.6 In vitro AMPylation assays

Adapted from [Grammel et al., 2011]. Differentiated THP-1 cells (107 in a 10 cm dish) were harvested into 500 µl lysis buffer (20 mM Tris pH7.4, 100 mM NaCl, 5 mM MgCl2 and 1 mM DTT) with cOmplete mini EDTA-free protease inhibitors (Roche) using a cell scraper. Cells were lysed by sonication (0.2 s pulses, 30% amplitude, 2 min, 4°C) (SONICS Vibra-Cell) and insoluble debris

84 removed by centrifugation (20,000 x g, 15 min, 4°C). Protein concentration was determined for the soluble lysate by BCA assay (Thermo) in 96-well plate format.

50 µg cell lysate was incubated with 2 µg recombinant LtpG or LtpG H263A and 100 µM ATP or

6 N pATP in 20 mM Tris pH7.4, 100 mM NaCl, 5 mM MgCl2 and 1 mM DTT for 1 h at 30°C (100 µl total reaction volume).

AMPylation reactions were stopped by methanol-chloroform precipitation. To each reaction was added

400 µl methanol, 100 µl chloroform and 300 µl water. The mixture was centrifuged at 16,000 x g for 5 min at RT and the upper layer removed. 850 µl ice-cold acetone was added to the remaining bottom layer and centrifuged at 16,000 x g for 15 min at 4˚C. The supernatant was discarded and the protein pellet air dried for 5 min to remove residual acetone. The pellet was resuspended in 15 µl click buffer

(4% SDS, 150 mM NaCl, 50 mM triethanolamine, pH7.5) for the click reaction.

9 µl click master mix (0.25 µl 10 mM AzTB capture reagent, 0.5 µl 50 mM CuSO4, 0.5 µl 50 mM tris(2- carboxyethyl)phosphine (TCEP), 0.25 µl 10 mM Tris[(1-benzyl-1H-1,2,3-triazol-4-yl)methyl]amine

(TBTA) and 7.5 µl click buffer) was added to each sample and vortex mixed at RT for 1 h.

5x SDS loading buffer was added to the samples and boiled for 10 min prior to analysis by SDS-PAGE.

In-gel fluorescence was visualised using an Ettan DIGE Imager (GE Healthcare Life Sciences) or

Typhoon Imager (GE Healthcare Life Sciences). Gels were further processed by either Coomassie staining or Western Blot.

2.4.7 Differential scanning fluorimetry (DSF)

Samples were prepared in MicroAMP Fast Optical 96-well reaction plates (Applied Biosystems). The experiment was performed in technical triplicate with each well containing 2 µM recombinant protein,

10x SYPRO Orange, 5 mM MgCl2 and 200 µM ligand in 20 µl total volume. Plates were sealed with

MicroAMP Optical Adhesive Film and centrifuged for 5 min at 500 x g prior to analysis. Differential scanning fluorimetry was performed on a Step One Plus Real-Time PCR System (Applied Biosystems).

Fluorescence was monitored from 25°C to 95°C with measurements taken at every 1°C temperature increment using the ROX reporter setting. Data were analysed using Excel and GraphPad Prism.

85

2.4.8 Radioactive NMPylation

32 100 µg lysate, 1 µg recombinant enzyme, 5 mM MgCl2, 20 µM NTP and 5 µCi P-labelled NTP (Perkin

Elmer) in 20 mM Tris pH7.4, 100 mM NaCl were incubated for 1 h at 30°C (50 µl total reaction volume). Reactions were quenched by addition of 5x SDS loading buffer and boiling for 5 min. Proteins were separated by SDS-PAGE and analysed by autoradiography using a Typhoon Imager (GE

Healthcare Life Sciences)

2.4.9 Setting up crystal trays

For commercial crystallisation screens (ICL1-14), protein drops were spotted using a Mosquito liquid handling robot (TTP Labtech). Two drop ratios were used for each commercial screen 100 nl protein:100 nl reservoir and 100 nl protein: 200 nl reservoir. Plates were sealed using Crystal Clear

Sealing Tape (Hampton Research) and incubated at 20°C. A table of the commercially available crystallisation trays is shown in Table 2.5.

Plate ID Name Supplier

ICL1 Crystal Screens 1 & 2 Hampton Research ICL2 Wizard 1 and 2 Rigaku Reagents/Molecular Dimensions ICL3 PEG/Ion 1 and Natrix 1 Hampton Research ICL4 Index Hampton Research ICL5 SaltRx 1 and 2 Hampton Research ICL6 MemStart and MemSys Molecular Dimensions ICL7 PACT premier Molecular Dimensions ICL8 JCSG+ Molecular Dimensions ICL9 MemGold Molecular Dimensions ICL10 Wizard 3 and PEG/Ion 2 Molecular Dimensions/Hampton Research ICL11 JBS Cryo Jena Biosciences ICL12 Proplex Molecular Dimensions ICL13 Morpheus Molecular Dimensions ICL14 PGA screen Molecular Dimensions Table 2.5. Table of commercially available crystallisation screens.

Manual optimisation screens were performed in either MRC Maxi 48-well sitting drop crystallisation plates (Hampton Research) or VDX 24-well hanging drop plates (Hampton Research). Trays were set up using a drop ratio of 1 µl protein:2 µl reservoir with 200 µl and 500 µl buffer reservoirs for 48- and

24-well plates respectively.

86

2.4.10 X-ray diffraction

X-ray diffraction was performed by Dr. David Charles (Imperial College London) using either the in- house Rigaku MicroMax-007 source or Diamond Light Source Facility (Oxford).

2.4.11 Thrombin cleavage

Recombinant His-tagged LtpG was incubated with 100 U thrombin (GE Healthcare) in 20 mM

2+ NaH2PO4 pH 7.4, 500 mM NaCl buffer at 4°C overnight. The resulting solution was subjected to Ni -

NTA purification on an ÄKTA Prime. Presence of cleaved product was monitored using absorbance at

280 nm on the ÄKTA Prime and Coomassie stained SDS-PAGE gels.

2.5 Yeast techniques

2.5.1 Yeast cytotoxicity screen

S. cerevisiae strain BY4741 were transformed with pYES2-LtpG or pYES2-LtpG H263A. 5-6 colonies were pooled and subcultured overnight at 30°C at 200 rpm in 2% glucose containing SD (–uracil) broth.

Overnight cultures were pelleted and washed 2 x sterile water. They were normalised to an OD600 of

0.1 in either 2% glucose or galactose containing SD broth and 100 µl added into a well of a 96-well plate. Each condition was plated in triplicate. Growth was followed by measuring OD600 at 30 min intervals over 72 h using a FLUOstar Omega plate reader (BMG LABTECH).

2.5.2 Yeast-2-Hybrid screen

The yeast-2-hybrid screen was performed according to the Mate and Plate (Clontech) manufacturer’s protocol. Briefly, a single colony of AH109 pGBKT7-LtpG H263A was subcultured overnight in SD

(–tryptophan) media to an OD600 of 0.8. The culture was pelleted and resuspended in SD (–tryptophan) to a cell density of 108 cells per ml. The bait and prey strains were mated by addition of 1 ml HeLa S3

Mate and Plate Library to 5 ml concentrated bait with 50 ml 2x YPDA broth and incubated at 30°C for

24 h at 30 rpm. The cells were pelleted, resuspended in 10 ml 0.5x YPDA and 200 µl plated onto square

150 mm QDO (SD –tryptophan, -leucine, -histidine, -adenine, + X-α-gal) agar plates. The plates were grown at 30°C for 17 days. Colonies were restreaked onto QDO before colony PCR and sequencing to

87 determine the prey insert sequence. If colony PCR did not produce a band, the plasmid was extracted from yeast, passaged through E. coli TOP10 and sequenced.

2.5.3 Direct Yeast-2-Hybrid

S. cerevisiae strain AH109 was co-transformed with a pGBKT7-bait and pGADT7-prey plasmid as above and plated onto DDO (SD –tryptophan, -leucine) plates. After 3 days at 30°C, 5-6 colonies were pooled and resuspended in sterile water. Pooled colonies were either patched or spotted onto DDO and

QDO plates and allowed to grow at 30°C for 3 days.

2.6 Mass spectrometry-based techniques

2.6.1 In vitro pulldown

StrepTactin Superflow Plus resin (Qiagen) (100 µl 50% slurry per sample) was washed 4 x 1 ml lysis buffer (50 mM Tris pH 8.0, 150 mM NaCl, 10% glycerol, 1% NP-40). Strep-tagged recombinant bait proteins (225 µg) were loaded onto equilibrated resin for 1 h at 4°C with end-to-end rotation. The resin was washed with 3 x 1 ml lysis buffer.

THP-1 cell lysate (5 ml at 5 mg/ml) was precleared using 150 µl equilibrated resin for 90 min at 4°C with end-to-end rotation. 1 ml of 5 mg/ml precleared lysate was added to StrepTactin resin preloaded with bait protein and proteins allowed to complex for 4 h at 4°C with end-to-end rotation.

Resin was washed with 3 x 1 ml lysis buffer, 2 x 1 ml 50 mM Tris pH 8.0, 150 mM NaCl, 10% glycerol and 2 x 500 µl 50 mM AMBIC. . Proteins were reduced and alkylated on-bead. 3 µg trypsin was added to the resin and incubated at 37°C overnight at 1200 rpm. Peptide mixtures were StageTipped and analysed by mass spectrometry.

2.6.2 Competition in vitro pulldown

THP-1 lysates were prepared by addition of ice cold cell lysis buffer (50 mM Tris pH 8.0, 150 mM

NaCl, 10% glycerol. 1% NP-40) with protease inhibitors and sonication for 5 min (30% amplitude, 0.2s pulses). The soluble lysate was precleared by addition of 250 µl pre-equilibrated StrepTactin Superflow

Plus resin and incubation for 90 min at 4°C with end-to-end rotation. The precleared lysate was split

88 into 500 µl 2 mg/ml aliquots and His-tagged only recombinant LtpG (0, 0.1, 0.2, 0.5 and 1 mg) added.

Proteins were allowed to complex for 90 min at 4°C with end-to-end rotation. His-tagged only recombinant LtpG was removed by addition of Ni2+-IDA (150 µl per sample) and incubation for 30 min at 4°C with end-to-end rotation. The depleted lysate was transferred to StrepTactin resin prebound with

NHis CStrep recombinant LtpG (100 µg NHis CStrep LtpG was prebound onto 10 µl pre-equilibrated

Streptactin resin by incubation for 90 min at 4°C with end-to-end rotation) and incubated for 2 h at 4°C with end-to-end rotation. The resin was washed with 3 x 500 µl cell lysis buffer, 2 x 1 ml wash buffer

(50 mM Tris pH 8.0, 150 mM NaCl, 10% glycerol) and 2 x 50 mM AMBIC. Proteins were reduced and alkylated on-bead. 0.8 µg trypsin was added to the resin and incubated at 37°C overnight at 1200 rpm.

Peptide mixtures were StageTipped and dimethyl labelled. A heavy dimethyl labelled non-competed sample was spiked into every light dimethyl labelled sample at a 1:1 ratio. Samples were analysed by mass spectrometry

2.6.3 On-bead reduction and alkylation

The resin was resuspended in 50µl 50 mM AMBIC and reduced by addition of 5 µl 100 mM DTT.

After incubation at 56°C for 30 min, the resin was washed 2 x 200 µl 50 mM AMBIC, resuspended in

95µl 50 mM AMBIC and alkylated by addition of 5 µl 100 mM iodoacetamide. After incubation at RT for 30 min in the dark, the resin was washed 2 x 200 µl 50 mM AMBIC and resuspended in 100 µl 50 mM AMBIC for tryptic digestion.

2.6.4 Purification of effectors from infected cells for MS (Bio/BirA)

MS samples were done at a scale of 1 x 10 cm dish of infected cells. Typically, each condition was performed in technical triplicate (1 x 10 cm per replicate). Certain experiments were performed as technical duplicate of biological duplicates. Buffer compositions are indicated in Table 2.7. All steps were performed at the indicated temperature dependent on lysis buffer (Table 2.7) unless otherwise stated. Infected cells were washed 2 x 5 ml Dulbecco’s PBS and crosslinked with 5 ml crosslinking solution (Table 2.6) if required for 30 min at RT. Crosslinking was quenched by addition of 500 µl 1.25

M glycine/500 mM cysteine in PBS to each dish for 10 min at RT. The cells were washed 3 x 5 ml PBS and lysed by addition of 1 ml lysis buffer. Cells were lysed for 30 min before scraping into 1.5 ml tubes.

89

The insoluble fraction was removed by centrifugation at 20,000 x g for 20 min. The soluble fraction was added to pre-equilibrated 30 µl settled Ni2+-NTA resin (Qiagen) (equilibrated by washing 2 x 1 ml lysis buffer) and incubated for 1 h with end-to-end mixing. The resin was washed 5 x 1 ml His wash buffer with 1,000 x g 1 min spins at 4°C to pellet the resin between each wash. Bound proteins were eluted 3 x 200 µl His elution buffer for 10 min at RT with vortex shaking. The elution fractions were combined and incubated with pre-equilibrated 25 µl settled Neutravidin resin (Thermo) (equilibrated by washing 2 x 1 ml His wash buffer) at 4°C for 2 h with end-to-end mixing. The resin was washed 4 x 1 ml lysis buffer and 4 x 1 ml 50 mM ammonium bicarbonate (AMBIC). 50 µl AMBIC was left on the resin and 1 µg sequencing grade modified trypsin (Promega) was added. Proteins were tryptically digested overnight at 37°C at 1200 rpm. Peptides were StageTipped and dimethyl labelled and stored dry at -80°C prior to analysis by mass spectrometry. pBio samples were light dimethyl labelled and pBio K/A samples heavy dimethyl labelled.

90

Crosslinking solution 1/3% formaldehyde 1/3% formaldehyde (from 16% solution) (Agar Scientific) Dulbecco’s PBS (Sigma) DSP 1mM DSP (from 40mM stock in DMSO) (Pierce) Dulbecco’s PBS DTME 0.5mM DTME (from 20mM stock in DMSO) (Pierce) Dulbecco’s PBS SMCC 1mM SMCC (from 20mM stock in DMSO) (Pierce) Dulbecco’s PBS DSP+DTME 1mM DSP (from 40mM stock in DMSO) 0.5mM DTME (from 20mM stock in DMSO) Dulbecco’s PBS Table 2.6. Table of crosslinking solutions for Bio/BirA pulldowns.

Buffers

Lysis buffer GnCl/Triton RT 6M guanidium chloride X-100 1% Triton X-100 50mM Na2HPO4 150mM NaCl pH 7.3 Triton X-100 4°C 1% Triton X-100 50mM Na2HPO4 150mM NaCl pH 7.3 CHAPS 4°C 1% (w/v) 3-[(3-cholamidopropyl)dimethylammonio]-1- propanesulfonate 50mM Na2HPO4 150mM NaCl pH 7.3 SDS RT 0.5% (w/v) sodium dodecyl sulphate Dulbecco’s PBS His wash 1% Triton X-100 buffer 50mM Na2HPO4 150mM NaCl 20mM imidazole pH 7.3 Elution 1% Triton X-100 buffer 50mM Na2HPO4 150mM NaCl 250mM imidazole pH 7.3 AMBIC 50mM ammonium bicarbonate Table 2.7. Table of buffers for Bio/BirA pulldowns

91

2.6.5 StageTipping and dimethyl labelling

Following overnight tryptic digest, supernatant containing the tryptic peptides were collected by addition of 1 x 80 µl AMBIC followed by 1 x 80 µl 0.1% formic acid with vortex mixing for 10 min and centrifugation at 3,000 x g for 2 min.

Peptide mixtures were desalted by Stage-Tip method [Broncel et al., 2015] and dimethyl labelled on

Stage-Tip [Boersema et al., 2009]. Stage-Tips were made using 3 layers of SDB-XC membranes (3M,

Empore) in a 200 µl pipette tip. They were conditioned by passing 150 µl MeOH through the membranes by 1,000 x g centrifugation for 1 min and subsequently washed with 150 µl water. The peptide mixtures were loaded onto the Stage-Tip by centrifugation at 2,000 x g for typically 2 min and desalted with 150 µl water. If dimethyl labelling was required, 5 x 20 µl dimethyl labelling solution was passed over each sample at 200 x g for 5 min (25 min total labelling time). Composition of dimethyl labelling buffers can be found in Table 2.8. After labelling, the Stage-Tips were washed with 150 µl water. Bound peptides were eluted into 1.5 ml tubes with 60 µl 79% acetonitrile. Samples were multiplexed by mixing a light dimethyl labelled sample and a heavy dimethyl labelled sample at a 1:1 ratio. Peptides were dried by vacuum and stored dry at -80°C.

Dimethyl labelling solutions

PB 7.5 2ml 50mM Na2HPO4 7ml 50mM NaH2PO4 Light labelling solution 90ul PB7.5 5ul 4% CH2O in water 5ul 0.6M NaBH3CN in water Heavy labelling solution 90ul PB7.5 5ul 4% CD2O in water 5ul 0.6M NaBH3CN in water Table 2.8. Table of dimethyl labelling solutions. 2.6.6 MS sample preparation

Dried peptides were resuspended in 20 µl 0.5% trifluoroacetic acid, 2% acetonitrile and 97.5% water by vortex mixing (10 min) and sonication (ultrasonic water bath for 15 min). Insoluble debris was pelleted by centrifugation at 16,000 x g for 10 min and the soluble fraction (top 15 µl) transferred into sample vials. 2 µl of the sample was injected into the mass spectrometer per run.

92

2.6.7 Mass spectrometry

Mass spectrometric analysis was performed using an Acclaim PepMap RSLC column (50 cm × 75 μm inner diameter, Thermo Fisher Scientific) with a 2 h acetonitrile gradient in 0.1% aqueous formic acid at a flow rate of 250 nl/min. An EASY-nLC 1000 system was connected to a Q Exactive mass spectrometer via an easy-spray source (all Thermo Fisher Scientific). The mass spectrometer was run in data-dependent mode with survey scans acquired at a resolution of 75,000 at m/z 200 (transient time

256 ms). Up to ten of the most abundant isotope patterns with charge ≥+2 from the survey scan were selected with an isolation window of 3.0 m/z and fragmented by higher-energy collisional dissociation with normalised collision energies of 25. The maximum ion injection times for the survey scan and the

MS/MS scans (acquired with a resolution of 17,500 at m/z 200) were 20 and 120 ms respectively. The ion target value for MS and MS/MS was set to 106 and 105 respectively and the intensity threshold was set to 8.3 × 102.

2.6.8 MS data processing – MaxQuant

The raw MS data was processed using MaxQuant (version 1.5.0.25) and peptides were identified by matching MS/MS spectra with reference human (Uniprot, downloaded on 19/01/2015) and L. pneumophila strain 130b (ORF extraction from draft genome, [Schroeder et al., 2010]) proteomes using

Andromeda search engine [Cox et al., 2008; Cox et al., 2011]. For in vitro pulldowns, an E. coli strain

K12 reference proteome was also used. N-terminal acetylation and methionine oxidation were selected as variable modifications. No fixed modifications were set unless reduction/alkylation was used then carbamidomethylation of cysteine residues was set as a fixed modification (only for in vitro pulldowns).

Reference proteomes were digested in silico using the built-in trypsin/P setting in which cleavages were allowed after arginine and lysine residues. Light (+28 Da) and heavy (+32 Da) dimethyl labelled lysines and N-termini were selected as required and used for quantification by a built-in algorithm in

MaxQuant. Up to two missed cleavages were allowed. The false discovery rate was set to 1% for peptides, proteins and sites. All other parameters were as pre-set for the software.

93

2.6.9 MS data processing – Perseus (in vitro pulldown)

The MaxQuant data were further processed using Perseus (version 1.5.0.9). The protein groups file was uploaded into the software. Reverse, identified by site hits and contaminants were removed. Based on

MS/MS spectral counts, any proteins identified in the control samples (lysate only or bait protein only samples) were filtered out. The remaining proteins were ranked according to spectral counts and analysed using the STRING database (http://string-db.org/) [Szklarczyk et al., 2015].

2.6.10 MS data processing – Perseus (competition pulldown)

The MaxQuant data were further processed using Perseus (version 1.5.0.9) [Tyanova et al., 2016]. The protein groups file was uploaded into the software. Reverse and identified by site hits were removed.

All assigned E. coli proteins were manually removed. L/H ratios were normalised by division by the non-competed sample. Proteins were filtered for 3 valid L/H ratio values in at least one replicate (A, B or C). The L/H ratios were log2 transformed and dose response curves plotted using the in-built profile plot function.

2.6.11 MS data processing – Perseus (Bio/BirA experiments)

The MaxQuant data were further processed using Perseus (version 1.5.0.9). The protein groups file was uploaded into the software. Reverse and identified by site hits were removed. Proteins identified by at least 1 unique and 1 razor peptide were included for further analysis. Light (pBio) and heavy (pBio

K/A) intensities were logarithmised (log2). Replicates were grouped together and at least two valid log2 intensities across replicates (either three technical replicates or technical duplicate of biological duplicates) were required for at least one group as a threshold of positive protein identification. No unique peptide threshold was applied per sample. Missing log2 intensities were imputed using a downshifted normal distribution (1.8 downshift, 0.3 width) for each sample individually as an estimation of intensity detection limits for each sample. Enrichment factors (the difference in average log2 intensity between pBio and pBio K/A samples) were calculated for each protein using imputed values if required. Proteins were ranked according to enrichment factors, generating Top10 ranked enriched proteins. Heat maps were generated using log2 light and heavy intensities with imputed values removed. Proteins were classified into five categories: Bio-specific (proteins only identified in the pBio

94 samples), Bio-enriched (proteins with an enrichment factor ≥2), nonspecific (-2 ≤ enrichment factor ≤

2), K/A-enriched (proteins with an enrichment factor ≤-2) and K/A-specific (proteins only identified in pBio K/A samples). Bio-specific and Bio-enriched proteins were grouped together as potential interacting partners of the bait protein whilst nonspecific, K/A-enriched and K/A-specific proteins were treated as non-interacting proteins.

95

Chapter 3: Results: The novel Legionella pneumophila Dot/Icm Fic domain effector LtpG

3.1. Introduction

Bioinformatic analysis of the L. pneumophila strain 130b genome revealed a new Fic domain containing protein LtpG [Schroeder et al., 2010]. LtpG was found in 30 out of 87 L. pneumophila isolates tested

(34%). Blast searches also revealed that LtpG homologues exist in other Legionella species such as

Legionella brunensis and Legionella hackeliae. However, LtpG does not appear to be encoded in any sequenced L. longbeachae genomes. Although homology modelling using Phyre2 suggests that LtpG is a Fic domain containing protein with high confidence, there are little similarities with other known proteins [Kelley et al., 2015]. Its closest relative with a determined structure is an uncharacterised Fic domain containing protein from Bacteroides thetaiotaomicron (BT_2513) (Figure 3.1).

Figure 3.1. Homology structural model of LtpG by Phyre2. The helical Fic core is highlighted in red with the canonical Fic motif highlighted in blue. N-terminal and C-terminal extensions are in green and yellow respectively. Β-hairpin flaps are in orange. (A) BT_2513 from Bacteroides thetaiotaomicron (PDB ID: 3CUC). (B) Homology structural model of LtpG by Phyre2. [Kelley et al., 2015]

96

However, only 120 amino acid residues of LtpG’s 526 residue primary sequence aligned with 25% identity (Figure 3.2). Interestingly, LtpG does not encode an acidic residue at position 5 in the Fic motif.

This is instead replaced by a glutamine residue. As the acidic residue is responsible for coordination of an Mg2+ ion in the Fic active site typically, this may have implications on the ability of LtpG to bind diphosphate containing substrates. LtpG also did not encode an auto-inhibitory SxxxE motif in its primary sequence, suggesting that any regulatory mechanisms are encoded on other proteins.

Overexpression of LtpG in L. pneumophila cultures resulted in a strong growth defect, suggesting that endogenous expression of LtpG is tightly regulated [Schroeder et al., 2010]. The critical Fic motif histidine is residue 263. A translocation assay involving a TEM-fusion with LtpG revealed that LtpG is translocated into host cells in a Dot/Icm dependent manner [Schroeder et al., 2010]. As a Dot/Icm effector protein, LtpG may play a role L. pneumophila’s virulence strategy.

97

Figure 3.2. Primary sequence alignment of LtpG with BT_2513 by Clustal Omega. Identical amino acids are indicated with an asterisk (*). Amino acids with strongly and weakly similar properties are indicated by a colon (:) and full stop (.) respectively. The Fic motif is framed in the box. [Sievers et al., 2011]

98

3.2. Contribution of LtpG to virulence and intracellular growth

As the Dot/Icm T4SS is crucial to Legionella’s intracellular lifestyle, assaying the ability of Legionella to replicate intracellularly with the removal of individual effectors through chromosomal deletion is a simple method to determine effector function. Replacement of ltpG with a kanamycin resistance cassette within the genome was performed by Dr. Gunnar Schroeder (Imperial College London, UK) using natural transformation of L. pneumophila. To determine whether LtpG played a role in ensuring efficient intracellular replication, growth curves in a range of hosts were performed using Legionella wild-type (WT), ΔdotA (T4SS-null mutant) and ΔltpG strains: A549 human lung epithelial cells, THP-

1 monocyte-like cells, Dictyostelium discoideum, Acanthamoeba castellanii, Galleria mellonella and mice (Figure 3.3 and data not shown). Intracellular replication was measured by either CFU (colony forming unit) counting or fluorescence measurements on a plate reader using GFP or mCherry expressing L. pneumophila strains. Growth curve experiments were performed by Dr. Gunnar

Schroeder and Corinna Mattheis (Imperial College London, UK) for cell culture models, Dr. Clare

Harding (Imperial College London, UK) for the G. mellonella model and in collaboration with Prof.

Liz Hartland (University of Melbourne, Australia) for the mouse model.

Figure 3.3. Deletion of ltpG does not affect L. pneumophila intracellular replication in amoebae. (A) Dictyostelium discoideum and (B) Acanthamoebae castellanii were infected with mCherry expressing L. pneumophila strains WT, ΔdotA or ΔltpG and replication of L. pneumophila followed by fluorescence over 72 and 94 h for D. discoideum and A. castellanii respectively.

99

In all infection models tested, the ΔltpG strain behaved as WT suggesting that it either does not play a role in intracellular replication or there is a functionally redundant effector. This is a common outcome for Legionella effectors as it encodes over 300 effectors. In fact, very few effectors (SdhA and

IroT/DimB) have been shown to be crucial for L. pneumophila’s ability to replicate intracellularly [Isaac et al., 2015; Laguna et al., 2006].

3.3. Localisation of LtpG in infection

To determine the localisation of LtpG during infection, A549 cells were infected with Legionella strains expressing 4HA-tagged LtpG. Cells were washed and fixed at 5 h and 24 h post infection and effector stained using αHA antibodies (Figure 3.4). A ΔdotA strain was used as control for Dot/Icm dependent translocation.

After 5 h, LtpG could not be detected. However, 24 h post-infection LtpG localises to the nucleus and fibrous structures as well as diffusely in the cytoplasm; translocation was Dot/Icm dependent. This, along with the previously published translocation assay, confirms that LtpG is a Dot/Icm effector

[Schroeder et al., 2010]. HA positive signal can also be seen inside the bacteria, suggesting that not all

LtpG is translocated into the host cell. This is likely a consequence of overexpression using an IPTG inducible promoter. Overexpression of LtpG during infection did not result in any morphological changes to the host cell cytoskeleton.

100

Figure 3.4. LtpG localises to the nucleus 24 h post-infection. A549 cells were infected with L. pneumophila WT, ΔltpG or ΔdotA expressing HA-tagged LtpG. Cells were fixed at 5 h and 24 h post-infection and analysed by immunofluorescence. Cells were stained with anti-Legionella LPS antibodies (green), anti-HA antibodies (red), phalloidin (cyan) and DAPI (blue). Data is representative of 3 biological replicates. Scale bars represent 10 µm.

101

3.4. Localisation of ectopically-expressed LtpG

To determine whether ectopic expression of LtpG shared a similar localisation to translocated LtpG during infection, HeLa cells were transfected with Myc-tagged LtpG for 24 h before fixation and immunofluorescence analysis (Figure 3.5).

Figure 3.5. Ectopically expressed LtpG localises to the nucleus. HeLa cells were transfected with pRK5-Myc-LtpG for 24 h. Cells were fixed and stained with anti-Myc antibodies (green), phalloidin (red) and DAPI (blue). Data is representative of 3 biological replicates. Scale bars represent 10 µm.

Localisation of ectopically expressed Myc-LtpG resembled that of 4HA-LtpG during infection including accumulation in the nucleus as well as fibrous structures in the cytoplasm. Similarly with infection, no obvious effects on cell morphology could be seen.

3.5. LtpG causes Fic-dependent cytotoxicity in yeast

Yeast has acted as a useful model to discover functions of effectors as they share similar pathways with humans [Curak et al., 2009; Popa et al., 2016]. The ability of SidD to counteract the cytotoxic effect of

SidM was initially demonstrated using a yeast model [Tan et al., 2011b]. As yeast replicate faster than mammalian cells, they are able to provide a faster readout for cytotoxicity. To this end, BY4741 yeast was transformed with plasmids encoding LtpG or LtpG H263A (inactive mutant) under the control of a glucose-repressed, galactose-inducible promoter. Yeast was grown overnight in glucose-containing media and then subcultured in either galactose or glucose-containing media for 72 hours. To follow growth, OD600 was measured at 30 min intervals using a plate reader (Figure 3.6).

102

Figure 3.6. Ectopic expression of LtpG in yeast causes Fic-dependent cytotoxicity. BY4741 yeast was transformed with pYES2-LtpG or –LtpG H263A. Two clones for each transformation were taken and growth in repressing glucose or inducing galactose containing media was monitored over 72 h by OD600 in technical triplicate. Data is representative of 3 biological replicates.

This revealed LtpG inhibits growth in a Fic domain dependent manner with the LtpG H263A mutant providing an intermediate phenotype. Whilst LtpG H263A slowly recovers to resemble the empty vector control, there is no growth if LtpG WT is expressed over the entire 72 h time course. The intermediate phenotype exhibited by LtpG H263A suggests either residual Fic activity or alternative

Fic-independent functions of LtpG. The data indicates that LtpG targets an essential protein in yeast but not in mammalian cells. Furthermore the Fic domain of LtpG is functional and its activity at least partially dependent on the conserved histidine at the beginning of the Fic motif.

3.6. Purification of recombinant LtpG

In order to characterise the biochemical function of LtpG, we produced recombinant LtpG to test in in vitro assays. The ltpG gene was cloned into the pET28a plasmid, enabling inducible production of recombinant N-terminally hexahistidine tagged (His-tagged) LtpG in E. coli. Three induction temperatures (37°C, 30°C and 18°C) and three IPTG concentrations (0.1, 0.5 and 1 mM) were tested.

Cells were subcultured to an OD of 0.6 before induction. Cells were harvested 4 h post-induction for those grown at 37°C and 30°C whilst cells growth at 18°C were cultured overnight (~16 h) before harvest. Whole cell extracts were analysed by SDS-PAGE and Coomassie staining (Figure 3.7). A novel

103 band just below the 58 kDa marker band, which was absent in the uninduced samples and approximately corresponded to the expected size of His-tagged LtpG (~63 kDa), was observed. Induction with 1 mM

IPTG at 18°C overnight appeared to give the highest yield of His-LtpG and was hence chosen as the induction condition.

37 C 30 C 18 C

mM

mM

mM

mM

mM

mM

mM

mM

M

1

0.5

0.1

Uninduced

1

0.1

1

0.5 0.1

175 kDa-

80 kDa-

58 kDa- LtpG 46 kDa-

30 kDa-

23 kDa-

Figure 3.7. Recombinant His-LtpG expression tests. Expression of N-terminally His-tagged LtpG in E. coli BL21Star was tested at 18°C, 30°C and 37°C using 0.1, 0.5 and 1mM IPTG. Whole cell lysates were separated by SDS-PAGE and stained with Coomassie. LtpG is indicated by the black arrowhead.

To determine whether this His-LtpG was soluble and could be purified using metal affinity chromatography, clarified lysates expressing His-LtpG were subjected to batch purification using Ni2+-

IDA resin (Figure 3.8). Three different wash buffers of increasing imidazole concentration (5 mM, 20 mM and 60 mM) were used to maximise His-LtpG purity in the elution fractions. Whilst His-LtpG was retained on the resin at the two lower imidazole concentrations, a band corresponding to His-LtpG can be seen in the third wash (60 mM imidazole) fraction, suggesting that His-LtpG already begins to be eluted at 60 mM imidazole.

104

1

3

4

M

WCL

Insoluble

Soluble

Flowthrough

Wash 1 Wash

Wash 2 Wash

Wash 3 Wash

M

Elution

Elution 2 Elution

Elution

Elution

Elution 5 Elution

Elution 6 Elution

Elution 7 Elution

Elution 8 Elution Elution 9 Elution

175 kDa- 175 kDa-

80 kDa- 80 kDa-

58 kDa- 58 kDa- LtpG

46 kDa- 46 kDa-

30 kDa- 30 kDa-

Figure 3.8. Batch purification of recombinant His-LtpG. Recombinant N-terminally His-tagged LtpG was expressed in E. coli BL21Star (DE3) at 18°C overnight. Cells were harvested and mechanically lysed. Clarified lysates were subjected to Ni2+- IDA batch purification. The presence of LtpG in fractions was determined by Coomassie staining of SDS-PAGE gels. Whole cell lysates (WCL), 5 mM imidazole wash (Wash 1), 20 mM imidazole wash (Wash 2), 60 mM imidazole wash (Wash 3).

Although some protein was lost in the third wash, the elution fractions (1 ml each) yielded ~4.5 mg of purified LtpG from an initial 300 ml culture (~15 mg/L). This proved a sufficient yield of recombinant proteins for further assays and likely produced a higher purity of LtpG than if the 60 mM imidazole wash was omitted.

3.7. Recombinant LtpG exhibits auto-AMPylation activity in vitro

As AMPylation was the most studied enzymatic activity of Fic domains, we determined whether LtpG possessed AMPylation capabilities. A clickable-analogue of ATP (Yn-6-TP) was used as a probe for

AMPylation activity (Figure 3.9) [Grammel et al., 2011].

Figure 3.9. Chemical structures of click reagents for in vitro AMPylation. The clickable ATP probe Yn-6-TP resembles ATP with only a small alkyne functionality attached to the amino group on the 6-position of the adenine ring. The azide- containing capture reagent (AzTB) is functionalised with both a TAMRA fluorophore and biotin for visualisation and enrichment.

105

Yn-6-TP contains an alkyne motif on the nitrogen attached to the 6-position of the adenine ring. Using this small tag, additional functionalities can be attached to the probe using an azide-containing capture reagent and a copper catalysed cycloaddition reaction (click chemistry) [Rostovtsev et al., 2002]. This small alkyne tag enables Yn-6-TP to mimic ATP with minimal steric hindrance whilst enabling addition of functional groups such as fluorophores and affinity handles when steric hindrance is no longer an issue for recognition. Using a trifunctional capture reagent (AzTB) consisting of an azide, TAMRA fluorophore and biotin, proteins modified with Yn-6-TP can be visualised and enriched [Heal et al.,

2011; Heal et al., 2012]. Enabling enrichment of post-translationally modified proteins provides clickable analogues with a substantial advantage to study PTMs over other more traditional techniques such as radioactive substrates.

Figure 3.10. In vitro AMPylation schematic. Mammalian cell lysates were incubated with the ATP analogue Yn-6-TP and recombinant LtpG. The enzymatic reaction was quenched by protein precipitation and modified proteins reacted with capture reagent using click chemistry. Modified proteins were visualised by in-gel fluorescence following separation by SDS-PAGE.

Recombinant LtpG was incubated with Yn-6-TP and reacted with AzTB to visualise any potential auto-

AMPylation using in-gel fluorescence (Figure 3.10). A fluorescent signal was observed at the appropriate size for recombinant LtpG but not LtpG H263A suggesting that LtpG is able to use ATP as a substrate for auto-AMPylation and that LtpG has enzymatic activity in vitro (Figure 3.11).

106

WT

H263A

WT

H263A

LtpG

LtpG

LtpG LtpG

250kDa- 250kDa- 150kDa- 150kDa- 100kDa- 100kDa- 75kDa- 75kDa-

50kDa- 50kDa- LtpG

37kDa- 37kDa-

25kDa- 25kDa-

20kDa- 20kDa-

In-gel fluorescence Coomassie

Figure 3.11. LtpG auto-AMPylates in a Fic-dependent manner. Recombinant LtpG and LtpG H263A were incubated with Yn-6-TP and subsequently clicked with capture reagent. Proteins were separated by SDS-PAGE and analysed by in-gel fluorescence.

To determine whether LtpG could modify any host cell targets with AMP, LtpG was incubated with

Yn-6-TP and THP-1 cell lysate. After a click reaction with AzTB, the only additional fluorescent band compared with the negative controls corresponded to the size of LtpG (Figure 3.12). Hence no obvious additional AMPylated targets were detected by this method. In contrast, strong fluorescent signals were observed around the 20 kDa region when cell lysate was incubated with recombinant VopS corresponding to the expected AMPylation of Rho GTPases [Yarbrough et al., 2009]. Multiple fluorescent bands were observed when no recombinant protein was added to lysate or when the catalytically inactive (H263A) mutant was used, suggesting that proteins within the cell lysate have

AMPylation activity or become modified by the probe by other means. These fluorescent bands are specific for the Yn-6-TP probe as substitution of the small molecule substrate to ATP completely abolishes all fluorescent signal.

107

ATP Yn-6-TP ATP Yn-6-TP Yn-6-TP ATP

H348A

H348A

H263A

H263A

LtpG

LtpG

LtpG

LtpG

No enzyme

LtpG

No enzyme

LtpG

VopS

VopS

VopS VopS

250kDa- 250kDa- 150kDa- 150kDa- 100kDa- 100kDa- 75kDa- 75kDa- LtpG VopS LtpG

50kDa- 50kDa- gel fluorescence fluorescence gel

37kDa- 37kDa- - In

25kDa- 25kDa- Small 20kDa- GTPases 20kDa-

75kDa- 75kDa-

LtpG LtpG His WB His 50kDa- 50kDa- -

Anti

Figure 3.12. No additional AMPylation targets can be seen when LtpG is incubated with THP-1 cell lysate whilst VopS AMPylates Rho GTPases. Recombinant His-LtpG, -LtpG H263A or GST-VopS, -VopS H348A were incubated with either Yn-6-TP or ATP and THP-1 cell lysate. Modified proteins were clicked with capture reagent and visualised by in-gel fluorescence following SDS-PAGE. Recombinant His-LtpG loading was monitored by anti-His Western Blot.

Although LtpG showed auto-AMPylation activity, the lack of any additional AMPylated substrates suggests that ATP may not be its preferred substrate. This hypothesis is further strengthened by the observation that more diphosphate containing metabolites are being discovered as Fic domain substrates. However, it may also be possible that the physiological AMPylation target of LtpG is either not present in the cell lysate used or is not abundant. Alternatively, as Legionella has evolved such complex virulence mechanisms, perhaps only in the presence of another effector would LtpG AMPylate its substrate protein.

108

3.8. LtpG preferentially binds diphosphate-containing metabolites

To determine whether LtpG could preferentially bind other phosphate-based metabolite substrates, a differential scanning fluorimetry (DSF) assay was performed. The DSF assay involves slowly heating a protein of interest in the presence of SYPRO Orange, a dye which fluoresces upon binding to hydrophobic regions [Niesen et al., 2007]. Thermal denaturation of proteins exposes more hydrophobic regions to the solvent and hence protein unfolding can be followed using fluorescence (Figure 3.13).

Protein stability can be measured by the midpoint of this thermal denaturation transition, known as the melting temperature (Tm) whereby 50% of the protein is folded.

Figure 3.13. Schematic of the differential scanning fluorimetry (DSF) assay. Thermal denaturation of a protein is monitored by fluorescence using SYPRO Orange, a dye which only fluoresces upon binding to hydrophobic regions. Binding of ligands stabilises the protein and results in a positive shift in protein melting temperature (ΔTm).

When incubated with small molecule binders such as substrates, the protein should be stabilised and hence more resistant to thermal denaturation, giving a positive shift in the melting temperature (ΔTm) relative to a protein without ligand control. Greater positive thermal shifts are correlated with increased binding affinities. However, the absolute temperature shift caused by binding of a small molecule to each protein is unique, as the extra stability exhibited will largely depend on the innate stability and

109 structure of the protein. Furthermore, the reference melting temperature of a protein will have significant batch to batch variation due to slight differences in impurities with every purification and the number of freeze-thaw cycles undergone by the protein. Therefore, the DSF data should mainly be used for qualitative purposes and determination of trends.

To determine whether small molecules could shift the melting temperature of LtpG and hence imply binding, LtpG was incubated with a panel of phosphate-based metabolites and the Tm determined. All currently known Fic domain substrates (NTPs and CDP-choline) were included as well as the various phosphorylation states of each nucleotide. Pyrophosphate was also included as it represents the minimal diphosphate containing molecule.

GDP provided LtpG with the largest ΔTm followed by pyrophosphate (Figure 3.14). This suggested that rather than an adenosine-metabolite substrate, LtpG may prefer a guanosine-based substrate. All

NMPs provided very small thermal shifts, further suggesting that Fic domains bind metabolites with at least a diphosphate moiety. As CDP-choline does not provide a positive ΔTm, LtpG is also unlikely to be a phosphocholine transferase like AnkX.

Figure 3.14. GDP causes the largest shift in LtpG melting temperature. Recombinant LtpG was incubated with a panel of phosphate containing small molecules and its melting temperature determined by the DSF assay. Melting temperature shifts (ΔTm) were determined relative to the no substrate negative control.

110

To determine whether these nucleotide binding preferences could indicate the preferred enzymatic activity of LtpG, nucleotide binding preferences of LtpG were compared with the known AMPylator

VopS as a control. As the γ-phosphate of NTPs is typically labile, NTP-γ-S may be more representative of NTPs in this assay and hence were used for subsequent assays. NTP-γ-S replaces an oxygen on the

γ-phosphate with a sulphur atom, preventing hydrolysis between the β and γ phosphate groups and hence acts as a non-hydrolysable analogue of NTP.

Whilst VopS shows strongest binding to ATP-γ-S and GTP-γ-S, it shows relatively weak binding to

ADP, GDP, ATP and GTP (Figure 3.15). In contrast, LtpG binds GDP with similar affinity to GTP-γ-

S but slightly lower affinity to GTP. The adenosine containing metabolites ADP, ATP and ATP-γ-S all provide similar thermal shifts to LtpG. The distinct preference for NTP-γ-S binding of VopS in the DSF assay correlates well with the fact that it uses preferentially ATP and GTP as substrates [Mattoo et al.,

2011]. As LtpG appears to bind NDPs with similar affinity to NTP-γ-S, this suggests that only the diphosphate moiety is required for efficient binding. Whilst triphosphate containing metabolites are mainly NTPs, there are many nucleotide diphosphate-containing molecules such as CDP-choline, GDP- mannose and NAD. A comparison of triphosphate and diphosphate containing metabolites on the

Human Metabalome Database (www.hmdb.ca) revealed that 53 metabolites contained triphosphate moieties whilst 488 had a diphosphate motif [Wishart et al., 2013]. Hence our phosphate containing metabolite library might not contain the physiological small molecule substrate of LtpG, although it suggests that a guanine base is preferred.

The exclusive NTP-γ-S binding of VopS may also be explained by a high catalytic activity of VopS whereby any substrate is hydrolysed and subsequently released from the active site. Therefore hydrolysable substrates do not show large ΔTm for VopS. Although VopS appears to exclusively bind triphosphates-containing metabolites, VopS is still able to bind pyrophosphate. This may be possible due to the smaller nature of pyrophosphate compared with nucleotides and hence enables to the pyrophosphate to adopt alternative binding modes within the active site.

111

5

4

3 C ° 2 ΔTm ΔTm /

1

0 - GTP GDP-β-S GTP-γ-S GDP - AMP ADP ATP ATP-γ-S Yn-6-TP PPi

-1

LtpG VopS

Figure 3.15. Comparison of nucleotide binding preferences of LtpG and VopS. Recombinant LtpG and VopS incubated with various nucleotides and their melting temperatures determined by DSF. Melting temperature shifts (ΔTm) are relative to the no substrate negative control for each protein.

To determine whether the mutation of the catalytic motif changed the binding affinity of LtpG to the phosphate containing metabolites, the DSF assay was performed on His-LtpG and His-LtpG H263A.

Similar trends were observed for the WT and H263A mutant with GTP-γ-S providing the strongest stabilisation (Figure 3.16). This suggests that LtpG is not hydrolysing any of the putative substrates in the absence of a target protein. A notable exception was that the WT protein appeared to have slightly higher affinity to UTP than the H263A mutant.

112

3.5

3

2.5

2 C

° 1.5

ΔTm / 1

0.5

0

-0.5

-1

LtpG WT LtpG H263A

Figure 3.16. Mutation of the catalytic histidine does not alter the nucleotide binding preferences of LtpG. Recombinant LtpG and LtpG H263A were incubated with a panel of nucleotide substrates and their melting temperatures determined by DSF. Melting temperature shifts (ΔTm) are relative to the no substrate negative control for each protein.

As the current structures of Fic domains all indicate that a metal ion like Mg2+ is required to bind the diphosphate-containing small molecule, we determined whether Mg2+ was required for GDP binding using DSF. A positive control of His-LtpG with MgCl2 and GDP was compared with a sample of His-

LtpG and GDP. As some residual Mg2+ may still be bound to the active site during the purification process, a final sample in which EDTA was included was used to sequester any Mg2+ which may be pre-bound to LtpG during purification. Whilst addition of MgCl2 to the assay buffer increased the Tm of GDP-bound LtpG by ~5°C, removal of Mg2+ using EDTA drastically destabilised LtpG with a -8°C shift in melting temperature (Figure 3.17). The data suggests that metal binding is important for LtpG stability. Furthermore, it indicated that the lack of an acidic residue in position 5 of the Fic motif, which was implicated in metal binding, does not appear to hinder the ability of LtpG to bind Mg2+. However, the DSF assay does not prove that Mg2+ is bound to the Fic active site, it only suggests that metal binding is important for GDP-binding and protein stability.

113

60 1.6 1.4 50 1.2 1 40 0.8 C ° 0.6 30

0.4 Tm / Normalised Normalised RFU 0.2 20 0 25 30 35 40 45 50 55 10 -0.2 Temperature / °C 0 -MgCl2 +MgCl2 +EDTA -MgCl2 +MgCl2 +EDTA

Figure 3.17. MgCl2 is critical for LtpG stability. Recombinant LtpG was incubated with ±MgCl2 or EDTA in the presence of GDP and its melting temperature monitored using DSF.

114

3.9. LtpG does not have alternative NMPylation or kinase activity

To determine whether LtpG could utilise GTP as a substrate, in vitro GMPylation assays were performed using GTP α-32P. ATP α-32P and VopS were used as positive controls. Recombinant Fic domain enzymes were incubated with cell lysate and radioactive NTPs and modified proteins were visualised by SDS-PAGE and autoradiography.

P32 αATP P32 αGTP P32 αATP P32 γATP

)

)

A)

B

B

(Batch

(Batch

(Batch A) (Batch

(Batch

WT

H263A

WT

H263A

WT

H263A

WT

H263A

LtpG

LtpG

VopS

LtpG

LtpG

VopS

LtpG

LtpG

VopS

VopS

LtpG

LtpG

VopS VopS

100kDa- 100kDa- 80kDa- 80kDa- 58kDa- 58kDa- 46kDa- 46kDa- 32kDa- 32kDa-

25kDa- 25kDa- Small 22kDa- 22kDa- GTPases

Figure 3.18. LtpG does not appear to have GMPylation or kinase activity. Recombinant LtpG, LtpG H263A and VopS were incubated with THP-1 cell lysate and radiolabelled nucleotides (α-32P ATP, γ-32P ATP or α-32P GTP). Proteins were separated by SDS-PAGE and analysed by autoradiography.

Whilst AMPylation and GMPylation of substrates around 20kDa (presumably Rho GTPases) could be seen when lysates were treated with VopS, no signal was observed in samples treated with LtpG or

LtpG H263A (Figure 3.18). Additionally, no auto-AMPylation nor auto-GMPylation signal could be detected for any of the recombinant enzymes. The data suggests that the α-32P containing portion of

GTP does not get transferred onto substrate proteins by LtpG and therefore GTP may not be its endogenous substrate. ATP γ-32P was also included in the experiments to assay for potential kinase activity of LtpG. Many bands generated by endogenous kinases were detected. However, no obvious additional bands were detected upon addition of LtpG. The high number of kinases within the cell likely masks any potential kinase activity LtpG may have. Although UTP also gave strong positive melting

115 temperature shifts in the DSF assay, due to the inability to detect even auto-AMPylation of LtpG using

α-32P whilst it could be detected using Yn-6-TP, the radioactivity-based assay did not appear suitable to study the enzymatic function of LtpG.

3.10. LtpG crystallisation trials

Structural studies were undertaken to provide information on the substrate binding pocket of LtpG.

Protein crystallisation trials of recombinant LtpG were attempted using vapour diffusion. A droplet containing concentrated soluble protein and buffer containing precipitant is incubated in a closed environment next to a larger buffer reservoir (Figure 3.19). Initially, the droplet contains lower precipitant concentration than the larger reservoir. As the system equilibrates by diffusion between the drop and reservoir, the protein and precipitant concentration slowly increase, allowing the protein to reach a critical nucleation point for crystallisation.

Figure 3.19. Schematic of sitting drop and hanging drop vapour diffusion methodology for protein crystallisation. A protein drop is incubated next to a buffer reservoir in a closed environment. As the system equilibrates by diffusion between the protein drop and buffer reservoir, protein and precipitant concentration slowly increase in the drop. This causes the protein to come out of solution which may be in the form of a protein crystal.

As high protein purity is critical for successful crystallisation, His-LtpG was purified further after an initial Ni2+-NTA purification using anion exchange chromatography and size exclusion chromatography (Figure 3.20).

116

M

Pre

Post

Insol

Sol

F3

F8

F12

F13

F14

F15

F16

F17 F18 A F19 175 kDa-

80 kDa- 58 kDa- LtpG

NTA 46 kDa-

- 2+

Ni 30 kDa-

23 kDa-

M

F13

F14

F15

F16 F17 B F18

175 kDa-

80 kDa- 58 kDa-

46 kDa- LtpG Anion

30 kDa- Exchange

23 kDa-

M

F17

F18

F19

F20

F21

F22

F23

F24

F25

F26

F27

F28

F29 F30

C 175 kDa-

80 kDa- 58 kDa- 46 kDa- LtpG

30 kDa- Gel Filtration 23 kDa-

Figure 3.20. Purification of recombinant LtpG for crystallography. ÄKTA FPLC traces showing absorption at 280 nm and Coomassie stained SDS-PAGE gels for selected fractions. E. coli BL21Star expressing His-tagged LtpG were lysed and clarified lysates subjected to sequential (A) Ni2+-NTA, (B) anion exchange and (C) gel filtration purifications on an ÄKTA prime. Fractions highlighted in yellow were pooled and taken forward for subsequent purification steps.

This purification procedure yielded typically 1 ml of 10 mg/ml homogenous recombinant His-LtpG from a 1 L culture. Drops of recombinant LtpG were seeded as sitting drops across 14 commercially available crystallisation plates, each consisting of 96 different crystallisation conditions, enabling a total of 1344 conditions to be tested. Two drop ratios (protein:reservoir buffer - 100:100 nl and 100:200 nl) were employed to maximise the chance of crystallisation. Trays were incubated at 20°C and crystal formation was checked by microscopy daily.

117

Figure 3.21. LtpG crystals formed in 100 mM BisTris, 3 M NaCl. (A) pH 5.5 and (B) pH 6.5. Pictures were taken after two days of incubation.

Two conditions yielded crystals within a day of tray setup: 100 mM BisTris with 3 M NaCl at pH5.5 and pH6.5. No other conditions from the 14 trays yielded crystals over a two week period (Figure 3.21).

As the BisTris/NaCl conditions only yielded small crystals of 20-30 µm in length, the conditions were further optimised. A broad BisTris pH (0.5 pH intervals) and NaCl concentration (0.5 M intervals) screen was performed (Table 3.1).

[NaCl] / M 0.1 0.5 1.0 2.0 3.0 4.0

pH4.5

pH5.0

pH5.5

pH6.0

pH6.5

100 mM 100 BisTris pH7.0

pH7.5

pH8.0

Table 3.1. Table of broad optimisation conditions for LtpG crystallisation based on BisTris and NaCl base condition. Successful crystallisation conditions are highlights in green.

118

Only conditions within a similar range of pH and NaCl concentration (highlighted in green) as the initial condition yielded crystals (pH 6.0-6.5, 3 M NaCl). As such, a more focused optimisation tray was set up with pH units increasing at 0.2 pH intervals and 0.2 M NaCl concentration intervals (Table 3.2).

NaCl 1 2 3 4 5 6 2.4M 2.6M 2.8M 3.0M 3.2M 3.4M A pH5.7 Small crystal Small crystal Precipitate Medium crystal Precipitate Precipitate B pH5.9 Medium crystal Large crystal Medium crystal Medium crystal Precipitate Precipitate C pH6.1 Large crystal Large crystal Small crystal Small crystal Precipitate Precipitate D pH6.3 Clear drop Large crystal Medium crystal Precipitate Precipitate Precipitate E pH6.5 Clear drop Medium crystal Medium crystal Medium crystal Precipitate Precipitate F pH6.7 Clear drop Clear drop Medium crystal Small crystal Precipitate Precipitate 100mM BisTris G pH6.9 Clear drop Clear drop Clear drop Small crystal Precipitate Precipitate H pH7.1 Clear drop Clear drop Clear drop Clear drop Precipitate Precipitate

Table 3.2. Table of fine optimisation conditions for LtpG crystallisation based on BisTris and NaCl base condition. Each condition was graded based on crystal size.

This fine optimisation tray gave an optimal condition of 100 mM BisTris with 2.6 M NaCl at pH 6.1.

After 23 days incubation, crystals grew to ~100 µm (Figure 3.22).

Figure 3.22. LtpG crystals grown in 100 mM BisTris, 2.6 M NaCl, pH 6.1. Picture taken after 23 days incubation.

119

The crystals were subjected to x-ray diffraction analysis using both a Rigaku MicroMax-007 in-house x-ray source and a synchrotron source at the Oxford Diamond Light Source Facility in collaboration with Dr. David Charles (Imperial College London). Crystals obtained in this condition gave 7-8Å diffraction patterns on both x-ray sources tested (Figure 3.23A and B). In addition, no salt spots (high resolution diffraction spots typically associated with the tight packing of salt crystals) were detected suggesting that the observed diffraction pattern originated from protein crystals. This was further confirmed by SDS-PAGE analysis of a single crystal (Figure 3.23D). Silver staining revealed that the crystal comprised of a single polypeptide which corresponded to the size of LtpG. Although these diffraction patterns indicate that the crystals obtained were indeed LtpG crystals, they were not of sufficient quality to provide structural insights. Ice rings (indicative of ice crystals forming during the freezing process of the crystal) were observed in the diffraction pattern obtained on the home x-ray source. However, they were not present on the synchrotron source. This suggests that cryoprotectant is likely required in the buffer condition to minimise the chances of ice forming.

120

Figure 3.23. X-ray diffraction patterns of His-LtpG crystals. X-ray diffraction patterns of His-LtpG crystals obtained from (A) in-house Rigaku MicroMax-007 source and (B) synchrotron source. (C) Picture of LtpG crystal in a crystal loop. (D) Silvered stained SDS-PAGE gel of single LtpG crystal (black arrowhead).

121

Crystals typically grew on the bottom of the well due to gravity and required physical dislodgement to be transferred into a crystal loop, leading to crystal damage. To minimise manual handling of crystals, hanging drops were prepared to prevent crystal formation on the well. Although crystals were obtained under hanging drop conditions, a “skin” formed on the surface of the drop to which the crystals attached to. Hence the hanging drop methodology did not help prevent crystal damage during the transfer process of the crystal from drop to loop.

To determine whether better crystals could be obtained, a 96-well additive screen was performed using the Bis-Tris/NaCl as the base condition. A table of the additives tested is shown in Table 3.3.

Additive Type Score 1. (A1) 0.1 M Barium chloride dihydrate Multivalent Precipitate 2. (A2) 0.1 M Cadmium chloride hydrate Multivalent Precipitate 3. (A3) 0.1 M Calcium chloride dihydrate Multivalent Precipitate 4. (A4) 0.1 M Cobalt(II) chloride hexahydrate Multivalent Precipitate 5. (A5) 0.1 M Copper(II) chloride dihydrate Multivalent Precipitate 6. (A6) 0.1 M Magnesium chloride hexahydrate Multivalent Precipitate 7. (A7) 0.1 M Manganese(II) chloride tetrahydrate Multivalent Precipitate 8. (A8) 0.1 M Strontium chloride hexahydrate Multivalent Precipitate 9. (A9) 0.1 M Yttrium(III) chloride hexahydrate Multivalent Precipitate 10. (A10) 0.1 M Zinc chloride Multivalent Precipitate 11. (A11) 0.1 M Iron(III) chloride hexahydrate Multivalent Precipitate 12. (A12) 0.1 M Nickel(II) chloride hexahydrate Multivalent Precipitate 13. (B1) 0.1 M Chromium(III) chloride hexahydrate Multivalent Precipitate 14. (B2) 0.1 M Praseodymium(III) acetate hydrate Multivalent Precipitate 15. (B3) 1.0 M Ammonium sulfate Salt Precipitate 16. (B4) 1.0 M Potassium chloride Salt Precipitate 17. (B5) 1.0 M Lithium chloride Salt Precipitate 18. (B6) 2.0 M Sodium chloride Salt Precipitate 19. (B7) 0.5 M Sodium fluoride Salt Precipitate/Micro crystals 20. (B8) 1.0 M Sodium iodide Salt Precipitate 21. (B9) 2.0 M Sodium thiocyanate Salt Precipitate/Micro crystals 22. (B10) 1.0 M Potassium sodium tartrate tetrahydrate Salt Precipitate 23. (B11) 1.0 M Sodium citrate tribasic dihydrate Salt Precipitate 24. (B12) 1.0 M Cesium chloride Salt Precipitate 25. (C1) 1.0 M Sodium malonate pH 7.0 Salt Precipitate 26. (C2) 0.1 M L-Proline Amino Acid Precipitate 27. (C3) 0.1 M Phenol Dissociating Agent Precipitate 28. (C4) 30% v/v Dimethyl sulfoxide Dissociating Agent Precipitate/Micro crystals 29. (C5) 0.1 M Sodium bromide Dissociating Agent Precipitate/Micro crystals 30. (C6) 30% w/v 6-Aminohexanoic acid Linker Precipitate/Micro crystals 31. (C7) 30% w/v 1,5-Diaminopentane dihydrochloride Linker Precipitate 32. (C8) 30% w/v 1,6-Diaminohexane Linker Micro crystals 33. (C9) 30% w/v 1,8-Diaminooctane Linker Clear drop 34. (C10) 1.0 M Glycine Linker Precipitate/Micro crystals 35. (C11) 0.3 M Glycyl-glycyl-glycine Linker Precipitate

122

36. (C12) 0.1 M Taurine Linker Precipitate/Micro crystals 37. (D1) 0.1 M Betaine hydrochloride Linker Precipitate 38. (D2) 0.1 M Spermidine Polyamine Precipitate 39. (D3) 0.1 M Spermine tetrahydrochloride Polyamine Precipitate/Micro crystals 40. (D4) 0.1 M Hexammine cobalt(III) chloride Polyamine Precipitate/Micro crystals 41. (D5) 0.1 M Sarcosine Polyamine Precipitate/Micro crystals 42. (D6) 0.1 M Trimethylamine hydrochloride Chaotrope Micro crystals 43. (D7) 1.0 M Guanidine hydrochloride Chaotrope Micro crystals 44. (D8) 0.1 M Urea Chaotrope Small crystals (many) 45. (D9) 0.1 M β-Nicotinamide adenine dinucleotide hydrate Co-factor Micro crystals 46. (D10) 0.1 M Adenosine-5’-triphosphate disodium salt hydrate Co-factor Precipitate 47. (D11) 0.1 M TCEP hydrochloride Reducing Agent Precipitate 48. (D12) 0.01 M GSH (L-Glutathione reduced), 0.01 M GSSG (L-Glutathione oxidized) Reducing Agent Micro crystals 49. (E1) 0.1 M Ethylenediaminetetraacetic acid disodium salt dihydrate Chelating Agent Precipitate 50. (E2) 5% w/v Polyvinylpyrrolidone K15 Polymer Precipitate 51. (E3) 30% w/v Dextran sulfate sodium salt Polymer Precipitate 52. (E4) 40% v/v Pentaerythritol ethoxylate (3/4 EO/OH) Polymer Clear drop 53. (E5) 10% w/v Polyethylene glycol 3,350 Polymer Micro crystals 54. (E6) 30% w/v D-(+)-Glucose monohydrate Carbohydrate Micro crystals 55. (E7) 30% w/v Sucrose Carbohydrate Small crystals (many) 56. (E8) 30% w/v Xylitol Carbohydrate Small crystals (moderate) 57. (E9) 30% w/v D-Sorbitol Carbohydrate Small crystals (moderate) 58. (E10) 12% w/v myo-Inositol Carbohydrate Small crystals (moderate) 59. (E11) 30% w/v D-(+)-Trehalose dihydrate Carbohydrate Small crystals (many) 60. (E12) 30% w/v D-(+)-Galactose Carbohydrate Small crystals (many) 61. (F1) 30% v/v Ethylene glycol Polyol Precipitate 62. (F2) 30% v/v Glycerol Polyol Precipitate/Micro crystals 63. (F3) 3.0 M NDSB-195 Non-detergent Micro crystals 64. (F4) 2.0 M NDSB-201 Non-detergent Precipitate/Micro crystals 65. (F5) 2.0 M NDSB-211 Non-detergent Precipitate/Micro crystals 66. (F6) 2.0 M NDSB-221 Non-detergent Small crystals (many) 67. (F7) 1.0 M NDSB-256 Non-detergent Clear drop 68. (F8) 0.15 mM CYMAL®-7 Amphiphile Small crystals (few) 69. (F9) 20% w/v Benzamidine hydrochloride Amphiphile Clear drop 70. (F10) 5% w/v n-Dodecyl-N,N-dimethylamine-N-oxide Detergent Precipitate 71. (F11) 5% w/v n-Octyl-β-D-glucoside Detergent Precipitate 72. (F12) 5% w/v n-Dodecyl-β-D-maltoside Osmolyte Clear drop 73. (G1) 30% w/v Trimethylamine N-oxide dihydrate Organic, Non-volatile Precipitate 74. (G2) 30% w/v 1,6-Hexanediol Organic, Non-volatile Precipitate 75. (G3) 30% v/v (+/-)-2-Methyl-2,4-pentanediol Organic, Non-volatile Clear drop 76. (G4) 50% v/v Polyethylene glycol 400 Organic, Non-volatile Clear drop 77. (G5) 50% v/v Jeffamine M-600 pH 7.0 Organic, Non-volatile Precipitate 78. (G6) 40% v/v 2,5-Hexanediol Organic, Non-volatile Small crystals (few) 79. (G7) 40% v/v (±)-1,3-Butanediol Organic, Non-volatile Clear drop 80. (G8) 40% v/v Polypropylene glycol P 400 Organic, Non-volatile Clear drop 81. (G9) 30% v/v 1,4-Dioxane Organic, Volatile Clear drop 82. (G10) 30% v/v Ethanol Organic, Volatile Small crystals (many) 83. (G11) 30% v/v 2-Propanol Organic, Volatile Clear drop 84. (G12) 30% v/v Methanol Organic, Volatile Clear drop 85. (H1) 10% v/v 1,2-Butanediol Organic, Volatile Precipitate 86. (H2) 40% v/v tert-Butanol Organic, Volatile Clear drop

123

87. (H3) 40% v/v 1,3-Propanediol Organic, Volatile Clear drop 88. (H4) 40% v/v Acetonitrile Organic, Volatile Micro crystals 89. (H5) 40% v/v Formamide Organic, Volatile Micro crystals 90. (H6) 40% v/v 1-Propanol Organic, Volatile Clear drop 91. (H7) 5% v/v Ethyl acetate Organic, Volatile Small crystals (few) 92. (H8) 40% v/v Acetone Organic, Volatile Micro crystals 93. (H9) 0.25% v/v Dichloromethane Organic, Volatile Clear drop 94. (H10) 7% v/v 1-Butanol Organic, Volatile Clear drop 95. (H11) 40% v/v 2,2,2-Trifluoroethanol Organic, Volatile Clear drop 96. (H12) 40% v/v 1,1,1,3,3,3-Hexafluoro-2-propanol Organic, Volatile Small crystals (few) Table 3.3. Table of additives in the additive screen on the base BisTris and NaCl crystallisation condition. Protein drops were graded based on crystal size.

Most conditions produced precipitates. However, crystals were obtained in conditions containing sugar-based additives. To determine whether these sugar-based additives bound to LtpG, DSF was used with a variety of sugars used as small molecule substrates. None of the sugars tested provided a shift in melting temperature, indicating that the sugars did not bind (Figure 3.24). Although crystals were formed with sugar additives, crystal size was not improved and hence were not added in future trays.

3

2.5

2

1.5

Tm 1 D 0.5

0 - Ppi GDP -0.5 Urea Xylose Dulcitol Sucrose Sorbitol Glucose Maltose Mannitol Mannose Raffinose Galactose -1 Arabinose Myo-inositol

Figure 3.24. LtpG does not appear to bind to sugars in the DSF assay. Recombinant LtpG was incubated with a panel of carbohydrates and its melting temperature determined by DSF. Melting temperature shifts (ΔTm) were calculated relative to the no substrate negative control.

124

Crystal seeding was attempted to increase crystal size. By providing an initial nucleation site, crystals could grow around the existing nucleus rather than forming new small crystals. A seed stock was created by crushing LtpG crystals obtained from the optimal BisTris-NaCl. Trays were set up using a serial dilution of seed stock along with soluble protein and reservoir buffer. The optimal BisTris/NaCl condition was used as the base condition. Although crystals were produced, crystal seeding did not result in larger crystals being formed.

LtpG appears to form lots of small crystals rapidly but never grow to a sufficient size. This may be due to too many nucleation sites and hence large crystals are unable to form. To reduce nucleation, lower protein concentrations and incubation temperatures were tested. Protein drops containing 5 mg/ml and

10 mg/ml were used and incubated at 4°C and 12°C using the BisTris/NaCl buffer condition. No crystals were formed at 4°C at either protein concentration. Crystals were obtained at 12°C for both protein concentrations. However, there was no improvement on crystal size.

To determine whether tighter crystal packing could be obtained and hence potentially produce better crystals for diffraction, cleavage of the N-terminal His tag was attempted using the encoded thrombin cleavage site. As the His tag is likely to be disordered, its presence may have a negative effect on the crystal lattice. Recombinant LtpG was incubated with thrombin overnight at 4°C. Cleaved and uncleaved populations were separated by Ni2+-NTA chromatography and analysed by SDS-PAGE and coomassie staining (Figure 3.25).

However, incubation of LtpG with thrombin did not yield a cleavage product. Efficient cleavage of the

His-tag should result in the protein no longer binding to the Ni2+-NTA resin. Although some protein was recovered in the flowthrough (highlighted by the red arrow), the majority remained on the resin and had to be eluted off using imidazole (highlighted by the blue arrow). This suggests that perhaps the thrombin cleavage site is inaccessible.

125

Flowthrough fractions Elution

F23 M F1 F2 F3 F4 F5 F6 F7 F8 F9 F24 F25 F26 175 kDa-

80 kDa-

58 kDa-

46 kDa- LtpG

30 kDa-

23 kDa-

Figure 3.25. The His-tag cannot be efficiently cleaved from recombinant His-LtpG. N-terminally His-tagged LtpG was incubated with thrombin at 4°C overnight. The protein was subjected to a Ni2+-NTA purification on an ÄKTA Prime to determine efficiency of tag cleavage. The purification was monitored using absorbance at 280 nm and selected fractions analysed by SDS-PAGE and Coomassie staining.

As an alternative to cleaving off the His-tag to remove flexible regions to enhance crystal packing, a novel LtpG construct (His-LtpG SL) was created (Figure 3.26). The initial construct consisted of a long

15 amino acid linker between the thrombin site and the starting ATG of the endogenous sequence. In the new construct, this linker was reduced to 2 amino acids.

Figure 3.26. Comparison of linker lengths of N-terminally His-tagged LtpG constructs. (A) Original His-LtpG construct with a 15 amino acid linker between thrombin cleavage site and the LtpG initiator methionine. (B) New LtpG construct (His- LtpG SL) with a 2 amino acid liniker between the thrombin cleavage site and the LtpG initiator methionine. Underlined regions are predicted to be disordered.

126

A complete crystallisation condition screen using His-LtpG SL was performed. Again, this construct produced crystals under the BisTris and NaCl condition (Figure 3.27A and B). However, in addition, a novel crystal form was obtained under 100 mM HEPES sodium salt, 2% w/v polyethyleneglycol 4000,

25% v/v 2-methyl-2,4-pentanediol, pH 7.5 conditions (Figure 3.27C and D). This new crystal form, however, produced a diffraction pattern of only 25-30Å resolution and as such was not pursued any further (data not shown).

Figure 3.27. Crystal forms of His-LtpG SL. (A) 100 mM BisTris, 3M NaCl, pH 6.5. (B) 100 mM BisTris propane, 3.2M NaCl, pH 7.0. (C) and (D) 100 mM HEPES sodium salt, 2% (w/v) PEG4000, 25% (v/v) MPD, pH 7.5.

As the DSF data suggested that GDP stabilised LtpG’s melting temperature, we attempted to improve crystal diffraction by co-crystallisation of LtpG with small molecules. In addition to varying concentrations of GDP for co-crystallisation, a concentration gradient of MgCl2 was trialled. Again, addition of GDP and MgCl2 did neither alter crystal size nor improve diffraction pattern resolution (data not shown). Due to the inability to improve crystals substantially from the initial conditions, crystallisation trials were halted.

127

3.11. Identifying effector-host protein interactions

As Fic domain proteins interact and modify other proteins, we set out to determine host cell binding partners of LtpG to elucidate its function. Identification of binding partners may not only reveal a putative substrate protein for the Fic domain of LtpG and therefore aid discovery of the PTM which

LtpG catalyses but may also help identify any potential signalling pathways in which LtpG modulates independent of the Fic domain. For most novel Fic domain activities (VopS, AnkX, Doc), their biochemical function was only elucidated upon finding their binding partners. Two methods were used to screen for potential interactions partners: yeast-2-hybrid and in vitro pulldowns using recombinant protein.

3.12. Yeast-2-hybrid screen

In a yeast-2-hybrid screen, the yeast transcription factor GAL4 is split into a DNA binding domain (BD) and an activation domain (AD) (Figure 3.28). Each domain is fused to the N-terminus of two putative interacting partners: bait and prey proteins. Typically a prey library is used against a specific bait protein to sample as many potential interacting partners as possible. If the bait and prey proteins interact, the

GAL4 transcription factor is reconstituted and hence active. By using specific reporter yeast strains in which essential genes for nutrient production are under the control of GAL4 transcription factor- dependent promoters, growth on selective media provides a readout for interaction. To increase the stringency of interaction, a quadruple dropout (QDO) (-Trp, -Leu, -His, -Ade) medium is used along with X-α-Gal. The BD-fused bait is provided on a plasmid (pGBKT7) which encodes TRP1 restoring the tryptophan biosynthesis pathway. In contrast, the AD-fused prey genes are provided on a plasmid encoding LEU2 to allow leucine synthesis. The GAL4 transcription factors can bind multiple GAL promoters which control histidine (HIS3) and adenine (ADE2) synthesis. HIS3 is under the control of a G1 promoter whilst ADE2 is controlled by a G2 promoter. In addition, a M1 promoter controls the transcription of MEL1, a α-galactosidase, allowing the yeast to metabolise X-α-Gal (5-bromo-4-chloro-

3-indoyl-α-D-galactopyranoside) to create the insoluble blue compound (5,5’-dibromo-4,4’-dichloro- indigo), a dimer formed from the cleaved 5-bromo-4-chloro-indoxyl product. These three promoters can all be bound by the DNA-BD of GAL4 using a consensus upstream activating sequence but are

128 otherwise unrelated, providing a more stringent readout for GAL4 reconstitution. Blue colonies which can be detected on QDO/X-α-Gal plates therefore suggest that the GAL4 was reconstituted through an interaction between bait and prey proteins. To determine the identity of the prey proteins, the DNA sequence is amplified by colony PCR and analysed by DNA sequencing.

Figure 3.28. Schematic of the yeast-2-hybrid (Y2H) assay. Plasmid constructs of bait protein fused to the DNA binding domain (BD) of the GAL4 transcription factor and prey protein fused to the activation domain (AD) of the GAL4 transcription factor are co-transformed into yeast. If the bait and prey interact, the GAL4 BD and AD will be in close enough proximity for the GAL4 transcription factor to be reconstituted. This causes transcription of reporter amino acid and nucleobase biosynthetic genes which enable the yeast to grow on synthetically defined media lacking these critical amino acids and nucleobases (QDO).

129

As the Fic domain was toxic when ectopically expressed in BY4741 yeast, we determined whether BD- fused LtpG also caused cytotoxicity and hence whether LtpG would be amenable to an Y2H screen which uses the AH109 S. cerevisiae strain. BD-fused LtpG H263A was simultaneously tested to determine if any cytotoxicity was due to the Fic domain activity. AH109 was transformed with either pGBKT7, pGBKT7-LtpG or pGBKT7-LtpG H263A and growth over 24 h measured (Figure 3.29).

Both LtpG and LtpG H263A showed reduced growth compared to the control AH109 strain transformed with pGBKT7. However, pGBKT7-LtpG H263A strain showed better recovery at 24 h than WT LtpG

(89% vs 61% of the control OD). As such LtpG H263A was used for the Y2H screen.

Figure 3.29. Yeast expressing BD-fused LtpG exhibits a growth defect in a Fic-dependent manner. Yeast AH109 were transformed with pGBKT7-LtpG or –LtpG H263A and their growth monitored by OD600 over 24 h.

130

To this end, a MATa strain of AH109 yeast encoding BD-fused LtpG H263A was mated with a MATα strain library of Y187 yeast encoding AD-fused human open reading frames. The subsequent progeny were plated onto QDO/ X-α-Gal plates. A mating efficiency of 0.85% resulted in 8 x 105 colonies being screened for interaction. Over 17 days, 35 colonies grew on QDO media. DNA-inserts from 30 of the

35 colonies were purified. Subsequent DNA sequencing and BLAST searches revealed 20 coding inserts with 13 putative interactors of LtpG being identified in the screen (Table 3.4).

Table 3.4. Table of putative LtpG interactors from the Y2H screen.

131

Four of the hits had nuclear localisations, although one of these putative interactors HSPC291 appears to be poorly annotated and was hence excluded from subsequent experiments. The remaining three nuclear hits (CDK7, phospholipid scramblase 1, UBE2T) were taken forward as putative interactors and direct Y2H performed to validate these interactions. AH109 yeast was co-transformed with pGBKT7-baits (LtpG WT and LtpG H263A) and pGADT7-preys (CDK7, phospholipid scramblase 1 and UBE2T Y2H inserts) and plated onto DDO and QDO media agar (Figure 3.30). pGADT7-Rab1a was included as an additional negative control.

Figure 3.30. LtpG interacts with the CDK7 fragment from the Y2H screen. Direct Y2H to confirm putative nuclear localising interaction partners of LtpG. Yeast AH109 were co-transformed with pGBKT7-bait (empty vector, LtpG or LtpG H263A) and pGADT7-prey Y2H hits (empty vector, CDK7 [c2b], phospholipid scramblase 1 [c9b], UBE2T [c4c] or Rab1A). Multiple colonies were pooled and patched on DDO and QDO plates. Plates were incubated at 30°C. Pictures were taken 3 days post-patching.

Only the co-transformed yeast expressing the CDK7-prey and LtpG H263A-bait plasmids was able to grow on QDO. None of the three putative hits grew with LtpG WT. As production of the yeast prey library inserts random ORF fragments into the pGADT7 vector, many of the hits contain partial proteins or also encode non-coding regions. The CDK7 Y2H fragment encoded amino acids 157-346 of CDK7 in addition to 300 bases downstream of the coding mRNA. The UBE2T fragment encoded all 197 amino acids of the full length protein but also included 100 and 200 non-coding bases upstream and downstream of the coding sequence respectively.

To determine whether LtpG could interact with full length CDK7 and ensure that the 300 non-coding bases did not cause an interaction artefact, full length CDK7 was cloned into pGADT7. Similarly the

132 open reading frame of UBE2T without the non-coding regions was cloned into pGADT7 to confirm that these non-coding regions did not interfere with any potential binding. A direct Y2H was performed using these clean full length CDK7 and UBE2T AD-fused constructs with BD-fused LtpG WT and

H263A (Figure 3.31). The Y2H CDK7 fragment was included as a positive control. Neither full length

CDK7 nor UBE2T grew on QDO when co-transformed LtpG WT/H263A. However, the CDK7 fragment showed interaction with both LtpG WT and H263A.

Figure 3.31. LtpG does not interact with full length CDK7 or UBE2T. Yeast AH109 were co-transformed with pGBKT7- baits (empty vector, LtpG or LtpG H263A) and pGADT7-preys (empty vector, full length CDK7, Y2H CDK7 hit or full length UBE2T). Multiple colonies were pooled and patched onto DDO and QDO plates. Plates were incubated at 30°C. Pictures were taken 3 days post-patching.

133

Yeast also encodes a CDK7 homologue, KIN28. However, as CDK7 is an important regulator of cell cycle, overexpression of it may be detrimental to cellular growth of yeast given its similarities to KIN28.

Transformations involving full length CDK7 generally resulted in fewer colonies suggesting that it may be toxic (data not shown). To determine whether the kinase activity of CDK7 is preventing interaction with LtpG, two individual point mutations were inserted into CDK7 constructs. Lysine 41 is required for chelating an Mg2+ ion in the active site of CDK7 which in turn is required for ATP binding [Lolli et al., 2004]. A K41A mutation therefore renders CDK7 catalytically inactive. Threonine 170 is a key regulatory residue and phosphorylation of this residue is required for CDK7 to be active [Fisher et al.,

1994]. A T170A mutation prevents phosphorylation at this site and hence disables CDK7. Furthermore, a fragment of CDK7 consisting of amino acids 157-346 was subcloned from the Y2H insert to ensure that the interaction between the CDK7 fragment and LtpG was not due to the downstream non-coding

300 base pairs which are encoded in the Y2H prey plasmid.

Direct Y2H was performed using these point mutants in both CDK7 and the AA157-346 fragment.

None of the full length CDK7 co-transformed constructs grew on QDO whilst all AA157-346 fragments showed interaction with both LtpG WT and H263A (Figure 3.32). This suggests that the non-coding

300 bases from the Y2H CDK7 fragment did not alter its ability to bind to LtpG. The reverse fusion constructs were also included (AD-fused baits and BD-fused preys) to minimise the chance of any artefacts. Similarly with the BD-fused baits and AD-fused preys, interactions between LtpG

WT/H263A and CDK7 AA157-346 was observed. Again this interaction was independent of the T170A mutation.

134

Figure 3.32. The non-coding regions of the Y2H CDK7 hit do not affect its ability to bind LtpG. (A) Yeast AH109 were co-transformed with pGBKT7-baits (empty vector, LtpG or LtpG H263A) and pGADT7-preys (empty vector, Y2H CDK7 hit, CDK7 AA157-346, CDK7 AA157-347 T170A, CDK7, CDK7 K41A or CDK7 T170A). (B) The reciprocal co-transformations were performed using pGBKT7-preys and pGADT7-baits. Multiple colonies were pooled and patched onto DDO and QDO plates. Plates were incubated at 30°C. Pictures were taken 3 days post-patching.

135

To ensure that the activity of CDK7 did not mask any potential interaction with LtpG, a double point mutant CDK7 K41A T170A was cloned into pGADT7 and a direct Y2H performed. Similarly with the

WT and single point mutants, the CDK7 double mutant (K41A T170A) did not grow on QDO with neither LtpG WT nor H263A suggesting that full length CDK7 is unable to interact with LtpG using a

Y2H system (Figure 3.33). As the readout of interaction requires reconstitution of the GAL4 transcription factor, it is plausible that interaction between CDK7 and LtpG positions the AD and BD in such a way that they are unable to complex together. Using the shorter AA157-346 construct may alleviate these constraints hence enable formation of the GAL4 transcription factor.

Figure 3.33. LtpG is unable to interact with full length CDK7 regardless of activation state. Yeast AH109 were co- transformed with pGBKT7-baits (empty vector, LtpG or LtpG H263A) and pGADT7-preys (empty vector, CDK7 AA157- 346, CDK7 AA157-347 T170A, CDK7, CDK7 K41A, CDK7 T170A or CDK7 K41A T170A). Multiple colonies were pooled and patched onto DDO and QDO plates. Plates were incubated at 30°C. Pictures were taken 3 days post-patching.

136

3.13. In vitro pulldowns

To provide a complementary screen for protein-protein interactions and also validate the putative interaction of LtpG with CDK7, an in vitro pulldown was performed using recombinant immobilised

LtpG and THP-1 cell lysate. A dual N-terminal His6 and C-terminal Strep II tagged construct of LtpG

(His-LtpG-Strep) was used for these experiments. After an initial His/Ni2+-NTA purification, recombinant His-LtpG-Strep was incubated with mammalian cell lysate and allowed to complex with its binding partners. Bait complexes were then pulled down using Streptactin agarose resin. Background binders were removed by washing and bound proteins identified by liquid chromatography tandem mass spectrometry (LC-MS/MS). By utilising tandem affinity purification, the number of background bacterial proteins from the initial E. coli expression will be reduced as well as ensuring that only full length LtpG and its interacting partners are purified.

137

3.13.1. In-gel digest proteomics

A B

Lysate

+ + + - +

- Lysate

- + + - + +

Bait

- - H263A LtpG H263A LtpG WT LtpG WT

LtpG Bait

H263A LtpG - LtpG H263A LtpG WT - LtpG WT

175 kDa- 175 kDa- 80 kDa- 80 kDa- 4 58 kDa- 3 58 kDa- LtpG 46 kDa- 46 kDa-

30 kDa- 30 kDa-

23 kDa- 23 kDa-

17 kDa- 17 kDa-

7 kDa- 2 7 kDa- 1

Figure 3.34. Novel protein bands appear in LtpG pulldowns. Recombinant dual His- and Strep-tagged LtpG and LtpG H263A were incubated with mammalian cell lysate and pulled down using StrepTactin resin. Immunoprecipitated proteins were separated by SDS-PAGE and analysed by (A) silver and (B) Coomassie staining. Lysate only and recombinant LtpG only samples were used as negative controls. Recombinant LtpG is indicated by the black arrow head. Novel bands are highlighted by red arrowheads. Numbered bands (1-4) were excised for MS analysis.

An initial experiment used both His-LtpG WT-Strep and His-LtpG H263A-Strep as bait proteins. Bait protein alone and lysate only samples were included as negative controls. After the pulldown, the samples were separated by SDS-PAGE and visualised by silver and Coomassie staining (Figure 3.34).

Although recombinant LtpG concentrations were normalised according to Nanodrop concentration measurements (based on A280 readings and protein extinction coefficients), the amount of LtpG H263A was much less than LtpG WT as evidenced by a significantly fainter band in the silver and Coomassie stained gels (black arrowheads). In addition, there appears to be significant impurities in the recombinant protein preps as shown by the presence of additional bands in the bait protein only lanes

138 in the silver stain. This suggests that an initial His/Ni2+-NTA purification followed by a Strep

II/Streptactin pulldown is insufficient to obtain pure protein. Subsequent purifications of recombinant baits included a gel filtration step to further purify bait proteins after the Ni2+-NTA purification before incubation with cell lysate. Furthermore, a large number of proteins found in cell lysates bind to the resin alone (without any bait protein) as shown by the presence of bands in the cell lysate only lanes.

This suggests that preclearing of the lysate with unloaded resin is important to reduce background binders from being detected in the MS. Even though there were many background/unspecific bands, specific bands only found in the bait + lysate lanes could be observed in both the silver and coomassie stains (highlighted with red arrowheads). To determine the identity of these bands, they were excised and tryptically digested in-gel. The peptide mixtures were analysed by LC-MS/MS in collaboration with Dr. Jyoti Choudhary (Sanger Institute, Cambridge, UK).

139

Gel slice 4 Peptides (MS/MS count) Gel slice 3 Peptides (MS/MS count) LtpG H263A Control LtpG WT LtpG H263A Control LtpG WT HSP90 17 (28) #N/A 34 (79) 78kDa glucose-regulated protein 7 (9) 7 (7) 32 (85) Myosin-9 6 (6) 4 (4) 7 (7) Myosin-9 3 (3) 5 (5) 6 (6) Alpha-actinin-4 5 (6) 7 (8) 17 (20) Wiskott-Aldrich syndrome protein family member 2 1 (1) #N/A 1 (1) Actin 1 (5) 4 (10) 6 (7) Keratin, type I cytoskeletal 10 1 (1) 2 (2) 5 (7) Keratin, type II cytoskeletal 1 3 (4) 1 (2) 5 (7) Keratin, type II cytoskeletal 79 1 (1) #N/A #N/A Keratin, type I cytoskeletal 10 2 (2) 1 (1) 5 (7) Prelamin 1 (1) 1 (1) 7 (7) Tubulin alpha 1 (1) #N/A 1 (1) Actin 1 (1) 4 (9) 3 (3) Gelsolin 2 (2) 1 (1) 11 (16) Hematopoietic lineage cell-specific protein 1 (1) #N/A 2 (2) Keratin, type II cytoskeletal 79 1 (1) #N/A #N/A Vesicle-fusing ATPase 1 (1) #N/A #N/A Far upstream element-binding protein 2 1 (1) #N/A 2 (2) Keratin, type II cytoskeletal 1 1 (1) 2 (2) 8 (10) SH3 domain-containing kinase-binding protein 1 1 (1) #N/A 5 (5) Propionyl-CoA carboxylase alpha chain, mitochondrial #N/A 8 (9) 1 (1) Propionyl-CoA carboxylase alpha chain, mitochondrial #N/A 1 (1) #N/A Alpha-actinin-4 #N/A 5 (5) 7 (8) Propionyl-CoA carboxylase beta chain, mitochondrial #N/A 1 (1) #N/A Keratin, type II cytoskeletal 2 #N/A 3 (3) #N/A Keratin, type II cytoskeletal 2 epidermal #N/A 1 (1) #N/A Procollagen galactosyltransferase 1 #N/A 2 (2) #N/A Tropomyosin alpha-3 chain #N/A 1 (1) 1 (1) Thioredoxin #N/A 1 (1) #N/A Ras GTPase-activating-like protein IQGAP #N/A #N/A 4 (4) Acetyl-CoA carboxylase 1 #N/A 1 (1) #N/A Keratin, type I cytoskeletal 9 #N/A #N/A 2 (2) Myosin regulatory light chain 12A #N/A 1 (1) #N/A Elongation factor 2 #N/A #N/A 2 (2) Heat shock cognate 71 kDa protein #N/A 1 (1) 3 (3) Apolipoprotein B receptor #N/A #N/A 1 (1) Methylcrotonoyl-CoA carboxylase subunit alpha, mitochondrial #N/A 1 (1) #N/A Pericentriolar material 1 protein #N/A #N/A 1 (1) Trifunctional enzyme subunit alpha, mitochondrial #N/A 1 (1) #N/A Protein phosphatase 1 regulatory subunit 12A #N/A #N/A 1 (1) HSP90 #N/A #N/A 10 (11) Importin subunit beta-1 #N/A #N/A 1 (1) Pro-IL16 #N/A #N/A 3 (3) Glucosidase 2 subunit beta #N/A #N/A 1 (1) Stress-70 protein, mitochondrial #N/A #N/A 3 (4) U4/U6.U5 tri-snRNP-associated protein 1 #N/A #N/A 1 (1) Zyxin #N/A #N/A 4 (4) Disks large-associated protein 5 #N/A #N/A 1 (1) Tubulin alpha #N/A #N/A 1 (2) 26S proteasome non-ATPase regulatory subunit 2 #N/A #N/A 1 (1) Far upstream element-binding protein 2 #N/A #N/A 2 (2) La-related protein 4B #N/A #N/A 1 (1) Vesicle fusing ATPase #N/A #N/A 1 (1) CTP synthase 1 #N/A #N/A 1 (1) Serine/threonine-protein phosphatase 6 regulatory subunit 1 #N/A #N/A 1 (1) CD2-associated protein #N/A #N/A 1 (1) CTP synthase 1 #N/A #N/A 2 (2) 78 kDa glucose-regulated protein #N/A #N/A 1 (1) Alpha taxilin #N/A #N/A 1 (1) Integrin beta #N/A #N/A 1 (1) Protein phosphatase 1G #N/A #N/A 1 (1) Differentially expressed in FDCP 6 homolog #N/A #N/A 1 (1) Tubulin beta #N/A #N/A 1 (1) Heat shock 70kDa protein 6 #N/A #N/A 1 (1) La-related protein 4B #N/A #N/A 1 (1) Rab GTPase-binding effector protein 2 #N/A #N/A 1 (1) Keratin, type I cytoskeletal 9 #N/A #N/A 1 (1) Protein phosphatase 1 regulatory subunit 12A #N/A #N/A 1 (1) SH3 domain-binding glutamic acid-rich-like protein 3 #N/A #N/A 1 (1) Gelsolin #N/A #N/A 2 (2) LIM domain and actin-binding protein 1 #N/A #N/A 1 (1) CapZ-interacting protein #N/A #N/A 1 (1) Probable ATP-dependent RNA helicase DDX17 #N/A #N/A 1 (1) Protein transport protein Sec16A #N/A #N/A 1 (1) Gel slice 2 Peptides (MS/MS count) LtpG H263A Control LtpG WT Tubulin alpha 2 (5) 2 (8) Gel slice 1 Peptides (MS/MS count) Keratin, type I cytoskeletal 10 3 (3) 2 (2) LtpG H263A Control LtpG WT S100-A4 4 (4) 8 (17) Keratin, type II cytoskeletal 1 6 (9) 2 (3) Keratin, type II cytoskeletal 1 2 (2) 3 (3) Keratin, type I cytoskeletal 10 7 (8) 2 (2) NADH dehydrogenase 1 (1) 1 (1) S100-A6 3 (7) 1 (1) Keratin, type II cytoskeletal 79 1 (1) #N/A Stretavidin 1 (1) #N/A Myosin regulatory light chain 12A 1 (1) #N/A Tubulin beta 1 (1) #N/A Trypsin-1 1 (2) #N/A Tubulin alpha 1 (1) #N/A Streptavidin 1 (1) 2 (2) Small nuclear ribonucleoprotein G 1 (1) #N/A Actin 1 (1) 3 (3) 40s ribosomal protein S29 1 (1) #N/A SH3 domain-binding glutamic acid-rich-like protein 3 2 (2) 5 (9) 40s ribosomal protein S28 1 (1) #N/A 40S ribosomal protein S21 #N/A 2 (4) REST corepressor 2 1 (1) #N/A Tubulin beta #N/A 1 (1) ATP synthase subunit e 1 (1) #N/A Thioredoxin #N/A 1 (1) Keratin, type I cytoskeletal 9 #N/A 1 (1)

Table 3.5. Table of MS protein identifications for each gel slice. Number of unique peptides and spectral counts for each protein are shown. #N/A = protein not identified.

In general the samples provided very weak signals and few proteins were identified from the gel slices.

Of the proteins identified, many were inferred from a single unique peptide match and hence does not provide strong evidence of its presence in the sample. The maximal number of matched MS/MS spectra in a single sample was 229 whilst most samples had fewer than 100 identified MS/MS spectra. This is a poor return given that current mass spectrometers are capable ~25000 MS/MS scans over a typical 2 h LC run.

However, HSP90 was identified as a putative interactor (Table 3.5). It was most robustly identified in gel slice 4 with a few peptides also found in gel slice 3. 34 and 17 HSP90 peptides were identified from

140 gel slice 4 for LtpG WT and H263A respectively whilst no peptides were found in the control sample.

Gel slices 1 and 2 were difficult to analyse confidently as there was no negative control sample for those samples due to limited MS time. However, most proteins identified in gel slices 1 and 2 are common contaminants such as keratin, tubulin and actin. However, S100A4 was also identified with 8 and 4 peptides from LtpG WT and H263A respectively in gel slice 2. However, without a negative control sample of gel slice 2, there is not sufficient evidence to suggest that S100A4 may be an interacting partner.

3.13.2. On-bead digest proteomics

Due to the weak MS signals obtained from in-gel tryptic digest, the pulldown was repeated with enriched proteins digested on-bead. This eliminates additional manipulation steps (such as SDS-PAGE) in which proteins/peptides could be lost. Lem28 (an uncharacterised Legionella effector) was included as an additional negative control to control for unspecific binding to bait proteins. All subsequent MS experiments were performed in-house.

On-bead digest yielded more protein IDs than in-gel digest. Across the seven MS runs, 1345 proteins were identified by 54833 MS/MS spectra (~8000 matched MS/MS spectra per sample). Using spectral counts as a quantitative measure of protein abundance, stringent filtering was performed in which proteins with spectral counts in any of the four controls (lysate and baits alone) were filtered out.

Additionally any E. coli proteins were also filtered out from the dataset. The remaining proteins were ranked according number of MS/MS spectra.

The bait proteins were identified in the appropriate samples (Table 3.6). However, occasional MS/MS spectra were assigned to bait proteins in samples where they should have been absent. This may be due to residual peptides left on the column from the previous MS run or misassignment of the spectra by the software. Nevertheless, ~300 spectra were assigned to Lem28 in Lem28 containing samples whilst a maximum of 13 spectra were assigned in Lem28 absent samples. Approximately 800 spectra were assigned to LtpG (WT/H263A) in LtpG containing samples. Only 6 spectra were assigned to LtpG in the lysate only sample with no peptides matching in the Lem28 alone sample. However, 217 spectra were assigned to LtpG in the Lem28+lysate sample.

141

Spectral counts Lem28 LtpG WT LtpG H263A + + + lysate only Lem28 LtpG WT LtpG H263A lysate lysate lysate Protein names Gene names 0 283 1 13 304 1 11 Lem28 Lem28 6 0 891 827 217 833 767 LtpG LtpG 1 0 0 0 1 111 147 Heat shock protein HSP 90-alpha HSP90AA1 3 0 0 0 4 75 81 Heat shock protein HSP 90-beta HSP90AB1

Table 3.6. Table showing spectral counts of the bait proteins Lem28, LtpG and the putative interactor HSP90 for each sample.

HSP90 was originally filtered from the list of putative interactors as there were a few spectra found in the negative controls. However, upon manual inspection, only 1 and 3 spectra were found in the lysate only negative control for HSP90AA1 and HSP90AB1 respectively. In contrast, 111 and 147 spectra were identified as HSP90AA1 peptides for LtpG WT and H263A samples. 75 and 81 spectra were assigned to HSP90AB1 for LtpG WT and H263A samples respectively. The results suggest that HSP90 remains a prominent putative interaction partner of LtpG. However, as bait proteins were produced recombinantly, it is plausible that chaperones are detected as potential interacting partners due to their ability to bind misfolded protein.

There were not many proteins specific for Lem28 after the initial filtering. Only one protein

(sulphide:quinone oxidoreductase, mitochondrial) had at least 2 MS/MS spectra which was also specific to the Lem28+lysate sample compared with LtpG WT+lysate and LtpG H263A+lysate. As such the remaining list of proteins were essentially all putative LtpG interactors. Proteins were ranked according to the number of spectral counts in the LtpG WT+lysate sample and a table of the top 25 filtered proteins is shown in Table 3.7. HSP90AA1 and HSP90AB1 were also included in this table. The data appears consistent between LtpG WT and H263A suggesting that Fic domain activity did not alter its binding partners and therefore also implies that the mutation does not alter the structure of LtpG.

142

Spectral counts Lem28 LtpG WT LtpG H263A + + + lysate only Lem28 LtpG WT LtpG H263A lysate lysate lysate Protein names Gene names 1 0 0 0 1 111 147 Heat shock protein HSP 90-alpha HSP90AA1 0 0 0 0 0 92 111 Neuroblast differentiation-associated protein AHNAK AHNAK 3 0 0 0 4 75 81 Heat shock protein HSP 90-beta HSP90AB1 0 0 0 0 0 42 45 Lysine-specific demethylase 3B KDM3B 0 0 0 0 0 25 43 Baculoviral IAP repeat-containing protein 6 BIRC6 0 0 0 0 0 18 21 Coatomer subunit alpha;Xenin;Proxenin COPA 0 0 0 0 0 16 10 Nucleolar GTP-binding protein 2 GNL2 0 0 0 0 0 16 7 Guanine nucleotide-binding protein-like 3-like protein GNL3L 0 0 0 0 0 13 5 Unconventional myosin-XVIIIa MYO18A 0 0 0 0 0 12 13 Nascent polypeptide-associated complex subunit alpha NACA 0 0 0 0 0 12 24 Structural maintenance of chromosomes protein 1A SMC1A 0 0 0 0 0 11 4 Transcription factor ETV6 ETV6 0 0 0 0 0 11 10 La-related protein 4B LARP4B 0 0 0 0 0 11 15 Structural maintenance of chromosomes protein 3 SMC3 0 0 0 0 0 9 6 RAVER1 0 0 0 0 0 9 13 Double-strand-break repair protein rad21 homolog RAD21 0 0 0 0 0 8 11 Protein disulfide-isomerase P4HB 0 0 0 0 0 8 16 Myotubularin-related protein 14 MTMR14 0 0 0 0 0 8 14 COBW domain-containing protein 1;COBW domain-containing protein 3;COBWCBWD1;CBWD3;CBWD5;CBWD7;CBWD6 domain-containing protein 5;Putative COBW domain-containing protein 7;COBW domain-containing protein 6 0 0 0 0 0 7 7 Dynactin subunit 1 DCTN1;DKFZp686E0752 0 0 0 0 0 7 11 Transcription factor BTF3 BTF3 0 0 0 0 0 7 5 Mothers against decapentaplegic homolog 2;Mothers against decapentaplegicSMAD2;SMAD3 homolog 3 0 0 0 0 0 7 8 CREB-regulated transcription coactivator 2 CRTC2 0 0 0 0 0 7 3 Guanine nucleotide-binding protein-like 3 GNL3 0 0 0 0 0 7 6 ATPase family AAA domain-containing protein 3A;ATPase family AAA domain-containingATAD3A;ATAD3B;ATAD3C protein 3B;ATPase family AAA domain-containing protein 3C

Table 3.7. Table showing spectral counts of the top 25 putative LtpG interaction partners after stringent filtering. Proteins were ranked according to spectral counts.

143

To narrow the list of putative interactors of LtpG, they were grouped and clustered together using the

STRING database (http://string-db.org/) to identify whether known protein complexes were detected

(Figure 3.35) [Szklarczyk et al., 2015]. The putative 220 interacting proteins were inputted for this analysis.

Figure 3.35. Putative LtpG interactors cluster into three main groups. Network of the 220 putative interaction partners using the STRING database. Proteins were clustered using the in-built algorithm. Each node represents an individual protein and each edge represents a relationship between two proteins. [Szklarczyk et al., 2015]

Many proteins did not associate with any other protein from the list and are represented as individual nodes with no edges. However, some clusters could be seen. In particular, there are two major clusters: ribosomal proteins and the coatomer COPI complex. A third cluster of mainly chaperones can also be seen but this is less distinct. Ribosomes are highly abundant structures in the cell and hence are likely unspecific binders. Furthermore, closer inspection of the data reveals that most of these proteins had few spectral counts (~1-2). Additionally some were detected in the Lem28+lysate sample, providing additional evidence that they may bind promiscuously to any bait protein.

The COPI complex, involved in coating vesicles for retrograde (cis-Golgi to ER) trafficking, consists of 7 subunits: COPA, COPB1, COPB2, COPG, ARCN1, COPE and COPZ. The proteomics results identified all but the COPZ subunit in both LtpG WT and H263A samples. In addition, no MS/MS

144 spectra from any of the control samples were assigned to any of these components. The COPA subunit had the most spectral counts with 18 and 21 matched for WT and H263A respectively, suggesting that

LtpG may bind to the complex via the COPA subunit.

A direct Y2H was tested to confirm the interaction between LtpG and the coatomer complex as well as determine which component LtpG directly interacted with. To this end, all full-length COPI subunits were cloned into the prey vector pGADT7. COPG was included as an additional bait protein and CDK7

AA157-346 as an additional prey as positive controls. Although the positive controls grew on QDO, no growth was seen between any of the coatomer COPI subunits with either LtpG WT or LtpG H263A

(Figure 3.36). This suggests that either the interaction is not amenable to analysis by Y2H or that LtpG does not directly interact with the COPI complex. It is also possible that the hit was a false positive from the proteomics workflow.

Figure 3.36. LtpG does not interact with any of the COPI subunits in a Y2H assay. Yeast AH109 were co-transformed with pGBKT7-baits (empty vector, LtpG, LtpG H263A or COPG) and pGADT7-preys (empty vector, COPA, COPB1, COPB2, COPG, ARCN1, COPE, COPZ or CDK7 AA157-346). Multiple colonies were pooled and patched onto DDO and QDO plates. Plates were incubated at 30°C. Pictures were taken 3 days post-patching.

145

3.13.3. Competition pulldown and quantitative proteomics by dimethyl labelling

To try and distinguish between true interactors and background binders, a competition experiment was attempted (Figure 3.37).

Figure 3.37. Schematic of the competition pulldown experiment. Increasing concentrations of recombinant His-LtpG was used to deplete mammalian cell lysates of any specific LtpG binders. His-LtpG and its interaction partners are removed from the lysate using Ni2+-NTA. The depleted lysate is then added onto Streptactin resin pre-loaded with His-LtpG-Strep. Bound proteins are digested on-bead and peptides light dimethyl labelled. Each peptide sample is spiked with a heavy dimethyl labelled non-competed sample to enable comparison between samples. Peptide mixtures are analysed by LC-MS/MS. Dose response curves of protein intensity relative to the non-competed spike against increasing concentration of competing LtpG enable LtpG-specific interactors to be determined.

146

Various concentrations of His-LtpG were pre-incubated with the cell lysate. His-LtpG and its binding partners were removed using Ni2+-NTA resin. The resulting precleared lysate (depleted in LtpG-specific binding partners depending on the concentration of His-LtpG used) was added to StrepTactin resin pre- loaded with a fixed concentration of His-LtpG-Strep. As the His-LtpG will compete with His-LtpG-

Strep for interacting partners, an increase in concentration of His-LtpG should result in a decrease in abundance of interactors being pulled down with the bait His-LtpG-Strep. Using quantitative MS with dimethyl labelling, interactors should exhibit a dose-response curve with increasing His-LtpG whilst unspecific binders should be unperturbed by increasing His-LtpG concentrations. Five concentrations of His-LtpG were used (0, 0.1, 0.2, 0.5 and 1.0 mg) whilst the concentration of His-LtpG-Strep was constant at 0.1 mg for each sample. Quantification was done using a spike-in dimethyl label. Tryptic peptides from each sample were labelled with light dimethyl reagents resulting in a mass shift of +28

Da for each lysine or N-terminal amine. To each of these samples was spiked in at a 1:1 ratio a heavy dimethyl labelled (+32 Da) control sample which consists of a non-competed LtpG pulldown with lysate. As the heavy label is constant across all samples, comparison of the L/H ratios between each sample (ratio of ratios) enables the samples to be compared across different MS runs. The experiment was performed in triplicate.

This experiment proved technically challenging with multiple manipulations of the cell lysate through preclearing steps with the His-LtpG. This resulted in rather inconsistent MS datasets with many samples providing data with insufficient data points for analysis (Table 3.8). 6 out of the 15 samples (indicated in red) did not quantify enough proteins to rationalise further analysis of them within the dataset.

However, removal of these samples still enabled a 3-point concentration series for each of the replicates to be plotted to generate simplistic dose response curves.

147

Table 3.8 Table of number of quantifiable proteins for each MS sample. Samples highlighted in red were removed from subsequent analysis due to lack of data points.

All detected E. coli proteins were manually removed from the dataset. L/H ratios at each concentration were normalised to the non-competed sample of each replicate and dose-response curves of normalised log2 L/H ratio against concentration of His-LtpG plotted. A box plot at each concentration reveals that the data does not follow a normal distribution centred on 0 (Figure 3.38A). Additionally, the distribution of the data varies between the replicates, suggesting that the replicate datasets are not very reproducible.

However, the log2 ratio of LtpG (in red) does seem homogenous and is around 0 across all the samples as expected (Figure 3.38A). Due to the variability of the data, caution should be taken to conclude any substantial results from this dataset.

148

Figure 3.38. Dose response curves of normalised log2 L/H (competed/non-competed) ratios of all quantifiable proteins. Each protein is normalised relative to its non-competed L/H ratio. (A) Box plots showing the distribution of protein L/H ratios at each competed concentration. LtpG is highlighted in red. (B) NACA, BTF3 and BTF3L4 (highlighted in red) are depleted with increasing concentration of competitive LtpG. (C) COPI subunits (highlighted in red) do not show a does response. (D) HSP90AA1 and HSP90AB1 (highlighted in red) show a moderate does response to increasing concentration of competitive LtpG.

149

Across the three replicates, a few proteins showed does-dependent characteristics. Multiple components of the nascent polypeptide-associated complex NAC including NACA (nascent polypeptide-associated complex subunit alpha), BTF3 and BTF3L4 showed the strongest competition depletion (Figure 3.38B).

NAC components were also identified in the previous MS results specifically for LtpG (WT and

H263A). They are clustered with the ribosomal proteins as NAC binds emerging newly synthesised polypeptides and shields them from unwanted interactions with cytosolic proteins. Similarly with chaperones, the ability of NAC to bind unfolded peptide sequences makes it difficult to conclude whether it is a genuine interaction partner or an artefact of recombinant protein assays.

COPI complex components such as COPA, COPB1, COPB2, ARCN1 and COPG1 were also identified, however, they did not show competition dependency suggesting that they were not genuine binding interactors of LtpG supporting the direct Y2H data (Figure 3.38C).

HSP90 showed some competition dependency but to a weaker extent to NAC components (Figure

3.38D). This could be potentially rationalised by the high abundance of HSP90 in the cell and hence even with 10-fold excess (1 mg of His-LtpG), it is unable to completely deplete the HSP90 from the lysate.

150

3.14. Discussion

The experimental data suggest that LtpG is a T4SS effector protein which does not play a role in intracellular replication of L. pneumophila or is functionally redundant with another effector. During infection, it localises to both the nucleus and cytoplasm where it is mainly diffuse but can also be seen to associate with network structures. Currently only the effector RomA has been shown to target a nuclear protein, histone H3 [Rolando et al., 2013]. LtpG is a 59 kDa protein and is therefore unlikely to be able to pass into the nucleus without an active mechanism. As such, LtpG may require host importin machinery or L. pneumophila may translocate unknown effectors to promote nuclear localisation of LtpG [Miyamoto et al., 2016]. However, as ectopically expressed LtpG also localises to the nucleus, it suggests that its localisation may not require additional L. pneumophila proteins.

Although LtpG does not provide a strong phenotype in mammalian cells, ectopic expression of LtpG in yeast is toxic and dependent on its active Fic domain.

The observation that mutation of the critical Fic motif histidine to alanine does not completely restore yeast growth back to empty vector control levels can be rationalised by the fact that mutant LtpG may still bind its substrate protein and prevent its normal function by a steric hindrance mechanism. This intermediate phenotype may only be visible due to overexpression of LtpG whereby there is sufficient

LtpG to inhibit its target protein at stoichiometric levels. However, as the H263A mutant has no catalytic

Fic activity in vitro, this defect in growth may only be possible as long as LtpG is still binding to its partner protein and therefore the Fic mutant exhibits a weaker phenotype compared with the WT protein. Alternatively, LtpG may have additional functions beyond its Fic-domain which restricts yeast growth.

The observed Fic-dependent cytotoxicity in yeast was surprising given that no cytotoxicity was observed upon ectopic expression in mammalian cells. However, transfection was only over a 24 h period, equating to approximately the doubling time of HeLa cells, whilst the growth of yeast was followed over a 72 h time course (approximately 36 replication cycles given a 2 h doubling time).

Therefore, the cytotoxic effects of LtpG on mammalian cells may have been missed at the experimental

151 time scale. Alternatively, the target of LtpG may not be essential in mammalian cells. However, the

H263A-dependent cytotoxicity in yeast clearly shows that LtpG has enzymatic activity.

The auto-AMPylation of LtpG further supports this conclusion. Although LtpG readily modifies itself, no targets could be seen by in-gel fluorescence when recombinant LtpG was incubated with lysates and

Yn-6-TP. Although Yn-6-TP is a useful probe to study AMPylation modifications, it is limited by its inability to cross the cell membrane due to the negative charge of the triphosphate. As such, it can only be used in in vitro assays with recombinant proteins or cell lysates. As certain proteins such as membrane proteins are difficult to solubilise in an extracellular setting, the target of LtpG may not be present in in vitro reaction mixtures. Furthermore, AMPylation is a difficult PTM to study due to the wide use of ATP as both a substrate and an energy source by various proteins and hence Yn-6-TP is likely to be utilised and hydrolysed by other enzymes. However, only proteins which retain the alkyne- containing adenine ring will be visualised, enabling Yn-6-TP to remain as a specific probe.

DSF assays suggest that whilst LtpG is able to utilise ATP as a substrate, it has higher affinity to guanosine-containing nucleotides. Although GDP showed strongest binding amongst the metabolites tested in the DSF assay, their natural abundances in a cellular environment may govern LtpG’s binding to them rather than affinity alone. Furthermore, using radioactive GTP as a substrate, LtpG was unable to modify itself or any substrates in lysate. The notion that Fic domains bind exclusively diphosphate containing substrates is further supported by the ability of pyrophosphate to exhibit a similar positive melting temperature shift of LtpG to GDP (the best binder tested) [Mukherjee et al., 2011]. In addition, all NMPs tested resulted in very minor melting temperature shifts suggesting that monophosphate containing metabolites do not bind strongly in the Fic active site. As no obvious target of the Fic enzymatic activity could be detected in either AMPylation or GMPylation assays and with only a small subset of diphosphate containing metabolites being tested in the DSF assay, it is highly possible that

LtpG utilises a novel diphosphate containing small molecule substrate not screened in our DSF panel.

Crystallisation of LtpG was attempted and although multiple crystal forms were obtained, the diffraction resolution was insufficient for the structure to be solved. As there were no guarantees that a structure could be solved, this aspect of the project was abandoned. However, structural data would

152 likely be a breakthrough in understanding the small molecule substrate specificity of LtpG using in silico docking of putative substrates as well as identifying additional domains beyond the catalytic Fic- domain core [Campanacci et al., 2013; Castro-Roa et al., 2013].

Multiple assays were attempted to determine the interaction partners of LtpG. A yeast-2-hybrid screen identified CDK7 as a putative hit. Although a C-terminal fragment consisting amino acids 157-346 of

CDK7 robustly interacted with LtpG, full length CDK7 did not. Variations of in vitro pulldowns were tried as an alternative assay. These experiments initially identified multiple components of the COPI complex. These interactions, however, could not be confirmed by a direct Y2H. Furthermore, in additional competition MS experiments, COP subunits were not depleted with increasing competing

LtpG. The nascent polypeptide-associated complex consisting of NACA and BTF3 showed competition dependency, adding confidence that it may be a true interacting partner of LtpG. HSP90 was also frequently detected in all MS experiments and appears to bind LtpG well. However, due to the nature of in vitro pulldowns in which the bait protein is produced recombinantly, binding of chaperones to misfolded bait protein remains a distinct possibility. Additionally, determination of whether chaperone binding is due to misfolding of the recombinant protein or is potentially a mechanism for LtpG to properly fold once injected into the host cell through the T4SS is impossible to decipher in this assay.

A third possibility exists in which LtpG binds HSP90 to alter its function. Although putative interactors were determined using Y2H and MS-based pulldown assays, there were no overlapping hits between the two methodologies. Although the competition experiment may aid in deciphering genuine binders from unspecific background using dose-response cures, the technical challenges of the competition pulldown assay forced extra variability in the results and hence reproducibility of replicates became a serious issue.

Furthermore, due to the nature of the assays which determine interactions in artificial environments, they do not recapitulate the unique nature during infection of host cells. To this end, we decided to no longer pursue identifying putative binding partners of LtpG by recombinant protein assays.

153

Chapter 4: Results – Developing mass spectrometric methods to determine effector binding proteins

4.1. Introduction

As most effectors share little-to-no sequence homology with known proteins, it is often difficult to predict function. Even for the case of LtpG for which a functional Fic domain was identified, its role during infection could still not be elucidated. Key to the functional characterisation of effectors is the identification of protein binding partners during infection. By determining their target proteins, functional analyses of the affected signalling pathways can be examined.

Although in vitro techniques, such as Y2H and pulldowns, have provided useful information, they are also likely to produce false positive hits due to the inability of the experimental design to account for the unique cellular changes during infection. The unique microenvironment created during infection cannot be replicated by studying effectors ectopically. In particular, L. pneumophila resides in the LCV inside host cells, a membranous compartment which is only created during infection.

Recent technological advances of protein arrays such as Nucleic Acid Programmable Protein Array

(NAPPA) have made them more amenable to study protein-protein interactions [Ramachandran et al.,

2004]. Whilst traditional protein arrays require spotting purified proteins onto a slide/matrix/membrane;

NAPPA only requires DNA to be spotted, which are subsequently translated in vitro. The translated protein is directly bound adjacent to the DNA using a pre-spotted antibody. Currently over 10,000 different proteins can be spotted across five slides. Importantly, the traditional problem of protein stability is less prominent as proteins are translated in situ and hence do not require storage. Additionally

NAPPA enables equimolar concentrations of proteins to be used rather than complex mixtures such as those found in cellular lysates. Although this does not represent physiological concentrations of proteins or a physiological environment, it may aid in identifying typically low abundance proteins and weak interacting partners. NAPPA has already been used to study SidM and LidA protein-protein interactions

[Yu et al., 2015a]. Furthermore, the clickable AMPylation assay has been modified for the NAPPA array format and AMPylated targets of SidM identified using this methodology [Yu et al., 2015b].

154

Co-immunoprecipitation experiments using ectopically expressed effector provides a small improvement to recombinant protein assays whereby organelle compartmentalisation remains intact during the experiment and hence protein-protein interactions are discovered in a cellular environment.

Although transfection allows effectors to be studied in cellular models individually, many effectors have been shown to act in concert and hence a transfection system may be too simplistic to understand effector functions. In addition, effectors are translocated into host cells through the Dot/Icm machinery.

In contrast, upon ectopic expression the effector enters the host cell through the normal protein translation pathway using host ribosomes and ER. Hence, subcellular targeting and the surrounding proteome may be vastly different compared with Dot/Icm translocation. Furthermore, effector-specific chaperones will be missing from the host cell in a transfection system and hence may prevent correct protein folding. However, in some cases, for example effectors which exist as toxin-antitoxin pairs such as SidM-SidD and AnkX-Lem3, only ectopic expression systems, when the antitoxin is absent, actually make phenotypes apparent and enable functional characterisation.

Figure 4.1. Schematic of the BirA/Bio-tag translocation dependent biotinylation methodology. Host cells expressing the biotin ligase BirA are infected with L. pneumophila strains expressing His6-Bio-tagged effectors. Upon translocation in the host cell through the Dot/Icm T4SS, the His6-Bio-tagged effector comes into contact with BirA and becomes biotinylated. As the effectors perform their functions and bind physiological host targets, these interactions are stabilised at the infection time point of interest using chemical crosslinking. The effector complexes are isolated using tandem-affinity purification (TAP) and their composition determined by liquid chromatography tandem mass spectrometry (LC-MS/MS). Taken from [So et al., 2016].

155

To overcome these limitations, we developed a novel mass spectrometry based method to determine effector interactomes during infection (Figure 4.1) [Mousnier et al., 2014]. Using a tandem-affinity purification (TAP) handle of hexahistidine and a BirA-specific biotinylation sequence fused to the N- terminus of an effector of interest, a two-step purification process using Ni2+-NTA and streptavidin sequentially enables effector complexes to be isolated. His-tags are typically only 6 amino acids and the BirA-specific biotinylation sequence only 15 amino acids in length. These short tags are beneficial as they are less likely to interfere with the protein's endogenous function when compared with larger protein tags such as GFP and GST. In addition, both the His/Ni2+-NTA and biotin/streptavidin interactions are amenable to a wide range of buffer conditions. In particular, they tolerate harsh denaturing conditions such as SDS and urea, enabling robust purification of tagged proteins.

As many effectors have been shown to be enzymes, their binding affinities to target proteins may be weaker to minimise product inhibition after the enzymatic reaction. To maximise the probability of detecting these transient interactions, chemical crosslinking using formaldehyde was utilised to stabilise complexes covalently prior to cell lysis. Formaldehyde enables crosslinks to be formed between lysine residues and N-terminal amines within 2-3Å radius.

A particular advantage of the method is the ability to remove background bacterial binders such as chaperones by only providing the E. coli biotin ligase BirA in the host cell. Therefore, only after translocation into the host cell does the TAP-tagged effector become biotinylated, enabling intrabacterial (untranslocated) and translocated effector populations to be distinguished. Effector complexes isolated by TAP are then subjected to tryptic digest and analysed by LC-MS/MS.

This methodology was first employed to identify the interactome of the effector PieE [Mousnier et al.,

2014]. Multiple Rab GTPases were identified and confirmed as interacting partners by Y2H, suggesting that PieE acted as a Rab GTPase binding hub. However, although the proof-of-principle of the method was confirmed, the experimental scale (42 x 15 cm plates per sample) meant that the methodology was low throughput. Given that there are over 300 effectors, most of which have no predictable function, a low throughput technique was not sufficient to further our understanding of Legionella pathogenesis.

To this end, we further developed the technique to provide semi-quantitative analysis as well as increase

156 throughput. To do so, we focused on the effector SidM as it is one of the best characterised effectors, with known interaction partners which can serve as a control for optimisation. A mutation of the target biotinylation site lysine residue in the biotinylation sequence to alanine (K/A), rendering the tag mute, was used as a negative control. This enabled the proteomic changes due to overexpression of the effector to be accounted for in the negative control. The use of isotopic labelling in the form of dimethyl labelling was used to facilitate direct quantitative comparisons between samples. This allowed putative interactors to be ranked according to enrichment factors as an indicator of confidence in a particular potential binding partner. Enrichment factors were determined as the difference in average log2 intensities between the sample (Bio) and negative control (Bio K/A). Higher enrichment factors, therefore, correlate to higher confidence potential interactors. For analytical purposes, proteins found either exclusively in the Bio sample or had an enrichment factor ≥2 were considered potential interactors. In addition, enrichment reproducibility was assessed over technical triplicate (separately infected and processed samples on the same day) or biological duplicate of technical duplicates. A number of key parameters (cell type, number of purification steps, lysis method and crosslinking) were examined as part of the methodology optimisation process.

4.2. TAP aids interactor identification in addition to reducing background binders compared to SAP

Although TAP undeniably enables tagged proteins to be enriched to a higher purity than single-step purification techniques, it may not be as beneficial when attempting to identify interaction partners as the extra manipulation steps may lose weaker interaction partners. In light of recent advancement in quantitative MS, background proteins can be filtered out more efficiently given appropriate controls hence enabling transient and weaker interactors to be identified.

To determine whether increased sensitivity could be gained from performing a single Neutravidin affinity purification (SAP) over a TAP or whether the increased background from SAP would convolute the data analysis, A549 and THP-1 cells expressing BirA were infected with L. pneumophila strains expressing either His6-Bio-SidM or His6-Bio K/A-SidM. Complexes were crosslinked using 1% formaldehyde at 24 h and 6 h post infection for A549 and THP-1 cells respectively. Infection of THP-

157

1 cells could not be pursued for longer time points due to the toxicity exerted onto the cells. Crosslinked cells were lysed with a 1% Triton X-100 phosphate-based buffer and effector complexes purified by either SAP or TAP. Enriched complexes were digested into peptides by trypsin and their composition deciphered by LC-MS/MS. Each stage of the purification process was monitored by Western Blot analysis for both the His-tag and the biotinylated Bio-tagged bait protein.

SAP TAP

pBio pBio K/A pBio pBio K/A

FT

Beads

FT

Beads

FT

Beads

FT

Beads

Elu

Elu

M

Pellet

Soluble

Neu

Neu

Pellet

Soluble

Neu

Neu

M

Pellet

Soluble

His His FT

His His Beads

His His

Neu

Neu

Pellet

Soluble

His His FT

His His Beads

His His

Neu Neu

100 kDa- 80 kDa- SidM 56 kDa-

48 kDa- Streptavidin

32 kDa-

25 kDa-

100 kDa- 80 kDa-

His SidM - 56 kDa-

Anti 48 kDa-

32 kDa-

25 kDa-

Figure 4.2. His6-Bio-SidM is specifically biotinylated in A549-BirA cells and both tags can be used for enrichment. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with 1% formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched using either a single Neutravidin purification (SAP) or sequential Ni2+-NTA and Neutravidin purifications (TAP). The presence of tagged-effector at each stage of the purification process was probed using Western Blot against the biotinylation of the effector and His-tag. FT = flowthrough, Elu = elution, His Beads = post-elution boiled Ni2+- NTA beads, Neu Beads = post-tryptic digest boiled Neutravidin beads.

The pellet contains insoluble bait protein. Protein complexes in the pellet cannot be specifically enriched and hence are not suitable for analysis by LC-MS/MS. The soluble fraction represents the input for the affinity purifications. The flowthrough (FT) shows the fraction of proteins which do not bind to the affinity matrix. The “His beads” fractions show the bait protein retained on the His resin

158 after elution and therefore show the efficiency of His elution. The His elution (His Elu) fraction shows the amount of bait complex eluted from the His resin and also acts as a reference as the input onto the

Neutravidin resin. The “Neu beads” are post-on-bead tryptic digest samples and therefore show the efficiency of the tryptic digest. Incomplete digestion is indicated by a signal in the “Neu beads" sample.

As expected, only pBio and not the pBio K/A samples contain biotinylated bait proteins whilst both still contain His-tags (Figure 4.2). A significant proportion of the bait protein is found in the insoluble fraction. This is likely due to overcrosslinking which can result in unspecific aggregation of proteins.

Both the His and Neutravidin (Neu) purifications were not quantitative as a signal for the bait protein can be detected in both His FT and Neu FT fractions. Incomplete binding on the affinity matrices will inevitably result in weaker MS signals as there is less protein to analyse. Longer incubations may aid in binding of bait protein to the resin. However, this likely also increases the amount of unspecific background proteins binding to the matrix. Therefore, although there is a loss of signal from incomplete binding, the reduction in background could facilitate the data analysis and identification of true interactors. The elution of bait protein from the Ni2+-NTA resin appears efficient with no signal being observed on the His beads post-elution. Notably, there is only a loss of His signal between input (Soluble for SAP and His Elu for TAP) and the Neu FT lanes for pBio but not K/A samples. This suggests that as expected Bio-SidM but not the control Bio K/A-SidM was biotinylated and bound to the Neutravidin resin. The tryptic digest also appears efficient with no undigested bait protein being observed in the Neu beads fractions.

159

SAP TAP

pBio pBio K/A pBio pBio K/A

Beads FT Beads FT

FT Beads FT Beads

Elu Elu

Neu M Pellet Soluble Neu Neu Pellet Soluble Neu

His His Beads M Pellet Soluble His FT His Beads His Neu Neu Pellet Soluble His FT His Neu Neu

245 kDa- 190 kDa- 135 kDa- 100 kDa- SidM 80 kDa- 56 kDa-

48 kDa- Streptavidin

32 kDa-

25 kDa-

245 kDa- 190 kDa- 135 kDa- 100 kDa- SidM

80 kDa- His - 56 kDa-

Anti 48 kDa-

32 kDa-

25 kDa-

Figure 4.3. His6-Bio-SidM is specifically biotinylated THP-1-BirA cells and both tags can be used for enrichment. THP- 1-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with 1% formaldehyde and lysed 6 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched using either a single Neutravidin purification (SAP) or sequential Ni2+-NTA and Neutravidin purifications (TAP). The presence of tagged-effector at each stage of the purification process was probed using Western Blot against the biotinylation of the effector and His-tag. FT = flowthrough, Elu = elution, His Beads = post-elution boiled Ni2+-NTA beads, Neu Beads = post-tryptic digest boiled Neutravidin beads.

The blots for the THP-1 samples were more difficult to analyse as there was more background signal

(Figure 4.3). This is likely due to less bait protein being present in these samples compared with A549 samples due to the lower multiplicity of infection (MOI) and infection time. However, Bio-SidM could be enriched as evidenced by a specific band of the correct size in the His elution which becomes depleted in the Neu FT fraction.

160

Figure 4.4. TAP enhances the detectable SidM interactome. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked using 1% formaldehyde and lysed 24 h post- infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched either by a single Neutravidin purification (SAP) or sequential Ni2+-NTA and Neutravidin purifications (TAP). Isolated effector complexes were digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. (A) Heat map showing log2 intensities of all identified proteins across both SAP and TAP conditions. Proteins were ordered based on SAP enrichment factors (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual replicate. (B) Zoom of log2 intensity heat maps indicating Top10 ranked enriched proteins according to SAP and TAP enrichment factors. Legionella proteins are in bold. (C) Comparison of protein enrichment factors obtained between SAP and TAP conditions. Common interactors are depicted as green triangles. SAP- and TAP-specific interactors are represented as red squares and blue circles respectively. (D) Log2 intensity heat map of the 15 common interactors identified between SAP and TAP. Proteins identified as a Top10 ranked hit are indicated with a cross (X). Taken from [So et al., 2016].

161

Figure 4.5. Rab1A is found in SidM effector complexes during infection of THP-1 cells. THP-1-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked using 1% formaldehyde and lysed 6 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched either by a single Neutravidin purification (SAP) or sequential Ni2+-NTA and Neutravidin purifications (TAP). Isolated effector complexes were digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. (A) Heat map showing log2 intensities of all identified proteins across both SAP and TAP conditions. Proteins were ordered based on SAP enrichment factors (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual replicate. Proteins classified as an interactor in either SAP or TAP are in bold. (B) Comparison of protein enrichment factors obtained between SAP and TAP conditions. SidM and Rab1A are represented as blue and green circles respectively. Taken from [So et al., 2016].

162

The resulting MS data were visualised as log2 intensity heat maps of each replicate and ranked according to enrichment factors. The bait protein SidM was identified in Bio-SidM samples from both cell lines and under both purification conditions (Figure 4.4A and B and Figure 4.5A). However, SidM intensity in THP-1 samples was 30-fold lower than in A549 samples. In addition, the enrichment factor of SidM was 13 in A549 samples whereas only 10 in THP-1 samples (Figure 4.4C and Figure 4.5B).

This suggests that less bait protein was recovered from THP-1 cells during the purification process.

This can be attributed to the shorter infection time and lower MOI used for THP-1 cells as a result of the cytotoxicity observed. Although a low bait intensity does not necessarily mean that interactors cannot be determined, THP-1 samples also did not identify many proteins as potential interactors of

SidM. Only 7 and 4 proteins passed the selection criteria for SAP and TAP respectively (Figure 4.5A).

This suggests that THP-1 samples approached the limits of detection and although THP-1s are a more physiologically relevant infection model than A549 cells, the infection dynamics of THP-1s renders them unsuitable for this analysis on this scale. Due to this, all subsequent data sets were obtained from infection of A549-BirA cells. However, proof-of-concept of using THP-1s was validated through the identification of Rab1A as a potential interactor under both SAP and TAP conditions (Figure 4.5B).

For A549-BirA cells, 20 of the 147 (13.6%) proteins identified under SAP conditions were classified as potential interactors (>2 enrichment factor) (Figure 4.4A). Comparatively, 38 out of 83 (45.8%) identified proteins were potential interactors under TAP conditions. 15 common interactors were identified between the two purification methods (Figure 4.4C). Ranking the interactors according to enrichment factors revealed that 8 of the top 10 ranked hits (Top10) were shared between SAP and

TAP, indicating that both methods identified similar high confidence potential interactors (Figure

4.4D). However, although many of the Top10 interactors were shared, the enrichment factors obtained from TAP were higher than those in SAP, suggesting that the extra enrichment step helped distinguish potential interactors from unspecific background (Figure 4.4C).

Multiple Rab GTPases were detected with Rab1A, 1B, 6 and 10 being identified as Top10 ranked interactors under both purification conditions (Figure 4.4B). Rab2, 8A and 14 were also detected as potential interactors but ranked lower. The host proteins annexin A1 and ubiquitin were also

163 consistently found as putative Top10 interactors. In addition to host proteins, the T4SS Legionella effectors Lpw_31531 (MavP), Lpw_17241 (PpeB) and Lpw_25181 (Lpg2327) were identified as potential SidM interactors. Whilst MavP was a Top10 hit across SAP and TAP, Lpg2327 was ranked lower and PpeB was only a Top10 hit under SAP conditions (Figure 4.4B).

4.3. Denaturing conditions are detrimental to the SidM-Rab1 interaction

Only soluble proteins may be subjected to purification and hence analysis by mass spectrometry.

However, there is no guarantee whether a protein is stable outside of a cellular context. This is further complicated by the addition of chemical crosslinkers. As formaldehyde is able to react with any lysine or N-terminal amine, crosslinking will inevitably lead to protein aggregates thus decreasing their solubility. To determine whether different lysis buffers could aid interactor determination through enhanced proteome solubilisation, four different lysis conditions were employed: 1% Triton X-100, 1%

3-((3-cholamidopropyl) dimethylammonio)-1-propanesulfonate (CHAPS), 0.5% sodium dodecyl sulphate (SDS) and 1% Triton X-100/6M guanidinium chloride (GnCl). This panel of detergents represents a non-ionic (Triton X-100), a zwitterionic (CHAPS) and an ionic (SDS) detergent in addition to the chaotropic agent guanidinium chloride. As SDS and guanidinium are highly chaotropic and denaturing agents, they will likely not enable native non-crosslinked interactions to be detected but provide the most solubilising buffer conditions. Triton X-100 and CHAPS, in contrast, are milder detergents whereby native interactions may be preserved during the purification process but are less likely to solubilise large crosslinked complexes.

164

A 6M GnCl + 1% Triton X-100 B 1% Triton X-100 C 1% CHAPS D 0.5% SDS

pBio pBio K/A pBio pBio K/A pBio pBio K/A pBio pBio K/A

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

Elu

Elu

Elu

Elu

Elu

Elu

Elu

Elu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu Neu

190 kDa- 135 kDa- 100 kDa- 80 kDa- SidM 56 kDa-

48 kDa- Streptavidin 32 kDa-

190 kDa- 135 kDa-

100 kDa- His

- 80 kDa- SidM 56 kDa-

Anti 48 kDa- 32 kDa-

Figure 4.6. Guanidinium chloride aids solubilisation of crosslinked effector complexes. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with 1% formaldehyde and lysed 24 h post-infection using phosphate buffers containing different detergents and chaotropic agents. Effector complexes were enriched using TAP. The presence of tagged-effector at each stage of the purification process was probed using Western Blot against the biotinylation of the effector and His-tag. FT = flowthrough, Elu = elution, His Beads = post- elution boiled Ni2+-NTA beads, Neu Beads = post-tryptic digest boiled Neutravidin beads. (A) 6M guanidium chloride and 1% Triton X-100. (B) 1% Triton X-100. (C) 1% CHAPS. (D) 0.5% SDS.

Western blot analysis revealed the 6M GnCl + 1% Triton X-100 buffer did help solubilise the crosslinked complexes as there was proportionally less bait protein in the pellet than the soluble fraction

(Figure 4.6A). In contrast, the bait protein is evenly distributed between pellet and soluble fractions under Triton and CHAPS conditions (Figure 4.6B and C). Unfortunately, the samples from the SDS condition were not amenable to WB analysis (Figure 4.6D). Although signal could be detected in the

His Elu and Neu FT fractions, the bait protein could not be detected in either pellet or soluble fractions.

However, Ponceau staining revealed that there was protein in those fractions (data not shown). Hence it was not possible to determine how SDS performed relative to the other detergents in aiding protein complex solubilisation.

165

Figure 4.7. Denaturing conditions hinders detection of the SidM/Rab1 interaction. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked using 1% formaldehyde and lysed 24 h post-infection using phosphate buffers containing different detergents and chaotropic agents (6M GnCl + 1% Triton X-100, 1% Triton X-100, 1% CHAPS and 0.5% SDS). Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. (A) Heat map showing log2 intensities of all identified proteins across all lysis conditions. Proteins were ranked based on 1% Triton X-100 lysis condition enrichment factors (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual replicate. (B) Numerical breakdown of the identified proteins into interaction partners and unspecific background for each lysis condition. (C) Venn diagram depicting the number of shared SidM interaction partners identified between lysis conditions. (D) Log2 intensity heat map of the 13 common interaction partners found across all lysis conditions. Top10 ranked hits for each condition are marked with a cross (X). Taken from [So et al., 2016].

The GnCl/Triton lysis condition identified the most proteins (91) compared with the 70, 73 and 66 IDs from Triton, CHAPS and SDS detergent conditions respectively (Figure 4.7A). However, the overall increase in protein IDs did not result in higher interactor identifications with the GnCl/Triton condition identifying the fewest number of potential interactors with 25 relative to the 34, 33 and 34 putative

166 interactors from the Triton, CHAPS and SDS conditions respectively (Figure 4.7B). Surprisingly, whilst the GnCl/Triton condition gave the most IDs and fewest potential interactors, the SDS lysis buffer identified the fewest number of proteins, yet produced the highest number of potential interactors, suggesting that the way in which GnCl and SDS aid solubilisation and disrupt interactions differs substantially.

The four lysis conditions identified 61 unique putative interactors, of which only 13 were common under all lysis buffers (Figure 4.7C). In particular, Rab1A, Rab6, annexin A1, annexin A2 and MavP were Top10 hits across all conditions (Figure 4.7D). Although Rab1A was identified under all four lysis conditions, its intensity under denaturing conditions (GnCl/Triton and SDS) was lower than those found under non-denaturing conditions (Triton and CHAPS) whilst the intensity of the bait SidM remained consistent across all conditions (Table 4.1). The Rab1A intensity from the GnCl/Triton condition was

40- and 25- fold lower than those found under Triton and CHAPS conditions. Although the Rab1A intensity for SDS was higher than GnCl/Triton, it was still 12- and 7-fold lower than those from Triton and CHAPS conditions. Furthermore Rab1B was ranked outside the Top10 hits under denaturing conditions. Its enrichment factor was 2 for both denaturing conditions whilst it was 10 and 8 for Triton and CHAPS respectively. This suggests that the Rab1-SidM interaction is hindered under denaturing conditions. As denaturing conditions did not appear beneficial for studying the SidM interactome, all subsequent experiments were performed using the Triton lysis condition.

Average log2 intensity Enrichment factor Gene names GnCl/Triton Triton CHAPS SDS GnCl/Triton Triton CHAPS SDS SidM 36.30 35.39 35.41 36.36 15.97 14.78 13.84 10.93 RAB1A 25.64 30.95 30.29 27.40 5.32 12.35 11.29 9.95 RAB1B 21.90 28.60 27.67 21.84 2.21 10.33 8.42 2.65

Table 4.1. Table of average log2 intensities and enrichment factors of SidM, Rab1A and Rab1B across all lysis conditions. Chaotropic agents decrease both Rab1 intensity and enrichment factor. 4.4. Moderate crosslinking increases the detectable SidM interactome

Although the probability of stabilising an interaction increases with increasing concentrations of formaldehyde, it also simultaneously promotes protein aggregation as well as increasing the number of covalent linkages between the bait and any background proteins. To determine whether formaldehyde

167 crosslinking enhanced the detectable SidM interactome, two formaldehyde concentrations 1% and 3% were used along with an uncrosslinked control.

A 0% CH2O B 1% CH2O C 3% CH2O

pBio pBio K/A pBio pBio K/A pBio pBio K/A

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

Elu

Elu

Elu

Elu

Elu

Elu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu Neu

190 kDa- 135 kDa- 100 kDa- 80 kDa- SidM 56 kDa-

48 kDa- Streptavidin

32 kDa-

25 kDa-

190 kDa- 135 kDa- 100 kDa- 80 kDa-

His SidM - 56 kDa-

Anti 48 kDa-

32 kDa- 25 kDa-

Figure 4.8. Increasing formaldehyde concentration decreases effector complex solubility. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with various concentrations of formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched using TAP. The presence of tagged-effector at each stage of the purification process was probed using Western Blot against the biotinylation of the effector and His-tag. FT = flowthrough, Elu = elution, His Beads = post-elution boiled Ni2+-NTA beads, Neu Beads = post-tryptic digest boiled Neutravidin beads. (A) 0% formaldehyde crosslinking. (B) 1% formaldehyde crosslinking. (C) 3% formaldehyde crosslinking.

Western blot analysis revealed that SidM is largely soluble in the absence of crosslinking with only a weak signal corresponding to SidM appearing in the pellet fraction (Figure 4.8A). Using 1% formaldehyde to crosslink results in approximately half of the SidM population to become insoluble

(Figure 4.8B). However, there is a lack of signal in the soluble fraction whilst there is a strong distinct band in the pellet under 3% formaldehyde crosslinking conditions (Figure 4.8C). This suggests that 3% formaldehyde essentially forces all SidM to form protein aggregates which cannot be solubilised in the lysis buffer. Nevertheless, there is a small proportion of soluble SidM as evidenced by a faint band in the His elution fraction.

168

Figure 4.9. Moderate formaldehyde crosslinking enhances the detectable SidM interactome. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with 0%, 1% or 3% formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. (A) Heat map showing log2 intensities of all identified proteins across all formaldehyde crosslinking concentrations. Proteins were ranked based on 1% formaldehyde crosslinking condition enrichment factors (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual replicate. (B) Numerical breakdown of the identified proteins into interaction partners and unspecific background for each formaldehyde concentration. (C) Log2 intensity heat map showing the Top10 ranked hits based on the 1% formaldehyde crosslinking concentration. Adapted from [So et al., 2016].

Whilst 3% formaldehyde drastically decreased the number of protein IDs (25), 1% formaldehyde actually slightly increased the number of identified proteins to 83 from the 73 identified with no crosslinking (Figure 4.9A). In addition to a higher number of total protein identifications, the 1% formaldehyde condition revealed the most number of potential interactors with 38 compared with 10 found in both uncrosslinked and 3% formaldehyde conditions (Figure 4.9B).

169

Average log2 intensity Δlog2 intensity (SidM-X)

Gene names 0% CH2O 1% CH2O 3% CH2O 0% CH2O 1% CH2O 3% CH2O SidM 36.46 34.69 31.28 0.00 0.00 0.00 RAB1A 31.85 30.33 26.84 4.61 4.36 4.44 RAB1B 29.54 27.65 24.21 6.92 7.04 7.07 Ubiquitin 26.53 26.06 23.54 9.93 8.63 7.74

Table 4.2. Table of average log2 intensities and difference in log2 intensities with respect to SidM for SidM, Rab1A, Rab1B and ubiquitin for each formaldehyde concentration. Detection of ubiquitin is enhanced with crosslinking whilst identification of Rab1 is independent of crosslinking concentration.

The bait log2 intensity was highest in the uncrosslinked condition (36.5) as expected and decreases with increasing formaldehyde concentration (34.7 and 31.3 for 1% and 3% CH2O respectively) (Table 4.2).

Other than the bait protein, only three proteins were found within the Top10 across all conditions:

Rab1A, Rab1B and ubiquitin (Figure 4.9C). Whilst the intensities of Rab1A and Rab1B were proportional to SidM under the three different crosslinking concentration, the relative intensity of ubiquitin increased with increasing crosslinking concentration (Table 4.2). This suggests that crosslinking did not aid in the identification of Rab1A and Rab1B as interacting partners of SidM and therefore suggests the binding interface between these proteins is not amenable to short 2-3Å lysine- lysine crosslinks. In contrast, the interaction between SidM and ubiquitin appears to be enhanced with crosslinking (Table 4.2). Furthermore, a number of proteins identified as Top10 interactors for the 1% formaldehyde condition were only identified in the presence of crosslinking: annexin A2, annexin A1,

Rab10, MavP and ALDH1A1 (Figure 4.9C). These five proteins were also identified under the 3% formaldehyde condition with the exception of MavP.

The data suggests that moderate crosslinking (1% formaldehyde) not only enhanced the total number of proteins identified but also increases the number of potential interactors. In contrast, at 3% formaldehyde, proteins are overcrosslinked and solubility issues result in the inability to decipher complex composition by MS.

170

4.5. The SidM interactome is dependent on both crosslinker length and reactivity

As the Rab1A/B-SidM interaction was not stabilised by short lysine-lysine crosslinks, not all protein- protein interaction interfaces appear amenable to formaldehyde crosslinking. This is unsurprising given the stringent requirement for formaldehyde crosslinking of two lysine residues/N-terminal amines within a 2-3Å radius of one another at the binding interface. To test whether a more complete interactome could be gathered using alternative crosslinkers, four additional conditions were included: dithiobis(succinimidyl propionate) (DSP), dithiobismaleimidoethane (DTME), 4-(N- maleimidomethyl)cyclohexane-1-carboxylate (SMCC) and DSP+DTME. Uncrosslinked and formaldehyde crosslinked samples acted as controls.

This panel of crosslinkers enabled sampling of both different chemical reactivities and various crosslinking distances. DSP is a homodifunctional NHS-ester crosslinker enabling lysines/N-terminal amines within 12Å to be covalently linked. DTME, on the other hand, is a homodifunctional maleimide crosslinker with the ability to form crosslinks between cysteine residues within 13.3Å of each other.

SMCC is a 8.3Å heterodifunctional crosslinker with NHS-ester and maleimide moieties to form crosslinks between a lysine/N-terminal amine and a cysteine residue. A combination of DSP and DTME was used to determine whether any synergistic effects could be observed.

171

pBio pBio K/A pBio pBio K/A pBio pBio K/A

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

FT

Beads

Elu

Elu

Elu

Elu

Elu

Elu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

M

Pellet

Soluble

His FT His

His Beads His

His His

Neu

Neu

Pellet

Soluble

His FT His

His Beads His

His His

Neu Neu

190 kDa- 80 kDa- 56 kDa- SidM 48 kDa-

32 kDa- (A) no crosslinker (B) CH2O (C) DSP

190 kDa- Streptavidin 80 kDa- 56 kDa- SidM 48 kDa- 32 kDa- (D) DTME (E) SMCC (F) DSP+DTME

190 kDa- 80 kDa- 56 kDa- SidM 48 kDa-

32 kDa- (A) no crosslinker (B) CH2O (C) DSP

His -

Anti 190 kDa- 80 kDa- 56 kDa- SidM 48 kDa- 32 kDa- (D) DTME (E) SMCC (F) DSP+DTME

Figure 4.10. Alternative crosslinkers do not alter to ability to purify effector complexes using TAP. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with various crosslinkers and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched using TAP. The presence of tagged-effector at each stage of the purification process was probed using Western Blot against the biotinylation of the effector and His-tag. FT = flowthrough, Elu = elution, His Beads = post-elution boiled Ni2+- NTA beads, Neu Beads = post-tryptic digest boiled Neutravidin beads. (A) Uncrosslinked. (B) 1% formaldehyde crosslinking. (C) 1 mM DSP. (D) 0.5 mM DTME. (E) 1 mM SMCC. (F) 1 mM DSP + 0.5 mM DTME.

Western blot analysis showed that none of the crosslinkers excessively aggregated the bait protein SidM and in fact 1% formaldehyde actually produced the strongest relative signal in the pellet compared with the soluble fraction (Figure 4.10). Furthermore, crosslinking with new reagents did not affect binding to either resin with signal for SidM observed in the His elution fraction which becomes depleted in the

Neu FT under all conditions.

172

Figure 4.11. The detectable SidM interactome is dependent on both crosslinker reactivity and crosslinker length. A549- BirA cells were infected with L. pneumophila strains expressing His6-Bio-SidM or His6-Bio K/A-SidM. Cells were crosslinked with various crosslinkers (uncrosslinked, formaldehyde, DSP, DTME, SMCC and DSP + DTME) and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. (A) Heat map showing log2 intensities of all identified proteins across all crosslinking conditions. Proteins were ranked based on DSP crosslinking condition enrichment factors (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual replicate. (B) Zoom of the log2 intensity heat map showing the Top10 ranked hits according to DSP enrichment factors. Top10 ranked proteins for their respective crosslinking condition are indicated with a cross (X). (C) Venn diagram of the interactors identified dependent of crosslinking length of lysine-lysine crosslinkers. (D) Venn diagram of interactors identified based on different crosslinker reactivities. Taken from [So et al., 2016].

173

Whilst all six conditions identified similar numbers of proteins (~50), the number of interactors varied with no crosslinking and DTME conditions yielding only 13 and 9 potential interactors respectively in contrast to the 27 identified under DSP conditions (Figure 4.11A). Five proteins were found within the

Top10 ranked proteins across all six conditions: SidM, Rab1A, Rab1B, Rab6 and ubiquitin (Figure

4.11B). Formaldehyde shared a further two Top10 targets with DSP whilst SMCC had an additional three Top10 proteins in common with DSP. The interactor profile of DTME largely resembled that of the uncrosslinked sample, although it did have one additional shared Top10 hit with DSP. The

DSP+DTME condition shared eight Top10 hits with the DSP only condition, suggesting that its interactome profile was largely dominated by DSP reactivity alone.

Although the Top10 ranked proteins were relatively consistent across most conditions, there were differences for the lower ranked proteins dependent on both crosslinker reactivity and length.

Comparison of Lys-Lys crosslinking length revealed that five proteins were common between the uncrosslinked control, formaldehyde and DSP treated samples (Figure 4.11C). 13 of the 24 Lys-Lys crosslink-dependent interactors were found in both formaldehyde and DSP samples. However, 3 of 24 required shorter formaldehyde crosslinks to be detected whilst 8 putative interactors were only detected using the 12Å DSP crosslinker. Additionally 7 putative interactors were only detected when neither

Lys-Lys crosslinker was used in comparison to the control. Whilst the DTME-treated interactome largely resembled the control, DSP and SMCC shared 12 potential interactors (Figure 4.11D). This suggests that the binding interface between these proteins and SidM likely have multiple crosslinkable functionalities. 9 additional interactors were DSP specific whilst 7 were only detected using SMCC.

In particular, a number of frequent Top10 hits (Rab10, MavP and annexin A2) were only identified under specific crosslinker conditions (Figure 4.11B). Rab10 could only be identified if the crosslinker used had at least one amine-reactive functionality (formaldehyde, DSP, SMCC and DSP+DTME).

MavP showed stronger amine reactivity dependency with it only being identified in the presence of an amine-amine crosslinker (formaldehyde and DSP). In contrast, annexin A2 showed little reactivity dependence as it was identified under all conditions where a crosslinker was used.

174

A summary of the most frequent Top10 hits identified across the experimental conditions tested are shown in Table 4.3. A green box indicates a Top10 ranked identification in the given experiment.

Table 4.3. Table of the high confidence interaction partners of SidM. SidM interaction partners identified as a Top10 ranked protein in at least 50% of all experimental conditions tested. Green boxes indicate a Top10 hit. Taken from [So et al., 2016].

Across all experimental conditions, only SidM and Rab1A were ranked in the Top10 hits. Rab6, ubiquitin and Rab1B were robustly identified as Top10 putative interactors in >85% of samples.

Annexin A2, MavP, annexin A1 and Rab10 were identified as Top10 targets in at least 50% of conditions tested. These 9 putative interactors represent the high confidence hits of the infection- dependent SidM interactome using this methodology. Furthermore, we have identified Rab1A, 1B, 6 and 10 as the Rab GTPases which SidM targets during infection.

175

4.6. The effector LidA binds a specific subset of Rab GTPases during infection

Having optimised the system to identify the interactome of SidM, we determined whether the methodology could be applied to other Legionella effectors. LidA has been reported to bind a multitude of Rab GTPases including Rab1A, 1B, 2, 3B, 4B, 5, 6, 7, 8A/B, 9, 10, 11, 13, 14, 18, 20, 22, 27A/B,

30, 31, 32 and 35 [Cheng et al., 2012; Yu et al., 2015a]. Some of these interactions (Rab1, Rab 6 and

Rab8) have been reported to be in the picomolar range [Schoebel et al., 2011]. However, other than

Rab1 and Rab6, all these interactions have only been identified in in vitro conditions [Chen et al., 2013;

Machner et al., 2006]. Bio-tagged LidA complexes were purified from infected A549-BirA cells to determine its infection-dependent Rab GTPase binding profile and any additional novel binding partners. The samples were processed using 1% formaldehyde crosslinking followed by Triton X-100 lysis and TAP.

The top hits for LidA were predominantly Rab GTPases (Figure 4.12A). Across two biological replicates, Rab1A, 1B, 3D, 6, 8A, 10, 14 and 18 were consistently identified. Rab3B, 8B and 13 were also identified but only in one biological repeat. Enrichment factors obtained from each biological replicate differed with higher enrichment factors being obtained in replicate 1 (Figure 4.12B). However, the overall ranking of Rab GTPases was similar between the replicates. The only non-Rab GTPase target within the Top10 ranked proteins was annexin A2.

Whilst Rab1A, 1B, 6 and 10 appear to be targeted by both SidM and LidA, Rab14 and 18 were also consistently identified as Top10 ranked interactors in LidA interactomes. Of the 24 reported LidA binding Rab GTPases, only 7 were found in LidA interactomes in both biological replicates. In addition, a novel interaction between LidA and Rab3D was identified. As the previously reported promiscuous

Rab binding capabilities of LidA were demonstrated by in vitro recombinant protein binding assays, these 8 Rab GTPases are likely more representative of LidA’s infection-dependent Rab GTPase binding profile.

176

Figure 4.12. The LidA interactome reveals Rab GTPase binding preferences during infection. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-LidA or His6-Bio K/A-LidA. Cells were crosslinked with 1% formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. The experiment was performed as a biological duplicate of technical triplicates. (A) Heat map showing log2 intensities of all identified proteins across two biological replicates. Proteins were ranked based on enrichment factors from replicate 1 (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual technical replicate. Top10 ranked hits are marked with a cross (X). (B) Comparison of protein enrichment factors obtained between biological replicates. LidA is represented as a blue circle. Rab GTPase interactors identified in both biological replicates are indicated as green triangles. Rab GTPase interactors specific for replicate 1 are represented as red squares. Taken from [So et al., 2016].

177

4.7. Co-immunoprecipitations confirm Rab10 as a genuine interactor of both SidM and LidA

To confirm the Rab GTPase binding profiles of SidM and LidA obtained from the BirA/Bio-tag methodology, the reciprocal complex purification was performed. GFP-Rab2A/5C/10 expressing cells were infected with L. pneumophila strains expressing 4HA-tagged SidM, LidA or the PI4P-binding domain of the effector SidC. Cells were lysed and immunoprecipitated using an anti-GFP antibody.

Immunoprecipitated proteins were separated by SDS-PAGE and analysed by Western blot using anti-

GFP and anti-HA antibodies. Whilst Rab10 was identified as an interactor for both SidM and LidA in the proteomics, Rab2A was only occasionally found in SidM interactomes with a low enrichment factor and Rab5C was never identified in either effector interactome. The PI4P-binding domain of SidC was used as a negative control since it also localises to the LCV but has never been reported to interact with

Rab GTPases.

- —

- —

Figure 4.13. SidM and LidA co-immunoprecipitate with Rab10. A549 cells transduced with GFP-Rab2A, -Rab5C or - Rab10 were infected with L. pneumophila strains expressing HA-tagged LidA, SidC (PI4P binding domain) or SidM. Cells were lysed 24 h post infection and GFP-Rab GTPases immunoprecipitated using anti-GFP beads. Immunoprecipitated proteins were separated by SDS-PAGE and analysed by Western Blot using anti-GFP and anti-HA antibodies. Adapted from [So et al., 2016].

The co-immunoprecipitation data complemented the interactome data with both SidM and LidA being co-immunoprecipitated with GFP-Rab10 (Figure 4.13). Immunoprecipitation of GFP-Rab2A revealed the presence of SidM but to a much weaker extent than GFP-Rab10, reflecting the enrichment factors obtained for Rab2 and Rab10 in SidM interactomes. Immunoprecipitation of GFP-Rab5C did not pull

178 down SidM or LidA. The PI4P-binding domain of SidC was not co-immunoprecipitated with any of the GFP-Rab GTPases as expected. The data confirms that Rab10 interacts with both SidM and LidA during infection whilst Rab2A is able to bind SidM with lower affinity.

4.8. SidM and LidA are not ubiquitinated during infection

Ubiquitin was frequently identified as a Top10 ranked protein for SidM. Although it was also identified in LidA interactomes, it was less reproducible with only one of the two biological repeats identifying it as a putative interactor. To determine whether the presence of ubiquitin in the interactomes was due to non-covalent binding to bait protein or post-translational modification of the bait protein, A549-BirA cells were infected with L. pneumophila strains expressing pBio (K/A)-SidM or –LidA. The tagged effectors were immunoprecipitated using TAP, analysed by SDS-PAGE and Western Blot against ubiquitin.

Figure 4.14. SidM and LidA are not ubiquitinated during infection. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio (K/A)-SidM or –LidA. Cells were lysed 24 h post infection and tagged-effectors enriched by TAP. Enriched proteins were separated by SDS-PAGE and analysed by Western Blot using Streptavidin and anti-ubiquitin antibodies.

Although both Bio-SidM and LidA could be enriched, a ubiquitin-positive band was not observed in any of the pulled down samples (Figure 4.14). Ubiquitinated proteins could be seen in the input as high molecular weight smears in the WB. This suggests that SidM and LidA are not modified with ubiquitin during infection. This, however, does not rule out the ability of either protein to bind ubiquitin in a non- covalent manner.

179

4.9. The MavP interactome

The effector MavP was consistently detected in the SidM interactome and was frequently within the

Top10 ranked proteins (9 out of the 14 experimental conditions tested). To determine if this novel putative effector-effector interaction could be validated, the reciprocal BirA/Bio-tag experiment was performed using Bio-tagged MavP as bait. Although MavP was identified and had a high enrichment factor of 10, SidM was not detected in the interactome (Figure 4.15). However, four other effectors

(MavN, Lpg2327, PpeB and Lem14) were ranked within the Top10, suggesting that it may act as an effector binding hub.

Figure 4.15. The MavP interactome identifies multiple Legionella effectors. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-MavP or His6-Bio K/A-MavP. Cells were crosslinked with 1% formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. Heat map showing log2 intensities of identified proteins ranked based on enrichment factor (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual technical replicate. 4.10. Direct Y2H to confirm SidM interactions

To determine whether the MavP-SidM interaction could be reproduced in an alternative assay, MavP and SidM were cloned into Y2H vectors pGADT7 and pGBKT7. In addition, a library of Rab GTPases were also tested for interaction with SidM and its inactive AMPylation mutant SidM DD102/104AA.

AH109 yeast were co-transformed with pairs of pGADT7 and pGBKT7 vectors and spotted onto DDO and QDO plates.

180

Figure 4.16. SidM is not amenable to Y2H analysis. AH109 yeast were co-transformed with pGBKT7-baits (empty, SidM, SidM DD102/104AA or MavP) and pGADT7-preys (empty, Rab1A, 1B, 2A, 5C, 6A, 7, 10, MavP, SidM or SidM DD102/104AA) and spotted onto DDO and QDO plates. Plates were incubated for 3 days.

No growth on selective QDO media was observed for any interacting pairs except for the positive control (pGBKT7-LtpG with pGADT7-CDK7 (157-346)) (Figure 4.16). The observation that the literature confirmed SidM-Rab1A interaction could not be reproduced in this assay suggests that SidM is not amenable to Y2H analysis. An alternative assay is likely needed to confirm the interaction between SidM and MavP as both the reciprocal Bio/BirA pulldown and Y2H failed to recapitulate the

SidM interactome data.

4.11. The LtpG interactome

As the Bio/BirA methodology enabled SidM and LidA interactomes to be determined effectively and identified a stringent subset of Rab GTPase targets relevant to infection, LtpG was subjected to the same assay. A549-BirA cells were infected with L. pneumophila expressing His6-Bio-tagged LtpG

181

H263A. LtpG WT (active Fic domain) could not be cloned into the pBio vector. We hypothesise that this is due to the toxicity of the Fic domain activity exerted on the bacteria. However, inactivation of the Fic domain through mutation of the catalytic histidine likely does not affect the structure of LtpG nor its binding partners. This was strengthened by the observation that LtpG WT and H263A behave similarly in the DSF assay.

Figure 4.17. The LtpG interactome during infection. A549-BirA cells were infected with L. pneumophila strains expressing His6-Bio-LtpG or His6-Bio K/A-LtpG. Cells were crosslinked with 1% formaldehyde and lysed 24 h post-infection using a 1% Triton X-100 phosphate buffer. Effector complexes were enriched by TAP, digested using trypsin and the resulting peptide mixture analysed by LC-MS/MS. Heat map showing log2 intensities of identified proteins ranked based on enrichment factor (Bio over K/A). Missing values are indicated as grey boxes. Each column represents an individual technical replicate.

HSP90 is again the most likely candidate as a putative interaction partner of LtpG, with an enrichment factor of 10 for both HSP90AA1 and HSP90AB1 (Figure 4.17). An additional heat shock protein

HSPA8 was also found within the Top10 hits. Three redox enzymes: retinal dehydrogenase 1

(ALDH1A1), 6-phosphogluconate dehydrogenase (PGD) and aldo-keto reductase family 1 member C3

(AKR1C3) were identified in the Top10 ranked hits. The remaining Top10 hits consisted of phosphoglycerate kinase 1 (PGK1), tubulin beta and elongation factor 2 (EEF2).

As HSPA8 and ALDH1A1 were also frequently classified as potential interactors in SidM, LidA and

MavP interactomes, they likely represent false positive hits. Although they are specifically enriched in

Bio samples over the K/A controls, their presence may be due to unspecific crosslinking to the bait protein due to proximity rather than specific binding interactions.

182

The presence of PGD and PGK1 in the Top10 ranked proteins suggests that LtpG may interfere with glucose metabolism. AKR1C3 is involved in alcohol metabolism with ALDH1A1. However, as

ALDH1A1 is likely to be a false positive, it is difficult to interpret whether the presence of both in the

Top10 LtpG hits indicates a potential pathway which LtpG may influence.

Tubulin is an essential component of the host cell cytoskeleton. As LtpG can be seen to form network structures in both infection and transfection by immunofluorescence, it may be possible that LtpG is co-localising with tubulin.

EEF2 is an essential factor involved in protein synthesis and translocates the nascent polypeptide from the A-site to the P-site of the ribosome during translation. It was also identified in the second in vitro pulldown experiment with 4 spectral counts for both LtpG WT and H263A (Figure 4.18A). However, it was less robustly identified in the competition pulldown experiment with only 4 of the 9 samples identifying it (Figure 4.18B). Furthermore, it did not show strong dose-dependency.

Figure 4.18. Elongation factor 2 is a potential interaction partner of LtpG. (A) Spectral counts of elongation factor 2 from second in vitro pulldown experiment. (B) Dose response curves of elongation factor 2 from competition in vitro pulldown experiment.

183

Manual inspection of the data revealed that NACA was identified in the LtpG interactome with an enrichment factor of 4.6. However, as it was only identified by a single unique peptide, it was initially filtered out. The same unique peptide was identified in all three technical repeats. As NACA is only a

215 amino acid protein, this single unique peptide covers 7% of the entire protein. Inspection of the

MS/MS spectrum of the peptide reveals a strong identification with many b and y ions assigned (Figure

4.19). Hence, although NACA did not pass initial data filters, it remains a strong candidate as a potential binding partner which is further strengthened by the in vitro pulldown data. Biological replicates also revealed that NACA is frequently identified in LtpG interactomes (data not shown). However, they were all identified with few unique peptide hits.

Figure 4.19. MS/MS spectrum of NACA peptide.

184

4.12. Discussion

The Bio-tag/BirA strategy was further optimised using SidM and LidA as bait proteins whilst testing number of purifications, lysis and crosslinking conditions. Importantly, the scale of the experiment was reduced from 42 x 15cm plates to 1 x 10cm plate per condition, a reduction of 116-fold in cell surface area. This enabled higher throughput for the method, allowing multiple conditions and replicates to be performed simultaneously.

Encouragingly, only known Legionella effectors were found in every effector interactome tested. No bacterial housekeeping proteins were detected suggesting that the translocation-dependent biotinylation of the bait protein is very stringent. This further highlights the power of the approach to study host- pathogen protein-protein interactions in a physiological infection environment.

However, although the method gave satisfactory results for the effectors SidM and LidA and helped identify a more stringent subset of Rab GTPase interaction partners for both effectors in the context of infection, the MS data for the effectors MavP and LtpG were more difficult to analyse. This is likely due to MavP and LtpG being essentially uncharacterised effectors and hence any putative binding partner must be confirmed by other means.

Although the data suggests that 1% formaldehyde crosslinking followed by mild lysis using Triton X-

100 and TAP is a good initial reference condition, it is difficult to completely generalise a workflow as each protein-protein interaction is unique. Although crosslinking may help stabilise more transient interactions, specific amino acids with the correct reactivity at the protein-protein interface is a prerequisite. Determining whether chemical crosslinking could help identify novel interaction partners can be quickly evaluated by inspecting the protein of interest’s primary sequence for the number and location of reactive amino acids such as lysine and cysteine. If such amino acids are not readily accessible, chemical crosslinking likely hinders potential interactor detection due to solubility issues arising from protein aggregates. More promiscuous crosslinkers are now commercially available which utilise reactive radical chemistries. They typically contain photoactivatable groups which when irradiated with light of the correct wavelength decomposes to form a radical. Due to their reactive

185 nature, radicals are quickly quenched by most functional groups nearby. This allows for any amino acid to be crosslinked, removing the need for specific residues to be present at the protein-protein interface for the interaction to be stabilised by crosslinking.

Furthermore, crosslinking not only aids to stabilise interactions but detecting crosslinked peptides provides absolute evidence that two proteins were interacting as well as providing information of the binding interface. However, this is not a trivial task as the search space required to accommodate for crosslinked peptides scales quadratically. In a given database, any peptide could theoretically be crosslinked to any peptide resulting in x2 number of potential peptide pairs where x is the number of peptides in the database. Even with these difficulties, several groups have developed software packages

(xProphet, pLink) to study crosslinked peptides, generating the field of chemical crosslinking mass spectrometry (CXMS) [Fan et al., 2015; Leitner et al., 2014]. Such significant progress has been made in the field of CXMS that structural MS is now a growing field. As crosslinkers have a defined maximal crosslinking length based on its structure, identified crosslinked peptides must be within a certain proximity of each other. Therefore, the crosslinker length constrains where two crosslinked peptides could physically be within a protein structure. Using such strategies, the Rappsilber group has recently determined the structure of human serum albumin within its native environment of blood serum

[Belsom et al., 2016].

Although advanced software packages can help analyse these complex datasets, it is also possible to simplify the problem at hand from the mass spectrometry perspective. Typically, proteomics is performed with MS1 and MS2, however, it is possible to perform MS3 on certain mass spectrometers.

By detecting the intact crosslinked peptide m/z in MS1 and subsequently fragmenting just the crosslinker, the m/z of each individual peptide can be detected in MS2. Upon further fragmentation, the amino acid sequence of each peptide can be inferred from the MS3 level [Kao et al., 2011]. This, however, does require specific MS-cleavable crosslinkers which generates characteristic fragment ions to aid the analysis.

The current Bio/BirA pulldown method is also an end-point assay in which only interactions occurring at the end point of infection are detected. This again skews detection of interactors to those which have

186 strong binding constants. In particular, detection of interaction partners of enzymatic effectors may be hindered as enzymes typically have weaker binding affinities to prevent product inhibition.

Two novel methods to detect protein-protein interactions which overcomes this obstacle have recently emerged: BioID and APEX. Both of these methods rely on bait proteins fused to an enzyme. BioID utilises a bait-BirA fusion which has a key residue R118 mutated to a glycine [Roux et al., 2012].

Typically biotin is activated by reaction of its carboxylic acid group with ATP, generating a reactive biotin-AMP product. This is normally retained in the active site of BirA until it comes into contact with a biotinylation sequence containing substrate protein. The mutation, however, renders BirA unable to trap the reactive biotin-AMP intermediate which instead is allowed to diffuse out of the active site. Due to its reactivity, primary amines within the vicinity react, enabling proximity-based biotinylation. The labelling radius has been shown to be approximately 10 nm. This enables all proteins which have been near the bait protein to be labelled, allowing for transient interactions to be detected. Importantly, interacting proteins do not need to be bound to the bait protein at the time of cell lysis as they will already be modified with biotin as an affinity purification handle. BioID has been shown to be able to identify substrate proteins of enzymes with ubiquitination targets of the SCF E3 ligase complex being determined using this method [Coyaud et al., 2015]. However, the incorporation times are currently rather long at 24 h and hence may not be suitable to study dynamic interactions at shorter timepoints.

APEX is a monomeric ascorbate peroxidase enzyme which can be used for both electron microscopy

(EM) staining and proteomic mapping [Lam et al., 2015]. For EM, APEX is able to polymerise diaminobenzidine in the presence of hydrogen peroxide. This subsequently recruits electron dense osmium as a contrast agent for EM. For proteomics, biotin-phenol is used as a substrate. The peroxidase activity of APEX activates the phenol moiety of biotin-phenol in the presence of hydrogen peroxide, forming the reactive oxygen radical biotin-phenoxyl species. This is quickly quenched by nearby reactive groups, allowing for proximity biotinylation of nearby proteins.

The optimised Bio/BirA methodology enabled the infection-dependent Rab GTPase binding profiles of

SidM and LidA to be determined. The role of Rab1 in Legionella’s virulence strategy has been well studied and is involved in recruitment of ER-derived vesicles onto the LCV. Activated Rab6 has been

187 implicated in promoting efficient intracellular growth of Legionella in macrophages [Chen et al., 2013]; yet the mechanisms are not defined. Although LidA prevents deactivation of Rab6, whether SidM cooperates with LidA to activate Rab6 or promotes an alternative signaling pathway currently remains unknown. Rab10 is involved in plasma membrane recycling including the transport of glucose transporter type (GLUT4) vesicles [Sano et al., 2008]; however, it has also been implicated in phagolysosome formation [Cardoso et al., 2010]. Although Legionella effectively evades phagolysosomal degradation, Rab10 depletion results in a reduction of intracellular growth [Hoffmann et al., 2014]. This suggests that Legionella may need certain components of early phagolysosomes to efficiently replicate before actively avoiding further phagolysosomal maturation. Indeed the effect of

Rab10 is most likely not due to its role in GLUT4 vesicle traffic as depletion of Rab14, also involved in GLUT4 vesicular trafficking, had an opposing effect and instead increases intracellular replication.

In addition, Rab14 is involved in maturation of early phagosomes [Seto et al., 2011] and has also been implicated in lipid manipulation on the LCV as its depletion reduced the amount of PI4P binding effector SidC on the LCV [Hoffmann et al., 2014]. Rab18 has not been implicated in bacterial pathogenesis but has been shown to be important for ER structure [Gerondopoulos et al., 2014]. As the

LCV matures by recruiting ER-derived material, Rab18 may equally promote structural integrity of the

LCV; however, as its depletion does not affect intracellular replication, its function may also be reproduced through alternative mechanisms. It will hence be interesting to determine how SidM and

LidA modulate the function of these Rab GTPases during infection.

Annexin A1 and A2 are consistently found in SidM and LidA complexes. Annexins, Ca2+-dependent membrane phospholipid binding proteins, are involved in a large number of membrane dynamics processes including actin cytoskeletal rearrangements. The Legionella effector Ceg14 has been shown to target components of the actin cytoskeleton already [Guo et al., 2014b]. However, nothing is currently known on the role of annexins during Legionella pathogenesis but their recruitment could play a role in maintaining LCV integrity as annexins have been implicated in membrane repair following invasion of macrophages by Listeria monocytogenes [Czuczman et al., 2014]. Moreover, annexins can

188 directly interact with Rab GTPases, such as Rab14, allowing us to speculate that SidM and LidA might exploit annexins in combination with Rab GTPases to manipulate membrane dynamics.

MavP was identified as a putative interaction partner of SidM. Currently, the only known effector- effector interactions are SidJ-SidE and LubX-SidH [Jeong et al., 2015a; Kubori et al., 2010]. Both of these interactions have regulatory roles with LubX targeting SidH for proteasomal degradation via its

E3 ubiquitin ligase activity, whilst SidJ is responsible for the removal of SidE from the LCV. Whether the SidM-MavP interaction may have a similar relationship remains to be determined. However, this interaction could not be further confirmed by either the reciprocal pulldown using MavP as bait or Y2H.

The absence of SidM from the MavP interactome can be rationalised by temporal control of effector translocation. SidM has been shown to be an effector which is translocated early in the infection process and can no longer be detected on the LCV 7 hours post-infection of U937 cells and mouse bone marrow derived macrophages [Ingmundson et al., 2007; Neunuebel et al., 2011]. Although infection dynamics of A549 cells is slower, after 24 h SidM is unlikely to be on the LCV. We hypothesise that MavP may remain on the LCV for longer during the infection and hence when Bio-SidM is constitutively expressed, MavP is found in Bio-SidM complexes after 24 h infection.

Little is known about the effector MavP but it is predicted to be a 25kDa transmembrane protein.

Although it is largely uncharacterised, it appears to be in a complementary effector grouping to LidA

[O'Connor et al., 2012]. Deletions of MavP and LidA individually do not result in a growth defect of

L. pneumophila. However, a double deletion mutant exhibits reduced intracellular replication, indicating that MavP and LidA likely act on two functionally redundant pathways. The presence of four other effectors (MavN, Lpg2327, PpeB and Lem14) as Top10 hits in the MavP interactome suggests that MavP may function as an effector binding hub.

Although the MS data for LtpG was more difficult to interpret in contrast to the Rab GTPases being identified in SidM and LidA interactomes, NACA was found in the LtpG interactome. Although NACA was only detected with 1 unique peptide in each of the technical replicates, it was not found in the negative controls. In addition, MS2 spectra of the single unique peptide provide strong evidence for a

189 positive peptide match with many b and y ions assignable. Furthermore, NAC consisting of NACA and

BTF3 showed the best dose-response in the competition in vitro pulldown experiment.

NAC is involved in the ensuring proper targeting of nascent polypeptide chains. It binds to polypeptides without a signal peptide and therefore blocks its interaction with the signal recognition particle. This in turn prevents mistranslocation of these signal peptide-lacking proteins to the ER. NAC has been recently shown to be essential for autophagic flux [Guo et al., 2014a]. Knockdown of NAC in mammalian cells results in an accumulation of enlarged lysosomes suggesting that NAC is critical for the degradative function of lysosomes. Other L. pneumophila effectors have been shown to target the autophagic pathway with the effector RavZ inhibiting autophagy by removing Atg8 from the LCV and

LegA9 promoting recognition of the LCV by autophagy proteins [Choy et al., 2012; Khweek et al.,

2013]. However, as both RavZ and LegA9 are not conserved effectors and are specific to only a handful of strains, other effectors likely compensate for its function. Further experiments need to be conducted to determine whether LtpG is able to manipulate the autophagic pathway during Legionella infection in conjunction with the conserved effector LpSpl [Rolando et al., 2016].

HSP90 remains an interesting putative target of LtpG. HSP90 is one of the most abundant proteins in the mammalian cells. Due to its ability to aid protein folding, it has been implicated in many cellular pathways including ER stress, apoptosis and maintenance of proteasome function [Imai et al., 2003;

Marcu et al., 2002; Mori et al., 2015]. However, due to the potential number of pathways it may have a role in, it becomes difficult to determine which aspect of HSP90 function LtpG may manipulate.

EEF2 has already been shown to be a target of bacterial virulence factors with both diphtheria toxin from Corynebacterium diphtheria and exotoxin A from targeting this essential protein [Honjo et al., 1968; Iglewski et al., 1977]. Both toxins inactivate EEF2 by modifying it with an ADP-ribosyl group [Iglewski et al., 1975]. Other Legionella effectors (Lgt 1, 2, 3, SidI and

SidL) have all been implicated in shutting down host protein synthesis [Belyi et al., 2008; Fontana et al., 2011; Shen et al., 2009]. With the exception of SidL whose mechanism is yet to be determined, these effectors all target eEF1A to inhibit protein synthesis. LtpG may therefore provide an alternative mechanism to halt host protein synthesis through targeting EEF2. Additionally, bacteria encode

190 elongation factor G (EF-G), a protein analogous to eukaryotic EEF2. If LtpG does indeed target EEF2, it may also prevent the function of EF-G leading to bacterial cytotoxicity if LtpG expression is not controlled.

191

Chapter 5: General discussion

This study focused on characterisation of the Fic domain L. pneumophila effector LtpG and development of techniques for determination of effector-host protein-protein interactions. A summary of the results can be found in Table 5.1, Table 5.2 and Table 5.3.

LtpG summary

Property Comments

L. pneumophila intracellular growth Deletion of ltpG did not cause an intracellular growth defect in any infection model tested.

Subcellular localisation Nucleus and fibrous cytoplasmic structures in infection and transfection.

Cytotoxicity Toxic in bacteria and yeast in a Fic-dependent manner. No toxicity observed in mammalian cells.

Enzymatic activity Auto-AMPylation activity observed but no additional targets identified from mammalian cell lysates. Small molecule binding assays suggest GDP is a preferred substrate but no GMPylation activity was observed.

Structural properties Protein crystals were obtained but of insufficient quality to provide structural insights.

Interaction partners From Y2H, in vitro pulldowns and Bio/BirA pulldowns, the most promising putative interaction partners are HSP90, NACA and EEF2.

Table 5.1. Summary of LtpG results.

Bio/BirA effector interactomes summary

Effector High confidence interaction partners (Top 10 ranked enriched proteins)

SidM Rab1A, Rab1B, Rab6, Rab10, annexin A1, annexin A2, ubiquitin, MavP

LidA Rab1A, Rab1B, Rab3D, Rab6, Rab8A, Rab10, Rab14, Rab18, annexin A2

MavP MavN, Lpg2327, PpeB, Lem14, annexin A2, ubiquitin, EEF1A1, HSPA8, ARF1

LtpG HSP90AA1, HSP90AB1, ALDH1A1, PGK1, PGD, tubulin beta, AKR1C3, EEF2, HSPA8

Table 5.2. Summary of high confidence interaction partners (Top 10 ranked enriched proteins) of SidM, LidA, MavP and LtpG as determined by the Bio/BirA methodology. L. pneumophila effectors are highlighted in bold.

192

Bio/BirA conditions summary

Number of purifications Single affinity purification (Neutravidin) Tandem affinity purification (Ni2+-NTA and Neutravidin)

Lysis buffers All lysis buffers tested were PBS-based with various detergents: 1% Triton X-100 1% CHAPS 0.5% SDS 1% Triton X-100 + 6M guanidium chloride

Crosslinking method No crosslinker 1% formaldehyde (Lys-Lys) 3% formaldehyde (Lys-Lys) 1 mM DSP (long Lys-Lys) 0.5 mM DTME (Cys-Cys) 1 mM SMCC (Lys-Cys) 1 mM DSP + 0.5 mM DTME (Lys-Lys + Cys-Cys)

Table 5.3. Summary of paramaters tested in the Bio/BirA pulldown optimisation. Suggested starting conditions are indicated in bold.

LtpG is translocated into host cells in a Dot/Icm-dependent manner and localises to the nucleus and cytoplasmic network structures during both infection and transfection. Its Fic domain appears active as cell based assays showed that LtpG exhibits cytotoxicity in both bacteria and yeast in a Fic-dependent manner. However, no cytotoxicity was observed in mammalian cells. Biochemical analysis of LtpG further confirmed Fic activity by detection of auto-AMPylation using the clickable AMPylation probe

Yn-6-TP. Although auto-AMPylation was robust, no host cell targets could be identified. Alternative small molecule substrates were examined using a DSF assay and revealed that a guanosine disphosphate-containing metabolite may more likely be the endogenous substrate of LtpG. Although

GMPylation activity could not be confirmed using radiolabelled GTP, there are many alternative GDP- containing metabolites within host cells. In parallel to small molecule binding assays, protein crystallisation was trialled to understand the binding pocket of LtpG. Although multiple crystal forms were obtained, none produced diffraction patterns of sufficient quality to provide structural insights.

193

An alternative approach was taken to determine LtpG function through identification of host cell binding partners. Yeast-2-hybrid and recombinant protein pulldown methodologies were used as initial screening techniques. Although some putative interactions were identified such as CDK7, HSP90, the

COPI complex and NAC, these interactions could not be confirmed by the counterpart method. As these techniques identified interaction partners in very artificial conditions, we believed that any interaction found in these assays may not be representative of the physiological interactions during infection.

To tackle the shortcomings of these in vitro assays, our group had recently developed a mass spectrometry-based method to enrich crosslinked effector complexes from infected cells. However, although proof-of-principle was possible using the effector PieE, the experimental scale was not feasible to efficiently study all 300 L. pneumophila effectors. To this end, we aimed to optimise the technique further to increase throughput whilst maximising the detectable interactome. SidM and LidA were chosen as the ideal effectors to optimise the method around. Using the enhanced methodology, the experimental scale was reduced by 116-fold whilst providing semi-quantitative data. Stringent infection-dependent Rab GTPase binding profiles were obtained for both SidM and LidA and revealed that previous literature likely reported on some non-physiological Rab binding partners of both effectors. Additional non-Rab GTPase targets were also identified. However, due to time constraints, it was not possible to confirm or explore the functional consequences of these novel targets. Having optimised the Bio/BirA pulldown methodology, it was used to identify the LtpG interactome during infection. Some overlap was observed between in vitro pulldown methodology and Bio/BirA technique.

In particular, HSP90, EEF2 and NAC were all identified with both methods. However, there were no similar hits with the Y2H system. Again, due to time constraints, it was not possible to determine the effects of LtpG binding to these proteins.

Although the Y2H system has identified genuine effector-host protein interactions in the past, it was not suitable to study interaction partners of LtpG or SidM. For SidM, not even known interactions with

Rab GTPases could be recapitulated using the Y2H system. Whilst the CDK7 fragment consisting of

AA157-346 showed robust interaction with LtpG, full length CDK7 could never be confirmed by the

Y2H method. This can be partially attributed to the design of Y2H libraries in which random 1 kB ORF

194 fragments are inserted in the prey plasmid, leading to many screened clones encoding nonsense sequences in addition to frameshifted constructs. Furthermore, rarely is a full length protein encoded in the clone as the inserts are restricted in length during construction. Finally, both bait and prey proteins are fused to large GAL4 domains which may hinder their structure and interaction interfaces in addition to enforcing a nuclear localisation. Consequently, the identification of CDK7 as an interaction partner of LtpG was likely a false positive result as the interaction could not be confirmed using multiple MS- based techniques. Although the Y2H is a quick and simple screen which does not require specialist equipment, the time and resources required to follow up and confirm any potential hits quickly outweighs its initial benefits. Furthermore, alternative techniques which use less artificial environments to detect protein-protein interactions are now more accessible, making Y2H assays less attractive as a screening method. Recombinant protein pulldowns aim to identify protein-protein interactions outside the cellular context. In particular, it requires the effector of interest to be soluble which is not always a trivial task. However, unlike the Y2H, small affinity purification tags which are less likely to interfere with protein-protein interaction interfaces can be used. Nevertheless, without cellular compartmentalisation and using large excesses of bait protein, this technique is also prone to false positives. HSP90 represents a difficult to analyse hit. Although it was robustly identified in all three in vitro pulldown assays, it was impossible to determine whether it was binding to misfolded bait protein or a genuine interaction partner. The competition pulldown is conceptually an intriguing idea to wean out the non-specific binders. This type of experiment is frequently used with success when assaying small molecule probes in activity based protein profiling [Nomura et al., 2010]. However, the technical challenges of competing out protein-protein interactions limits its reproducibility. Further optimisation of this method may make it a more viable solution.

Whilst ectopic expression of effectors in host cells through transfection enables effector-host protein complexes to be immunoprecipitated from a cellular environment, the unique setting during infection can still not be recapitulated. In contrast, the translocation-dependent biotinylation of effectors using the biotin ligase BirA and the 15 amino acid biotinylation sequence enables effector interactomes to be determined during infection. This is currently a substantial improvement on alternative techniques to

195 identify interacting partners in a more physiological environment. The ability to detect infection- dependent interactions reduces the likelihood of false positives. Additionally, the translocation- dependent biotinylation ensures that bacterial background proteins are kept to a minimum. However, the system can be further improved. In particular, currently the tagged effector of interest is encoded on a plasmid and expression determined by an IPTG-inducible promoter. This inevitably overexpresses the effector relative to its chromosomal promoter. Although this will help in downsizing the experimental scale, it also leads to an unphysiological amount of effector being translocated. This can be rectified by inserting the tag sequence into the chromosome. The system also requires the presence of BirA in the host cell for the biotinylation reaction to occur. As certain cell lines are less prone to genetic manipulation, it may not be possible to introduce BirA. However, BirA-expressing mice have been developed and hence the Bio/BirA system may be possible in a mouse model [Driegen et al.,

2005]. Furthermore, the amount of bait protein pulled down is essential for effective interactome detection. As evidenced by the THP-1 data, the infection dynamics renders them an unsuitable cell line for the system even though THP-1 cells represent the most relevant cell type to study Legionella infections. It will also be interesting to explore the possibility of using the system in L. pneumophila’s environmental protozoan hosts. As L. pneumophila requires a different subset of effectors to ensure proliferation in different hosts, effectors need to characterised in multiple host models to fully decipher their physiological functions [O'Connor et al., 2011]. Although the Bio/BirA methodology was developed to study T4SS effector proteins, it is likely possible to utilise the technique to decipher effectors secreted by other secretion systems. A potential pitfall, however, is that the bacterium of interest cannot encode BirA as the biotinylation reaction would no longer be translocation specific. As such, bacterial pathogens such as pathogenic E. coli and Salmonella species may not benefit from the system directly. However, there are many uncharacterised biotin ligases from other species which may be able to replace BirA [Kim et al., 2016].

This study focused on identifying effector function from an effector-centric point of view. However, as most effectors have little homology to characterised proteins, there is often no sensible starting point for characterisation. Legionella effectors are even more difficult to study due to the vast arsenal of

196 virulence factors encoded on its bacterial genome. Functional redundancies between effectors results in few effector deletions producing an obvious phenotype with regards to virulence. Therefore, the molecular Koch’s postulates are difficult to satisfy. An alternative approach to understand effector function is from the point of view of the host. As host cells have evolved extensive immune responses to combat infectious diseases, successful pathogens must have co-evolved to either disable these defence mechanisms or hide from them [Pallett et al., 2016]. By first identifying an important host pathway in which the bacterium likely needs to manipulate for its survival and proliferation, a phenotypic readout can be established. Screens such as transposon mutagenesis then enable relevant effector proteins involved in the pathway of interest to be identified [O'Connor et al., 2011]. With the functional redundancy of Legionella effectors, this approach may be more suitable to identify effector function [O'Connor et al., 2012]. Alternatively, host genes may be knocked down or deleted to determine not only important immune signalling pathways but also pathways L. pneumophila hijacks to ensure proliferation [Hoffmann et al., 2014]. With the recent developments of CRISPR-Cas9 methodologies, genomic manipulation of the host has become more accessible and large screens more feasible [Cong et al., 2013; Jinek et al., 2013; Mali et al., 2013]. Such screens have already been utilised to understand V. parahaemolyticus pathogenesis [Blondel et al., 2016]. Furthermore, effectors can be used as tools to understand cell biology signalling pathways. Modification of Rho GTPases by effectors has already been used to study the Pyrin inflammasome [Xu et al., 2014]. As modification of Rab1 by

SidM and AnkX restricts its interaction partners to a particular subset, these effectors could be used as tools to further elucidate Rab GTPase function [Goody et al., 2012; Muller et al., 2010].

As we further our understanding of effector biology, it becomes increasingly apparent that many effectors act in concert with other effectors [Jeong et al., 2015a; Kubori et al., 2010]. The manipulation of Rab1 is a prime example in which at least 6 effectors have been implicated in its activity [Ingmundson et al., 2007; Machner et al., 2006; Mukherjee et al., 2011; Neunuebel et al., 2011; Qiu et al., 2016; Tan et al., 2011b]. Whilst inhibition of a particular pathway may be beneficial to the bacterium at a specific stage of its life cycle, the same inhibition may be deleterious for bacterial growth at other stages. The early recruitment and activation of Rab1 onto the LCV in contrast to its deactivation and extraction at

197 later stages of infection again highlights the dynamism of host-pathogen interactions. Although the complex mechanisms by which bacterial pathogens have evolved to manipulate the host is remarkable, this also further complicates discovery of effector functions when effectors are studied individually.

Recent technological advances have enabled us to enter the age of omics [Yugi et al., 2016]. With large genomic, transcriptomic and proteomic datasets being generated with relative ease, host-pathogen interactions can be studied at a global level [Didelot et al., 2012; Hein et al., 2015; Westermann et al.,

2016]. Efficient data mining and systems biology are likely to play big roles in our understanding of bacterial pathogenesis. However, molecular understanding of these pathways and interactions is still vital as it is at a molecular scale in which drug development occurs. As antibiotic resistance becomes an increasing threat to our livelihood, fully understanding bacterial virulence strategies may aid in reducing unnecessary antibiotic use as well as provide alternative and specific targets for drug development [Brown et al., 2016].

With ever increasing numbers of publications in effector biology, there appear to be contradictions between effector functions. In particular, many phenotypes and functions which were identified either in ectopic expression cell models or recombinant protein assays are becoming increasingly scrutinised.

For example, LidA was reported with picomolar affinities for Rab1, Rab6 and Rab8 using in vitro assays

[Schoebel et al., 2011]. However, although Rab1 and Rab6 were the highest enriched interactors in our

MS-based assay, Rab8 ranked lower in comparison. This suggests that physical binding affinities and in vitro assays are insufficient to fully understand effector function. Such techniques appear to be prone to false positives and convolutes our understanding of host-pathogen interactions, highlighting the need to study these host-pathogen interactions in a physiologically relevant model to minimise the chances of studying artefacts.

As technologies advance, it becomes possible to understand effector function in increasingly physiological environments. This is of paramount importance as the ultimate goal in our work is to fully understand bacterial pathogenesis and hence enable humans to finally get a foothold against pathogens in the molecular arms race.

198

Chapter 6: References

Abdelhady, H. & Garduno, R. A. The progeny of Legionella pneumophila in human macrophages shows unique developmental traits. FEMS microbiology letters 349, 99-107, doi:10.1111/1574- 6968.12300 (2013).

Abu-Zant, A., Jones, S., Asare, R. et al. Anti-apoptotic signalling by the Dot/Icm secretion system of L. pneumophila. Cellular microbiology 9, 246-264, doi:10.1111/j.1462-5822.2006.00785.x (2007).

Adeleke, A. A., Fields, B. S., Benson, R. F. et al. Legionella drozanskii sp. nov., Legionella rowbothamii sp. nov. and Legionella fallonii sp. nov.: three unusual new Legionella species. International journal of systematic and evolutionary microbiology 51, 1151-1160, doi:10.1099/00207713-51-3-1151 (2001).

Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537, 347-355, doi:10.1038/nature19949 (2016).

Al-Bana, B. H., Haddad, M. T. & Garduno, R. A. Stationary phase and mature infectious forms of Legionella pneumophila produce distinct viable but non-culturable cells. Environmental microbiology 16, 382-395, doi:10.1111/1462-2920.12219 (2014).

Alix, E., Chesnel, L., Bowzard, B. J. et al. The capping domain in RalF regulates effector functions. PLoS pathogens 8, e1003012, doi:10.1371/journal.ppat.1003012 (2012).

Allison, C., Emody, L., Coleman, N. & Hughes, C. The role of swarm cell differentiation and multicellular migration in the uropathogenicity of Proteus mirabilis. The Journal of infectious diseases 169, 1155-1158 (1994).

Amer, A. O. & Swanson, M. S. Autophagy is an immediate macrophage response to Legionella pneumophila. Cellular microbiology 7, 765-778, doi:10.1111/j.1462-5822.2005.00509.x (2005).

Amor, J. C., Swails, J., Zhu, X. et al. The structure of RalF, an ADP-ribosylation factor guanine nucleotide exchange factor from Legionella pneumophila, reveals the presence of a cap over the active site. The Journal of biological chemistry 280, 1392-1400, doi:10.1074/jbc.M410820200 (2005).

Amyot, W. M., deJesus, D. & Isberg, R. R. Poison domains block transit of translocated substrates via the Legionella pneumophila Icm/Dot system. Infection and immunity 81, 3239-3252, doi:10.1128/IAI.00552-13 (2013).

Aragon, V., Kurtz, S. & Cianciotto, N. P. Legionella pneumophila major acid phosphatase and its role in intracellular infection. Infection and immunity 69, 177-185, doi:10.1128/IAI.69.1.177- 185.2001 (2001).

199

Aragon, V., Kurtz, S., Flieger, A., Neumeister, B. & Cianciotto, N. P. Secreted enzymatic activities of wild-type and pilD-deficient Legionella pneumophila. Infection and immunity 68, 1855-1863 (2000).

Aragon, V., Rossier, O. & Cianciotto, N. P. Legionella pneumophila genes that encode lipase and phospholipase C activities. Microbiology 148, 2223-2231, doi:10.1099/00221287-148-7-2223 (2002).

Arasaki, K., Toomre, D. K. & Roy, C. R. The Legionella pneumophila effector DrrA is sufficient to stimulate SNARE-dependent membrane fusion. Cell host & microbe 11, 46-57, doi:10.1016/j.chom.2011.11.009 (2012).

Arnold, F. W., Summersgill, J. T., Lajoie, A. S. et al. A worldwide perspective of atypical pathogens in community-acquired pneumonia. American journal of respiratory and critical care medicine 175, 1086-1093, doi:10.1164/rccm.200603-350OC (2007).

Asrat, S., Dugan, A. S. & Isberg, R. R. The frustrated host response to Legionella pneumophila is bypassed by MyD88-dependent translation of pro-inflammatory cytokines. PLoS pathogens 10, e1004229, doi:10.1371/journal.ppat.1004229 (2014).

Aurass, P., Schlegel, M., Metwally, O. et al. The Legionella pneumophila Dot/Icm-secreted effector PlcC/CegC1 together with PlcA and PlcB promotes virulence and belongs to a novel zinc metallophospholipase C family present in bacteria and fungi. The Journal of biological chemistry 288, 11080-11092, doi:10.1074/jbc.M112.426049 (2013).

Bandyopadhyay, P., Liu, S., Gabbai, C. B., Venitelli, Z. & Steinman, H. M. Environmental mimics and the Lvh type IVA secretion system contribute to virulence-related phenotypes of Legionella pneumophila. Infection and immunity 75, 723-735, doi:10.1128/IAI.00956-06 (2007).

Banerji, S., Bewersdorff, M., Hermes, B., Cianciotto, N. P. & Flieger, A. Characterization of the major secreted zinc metalloprotease- dependent glycerophospholipid:cholesterol acyltransferase, PlaC, of Legionella pneumophila. Infection and immunity 73, 2899-2909, doi:10.1128/IAI.73.5.2899-2909.2005 (2005).

Bantscheff, M., Lemeer, S., Savitski, M. M. & Kuster, B. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Analytical and bioanalytical chemistry 404, 939-965, doi:10.1007/s00216-012-6203-4 (2012).

Bartfeld, S., Engels, C., Bauer, B. et al. Temporal resolution of two-tracked NF-kappaB activation by Legionella pneumophila. Cellular microbiology 11, 1638-1651, doi:10.1111/j.1462- 5822.2009.01354.x (2009).

Bartram, J. Legionella and the prevention of legionellosis. (World Health Organization, 2007).

Basler, M., Pilhofer, M., Henderson, G. P., Jensen, G. J. & Mekalanos, J. J. Type VI secretion requires a dynamic contractile phage tail-like structure. Nature 483, 182-186, doi:10.1038/nature10846 (2012).

200

Beaute, J., Zucs, P., de Jong, B. & European Legionnaires' Disease Surveillance, N. Legionnaires disease in Europe, 2009-2010. Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin 18, 20417 (2013).

Behnia, R. & Munro, S. Organelle identity and the signposts for membrane traffic. Nature 438, 597- 604, doi:10.1038/nature04397 (2005).

Belsom, A., Schneider, M., Fischer, L., Brock, O. & Rappsilber, J. Serum Albumin Domain Structures in Human Blood Serum by Mass Spectrometry and Computational Biology. Molecular & cellular proteomics : MCP 15, 1105-1116, doi:10.1074/mcp.M115.048504 (2016).

Belyi, Y., Tabakova, I., Stahl, M. & Aktories, K. Lgt: a family of cytotoxic glucosyltransferases produced by Legionella pneumophila. Journal of bacteriology 190, 3026-3035, doi:10.1128/JB.01798-07 (2008).

Benin, A. L., Benson, R. F. & Besser, R. E. Trends in legionnaires disease, 1980-1998: declining mortality and new patterns of diagnosis. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 35, 1039-1046, doi:10.1086/342903 (2002).

Bennett, T. L., Kraft, S. M., Reaves, B. J. et al. LegC3, an effector protein from Legionella pneumophila, inhibits homotypic yeast vacuole fusion in vivo and in vitro. PloS one 8, e56798, doi:10.1371/journal.pone.0056798 (2013).

Berk, S. G., Faulkner, G., Garduno, E. et al. Packaging of live Legionella pneumophila into pellets expelled by Tetrahymena spp. does not require bacterial replication and depends on a Dot/Icm- mediated survival mechanism. Applied and environmental microbiology 74, 2187-2199, doi:10.1128/AEM.01214-07 (2008).

Berks, B. C. The twin-arginine protein translocation pathway. Annual review of biochemistry 84, 843- 864, doi:10.1146/annurev-biochem-060614-034251 (2015).

Blondel, C. J., Park, J. S., Hubbard, T. P. et al. CRISPR/Cas9 Screens Reveal Requirements for Host Cell Sulfation and Fucosylation in Bacterial Type III Secretion System-Mediated Cytotoxicity. Cell host & microbe 20, 226-237, doi:10.1016/j.chom.2016.06.010 (2016).

Boersema, P. J., Raijmakers, R., Lemeer, S., Mohammed, S. & Heck, A. J. Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nature protocols 4, 484-494, doi:10.1038/nprot.2009.21 (2009).

Borges, V., Nunes, A., Sampaio, D. A. et al. Legionella pneumophila strain associated with the first evidence of person-to-person transmission of Legionnaires' disease: a unique mosaic genetic backbone. Scientific reports 6, 26261, doi:10.1038/srep26261 (2016).

Brand, B. C., Sadosky, A. B. & Shuman, H. A. The Legionella pneumophila icm locus: a set of genes required for intracellular multiplication in human macrophages. Molecular microbiology 14, 797-808 (1994).

201

Brombacher, E., Urwyler, S., Ragaz, C. et al. Rab1 guanine nucleotide exchange factor SidM is a major phosphatidylinositol 4-phosphate-binding effector protein of Legionella pneumophila. The Journal of biological chemistry 284, 4846-4856, doi:10.1074/jbc.M807505200 (2009).

Broncel, M., Serwa, R. A., Bunney, T. D., Katan, M. & Tate, E. W. Global Profiling of Huntingtin- associated protein E (HYPE)-Mediated AMPylation through a Chemical Proteomic Approach. Molecular & cellular proteomics : MCP 15, 715-725, doi:10.1074/mcp.O115.054429 (2016).

Broncel, M., Serwa, R. A., Ciepla, P. et al. Multifunctional reagents for quantitative proteome-wide analysis of protein modification in human cells and dynamic profiling of protein lipidation during vertebrate development. Angewandte Chemie 54, 5948-5951, doi:10.1002/anie.201500342 (2015).

Brown, E. D. & Wright, G. D. Antibacterial drug discovery in the resistance era. Nature 529, 336-343, doi:10.1038/nature17042 (2016).

Brunet, Y. R., Henin, J., Celia, H. & Cascales, E. Type VI secretion and bacteriophage tail tubes share a common assembly pathway. EMBO reports 15, 315-321, doi:10.1002/embr.201337936 (2014).

Bunney, T. D., Cole, A. R., Broncel, M. et al. Crystal structure of the human, FIC-domain containing protein HYPE and implications for its functions. Structure 22, 1831-1843, doi:10.1016/j.str.2014.10.007 (2014).

Burnside, D. M., Wu, Y., Shafaie, S. & Cianciotto, N. P. The Legionella pneumophila Siderophore Legiobactin Is a Polycarboxylate That Is Identical in Structure to Rhizoferrin. Infection and immunity 83, 3937-3945, doi:10.1128/IAI.00808-15 (2015).

Burstein, D., Zusman, T., Degtyar, E. et al. Genome-scale identification of Legionella pneumophila effectors using a machine learning approach. PLoS pathogens 5, e1000508, doi:10.1371/journal.ppat.1000508 (2009).

Buscher, B. A., Conover, G. M., Miller, J. L. et al. The DotL protein, a member of the TraG-coupling protein family, is essential for Viability of Legionella pneumophila strain Lp02. Journal of bacteriology 187, 2927-2938, doi:10.1128/JB.187.9.2927-2938.2005 (2005).

Cambronne, E. D. & Roy, C. R. The Legionella pneumophila IcmSW complex interacts with multiple Dot/Icm effectors to facilitate type IV translocation. PLoS pathogens 3, e188, doi:10.1371/journal.ppat.0030188 (2007).

Campanacci, V., Mukherjee, S., Roy, C. R. & Cherfils, J. Structure of the Legionella effector AnkX reveals the mechanism of phosphocholine transfer by the FIC domain. The EMBO journal 32, 1469-1477, doi:10.1038/emboj.2013.82 (2013).

Campodonico, E. M., Chesnel, L. & Roy, C. R. A yeast genetic system for the identification and characterization of substrate proteins transferred into host cells by the Legionella pneumophila

202

Dot/Icm system. Molecular microbiology 56, 918-933, doi:10.1111/j.1365-2958.2005.04595.x (2005).

Campodonico, E. M., Roy, C. R. & Ninio, S. Legionella pneumophila Type IV Effectors YlfA and YlfB Are SNARE-Like Proteins that Form Homo- and Heteromeric Complexes and Enhance the Efficiency of Vacuole Remodeling. PloS one 11, e0159698, doi:10.1371/journal.pone.0159698 (2016).

Cardoso, C. M., Jordao, L. & Vieira, O. V. Rab10 regulates phagosome maturation and its overexpression rescues Mycobacterium-containing phagosomes maturation. Traffic 11, 221- 235, doi:10.1111/j.1600-0854.2009.01013.x (2010).

Castro-Roa, D., Garcia-Pino, A., De Gieter, S. et al. The Fic protein Doc uses an inverted substrate to phosphorylate and inactivate EF-Tu. Nature chemical biology 9, 811-817, doi:10.1038/nchembio.1364 (2013).

Cazalet, C., Rusniok, C., Bruggemann, H. et al. Evidence in the Legionella pneumophila genome for exploitation of host cell functions and high genome plasticity. Nature genetics 36, 1165-1173, doi:10.1038/ng1447 (2004).

Charpentier, X., Kay, E., Schneider, D. & Shuman, H. A. Antibiotics and UV radiation induce competence for natural transformation in Legionella pneumophila. Journal of bacteriology 193, 1114-1121, doi:10.1128/JB.01146-10 (2011).

Chen, J., de Felipe, K. S., Clarke, M. et al. Legionella effectors that promote nonlytic release from protozoa. Science 303, 1358-1361, doi:10.1126/science.1094226 (2004).

Chen, Y. & Machner, M. P. Targeting of the small GTPase Rab6A' by the Legionella pneumophila effector LidA. Infection and immunity 81, 2226-2235, doi:10.1128/IAI.00157-13 (2013).

Cheng, W., Yin, K., Lu, D. et al. Structural insights into a unique Legionella pneumophila effector LidA recognizing both GDP and GTP bound Rab1 in their active state. PLoS pathogens 8, e1002528, doi:10.1371/journal.ppat.1002528 (2012).

Choe, L., D'Ascenzo, M., Relkin, N. R. et al. 8-plex quantitation of changes in cerebrospinal fluid protein expression in subjects undergoing intravenous immunoglobulin treatment for Alzheimer's disease. Proteomics 7, 3651-3660, doi:10.1002/pmic.200700316 (2007).

Choy, A., Dancourt, J., Mugo, B. et al. The Legionella effector RavZ inhibits host autophagy through irreversible Atg8 deconjugation. Science 338, 1072-1076, doi:10.1126/science.1227026 (2012).

Cianciotto, N. P. Many substrates and functions of type II secretion: lessons learned from Legionella pneumophila. Future microbiology 4, 797-805, doi:10.2217/fmb.09.53 (2009).

Cirillo, J. D., Falkow, S. & Tompkins, L. S. Growth of Legionella pneumophila in Acanthamoeba castellanii enhances invasion. Infection and immunity 62, 3254-3261 (1994).

203

Cirillo, S. L., Bermudez, L. E., El-Etr, S. H., Duhamel, G. E. & Cirillo, J. D. Legionella pneumophila entry gene rtxA is involved in virulence. Infection and immunity 69, 508-517, doi:10.1128/IAI.69.1.508-517.2001 (2001).

Cirillo, S. L., Yan, L., Littman, M., Samrakandi, M. M. & Cirillo, J. D. Role of the Legionella pneumophila rtxA gene in amoebae. Microbiology 148, 1667-1677, doi:10.1099/00221287- 148-6-1667 (2002).

Coers, J., Kagan, J. C., Matthews, M. et al. Identification of Icm protein complexes that play distinct roles in the biogenesis of an organelle permissive for Legionella pneumophila intracellular growth. Molecular microbiology 38, 719-736 (2000).

Cong, L., Ran, F. A., Cox, D. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819-823, doi:10.1126/science.1231143 (2013).

Correia, A. M., Ferreira, J. S., Borges, V. et al. Probable Person-to-Person Transmission of Legionnaires' Disease. The New England journal of medicine 374, 497-498, doi:10.1056/NEJMc1505356 (2016).

Costa, T. R., Felisberto-Rodrigues, C., Meir, A. et al. Secretion systems in Gram-negative bacteria: structural and mechanistic insights. Nature reviews. Microbiology 13, 343-359, doi:10.1038/nrmicro3456 (2015).

Cox, J., Hein, M. Y., Luber, C. A. et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Molecular & cellular proteomics : MCP 13, 2513-2526, doi:10.1074/mcp.M113.031591 (2014).

Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature biotechnology 26, 1367- 1372, doi:10.1038/nbt.1511 (2008).

Cox, J., Neuhauser, N., Michalski, A. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. Journal of proteome research 10, 1794-1805, doi:10.1021/pr101065j (2011).

Coyaud, E., Mis, M., Laurent, E. M. et al. BioID-based Identification of Skp Cullin F-box (SCF)beta- TrCP1/2 E3 Ligase Substrates. Molecular & cellular proteomics : MCP 14, 1781-1795, doi:10.1074/mcp.M114.045658 (2015).

Creasey, E. A. & Isberg, R. R. The protein SdhA maintains the integrity of the Legionella-containing vacuole. Proceedings of the National Academy of Sciences of the United States of America 109, 3481-3486, doi:10.1073/pnas.1121286109 (2012).

Cruz, J. W., Rothenbacher, F. P., Maehigashi, T. et al. Doc toxin is a kinase that inactivates elongation factor Tu. The Journal of biological chemistry 289, 7788-7798, doi:10.1074/jbc.M113.544429 (2014).

204

Curak, J., Rohde, J. & Stagljar, I. Yeast as a tool to study bacterial effectors. Current opinion in microbiology 12, 18-23, doi:10.1016/j.mib.2008.11.004 (2009).

Czuczman, M. A., Fattouh, R., van Rijn, J. M. et al. Listeria monocytogenes exploits efferocytosis to promote cell-to-cell spread. Nature 509, 230-234, doi:10.1038/nature13168 (2014).

De Buck, E., Anne, J. & Lammertyn, E. The role of protein secretion systems in the virulence of the intracellular pathogen Legionella pneumophila. Microbiology 153, 3948-3953, doi:10.1099/mic.0.2007/012039-0 (2007).

De Buck, E., Lebeau, I., Maes, L. et al. A putative twin-arginine translocation pathway in Legionella pneumophila. Biochemical and biophysical research communications 317, 654-661, doi:10.1016/j.bbrc.2004.03.091 (2004).

De Buck, E., Maes, L., Meyen, E. et al. Legionella pneumophila Philadelphia-1 tatB and tatC affect intracellular replication and biofilm formation. Biochemical and biophysical research communications 331, 1413-1420, doi:10.1016/j.bbrc.2005.04.060 (2005). de Felipe, K. S., Glover, R. T., Charpentier, X. et al. Legionella eukaryotic-like type IV substrates interfere with organelle trafficking. PLoS pathogens 4, e1000117, doi:10.1371/journal.ppat.1000117 (2008).

DebRoy, S., Dao, J., Soderberg, M., Rossier, O. & Cianciotto, N. P. Legionella pneumophila type II secretome reveals unique exoproteins and a chitinase that promotes bacterial persistence in the lung. Proceedings of the National Academy of Sciences of the United States of America 103, 19146-19151, doi:10.1073/pnas.0608279103 (2006).

Dedic, E., Alsarraf, H., Welner, D. H. et al. A Novel Fic (Filamentation Induced by cAMP) Protein from Clostridium difficile Reveals an Inhibitory Motif-independent Adenylylation/AMPylation Mechanism. The Journal of biological chemistry 291, 13286-13300, doi:10.1074/jbc.M115.705491 (2016).

Del Campo, C. M., Mishra, A. K., Wang, Y. H. et al. Structural basis for PI(4)P-specific membrane recruitment of the Legionella pneumophila effector DrrA/SidM. Structure 22, 397-408, doi:10.1016/j.str.2013.12.018 (2014).

Derre, I. & Isberg, R. R. LidA, a translocated substrate of the Legionella pneumophila type IV secretion system, interferes with the early secretory pathway. Infection and immunity 73, 4370-4380, doi:10.1128/IAI.73.7.4370-4380.2005 (2005).

Desveaux, D., Singer, A. U., Wu, A. J. et al. Type III effector activation via nucleotide binding, phosphorylation, and host target interaction. PLoS pathogens 3, e48, doi:10.1371/journal.ppat.0030048 (2007).

Didelot, X., Bowden, R., Wilson, D. J., Peto, T. E. & Crook, D. W. Transforming clinical microbiology with bacterial genome sequencing. Nature reviews. Genetics 13, 601-612, doi:10.1038/nrg3226 (2012).

205

Doleans, A., Aurell, H., Reyrolle, M. et al. Clinical and environmental distributions of Legionella strains in France are different. Journal of clinical microbiology 42, 458-460 (2004).

Dolezal, P., Aili, M., Tong, J. et al. Legionella pneumophila secretes a mitochondrial carrier protein during infection. PLoS pathogens 8, e1002459, doi:10.1371/journal.ppat.1002459 (2012).

Dolinsky, S., Haneburger, I., Cichy, A. et al. The Legionella longbeachae Icm/Dot substrate SidC selectively binds phosphatidylinositol 4-phosphate with nanomolar affinity and promotes pathogen vacuole-endoplasmic reticulum interactions. Infection and immunity 82, 4021-4033, doi:10.1128/IAI.01685-14 (2014).

Dooling, K. L., Toews, K. A., Hicks, L. A. et al. Active Bacterial Core Surveillance for Legionellosis - United States, 2011-2013. MMWR. Morbidity and mortality weekly report 64, 1190-1193, doi:10.15585/mmwr.mm6442a2 (2015).

Driegen, S., Ferreira, R., van Zon, A. et al. A generic tool for biotinylation of tagged proteins in transgenic mice. Transgenic research 14, 477-482 (2005).

Dumenil, G. & Isberg, R. R. The Legionella pneumophila IcmR protein exhibits chaperone activity for IcmQ by preventing its participation in high-molecular-weight complexes. Molecular microbiology 40, 1113-1127 (2001).

Dumenil, G., Montminy, T. P., Tang, M. & Isberg, R. R. IcmR-regulated membrane insertion and efflux by the Legionella pneumophila IcmQ protein. The Journal of biological chemistry 279, 4686- 4695, doi:10.1074/jbc.M309908200 (2004).

Edelstein, P. H. Control of Legionella in hospitals. The Journal of hospital infection 8, 109-115 (1986).

Edelstein, P. H. Antimicrobial chemotherapy for legionnaires' disease: a review. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 21 Suppl 3, S265-276 (1995).

Elias, J. E. & Gygi, S. P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nature methods 4, 207-214, doi:10.1038/nmeth1019 (2007).

Engel, P., Goepfert, A., Stanger, F. V. et al. Adenylylation control by intra- or intermolecular active- site obstruction in Fic proteins. Nature 482, 107-110, doi:10.1038/nature10729 (2012).

Engleberg, N. C., Drutz, D. J. & Eisenstein, B. I. Cloning and expression of Legionella pneumophila antigens in Escherichia coli. Infection and immunity 44, 222-227 (1984).

Ensminger, A. W. & Isberg, R. R. E3 ubiquitin ligase activity and targeting of BAT3 by multiple Legionella pneumophila translocated substrates. Infection and immunity 78, 3905-3919, doi:10.1128/IAI.00344-10 (2010).

206

European Centre for Disease Prevention and Control. Legionnaires' disease in Europe 2011. (ECDC, 2013).

European Centre for Disease Prevention and Control. Annual Epidemiological Report 2016 - Legionnaire's Disease. (2016).

Fan, S. B., Meng, J. M., Lu, S. et al. Using pLink to Analyze Cross-Linked Peptides. Current protocols in bioinformatics / editoral board, Andreas D. Baxevanis ... [et al.] 49, 8 21 21-19, doi:10.1002/0471250953.bi0821s49 (2015).

Faulkner, G., Berk, S. G., Garduno, E., Ortiz-Jimenez, M. A. & Garduno, R. A. Passage through Tetrahymena tropicalis triggers a rapid morphological differentiation in Legionella pneumophila. Journal of bacteriology 190, 7728-7738, doi:10.1128/JB.00751-08 (2008).

Feng, F., Yang, F., Rong, W. et al. A Xanthomonas uridine 5'-monophosphate transferase inhibits plant immune kinases. Nature 485, 114-118, doi:10.1038/nature10962 (2012).

Fields, B. Procedures for the Recovery of Legionella from the Environment. US Department of Health and Human Services. Atlanta, GA: Centers for Disease Control and Prevention (1994).

Finn, R. D., Coggill, P., Eberhardt, R. Y. et al. The Pfam protein families database: towards a more sustainable future. Nucleic acids research 44, D279-285, doi:10.1093/nar/gkv1344 (2016).

Finsel, I., Ragaz, C., Hoffmann, C. et al. The Legionella effector RidL inhibits retrograde trafficking to promote intracellular replication. Cell host & microbe 14, 38-50, doi:10.1016/j.chom.2013.06.001 (2013).

Fisher, R. P. & Morgan, D. O. A novel cyclin associates with MO15/CDK7 to form the CDK-activating kinase. Cell 78, 713-724 (1994).

Flieger, A., Gong, S., Faigle, M. et al. Novel lysophospholipase A secreted by Legionella pneumophila. Journal of bacteriology 183, 2121-2124, doi:10.1128/JB.183.6.2121-2124.2001 (2001).

Flieger, A., Neumeister, B. & Cianciotto, N. P. Characterization of the gene encoding the major secreted lysophospholipase A of Legionella pneumophila and its role in detoxification of lysophosphatidylcholine. Infection and immunity 70, 6094-6106 (2002).

Fontana, M. F., Banga, S., Barry, K. C. et al. Secreted bacterial effectors that inhibit host protein synthesis are critical for induction of the innate immune response to virulent Legionella pneumophila. PLoS pathogens 7, e1001289, doi:10.1371/journal.ppat.1001289 (2011).

Franco, I. S., Shohdy, N. & Shuman, H. A. The Legionella pneumophila effector VipA is an actin nucleator that alters host cell organelle trafficking. PLoS pathogens 8, e1002546, doi:10.1371/journal.ppat.1002546 (2012).

207

Fraser, D. W., Tsai, T. R., Orenstein, W. et al. Legionnaires' disease: description of an epidemic of pneumonia. The New England journal of medicine 297, 1189-1197, doi:10.1056/NEJM197712012972201 (1977).

Fuche, F., Vianney, A., Andrea, C., Doublet, P. & Gilbert, C. Functional type 1 secretion system involved in Legionella pneumophila virulence. Journal of bacteriology 197, 563-571, doi:10.1128/JB.02164-14 (2015).

Galan, J. E., Lara-Tejero, M., Marlovits, T. C. & Wagner, S. Bacterial type III secretion systems: specialized nanomachines for protein delivery into target cells. Annual review of microbiology 68, 415-438, doi:10.1146/annurev-micro-092412-155725 (2014).

Garcia-Pino, A., Christensen-Dalsgaard, M., Wyns, L. et al. Doc of prophage P1 is inhibited by its antitoxin partner Phd through fold complementation. The Journal of biological chemistry 283, 30821-30827, doi:10.1074/jbc.M805654200 (2008).

Garcia-Pino, A., Zenkin, N. & Loris, R. The many faces of Fic: structural and functional aspects of Fic enzymes. Trends in biochemical sciences 39, 121-129, doi:10.1016/j.tibs.2014.01.001 (2014).

Garduno, R. A., Garduno, E., Hiltz, M. & Hoffman, P. S. Intracellular growth of Legionella pneumophila gives rise to a differentiated form dissimilar to stationary-phase forms. Infection and immunity 70, 6273-6283 (2002).

Garduno, R. A., Quinn, F. D. & Hoffman, P. S. HeLa cells as a model to study the invasiveness and biology of Legionella pneumophila. Canadian journal of microbiology 44, 430-440 (1998).

Gaspar, A. H. & Machner, M. P. VipD is a Rab5-activated phospholipase A1 that protects Legionella pneumophila from endosomal fusion. Proceedings of the National Academy of Sciences of the United States of America 111, 4560-4565, doi:10.1073/pnas.1316376111 (2014).

Ge, J., Gong, Y. N., Xu, Y. & Shao, F. Preventing bacterial DNA release and absent in melanoma 2 inflammasome activation by a Legionella effector functioning in membrane trafficking. Proceedings of the National Academy of Sciences of the United States of America 109, 6193- 6198, doi:10.1073/pnas.1117490109 (2012).

Ge, J., Xu, H., Li, T. et al. A Legionella type IV effector activates the NF-kappaB pathway by phosphorylating the IkappaB family of inhibitors. Proceedings of the National Academy of Sciences of the United States of America 106, 13725-13730, doi:10.1073/pnas.0907200106 (2009).

Gerondopoulos, A., Bastos, R. N., Yoshimura, S. et al. Rab18 and a Rab18 GEF complex are required for normal ER structure. The Journal of cell biology 205, 707-720, doi:10.1083/jcb.201403026 (2014).

Glick, T. H., Gregg, M. B., Berman, B. et al. Pontiac fever. An epidemic of unknown etiology in a health department: I. Clinical and epidemiologic aspects. American journal of epidemiology 107, 149-160 (1978).

208

Gomez-Valero, L., Rusniok, C., Jarraud, S. et al. Extensive recombination events and horizontal gene transfer shaped the Legionella pneumophila genomes. BMC genomics 12, 536, doi:10.1186/1471-2164-12-536 (2011).

Goody, P. R., Heller, K., Oesterlin, L. K. et al. Reversible phosphocholination of Rab proteins by Legionella pneumophila effector proteins. The EMBO journal 31, 1774-1784, doi:10.1038/emboj.2012.16 (2012).

Grammel, M., Luong, P., Orth, K. & Hang, H. C. A chemical reporter for protein AMPylation. Journal of the American Chemical Society 133, 17103-17105, doi:10.1021/ja205137d (2011).

Guo, B., Huang, J., Wu, W. et al. The nascent polypeptide-associated complex is essential for autophagic flux. Autophagy 10, 1738-1748, doi:10.4161/auto.29638 (2014a).

Guo, Z., Stephenson, R., Qiu, J., Zheng, S. & Luo, Z. Q. A Legionella effector modulates host cytoskeletal structure by inhibiting actin polymerization. Microbes and infection / Institut Pasteur 16, 225-236, doi:10.1016/j.micinf.2013.11.007 (2014b).

Hales, L. M. & Shuman, H. A. Legionella pneumophila contains a type II general secretion pathway required for growth in amoebae as well as for secretion of the Msp protease. Infection and immunity 67, 3662-3666 (1999).

Ham, H., Woolery, A. R., Tracy, C. et al. Unfolded protein response-regulated Drosophila Fic (dFic) protein reversibly AMPylates BiP chaperone during endoplasmic reticulum homeostasis. The Journal of biological chemistry 289, 36059-36069, doi:10.1074/jbc.M114.612515 (2014).

Harada, E., Iida, K., Shiota, S., Nakayama, H. & Yoshida, S. Glucose metabolism in Legionella pneumophila: dependence on the Entner-Doudoroff pathway and connection with intracellular bacterial growth. Journal of bacteriology 192, 2892-2899, doi:10.1128/JB.01535-09 (2010).

Hardiman, C. A. & Roy, C. R. AMPylation is critical for Rab1 localization to vacuoles containing Legionella pneumophila. mBio 5, e01035-01013, doi:10.1128/mBio.01035-13 (2014).

Harding, C. R., Mattheis, C., Mousnier, A. et al. LtpD is a novel Legionella pneumophila effector that binds phosphatidylinositol 3-phosphate and inositol monophosphatase IMPA1. Infection and immunity 81, 4261-4270, doi:10.1128/IAI.01054-13 (2013a).

Harding, C. R., Stoneham, C. A., Schuelein, R. et al. The Dot/Icm effector SdhA is necessary for virulence of Legionella pneumophila in Galleria mellonella and A/J mice. Infection and immunity 81, 2598-2605, doi:10.1128/IAI.00296-13 (2013b).

Harms, A., Stanger, F. V., Scheu, P. D. et al. Adenylylation of Gyrase and Topo IV by FicT Toxins Disrupts Bacterial DNA Topology. Cell reports 12, 1497-1507, doi:10.1016/j.celrep.2015.07.056 (2015).

209

Havey, J. C. & Roy, C. R. Toxicity and SidJ-Mediated Suppression of Toxicity Require Distinct Regions in the SidE Family of Legionella pneumophila Effectors. Infection and immunity 83, 3506-3514, doi:10.1128/IAI.00497-15 (2015).

Heal, W. P., Jovanovic, B., Bessin, S. et al. Bioorthogonal chemical tagging of protein cholesterylation in living cells. Chemical communications 47, 4081-4083, doi:10.1039/c0cc04710d (2011).

Heal, W. P., Wright, M. H., Thinon, E. & Tate, E. W. Multifunctional protein labeling via enzymatic N-terminal tagging and elaboration by click chemistry. Nature protocols 7, 105-117, doi:10.1038/nprot.2011.425 (2012).

Heidtman, M., Chen, E. J., Moy, M. Y. & Isberg, R. R. Large-scale identification of Legionella pneumophila Dot/Icm substrates that modulate host cell vesicle trafficking pathways. Cellular microbiology 11, 230-248, doi:10.1111/j.1462-5822.2008.01249.x (2009).

Hein, M. Y., Hubner, N. C., Poser, I. et al. A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163, 712-723, doi:10.1016/j.cell.2015.09.053 (2015).

Helbig, J. H., Uldum, S. A., Luck, P. C. & Harrison, T. G. Detection of Legionella pneumophila antigen in urine samples by the BinaxNOW immunochromatographic assay and comparison with both Binax Legionella Urinary Enzyme Immunoassay (EIA) and Biotest Legionella Urin Antigen EIA. Journal of medical microbiology 50, 509-516, doi:10.1099/0022-1317-50-6-509 (2001).

Hervet, E., Charpentier, X., Vianney, A. et al. Protein kinase LegK2 is a type IV secretion system effector involved in endoplasmic reticulum recruitment and intracellular replication of Legionella pneumophila. Infection and immunity 79, 1936-1950, doi:10.1128/IAI.00805-10 (2011).

Hicks, L. A., Garrison, L. E., Nelson, G. E. & Hampton, L. M. Legionellosis--United States, 2000- 2009. American journal of transplantation : official journal of the American Society of Transplantation and the American Society of Transplant Surgeons 12, 250-253, doi:10.1111/j.1600-6143.2011.03938.x (2012).

Hilbi, H., Segal, G. & Shuman, H. A. Icm/dot-dependent upregulation of phagocytosis by Legionella pneumophila. Molecular microbiology 42, 603-617 (2001).

Hoffmann, C., Finsel, I., Otto, A. et al. Functional analysis of novel Rab GTPases identified in the proteome of purified Legionella-containing vacuoles from macrophages. Cellular microbiology 16, 1034-1052, doi:10.1111/cmi.12256 (2014).

Honjo, T., Nishizuka, Y. & Hayaishi, O. Diphtheria toxin-dependent adenosine diphosphate ribosylation of aminoacyl transferase II and inhibition of protein synthesis. The Journal of biological chemistry 243, 3553-3555 (1968).

Horenkamp, F. A., Kauffman, K. J., Kohler, L. J. et al. The Legionella Anti-autophagy Effector RavZ Targets the Autophagosome via PI3P- and Curvature-Sensing Motifs. Developmental cell 34, 569-576, doi:10.1016/j.devcel.2015.08.010 (2015).

210

Horenkamp, F. A., Mukherjee, S., Alix, E. et al. Legionella pneumophila subversion of host vesicular transport by SidC effector proteins. Traffic 15, 488-499, doi:10.1111/tra.12158 (2014).

Horwitz, M. A. Phagocytosis of the Legionnaires' disease bacterium (Legionella pneumophila) occurs by a novel mechanism: engulfment within a pseudopod coil. Cell 36, 27-33 (1984).

Hsu, F., Luo, X., Qiu, J. et al. The Legionella effector SidC defines a unique family of ubiquitin ligases important for bacterial phagosomal remodeling. Proceedings of the National Academy of Sciences of the United States of America 111, 10538-10543, doi:10.1073/pnas.1402605111 (2014).

Hsu, F., Zhu, W., Brennan, L. et al. Structural basis for substrate recognition by a unique Legionella phosphoinositide phosphatase. Proceedings of the National Academy of Sciences of the United States of America 109, 13567-13572, doi:10.1073/pnas.1207903109 (2012).

Huang, J. & Brumell, J. H. Bacteria-autophagy interplay: a battle for survival. Nature reviews. Microbiology 12, 101-114, doi:10.1038/nrmicro3160 (2014).

Huang, L., Boyd, D., Amyot, W. M. et al. The E Block motif is associated with Legionella pneumophila translocated substrates. Cellular microbiology 13, 227-245, doi:10.1111/j.1462- 5822.2010.01531.x (2011).

Hubber, A., Arasaki, K., Nakatsu, F. et al. The machinery at endoplasmic reticulum-plasma membrane contact sites contributes to spatial regulation of multiple Legionella effector proteins. PLoS pathogens 10, e1004222, doi:10.1371/journal.ppat.1004222 (2014).

Hyvola, N., Diao, A., McKenzie, E. et al. Membrane targeting and activation of the Lowe syndrome protein OCRL1 by rab GTPases. The EMBO journal 25, 3750-3761, doi:10.1038/sj.emboj.7601274 (2006).

Iglewski, B. H. & Kabat, D. NAD-dependent inhibition of protein synthesis by Pseudomonas aeruginosa toxin. Proceedings of the National Academy of Sciences of the United States of America 72, 2284-2288 (1975).

Iglewski, B. H., Liu, P. V. & Kabat, D. Mechanism of action of Pseudomonas aeruginosa exotoxin Aiadenosine diphosphate-ribosylation of mammalian elongation factor 2 in vitro and in vivo. Infection and immunity 15, 138-144 (1977).

Imai, J., Maruya, M., Yashiroda, H., Yahara, I. & Tanaka, K. The molecular chaperone Hsp90 plays a role in the assembly and maintenance of the 26S proteasome. The EMBO journal 22, 3557- 3567, doi:10.1093/emboj/cdg349 (2003).

Ingmundson, A., Delprato, A., Lambright, D. G. & Roy, C. R. Legionella pneumophila proteins that regulate Rab1 membrane cycling. Nature 450, 365-369, doi:10.1038/nature06336 (2007).

211

Isaac, D. T., Laguna, R. K., Valtz, N. & Isberg, R. R. MavN is a Legionella pneumophila vacuole- associated protein required for efficient iron acquisition during intracellular growth. Proceedings of the National Academy of Sciences of the United States of America 112, E5208- 5217, doi:10.1073/pnas.1511389112 (2015).

Ishiguro, T., Takayanagi, N., Yamaguchi, S. et al. Etiology and factors contributing to the severity and mortality of community-acquired pneumonia. Internal medicine 52, 317-324 (2013).

Ivanov, S. S., Charron, G., Hang, H. C. & Roy, C. R. Lipidation by the host prenyltransferase machinery facilitates membrane localization of Legionella pneumophila effector proteins. The Journal of biological chemistry 285, 34686-34698, doi:10.1074/jbc.M110.170746 (2010).

Ivanov, S. S. & Roy, C. R. Modulation of ubiquitin dynamics and suppression of DALIS formation by the Legionella pneumophila Dot/Icm system. Cellular microbiology 11, 261-278, doi:10.1111/j.1462-5822.2008.01251.x (2009).

Ivanov, S. S. & Roy, C. R. Pathogen signatures activate a ubiquitination pathway that modulates the function of the metabolic checkpoint kinase mTOR. Nature immunology 14, 1219-1228, doi:10.1038/ni.2740 (2013).

Jacobi, S. & Heuner, K. Description of a putative type I secretion system in Legionella pneumophila. International journal of medical microbiology : IJMM 293, 349-358, doi:10.1078/1438-4221- 00276 (2003).

Jahn, R. & Scheller, R. H. SNAREs--engines for membrane fusion. Nature reviews. Molecular cell biology 7, 631-643, doi:10.1038/nrm2002 (2006).

Jank, T., Bohmer, K. E., Tzivelekidis, T. et al. Domain organization of Legionella effector SetA. Cellular microbiology 14, 852-868, doi:10.1111/j.1462-5822.2012.01761.x (2012).

Jensen, O. N. Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry. Current opinion in chemical biology 8, 33-41, doi:10.1016/j.cbpa.2003.12.009 (2004).

Jeong, K. C., Sexton, J. A. & Vogel, J. P. Spatiotemporal regulation of a Legionella pneumophila T4SS substrate by the metaeffector SidJ. PLoS pathogens 11, e1004695, doi:10.1371/journal.ppat.1004695 (2015a).

Jeong, K. C., Sutherland, M. C. & Vogel, J. P. Novel export control of a Legionella Dot/Icm substrate is mediated by dual, independent signal sequences. Molecular microbiology 96, 175-188, doi:10.1111/mmi.12928 (2015b).

Jespersen, S., Sogaard, O. S., Fine, M. J. & Ostergaard, L. The relationship between diagnostic tests and case characteristics in Legionnaires' disease. Scandinavian journal of infectious diseases 41, 425-432, doi:10.1080/00365540902946536 (2009).

212

Jinek, M., East, A., Cheng, A. et al. RNA-programmed genome editing in human cells. eLife 2, e00471, doi:10.7554/eLife.00471 (2013).

Johannes, L. & Wunder, C. Retrograde transport: two (or more) roads diverged in an endosomal tree? Traffic 12, 956-962, doi:10.1111/j.1600-0854.2011.01200.x (2011).

Joseph, C. A., Ricketts, K. D., Yadav, R., Patel, S. & European Working Group for Legionella, I. Travel-associated Legionnaires disease in Europe in 2009. Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin 15, 19683 (2010).

Juhas, M., Crook, D. W., Dimopoulou, I. D. et al. Novel type IV secretion system involved in propagation of genomic islands. Journal of bacteriology 189, 761-771, doi:10.1128/JB.01327- 06 (2007).

Kalogeraki, V. S. & Winans, S. C. Wound-released chemical signals may elicit multiple responses from an Agrobacterium tumefaciens strain containing an octopine-type Ti plasmid. Journal of bacteriology 180, 5660-5667 (1998).

Kao, A., Chiu, C. L., Vellucci, D. et al. Development of a novel cross-linking strategy for fast and accurate identification of cross-linked peptides of protein complexes. Molecular & cellular proteomics : MCP 10, M110 002212, doi:10.1074/mcp.M110.002212 (2011).

Kawahara, H., Minami, R. & Yokota, N. BAG6/BAT3: emerging roles in quality control for nascent polypeptides. Journal of biochemistry 153, 147-160, doi:10.1093/jb/mvs149 (2013).

Kawamukai, M., Matsuda, H., Fujii, W., Utsumi, R. & Komano, T. Nucleotide sequences of fic and fic- 1 genes involved in cell filamentation induced by cyclic AMP in Escherichia coli. Journal of bacteriology 171, 4525-4529 (1989).

Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N. & Sternberg, M. J. The Phyre2 web portal for protein modeling, prediction and analysis. Nature protocols 10, 845-858, doi:10.1038/nprot.2015.053 (2015).

Khweek, A. A., Caution, K., Akhter, A. et al. A bacterial protein promotes the recognition of the Legionella pneumophila vacuole by autophagy. European journal of immunology 43, 1333- 1344, doi:10.1002/eji.201242835 (2013).

Kim, D. I., Jensen, S. C., Noble, K. A. et al. An improved smaller biotin ligase for BioID proximity labeling. Molecular biology of the cell 27, 1188-1196, doi:10.1091/mbc.E15-12-0844 (2016).

Kinch, L. N., Yarbrough, M. L., Orth, K. & Grishin, N. V. Fido, a novel AMPylation domain common to fic, doc, and AvrB. PloS one 4, e5818, doi:10.1371/journal.pone.0005818 (2009).

King, N. P., Newton, P., Schuelein, R. et al. Soluble NSF attachment protein receptor molecular mimicry by a Legionella pneumophila Dot/Icm effector. Cellular microbiology 17, 767-784, doi:10.1111/cmi.12405 (2015).

213

Ku, B., Lee, K. H., Park, W. S. et al. VipD of Legionella pneumophila targets activated Rab5 and Rab22 to interfere with endosomal trafficking in macrophages. PLoS pathogens 8, e1003082, doi:10.1371/journal.ppat.1003082 (2012).

Kubori, T., Hyakutake, A. & Nagai, H. Legionella translocates an E3 ubiquitin ligase that has multiple U-boxes with distinct functions. Molecular microbiology 67, 1307-1319, doi:10.1111/j.1365- 2958.2008.06124.x (2008).

Kubori, T., Koike, M., Bui, X. T. et al. Native structure of a type IV secretion system core complex essential for Legionella pathogenesis. Proceedings of the National Academy of Sciences of the United States of America 111, 11804-11809, doi:10.1073/pnas.1404506111 (2014).

Kubori, T., Shinzawa, N., Kanuka, H. & Nagai, H. Legionella metaeffector exploits host proteasome to temporally regulate cognate effector. PLoS pathogens 6, e1001216, doi:10.1371/journal.ppat.1001216 (2010).

Kudryashev, M., Wang, R. Y., Brackmann, M. et al. Structure of the type VI secretion system contractile sheath. Cell 160, 952-962, doi:10.1016/j.cell.2015.01.037 (2015).

Kuiper, M. W., Wullings, B. A., Akkermans, A. D., Beumer, R. R. & van der Kooij, D. Intracellular proliferation of Legionella pneumophila in Hartmannella vermiformis in aquatic biofilms grown on plasticized polyvinyl chloride. Applied and environmental microbiology 70, 6826- 6833, doi:10.1128/AEM.70.11.6826-6833.2004 (2004).

Kuroda, T., Kubori, T., Thanh Bui, X. et al. Molecular and structural analysis of Legionella DotI gives insights into an inner membrane complex essential for type IV secretion. Scientific reports 5, 10912, doi:10.1038/srep10912 (2015).

La Scola, B., Birtles, R. J., Greub, G. et al. Legionella drancourtii sp. nov., a strictly intracellular amoebal pathogen. International journal of systematic and evolutionary microbiology 54, 699- 703, doi:10.1099/ijs.0.02455-0 (2004).

Laguna, R. K., Creasey, E. A., Li, Z., Valtz, N. & Isberg, R. R. A Legionella pneumophila-translocated substrate that is required for growth within macrophages and protection from host cell death. Proceedings of the National Academy of Sciences of the United States of America 103, 18745- 18750, doi:10.1073/pnas.0609012103 (2006).

Lam, S. S., Martell, J. D., Kamer, K. J. et al. Directed evolution of APEX2 for electron microscopy and proximity labeling. Nature methods 12, 51-54, doi:10.1038/nmeth.3179 (2015).

Lamoth, F. & Greub, G. Amoebal pathogens as emerging causal agents of pneumonia. FEMS microbiology reviews 34, 260-280, doi:10.1111/j.1574-6976.2009.00207.x (2010).

Lee, C. C., Wood, M. D., Ng, K. et al. Crystal structure of the type III effector AvrB from Pseudomonas syringae. Structure 12, 487-494, doi:10.1016/j.str.2004.02.013 (2004).

214

Leiman, P. G., Basler, M., Ramagopal, U. A. et al. Type VI secretion apparatus and phage tail- associated protein complexes share a common evolutionary origin. Proceedings of the National Academy of Sciences of the United States of America 106, 4154-4159, doi:10.1073/pnas.0813360106 (2009).

Leitner, A., Walzthoeni, T. & Aebersold, R. Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MS/MS and the xQuest/xProphet software pipeline. Nature protocols 9, 120-137, doi:10.1038/nprot.2013.168 (2014).

Lewis, V. J., Thacker, W. L., Shepard, C. C. & McDade, J. E. In vivo susceptibility of the Legionnaires disease bacterium to ten antimicrobial agents. Antimicrobial agents and chemotherapy 13, 419- 422 (1978).

Li, J., Wolf, S. G., Elbaum, M. & Tzfira, T. Exploring cargo transport mechanics in the type IV secretion systems. Trends in microbiology 13, 295-298, doi:10.1016/j.tim.2005.05.002 (2005).

Li, T., Lu, Q., Wang, G. et al. SET-domain bacterial effectors target heterochromatin protein 1 to activate host rDNA transcription. EMBO reports 14, 733-740, doi:10.1038/embor.2013.86 (2013).

Lichter-Konecki, U., Farber, L. W., Cronin, J. S., Suchy, S. F. & Nussbaum, R. L. The effect of missense mutations in the RhoGAP-homology domain on ocrl1 function. Molecular genetics and metabolism 89, 121-128, doi:10.1016/j.ymgme.2006.04.005 (2006).

Lifshitz, Z., Burstein, D., Peeri, M. et al. Computational modeling and experimental validation of the Legionella and Coxiella virulence-related type-IVB secretion signal. Proceedings of the National Academy of Sciences of the United States of America 110, E707-715, doi:10.1073/pnas.1215278110 (2013).

Liles, M. R., Edelstein, P. H. & Cianciotto, N. P. The prepilin peptidase is required for protein secretion by and the virulence of the intracellular pathogen Legionella pneumophila. Molecular microbiology 31, 959-970 (1999).

Liles, M. R., Viswanathan, V. K. & Cianciotto, N. P. Identification and temperature regulation of Legionella pneumophila genes involved in type IV pilus biogenesis and type II protein secretion. Infection and immunity 66, 1776-1782 (1998).

Liu, J., Elmore, J. M., Lin, Z. J. & Coaker, G. A receptor-like cytoplasmic kinase phosphorylates the host target RIN4, leading to the activation of a plant innate immune receptor. Cell host & microbe 9, 137-146, doi:10.1016/j.chom.2011.01.010 (2011).

Liu, Y. & Luo, Z. Q. The Legionella pneumophila effector SidJ is required for efficient recruitment of endoplasmic reticulum proteins to the bacterial phagosome. Infection and immunity 75, 592- 603, doi:10.1128/IAI.01278-06 (2007).

Lolli, G., Lowe, E. D., Brown, N. R. & Johnson, L. N. The crystal structure of human CDK7 and its protein recognition properties. Structure 12, 2067-2079, doi:10.1016/j.str.2004.08.013 (2004).

215

Lomma, M., Dervins-Ravault, D., Rolando, M. et al. The Legionella pneumophila F-box protein Lpp2082 (AnkB) modulates ubiquitination of the host protein parvin B and promotes intracellular replication. Cellular microbiology 12, 1272-1291, doi:10.1111/j.1462- 5822.2010.01467.x (2010).

Losick, V. P., Haenssler, E., Moy, M. Y. & Isberg, R. R. LnaB: a Legionella pneumophila activator of NF-kappaB. Cellular microbiology 12, 1083-1097, doi:10.1111/j.1462-5822.2010.01452.x (2010).

Losick, V. P. & Isberg, R. R. NF-kappaB translocation prevents host cell death after low-dose challenge by Legionella pneumophila. The Journal of experimental medicine 203, 2177-2189, doi:10.1084/jem.20060766 (2006).

Low, H. H., Gubellini, F., Rivera-Calzada, A. et al. Structure of a type IV secretion system. Nature 508, 550-553, doi:10.1038/nature13081 (2014).

Lu, D., Wu, S., Gao, X. et al. A receptor-like cytoplasmic kinase, BIK1, associates with a flagellin receptor complex to initiate plant innate immunity. Proceedings of the National Academy of Sciences of the United States of America 107, 496-501, doi:10.1073/pnas.0909705107 (2010).

Lu, H. & Clarke, M. Dynamic properties of Legionella-containing phagosomes in Dictyostelium amoebae. Cellular microbiology 7, 995-1007, doi:10.1111/j.1462-5822.2005.00528.x (2005).

Lucas, M., Gaspar, A. H., Pallara, C. et al. Structural basis for the recruitment and activation of the Legionella phospholipase VipD by the host GTPase Rab5. Proceedings of the National Academy of Sciences of the United States of America 111, E3514-3523, doi:10.1073/pnas.1405391111 (2014).

Luke, I., Handford, J. I., Palmer, T. & Sargent, F. Proteolytic processing of Escherichia coli twin- arginine signal peptides by LepB. Archives of microbiology 191, 919-925, doi:10.1007/s00203- 009-0516-5 (2009).

Luong, P., Kinch, L. N., Brautigam, C. A. et al. Kinetic and structural insights into the mechanism of AMPylation by VopS Fic domain. The Journal of biological chemistry 285, 20155-20163, doi:10.1074/jbc.M110.114884 (2010).

Lycklama, A. N. J. A. & Driessen, A. J. The bacterial Sec-translocase: structure and mechanism. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 367, 1016-1028, doi:10.1098/rstb.2011.0201 (2012).

Ma, B. & Johnson, R. De novo sequencing and homology searching. Molecular & cellular proteomics : MCP 11, O111 014902, doi:10.1074/mcp.O111.014902 (2012).

Machner, M. P. & Isberg, R. R. Targeting of host Rab GTPase function by the intravacuolar pathogen Legionella pneumophila. Developmental cell 11, 47-56, doi:10.1016/j.devcel.2006.05.013 (2006).

216

Machner, M. P. & Isberg, R. R. A bifunctional bacterial protein links GDI displacement to Rab1 activation. Science 318, 974-977, doi:10.1126/science.1149121 (2007).

Maier, T., Guell, M. & Serrano, L. Correlation of mRNA and protein in complex biological samples. FEBS letters 583, 3966-3973, doi:10.1016/j.febslet.2009.10.036 (2009).

Mali, P., Yang, L., Esvelt, K. M. et al. RNA-guided human genome engineering via Cas9. Science 339, 823-826, doi:10.1126/science.1232033 (2013).

Mandell, L. A., Wunderink, R. G., Anzueto, A. et al. Infectious Diseases Society of America/American Thoracic Society consensus guidelines on the management of community-acquired pneumonia in adults. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 44 Suppl 2, S27-72, doi:10.1086/511159 (2007).

Marcotte, E. M. How do shotgun proteomics algorithms identify proteins? Nature biotechnology 25, 755-757, doi:10.1038/nbt0707-755 (2007).

Marcu, M. G., Doyle, M., Bertolotti, A. et al. Heat shock protein 90 modulates the unfolded protein response by stabilizing IRE1alpha. Molecular and cellular biology 22, 8506-8513 (2002).

Matthews, M. & Roy, C. R. Identification and subcellular localization of the Legionella pneumophila IcmX protein: a factor essential for establishment of a replicative organelle in eukaryotic host cells. Infection and immunity 68, 3971-3982 (2000).

Mattoo, S., Durrant, E., Chen, M. J. et al. Comparative analysis of Histophilus somni immunoglobulin- binding protein A (IbpA) with other fic domain-containing enzymes reveals differences in substrate and nucleotide specificities. The Journal of biological chemistry 286, 32834-32842, doi:10.1074/jbc.M111.227603 (2011).

McDade, J. E., Shepard, C. C., Fraser, D. W. et al. Legionnaires' disease: isolation of a bacterium and demonstration of its role in other respiratory disease. The New England journal of medicine 297, 1197-1203, doi:10.1056/NEJM197712012972202 (1977).

Mercante, J. W. & Winchell, J. M. Current and emerging Legionella diagnostics for laboratory and outbreak investigations. Clinical microbiology reviews 28, 95-133, doi:10.1128/CMR.00029- 14 (2015).

Miyamoto, Y., Yamada, K. & Yoneda, Y. Importin alpha: a key molecule in nuclear transport and non- transport functions. Journal of biochemistry 160, 69-75, doi:10.1093/jb/mvw036 (2016).

Monroe, K. M., McWhirter, S. M. & Vance, R. E. Identification of host cytosolic sensors and bacterial factors regulating the type I interferon response to Legionella pneumophila. PLoS pathogens 5, e1000665, doi:10.1371/journal.ppat.1000665 (2009).

217

Mori, M., Hitora, T., Nakamura, O. et al. Hsp90 inhibitor induces autophagy and apoptosis in osteosarcoma cells. International journal of oncology 46, 47-54, doi:10.3892/ijo.2014.2727 (2015).

Morozova, I., Qu, X., Shi, S. et al. Comparative sequence analysis of the icm/dot genes in Legionella. Plasmid 51, 127-147, doi:10.1016/j.plasmid.2003.12.004 (2004).

Mousnier, A., Schroeder, G. N., Stoneham, C. A. et al. A new method to determine in vivo interactomes reveals binding of the Legionella pneumophila effector PieE to multiple rab GTPases. mBio 5, doi:10.1128/mBio.01148-14 (2014).

Muder, R. R. & Yu, V. L. Infection due to Legionella species other than L. pneumophila. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 35, 990-998, doi:10.1086/342884 (2002).

Mukherjee, S., Liu, X., Arasaki, K. et al. Modulation of Rab GTPase function by a protein phosphocholine transferase. Nature 477, 103-106, doi:10.1038/nature10335 (2011).

Muller, M. P., Peters, H., Blumer, J. et al. The Legionella effector protein DrrA AMPylates the membrane traffic regulator Rab1b. Science 329, 946-949, doi:10.1126/science.1192276 (2010).

Murata, T., Delprato, A., Ingmundson, A. et al. The Legionella pneumophila effector protein DrrA is a Rab1 guanine nucleotide-exchange factor. Nature cell biology 8, 971-977, doi:10.1038/ncb1463 (2006).

Nagai, H., Cambronne, E. D., Kagan, J. C. et al. A C-terminal translocation signal required for Dot/Icm- dependent delivery of the Legionella RalF protein to host cells. Proceedings of the National Academy of Sciences of the United States of America 102, 826-831, doi:10.1073/pnas.0406239101 (2005).

Nagai, H., Kagan, J. C., Zhu, X., Kahn, R. A. & Roy, C. R. A bacterial guanine nucleotide exchange factor activates ARF on Legionella phagosomes. Science 295, 679-682, doi:10.1126/science.1067025 (2002).

Natale, P., Bruser, T. & Driessen, A. J. Sec- and Tat-mediated protein secretion across the bacterial cytoplasmic membrane--distinct translocases and mechanisms. Biochimica et biophysica acta 1778, 1735-1756, doi:10.1016/j.bbamem.2007.07.015 (2008).

Neunuebel, M. R., Chen, Y., Gaspar, A. H. et al. De-AMPylation of the small GTPase Rab1 by the pathogen Legionella pneumophila. Science 333, 453-456, doi:10.1126/science.1207193 (2011).

Niesen, F. H., Berglund, H. & Vedadi, M. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nature protocols 2, 2212-2221, doi:10.1038/nprot.2007.321 (2007).

218

Ninio, S., Zuckman-Cholon, D. M., Cambronne, E. D. & Roy, C. R. The Legionella IcmS-IcmW protein complex is important for Dot/Icm-mediated protein translocation. Molecular microbiology 55, 912-926, doi:10.1111/j.1365-2958.2004.04435.x (2005).

Nomura, D. K., Dix, M. M. & Cravatt, B. F. Activity-based protein profiling for biochemical pathway discovery in cancer. Nature reviews. Cancer 10, 630-638, doi:10.1038/nrc2901 (2010).

Nystrom, T. Nonculturable bacteria: programmed survival forms or cells at death's door? BioEssays : news and reviews in molecular, cellular and developmental biology 25, 204-211, doi:10.1002/bies.10233 (2003).

O'Brien, K. M., Lindsay, E. L. & Starai, V. J. The Legionella pneumophila effector protein, LegC7, alters yeast endosomal trafficking. PloS one 10, e0116824, doi:10.1371/journal.pone.0116824 (2015).

O'Connor, T. J., Adepoju, Y., Boyd, D. & Isberg, R. R. Minimization of the Legionella pneumophila genome reveals chromosomal regions involved in host range expansion. Proceedings of the National Academy of Sciences of the United States of America 108, 14733-14740, doi:10.1073/pnas.1111678108 (2011).

O'Connor, T. J., Boyd, D., Dorer, M. S. & Isberg, R. R. Aggravating genetic interactions allow a solution to redundancy in a bacterial pathogen. Science 338, 1440-1444, doi:10.1126/science.1229556 (2012).

O'Connor, T. J., Zheng, H., VanRheenen, S. M. et al. Iron Limitation Triggers Early Egress by the Intracellular Bacterial Pathogen Legionella pneumophila. Infection and immunity 84, 2185- 2197, doi:10.1128/IAI.01306-15 (2016).

Ohno, A., Kato, N., Yamada, K. & Yamaguchi, K. Factors influencing survival of Legionella pneumophila serotype 1 in hot spring water and tap water. Applied and environmental microbiology 69, 2540-2547 (2003).

Olsen, C. W., Elverdal, P., Jorgensen, C. S. & Uldum, S. A. Comparison of the sensitivity of the Legionella urinary antigen EIA kits from Binax and Biotest with urine from patients with infections caused by less common serogroups and subgroups of Legionella. European journal of clinical microbiology & infectious diseases : official publication of the European Society of Clinical Microbiology 28, 817-820, doi:10.1007/s10096-008-0697-x (2009).

Ong, S. E., Blagoev, B., Kratchmarova, I. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Molecular & cellular proteomics : MCP 1, 376-386 (2002).

Pallett, M. A., Crepin, V. F., Serafini, N. et al. Bacterial virulence factor inhibits caspase-4/11 activation in intestinal epithelial cells. Mucosal immunology, doi:10.1038/mi.2016.77 (2016).

Palmer, T. & Berks, B. C. The twin-arginine translocation (Tat) protein export pathway. Nature reviews. Microbiology 10, 483-496, doi:10.1038/nrmicro2814 (2012).

219

Phin, N., Parry-Ford, F., Harrison, T. et al. Epidemiology and clinical management of Legionnaires' disease. The Lancet. Infectious diseases 14, 1011-1021, doi:10.1016/S1473-3099(14)70713-3 (2014).

Piao, Z., Sze, C. C., Barysheva, O., Iida, K. & Yoshida, S. Temperature-regulated formation of mycelial mat-like biofilms by Legionella pneumophila. Applied and environmental microbiology 72, 1613-1622, doi:10.1128/AEM.72.2.1613-1622.2006 (2006).

Pierce, A., Unwin, R. D., Evans, C. A. et al. Eight-channel iTRAQ enables comparison of the activity of six leukemogenic tyrosine kinases. Molecular & cellular proteomics : MCP 7, 853-863, doi:10.1074/mcp.M700251-MCP200 (2008).

Popa, C., Coll, N. S., Valls, M. & Sessa, G. Yeast as a Heterologous Model System to Uncover Type III Effector Function. PLoS pathogens 12, e1005360, doi:10.1371/journal.ppat.1005360 (2016).

Portier, E., Zheng, H., Sahr, T. et al. IroT/mavN, a new iron-regulated gene involved in Legionella pneumophila virulence against amoebae and macrophages. Environmental microbiology 17, 1338-1350, doi:10.1111/1462-2920.12604 (2015).

Prashar, A., Bhatia, S., Gigliozzi, D. et al. Filamentous morphology of bacteria delays the timing of phagosome morphogenesis in macrophages. The Journal of cell biology 203, 1081-1097, doi:10.1083/jcb.201304095 (2013).

Prashar, A., Bhatia, S., Tabatabaeiyazdi, Z. et al. Mechanism of invasion of lung epithelial cells by filamentous Legionella pneumophila. Cellular microbiology 14, 1632-1655, doi:10.1111/j.1462-5822.2012.01828.x (2012).

Preissler, S., Rato, C., Chen, R. et al. AMPylation matches BiP activity to client protein load in the endoplasmic reticulum. eLife 4, e12621, doi:10.7554/eLife.12621 (2015).

Price, C. T., Al-Khodor, S., Al-Quadan, T. et al. Molecular mimicry by an F-box effector of Legionella pneumophila hijacks a conserved polyubiquitination machinery within macrophages and protozoa. PLoS pathogens 5, e1000704, doi:10.1371/journal.ppat.1000704 (2009).

Price, C. T., Al-Quadan, T., Santic, M., Jones, S. C. & Abu Kwaik, Y. Exploitation of conserved eukaryotic host cell farnesylation machinery by an F-box effector of Legionella pneumophila. The Journal of experimental medicine 207, 1713-1726, doi:10.1084/jem.20100771 (2010).

Price, C. T., Al-Quadan, T., Santic, M., Rosenshine, I. & Abu Kwaik, Y. Host proteasomal degradation generates amino acids essential for intracellular bacterial growth. Science 334, 1553-1557, doi:10.1126/science.1212868 (2011).

Qiu, J., Sheedlo, M. J., Yu, K. et al. Ubiquitination independent of E1 and E2 enzymes by bacterial effectors. Nature 533, 120-124, doi:10.1038/nature17657 (2016).

220

Ragaz, C., Pietsch, H., Urwyler, S. et al. The Legionella pneumophila phosphatidylinositol-4 phosphate-binding type IV substrate SidC recruits endoplasmic reticulum vesicles to a replication-permissive vacuole. Cellular microbiology 10, 2416-2433, doi:10.1111/j.1462- 5822.2008.01219.x (2008).

Ramachandran, N., Hainsworth, E., Bhullar, B. et al. Self-assembling protein microarrays. Science 305, 86-90, doi:10.1126/science.1097639 (2004).

Richards, A. L., Hebert, A. S., Ulbrich, A. et al. One-hour proteome analysis in yeast. Nature protocols 10, 701-714, doi:10.1038/nprot.2015.040 (2015).

Ridenour, D. A., Cirillo, S. L., Feng, S., Samrakandi, M. M. & Cirillo, J. D. Identification of a gene that affects the efficiency of host cell infection by Legionella pneumophila in a temperature- dependent fashion. Infection and immunity 71, 6256-6263 (2003).

Robertson, P., Abdelhady, H. & Garduno, R. A. The many forms of a pleomorphic bacterial pathogen- the developmental network of Legionella pneumophila. Frontiers in microbiology 5, 670, doi:10.3389/fmicb.2014.00670 (2014).

Robinson, C. G. & Roy, C. R. Attachment and fusion of endoplasmic reticulum with vacuoles containing Legionella pneumophila. Cellular microbiology 8, 793-805, doi:10.1111/j.1462- 5822.2005.00666.x (2006).

Rodgers, F. G., Macrae, A. D. & Lewis, M. J. Electron microscopy of the organism of Legionnaires' disease. Nature 272, 825-826 (1978).

Rolando, M., Escoll, P., Nora, T. et al. Legionella pneumophila S1P-lyase targets host sphingolipid metabolism and restrains autophagy. Proceedings of the National Academy of Sciences of the United States of America 113, 1901-1906, doi:10.1073/pnas.1522067113 (2016).

Rolando, M., Sanulli, S., Rusniok, C. et al. Legionella pneumophila effector RomA uniquely modifies host chromatin to repress gene expression and promote intracellular bacterial replication. Cell host & microbe 13, 395-405, doi:10.1016/j.chom.2013.03.004 (2013).

Rosen, D. A., Hooton, T. M., Stamm, W. E., Humphrey, P. A. & Hultgren, S. J. Detection of intracellular bacterial communities in human urinary tract infection. PLoS medicine 4, e329, doi:10.1371/journal.pmed.0040329 (2007).

Ross, P. L., Huang, Y. N., Marchese, J. N. et al. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Molecular & cellular proteomics : MCP 3, 1154-1169, doi:10.1074/mcp.M400129-MCP200 (2004).

Rossier, O. & Cianciotto, N. P. Type II protein secretion is a subset of the PilD-dependent processes that facilitate intracellular infection by Legionella pneumophila. Infection and immunity 69, 2092-2098, doi:10.1128/IAI.69.4.2092-2098.2001 (2001).

221

Rossier, O. & Cianciotto, N. P. The Legionella pneumophila tatB gene facilitates secretion of phospholipase C, growth under iron-limiting conditions, and intracellular infection. Infection and immunity 73, 2020-2032, doi:10.1128/IAI.73.4.2020-2032.2005 (2005).

Rossier, O., Dao, J. & Cianciotto, N. P. The type II secretion system of Legionella pneumophila elaborates two aminopeptidases, as well as a metalloprotease that contributes to differential infection among protozoan hosts. Applied and environmental microbiology 74, 753-761, doi:10.1128/AEM.01944-07 (2008).

Rossier, O., Dao, J. & Cianciotto, N. P. A type II secreted RNase of Legionella pneumophila facilitates optimal intracellular infection of Hartmannella vermiformis. Microbiology 155, 882-890, doi:10.1099/mic.0.023218-0 (2009).

Rossier, O., Starkenburg, S. R. & Cianciotto, N. P. Legionella pneumophila type II protein secretion promotes virulence in the A/J mouse model of Legionnaires' disease pneumonia. Infection and immunity 72, 310-321 (2004).

Rostovtsev, V. V., Green, L. G., Fokin, V. V. & Sharpless, K. B. A stepwise huisgen cycloaddition process: copper(I)-catalyzed regioselective "ligation" of azides and terminal alkynes. Angewandte Chemie 41, 2596-2599, doi:10.1002/1521-3773(20020715)41:14<2596::AID- ANIE2596>3.0.CO;2-4 (2002).

Rothmeier, E., Pfaffinger, G., Hoffmann, C. et al. Activation of Ran GTPase by a Legionella effector promotes microtubule polymerization, pathogen vacuole motility and infection. PLoS pathogens 9, e1003598, doi:10.1371/journal.ppat.1003598 (2013).

Roux, K. J., Kim, D. I., Raida, M. & Burke, B. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. The Journal of cell biology 196, 801- 810, doi:10.1083/jcb.201112098 (2012).

Roy, C. R. & Cherfils, J. Structure and function of Fic proteins. Nature reviews. Microbiology 13, 631- 640, doi:10.1038/nrmicro3520 (2015).

Roy, C. R. & Isberg, R. R. Topology of Legionella pneumophila DotA: an inner membrane protein required for replication in macrophages. Infection and immunity 65, 571-578 (1997).

Sano, H., Roach, W. G., Peck, G. R., Fukuda, M. & Lienhard, G. E. Rab10 in insulin-stimulated GLUT4 translocation. The Biochemical journal 411, 89-95, doi:10.1042/BJ20071318 (2008).

Sansom, F. M., Newton, H. J., Crikis, S. et al. A bacterial ecto-triphosphate diphosphohydrolase similar to human CD39 is essential for intracellular multiplication of Legionella pneumophila. Cellular microbiology 9, 1922-1935, doi:10.1111/j.1462-5822.2007.00924.x (2007).

Sanyal, A., Chen, A. J., Nakayasu, E. S. et al. A novel link between Fic (filamentation induced by cAMP)-mediated adenylylation/AMPylation and the unfolded protein response. The Journal of biological chemistry 290, 8482-8499, doi:10.1074/jbc.M114.618348 (2015).

222

Schoebel, S., Cichy, A. L., Goody, R. S. & Itzen, A. Protein LidA from Legionella is a Rab GTPase supereffector. Proceedings of the National Academy of Sciences of the United States of America 108, 17945-17950, doi:10.1073/pnas.1113133108 (2011).

Schroeder, G. N., Aurass, P., Oates, C. V. et al. Legionella pneumophila Effector LpdA Is a Palmitoylated Phospholipase D Virulence Factor. Infection and immunity 83, 3989-4002, doi:10.1128/IAI.00785-15 (2015).

Schroeder, G. N., Petty, N. K., Mousnier, A. et al. Legionella pneumophila strain 130b possesses a unique combination of type IV secretion systems and novel Dot/Icm secretion system effector proteins. Journal of bacteriology 192, 6001-6016, doi:10.1128/JB.00778-10 (2010).

Schunder, E., Gillmaier, N., Kutzner, E. et al. Amino Acid Uptake and Metabolism of Legionella pneumophila Hosted by Acanthamoeba castellanii. The Journal of biological chemistry 289, 21040-21054, doi:10.1074/jbc.M114.570085 (2014).

Segal, G., Purcell, M. & Shuman, H. A. Host cell killing and bacterial conjugation require overlapping sets of genes within a 22-kb region of the Legionella pneumophila genome. Proceedings of the National Academy of Sciences of the United States of America 95, 1669-1674 (1998).

Segal, G., Russo, J. J. & Shuman, H. A. Relationships between a new type IV secretion system and the icm/dot virulence system of Legionella pneumophila. Molecular microbiology 34, 799-809 (1999).

Seto, S., Tsujimura, K. & Koide, Y. Rab GTPases regulating phagosome maturation are differentially recruited to mycobacterial phagosomes. Traffic 12, 407-420, doi:10.1111/j.1600- 0854.2011.01165.x (2011).

Sexton, J. A., Miller, J. L., Yoneda, A., Kehl-Fie, T. E. & Vogel, J. P. Legionella pneumophila DotU and IcmF are required for stability of the Dot/Icm complex. Infection and immunity 72, 5983- 5992, doi:10.1128/IAI.72.10.5983-5992.2004 (2004a).

Sexton, J. A., Pinkner, J. S., Roth, R. et al. The Legionella pneumophila PilT homologue DotB exhibits ATPase activity that is critical for intracellular growth. Journal of bacteriology 186, 1658-1666 (2004b).

Sexton, J. A., Yeo, H. J. & Vogel, J. P. Genetic analysis of the Legionella pneumophila DotB ATPase reveals a role in type IV secretion system protein export. Molecular microbiology 57, 70-84, doi:10.1111/j.1365-2958.2005.04667.x (2005).

Shen, X., Banga, S., Liu, Y. et al. Targeting eEF1A by a Legionella pneumophila effector leads to inhibition of protein synthesis and induction of host stress response. Cellular microbiology 11, 911-926, doi:10.1111/j.1462-5822.2009.01301.x (2009).

Shevchuk, O., Batzilla, C., Hagele, S. et al. Proteomic analysis of Legionella-containing phagosomes isolated from Dictyostelium. International journal of medical microbiology : IJMM 299, 489- 508, doi:10.1016/j.ijmm.2009.03.006 (2009).

223

Shi, X., Halder, P., Yavuz, H., Jahn, R. & Shuman, H. A. Direct targeting of membrane fusion by SNARE mimicry: Convergent evolution of Legionella effectors. Proceedings of the National Academy of Sciences of the United States of America 113, 8807-8812, doi:10.1073/pnas.1608755113 (2016).

Shin, S., Case, C. L., Archer, K. A. et al. Type IV secretion-dependent activation of host MAP kinases induces an increased proinflammatory cytokine response to Legionella pneumophila. PLoS pathogens 4, e1000220, doi:10.1371/journal.ppat.1000220 (2008).

Shohdy, N., Efe, J. A., Emr, S. D. & Shuman, H. A. Pathogen effector protein screening in yeast identifies Legionella factors that interfere with membrane trafficking. Proceedings of the National Academy of Sciences of the United States of America 102, 4866-4871, doi:10.1073/pnas.0501315102 (2005).

Sievers, F., Wilm, A., Dineen, D. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Molecular systems biology 7, 539, doi:10.1038/msb.2011.75 (2011).

Simon, S., Wagner, M. A., Rothmeier, E., Muller-Taubenberger, A. & Hilbi, H. Icm/Dot-dependent inhibition of phagocyte migration by Legionella is antagonized by a translocated Ran GTPase activator. Cellular microbiology 16, 977-992, doi:10.1111/cmi.12258 (2014).

Smalley, D. L., Jaquess, P. A., Ourth, D. D. & Layne, J. S. Antibiotic-induced filament formation of Legionella pneumophila. American journal of clinical pathology 74, 852 (1980).

Smith, A. W., Poyner, D. R., Hughes, H. K. & Lambert, P. A. Siderophore activity of myo-inositol hexakisphosphate in Pseudomonas aeruginosa. Journal of bacteriology 176, 3455-3459 (1994).

Smith, L. M., Kelleher, N. L. & Consortium for Top Down, P. Proteoform: a single term describing protein complexity. Nature methods 10, 186-187, doi:10.1038/nmeth.2369 (2013).

So, E. C., Mattheis, C., Tate, E. W., Frankel, G. & Schroeder, G. N. Creating a customized intracellular niche: subversion of host cell signaling by Legionella type IV secretion system effectors. Canadian journal of microbiology 61, 617-635, doi:10.1139/cjm-2015-0166 (2015).

So, E. C., Schroeder, G. N., Carson, D. et al. The Rab-binding Profiles of Bacterial Virulence Factors during Infection. The Journal of biological chemistry 291, 5832-5843, doi:10.1074/jbc.M115.700930 (2016).

Soderberg, M. A., Dao, J., Starkenburg, S. R. & Cianciotto, N. P. Importance of type II secretion for survival of Legionella pneumophila in tap water and in amoebae at low temperatures. Applied and environmental microbiology 74, 5583-5588, doi:10.1128/AEM.00067-08 (2008).

Soderberg, M. A., Rossier, O. & Cianciotto, N. P. The type II protein secretion system of Legionella pneumophila promotes growth at low temperatures. Journal of bacteriology 186, 3712-3720, doi:10.1128/JB.186.12.3712-3720.2004 (2004).

224

Sohn, Y. S., Shin, H. C., Park, W. S. et al. Lpg0393 of Legionella pneumophila is a guanine-nucleotide exchange factor for Rab5, Rab21 and Rab22. PloS one 10, e0118683, doi:10.1371/journal.pone.0118683 (2015).

St-Martin, G., Uldum, S., #xf8 & lbak, K. Incidence and Prognostic Factors for Legionnaires' Disease in Denmark 1993–2006. ISRN Epidemiology 2013, 8, doi:10.5402/2013/847283 (2013).

Stanger, F. V., Burmann, B. M., Harms, A. et al. Intrinsic regulation of FIC-domain AMP-transferases by oligomerization and automodification. Proceedings of the National Academy of Sciences of the United States of America 113, E529-537, doi:10.1073/pnas.1516930113 (2016).

Steinert, M., Emody, L., Amann, R. & Hacker, J. Resuscitation of viable but nonculturable Legionella pneumophila Philadelphia JR32 by Acanthamoeba castellanii. Applied and environmental microbiology 63, 2047-2053 (1997).

Stenmark, H. Rab GTPases as coordinators of vesicle traffic. Nature reviews. Molecular cell biology 10, 513-525, doi:10.1038/nrm2728 (2009).

Szklarczyk, D., Franceschini, A., Wyder, S. et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic acids research 43, D447-452, doi:10.1093/nar/gku1003 (2015).

Tan, M. J., Tan, J. S., Hamor, R. H., File, T. M., Jr. & Breiman, R. F. The radiologic manifestations of Legionnaire's disease. The Ohio Community-Based Pneumonia Incidence Study Group. Chest 117, 398-403 (2000).

Tan, Y., Arnold, R. J. & Luo, Z. Q. Legionella pneumophila regulates the small GTPase Rab1 activity by reversible phosphorylcholination. Proceedings of the National Academy of Sciences of the United States of America 108, 21212-21217, doi:10.1073/pnas.1114023109 (2011a).

Tan, Y. & Luo, Z. Q. Legionella pneumophila SidD is a deAMPylase that modifies Rab1. Nature 475, 506-509, doi:10.1038/nature10307 (2011b).

Tate, S., Larsen, B., Bonner, R. & Gingras, A. C. Label-free quantitative proteomics trends for protein- protein interactions. Journal of proteomics 81, 91-101, doi:10.1016/j.jprot.2012.10.027 (2013).

Temmerman, R., Vervaeren, H., Noseda, B., Boon, N. & Verstraete, W. Necrotrophic growth of Legionella pneumophila. Applied and environmental microbiology 72, 4323-4328, doi:10.1128/AEM.00070-06 (2006).

Thompson, A., Schafer, J., Kuhn, K. et al. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein mixtures by MS/MS. Analytical chemistry 75, 1895- 1904 (2003).

225

Tilney, L. G., Harb, O. S., Connelly, P. S., Robinson, C. G. & Roy, C. R. How the parasitic bacterium Legionella pneumophila modifies its phagosome and transforms it into rough ER: implications for conversion of plasma membrane to the ER membrane. Journal of cell science 114, 4637- 4650 (2001).

Toby, T. K., Fornelli, L. & Kelleher, N. L. Progress in Top-Down Proteomics and the Analysis of Proteoforms. Annual review of analytical chemistry 9, 499-519, doi:10.1146/annurev-anchem- 071015-041550 (2016).

Toulabi, L., Wu, X., Cheng, Y. & Mao, Y. Identification and structural characterization of a Legionella phosphoinositide phosphatase. The Journal of biological chemistry 288, 24518-24527, doi:10.1074/jbc.M113.474239 (2013).

Tyanova, S., Temu, T., Sinitcyn, P. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nature methods 13, 731-740, doi:10.1038/nmeth.3901 (2016).

Urwyler, S., Nyfeler, Y., Ragaz, C. et al. Proteome analysis of Legionella vacuoles purified by magnetic immunoseparation reveals secretory and endosomal GTPases. Traffic 10, 76-87, doi:10.1111/j.1600-0854.2008.00851.x (2009).

Utsumi, R., Nakamoto, Y., Kawamukai, M., Himeno, M. & Komano, T. Involvement of cyclic AMP and its receptor protein in filamentation of an Escherichia coli fic mutant. Journal of bacteriology 151, 807-812 (1982).

VanRheenen, S. M., Luo, Z. Q., O'Connor, T. & Isberg, R. R. Members of a Legionella pneumophila family of proteins with ExoU (phospholipase A) active sites are translocated to target cells. Infection and immunity 74, 3597-3606, doi:10.1128/IAI.02060-05 (2006).

Vincent, C. D., Friedman, J. R., Jeong, K. C. et al. Identification of the core transmembrane complex of the Legionella Dot/Icm type IV secretion system. Molecular microbiology 62, 1278-1291, doi:10.1111/j.1365-2958.2006.05446.x (2006a).

Vincent, C. D. & Vogel, J. P. The Legionella pneumophila IcmS-LvgA protein complex is important for Dot/Icm-dependent intracellular growth. Molecular microbiology 61, 596-613, doi:10.1111/j.1365-2958.2006.05243.x (2006b).

Viner, R., Chetrit, D., Ehrlich, M. & Segal, G. Identification of two Legionella pneumophila effectors that manipulate host phospholipids biosynthesis. PLoS pathogens 8, e1002988, doi:10.1371/journal.ppat.1002988 (2012).

Vogel, J. P., Andrews, H. L., Wong, S. K. & Isberg, R. R. Conjugative transfer by the virulence system of Legionella pneumophila. Science 279, 873-876 (1998). von Baum, H., Ewig, S., Marre, R. et al. Community-acquired Legionella pneumonia: new insights from the German competence network for community acquired pneumonia. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 46, 1356-1364, doi:10.1086/586741 (2008).

226

Warren, W. J. & Miller, R. D. Growth of Legionnaires disease bacterium (Legionella pneumophila) in chemically defined medium. Journal of clinical microbiology 10, 50-55 (1979).

Weber, S., Stirnimann, C. U., Wieser, M. et al. A type IV translocated Legionella cysteine phytase counteracts intracellular growth restriction by phytate. The Journal of biological chemistry 289, 34175-34188, doi:10.1074/jbc.M114.592568 (2014a).

Weber, S., Wagner, M. & Hilbi, H. Live-cell imaging of phosphoinositide dynamics and membrane architecture during Legionella infection. mBio 5, e00839-00813, doi:10.1128/mBio.00839-13 (2014b).

Weber, S. S., Ragaz, C. & Hilbi, H. The inositol polyphosphate 5-phosphatase OCRL1 restricts intracellular growth of Legionella, localizes to the replicative vacuole and binds to the bacterial effector LpnE. Cellular microbiology 11, 442-460, doi:10.1111/j.1462-5822.2008.01266.x (2009).

Werner, T., Sweetman, G., Savitski, M. F. et al. Ion coalescence of neutron encoded TMT 10-plex reporter ions. Analytical chemistry 86, 3594-3601, doi:10.1021/ac500140s (2014).

Westermann, A. J., Forstner, K. U., Amman, F. et al. Dual RNA-seq unveils noncoding RNA functions in host-pathogen interactions. Nature 529, 496-501, doi:10.1038/nature16547 (2016).

Whiley, H. & Bentham, R. Legionella longbeachae and legionellosis. Emerging infectious diseases 17, 579-583, doi:10.3201/eid1704.100446 (2011).

Wilkinson, H. W. Hospital-laboratory diagnosis of Legionella infections. (US Dept. of Health and Human Services, Public Health Service, Centers for Disease Control, 1987).

Wishart, D. S., Jewison, T., Guo, A. C. et al. HMDB 3.0--The Human Metabolome Database in 2013. Nucleic acids research 41, D801-807, doi:10.1093/nar/gks1065 (2013).

Woolery, A. R., Yu, X., LaBaer, J. & Orth, K. AMPylation of Rho GTPases subverts multiple host signaling processes. The Journal of biological chemistry 289, 32977-32988, doi:10.1074/jbc.M114.601310 (2014).

Worby, C. A., Mattoo, S., Kruger, R. P. et al. The fic domain: regulation of cell signaling by adenylylation. Molecular cell 34, 93-103, doi:10.1016/j.molcel.2009.03.008 (2009).

Xiao, J., Worby, C. A., Mattoo, S., Sankaran, B. & Dixon, J. E. Structural basis of Fic-mediated adenylylation. Nature structural & molecular biology 17, 1004-1010, doi:10.1038/nsmb.1867 (2010).

Xu, H., Yang, J., Gao, W. et al. Innate immune sensing of bacterial modifications of Rho GTPases by the Pyrin inflammasome. Nature 513, 237-241, doi:10.1038/nature13449 (2014).

227

Xu, L., Shen, X., Bryan, A. et al. Inhibition of host vacuolar H+-ATPase activity by a Legionella pneumophila effector. PLoS pathogens 6, e1000822, doi:10.1371/journal.ppat.1000822 (2010).

Yarbrough, M. L., Li, Y., Kinch, L. N. et al. AMPylation of Rho GTPases by Vibrio VopS disrupts effector binding and downstream signaling. Science 323, 269-272, doi:10.1126/science.1166382 (2009).

Yu, V. L., Plouffe, J. F., Pastoris, M. C. et al. Distribution of Legionella species and serogroups isolated by culture in patients with sporadic community-acquired legionellosis: an international collaborative survey. The Journal of infectious diseases 186, 127-128, doi:10.1086/341087 (2002).

Yu, V. L. & Stout, J. E. Community-acquired legionnaires disease: implications for underdiagnosis and laboratory testing. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 46, 1365-1367, doi:10.1086/586742 (2008).

Yu, X., Decker, K. B., Barker, K. et al. Host-pathogen interaction profiling using self-assembling human protein arrays. Journal of proteome research 14, 1920-1936, doi:10.1021/pr5013015 (2015a).

Yu, X. & LaBaer, J. High-throughput identification of proteins with AMPylation using self-assembled human protein (NAPPA) microarrays. Nature protocols 10, 756-767, doi:10.1038/nprot.2015.044 (2015b).

Yugi, K., Kubota, H., Hatano, A. & Kuroda, S. Trans-Omics: How To Reconstruct Biochemical Networks Across Multiple 'Omic' Layers. Trends in biotechnology 34, 276-290, doi:10.1016/j.tibtech.2015.12.013 (2016).

Zhang, J., Li, W., Xiang, T. et al. Receptor-like cytoplasmic kinases integrate signaling from multiple plant immune receptors and are targeted by a Pseudomonas syringae effector. Cell host & microbe 7, 290-301, doi:10.1016/j.chom.2010.03.007 (2010).

Zhu, W., Banga, S., Tan, Y. et al. Comprehensive identification of protein substrates of the Dot/Icm type IV transporter of Legionella pneumophila. PloS one 6, e17638, doi:10.1371/journal.pone.0017638 (2011).

228