Advanced metabolomics for the discrimination of uropathogenic Escherichia coli and their response to

A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences

2014

Haitham AlRabiah

School of Chemistry

Table of Contents

Table of Contents

Table of Contents ...... 2

List of Figures ...... 8

List of Tables...... 16

Abstract ...... 18

Declaration ...... 19

Copyright Statement ...... 20

Dedication ...... 21

Acknowledgements ...... 22

Abbreviations and Acronyms ...... 23

Preface ...... 26

1 Chapter One ...... 27

General Introduction ...... 27

1.1 Antibiotics ...... 27 1.1.1 Overview ...... 27 1.1.2 Definition ...... 28 1.1.3 Major classes of antibiotics ...... 28 1.1.3.1 Modes of action ...... 29

1.1.3.1.1 Inhibitors of bacterial cell wall synthesis ...... 30

1.1.3.1.2 Inhibitors of bacterial protein synthesis ...... 30

1.1.3.1.3 Inhibitors of nucleic acid synthesis ...... 31

1.1.3.1.4 Agents affecting membrane function ...... 31

1.1.4 resistance ...... 32 1.1.4.1 Mechanisms of drug resistance ...... 35

1.1.4.1.1 Enzymatic inactivation of antibiotics ...... 35

1.1.4.1.2 Alteration of the target site ...... 36

1.1.4.1.3 Prevention of antibiotic access to the target site ...... 36

2

Table of Contents

1.1.4.1.4 Metabolic bypass ...... 37

1.2 Metabolomics ...... 38 1.2.1 Overview ...... 38 1.2.2 Metabolomics experiments ...... 42 1.2.2.1 Sample preparation ...... 42

1.2.2.2 Analytical technology ...... 44

1.2.2.2.1 Fourier transform infrared (FT-IR) spectroscopy ...... 45

1.2.2.2.2 Mass spectrometry...... 47

1.2.2.2.2.1 Direct infusion mass spectrometry (DIMS) ...... 49

1.2.2.2.2.2 Gas chromatography-mass spectrometry (GC-MS) ...... 50

Electron Ionisation (EI) ...... 53

Time of Flight (TOF) ...... 53

1.2.2.2.2.3 Liquid chromatography-mass spectrometry (LC-MS) ...... 56

Electrospray Ionisation (ESI) ...... 57

The Orbitrap ...... 59

1.2.2.3 Data analysis ...... 61

1.2.2.3.1 Unsupervised methods ...... 62

1.2.2.3.1.1 Principal component analysis (PCA) ...... 62

1.2.2.3.2 Supervised methods...... 63

1.2.2.3.2.1 Discriminant function analysis (DFA) ...... 63

1.2.2.3.2.2 Partial least squares (PLS) ...... 63

1.2.2.3.2.3 Cross-validation ...... 64

1.3 Aims and objectives of the thesis ...... 65 1.4 References ...... 66 2 Chapter Two ...... 74 pH plays a role in the mode of action of on Escherichia coli...... 74

Abstract ...... 75

2.1 Introduction ...... 76

3

Table of Contents

2.2 Materials and Methods ...... 78 2.2.1 Antibiotic perturbation of E. coli ...... 78 2.2.2 FT-IR spectroscopy ...... 78 2.2.3 GC-MS ...... 79 2.2.4 Metabolomics data analysis ...... 80 2.3 Results and Discussion ...... 81 2.3.1 Determination of the optimum growth conditions (media and pH levels) for E. coli K-12 ...... 81 2.3.2 Determination of the MIC of trimethoprim in E. coli K-12 ...... 82 2.3.3 Challenge of E. coli K-12 with different concentrations of trimethoprim at different pH levels ...... 82 2.3.4 Metabolic fingerprinting of E. coli K-12 using FT-IR spectroscopy ...... 86 2.3.5 Metabolic profiling of E. coli K-12 using GC-MS ...... 90 2.3.6 Phenotypic response of E. coli K-12 to varying pH and trimethoprim exposure ...... 91 2.4 Conclusion ...... 97 2.5 References ...... 99 2.6 Supplementary Information ...... 103 3 Chapter Three ...... 122

High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy ...... 122

Abstract ...... 123

3.1 Introduction ...... 124 3.2 Materials and Methods ...... 126 3.2.1 Microorganisms ...... 126 3.2.2 Media ...... 126 3.2.3 Antibiotics ...... 126 3.2.4 Growth conditions ...... 126 3.2.5 FT-IR spectroscopy ...... 127 3.2.5.1 Sample preparation ...... 127

3.2.5.1.1 Method 1 ...... 127

3.2.5.1.2 Method 2 ...... 127

4

Table of Contents

3.2.5.1.3 Method 3 ...... 128

3.2.5.1.4 Method 4 ...... 128

3.2.5.1.5 Method 5 ...... 128

3.2.5.2 Instrumentation ...... 128

3.2.5.3 Data analysis ...... 129

3.3 Results and Discussion ...... 132 3.4 Conclusion ...... 138 3.5 References ...... 139 3.6 Supplementary Information ...... 142 4 Chapter Four ...... 148

A workflow for bacterial metabolic fingerprinting and lipid profiling: application to ciprofloxacin challenged Escherichia coli ...... 148

Abstract ...... 149

4.1 Introduction ...... 150 4.2 Materials and Methods ...... 152 4.2.1 Chemicals ...... 152 4.2.2 Escherichia coli isolates ...... 152 4.2.3 Preparation of E. coli inoculates ...... 152 4.2.4 Determining ciprofloxacin minimal inhibitory concentration (MIC) ...... 153 4.2.5 E. coli culture and antibiotic challenge for metabolic fingerprinting and lipid profiling ...... 153 4.2.6 E. coli sample collection and quenching for metabolic fingerprinting and lipid profiling ...... 154 4.2.7 Fourier transform infrared (FT-IR) spectroscopy ...... 154 4.2.8 Sample extraction for LC-MS lipid profiling ...... 155 4.2.9 LC-MS analysis ...... 155 4.2.10 Processing of raw LC-MS profiles and lipid identification ...... 156 4.2.11 Statistical analyses ...... 157 4.2.11.1 Principal component analysis (PCA) ...... 157

4.2.11.2 Principal component-discriminant function analysis ...... 157

4.2.11.3 Uni- and multivariate feature selection ...... 158

5

Table of Contents

4.3 Results and Discussion ...... 161 4.3.1 Determination of the ciprofloxacin minimal inhibitory concentration (MIC)…...... 161 4.3.2 Fourier transform infrared (FT-IR) spectroscopy based metabolic fingerprinting...... 162 4.3.3 Lipid profiling of pathogenic E. coli isolates before and after challenging with ciprofloxacin hydrochloride using LC-MS ...... 166 4.3.4 A wide selection of E. coli lipid species are altered by challenge with ciprofloxacin, with many showing differential regulation between sensitive and resistant E. coli isolates ...... 169 4.4 Conclusion ...... 176 4.5 References ...... 177 4.6 Supplementary Information ...... 180 5 Chapter Five ...... 182

Multiple metabolomics of uropathogenic E. coli reveal different information content in terms of metabolic potential compared to virulence factors ...... 182

Abstract ...... 183

5.1 Introduction ...... 185 5.2 Materials and Methods ...... 187 5.2.1 General chemicals ...... 187 5.2.2 Microorganisms ...... 187 5.2.3 Preparation of E. coli inoculates for metabolic fingerprinting and metabolic profiling ...... 187 5.2.4 FT-IR spectroscopy ...... 188 5.2.5 GC-MS ...... 188 5.2.6 LC-MS ...... 188 5.2.7 Data analysis ...... 189 5.3 Results and Discussion ...... 191 5.3.1 Interpretation of FT-IR spectra...... 193 5.3.2 Interpretation of LC-MS profiles ...... 195 5.3.3 Interpretation of GC-MS profiles ...... 197 5.4 Conclusion ...... 200 5.5 References ...... 202

6

Table of Contents

5.6 Supplementary Information ...... 204 6 Chapter Six ...... 216

General Discussion and Future Work ...... 216

6.1 Discussion and Future Work ...... 216 6.2 References ...... 224

Word Count: 57429 words

7

List of Figures

List of Figures

Figure 1.1: History of antibiotic development and emergence of antibiotic resistance. Adapted from Saga and Yamaguchi (2009)...... 32

Figure 1.2: How selective antimicrobial pressure affects microbes...... 34

Figure 1.3: Summary of main antibiotics mechanisms of action and bacterial mechanisms of resistance with antibiotic examples...... 37

Figure 1.4: Number of publications in the field of metabolomics since 2000...... 41

Figure 1.5: FT-IR spectrum with the main characteristic regions. (A) represents fatty acid region (3050-2800 cm-1), (B) amide region attributed to peptides and proteins (1750-1500 cm-1), (C) mixed region ascribed to carboxylic group vibrations of polysaccharides, proteins and free amino acids (1500-1250 cm-1), (D) ascribed to polysaccharides (1200-900 cm-1) and (E) fingerprint region...... 46

Figure 1.6: A schematic of the two-stage oxime silylation derivitisation reaction. ... 51

Figure 2.1: (a) Growth curves of E. coli K-12 at pH 5 (dashed line) and pH 7 (solid line). For pH 5, the dashed blue line represents control samples, dashed red indicates samples challenged with 0.8 mg/L of trimethoprim added at the beginning of the lag phase (t = 0 h) and dashed green denotes samples challenged with 0.8 mg/L of trimethoprim and added at mid-exponential phase (t = 5 h). For pH 7, the solid purple line represents control samples, solid light blue indicates samples challenged with 0.8 mg/L of trimethoprim added at the beginning of the lag phase (t = 0 h) and solid orange denotes samples challenged with 0.8 mg/L of trimethoprim and added at the exponential phase (t = 5 h). (b) Column chart representing relative E. coli intracellular levels of trimethoprim after challenging with 0.8 mg/L of the drug at pH 5 (red columns) and pH 7 (blue columns) at different growth stages (time = 0 and 5 h) as detected by LC-MS analysis...... 85

8

List of Figures

-1 Figure 2.2: (a) PCA scores plot of PC1 vs. PC2 after CO2 removal around 2350 cm and EMSC scaling. The total explained variance (TEV) of PC1 is 78.9% and for PC2 is 12.8%. (b) PC-DFA score plots of pH 5 and 7 samples. 20 PCs were extracted from PCA and used as inputs to DFA. These 20 PCs explain 99% of TEV; the legend in the plot shows the 95% confidence interval (CI) for the correct classification of the eight conditions. C, control...... 87

Figure 2.3: FT-IR multi-block PCA score plots showing the relationship between the effect of different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) and that of different pH levels. Block scores plots showing the distribution of samples with different concentrations at (a) pH 5 and (b) pH 7...... 89

Figure 2.4: PC-DFA score plots of GC-MS profiles (25 PCs were extracted from PCA and used as inputs to DFA, explaining 99% of the TEV). The legend in the figure shows the 95% CI for the correct classification of the 8 conditions. Significantly altered metabolites were mined through a combination of PC-DFA loadings and univariate significance testing (Student t-test). C, control...... 90

Figure 2.5: Metabolic effects of trimethoprim challenge on E. coli K-12 at pH 5 and pH 7. When partially ionised at pH 7, trimethoprim is seen to impact on directly associated with the dihydrofolate pathway, as well as off-target effects upon nucleotide, sugar and amino acid metabolism, glycolysis, the TCA cycle, and up- regulation of osmoprotectants. When trimethoprim is in a poorly ionised state (pH 5), it appears to have a profound effect upon the up-regulation of amino acid metabolism...... 96

Figure S 2.1: General scheme of sample preparation including (1) analysis by Bioscreen to determine the MIC of trimethoprim and produce the growth curves of E. coli K-12 at pH 5 and 7 with and without drug challenge. (2) FT-IR analysis of samples after washing with normal saline. (3) GC-MS analysis of samples after quenching and extraction using 60% and 80% cold (-48 ºC) methanol respectively...... 110

9

List of Figures

Figure S 2.2: General scheme of sample preparation including (1) analysis by Bioscreen to produce the growth curves of E. coli K-12 at pH 5 and 7 after challenge with 0.8 mg/ L of trimethoprim added at two time points: (I) at the beginning of the lag phase (time = 0 h) and (II) at the mid-exponential phase (time = 5 h). (2) LC-MS analysis of sample extracts, for relative quantification of the intracellular drug levels after quenching and extraction using 60% and 80% cold (-48 ºC) methanol respectively...... 114

Figure S 2.3: Calibration curve of trimethoprim concentration built from 20 different gradient concentrations of trimethoprim...... 115

Figure S 2.4: Growth curves of E. coli K-12 in three different media. The blue plot indicates LB; red NB and green Ψ...... 116

Figure S 2.5: Growth curves of E. coli K-12 at four different pH values in the same medium (LB). The blue plot indicates pH 3; red pH5; green pH 7 and purple pH 9...... 116

Figure S 2.6: Optical microscopic image (×100 magnification) of E. coli K-12 inoculated in LB medium at different pH levels. (a) pH 5; (b) pH 7; (c) pH 9...... 117

Figure S 2.7: Growth curves of E. coli K-12 (at pH 7 in LB) exposed to different concentrations of trimethoprim. Blue indicates control samples (0 mg/L); red 8 mg/L; green 2 mg/L; purple 0.3 mg/L; turquoise 0.2 mg/L; orange 0.03 mg/L and light blue 0.003 mg/L...... 117

Figure S 2.8: (a) Chemical structure of trimethoprim (blue circles show the main ionisation points on the structure in acidic media). Growth curves of E. coli K-12 exposed to different concentrations of trimethoprim. Blue indicates control samples (0 mg/L); red 0.003 mg/L; green 0.03 mg/L and purple 0.2 mg/L at (b) pH 9, (c) pH 7 and (d) at pH 5...... 118

Figure S 2.9: FT-IR spectra obtained from E. coli K-12. (a) After exposure to four concentrations of trimethoprim (0.2, 0.03, 0.003 and 0 mg/L) at three different pH 10

List of Figures values (pH 5, 7 and 9). There were six biological replicates for each condition; each replicate was analysed three times, totalling 18 spectra for each condition (total number of spectra = 216). (b) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 9 (total number of spectra = 72). (c) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 7 (total number of spectra = 72). (d) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 5 (total number of spectra = 72). 118 Figure S 2.10: (a) FT-IR spectra obtained from E. coli K-12 after exposure to four concentrations of trimethoprim (0.2, 0.03, 0.003 and 0 mg/L) at two different pH -1 values (5 and 7). (b) FT-IR spectra post CO2 removed at ≈ 2350 cm and EMSC scaling (Martens et al., 2003)...... 119

Figure S 2.11: FT-IR multi-block PCA scores plot showing the distribution of samples with different pH levels at different drug concentrations: (a) 0 mg/L, (b) 0.003 mg/L, (c) 0.03 mg/L and (d) 0.2 mg/L...... 119

Figure S 2.12: KEGG metabolic pathway of E. coli K-12 MG1655 highlighting significant metabolites with their relative levels subjected to different concentrations of trimethoprim at different pH levels (depicted in Materials and Methods). For significantly changed metabolites between pH and drug levels see Table S 2.2. .... 120

Figure 3.1: (a) Raw FT-IR spectra obtained from isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191 before washing the bacterial cells; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plot of non-washed samples (30 PCs were extracted from PCA and used as inputs to DFA, these 30 PCs explain 99.97% of the totals explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations...... 134

Figure 3.2: PC-DFA scores plot of non-washed method for (a) ST131 and (b) non- ST131 isolates. In both cases, 30 PCs were extracted from PCA and used as inputs to DFA. Those 30 PCs explain 99.98% of the data variance on both data sets ST131 and non-ST131. CI, confidence interval...... 136

11

List of Figures

Figure S 3.1: Typical growth curves of 10 pathogenic isolates of E. coli showing different susceptibilities to ciprofloxacin...... 142

Figure S 3.2: Comparison of four different methods of sample preparation for FT-IR analysis: (1) after washing with saline and (2) directly from the flask. From the Bioscreen plate (3) directly and (4) after washing...... 143

Figure S 3.3: A comparison of three different methods of sample preparation for FT- IR analysis: (3) directly from Bioscreen plate, (4) after washing and (5) supernatant...... 144

-1 Figure S 3.4: PCA scores plots of PC1 vs. PC2 after CO2 removal at ca. 2350 cm followed by EMSC scaling. (a) Samples cultured in Bioscreen plate and not washed; the total explained variance (TEV) of PC1 is 82% and of PC2 12.7 %. (b) Samples from a Bioscreen and washed (TEV of PC1 69.2 % and of PC2 14.9 %). (c) Samples cultured in flasks and not washed (TEV of PC1 90.6% and of PC2 5.6 %). (d) Samples grown in flasks and washed (TEV of PC1 78.3% and of PC2 11.7%). Circles representing isolate 162, diamonds representing isolate 163...... 145

Figure S 3.5: PCA score plot of the full data of isolates 162 and 163, cultured under -1 different conditions. PC1 vs. PC2 after CO2 removal around 2350 cm and EMSC scaling. TEV of PC1 is 95% and for PC2 2.9%. Closed symbols for washed samples while open symbols for non-washed samples. Isolate 162 represented by circles, diamonds symbolise isolate 163. Red symbols, shaking flasks; blue symbols, Bioscreen plates...... 145

Figure S 3.6: (a) Raw FT-IR spectra obtained from supernatants of isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plots of supernatant samples; 50 PCs were extracted from PCA and used as inputs to DFA, these 50 PCs explain 99.98% of the total explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations...... 146 12

List of Figures

Figure S 3.7: (a) Raw FT-IR spectra obtained from isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191 after washing; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plots of washed samples; 50 PCs were extracted from PCA and used as inputs to DFA, these 50 PCs explain 99.73% of the total explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations...... 147

Figure 4.1: Schematic of sample preparation including: (1) Bioscreen analysis to determine the MIC of ciprofloxacin and produce the growth curves of pathogenic E. coli isolates; (2) FT-IR analysis of samples; (3) LC-MS analysis of samples after quenching with cold (-48 ºC) methanol and extraction with (1:1) methanol:chloroform...... 160

Figure 4.2: OD 600 nm growth curves collected during 0-18 h post antibiotic challenge to establish minimal inhibitory concentrations (MICs) of ciprofloxacin hydrochloride against E. coli ST131 sequence isolates: (a) 160 (resistant) and (b) 173 (sensitive). Non-ST131 sequence isolates include: (c) 161 (resistant) and (d) 171 (intermediate resistant). (e) Table of predicted MIC (mg/L ciprofloxacin hydrochloride)...... 162

Figure 4.3: FT-IR spectra analysed using principal component-discriminant function analysis (PC-DFA) of E. coli challenged for 18 h with ciprofloxacin and control bacteria. (a) PC-DFA scores plot (DF1 vs. DF2). PC-DFA loadings plots derived by comparisons of ciprofloxacin challenged versus control samples for each respective isolate: (b) ST131 sequence isolate 160 (resistant); (c) ST131 sequence isolate 173 (sensitive); (d) non-ST131 sequence isolate 161 (resistant); (e) non-ST131 sequence isolate 171 (intermediate resistance). C, control; CI, confidence interval...... 165

Figure 4.4: PC-DFA scores plot of LC-MS data from E. coli challenged for 18 h with ciprofloxacin and control bacteria: (a) PC-DFA scores plot (DF1 vs. DF2) of LC-MS-ve mode data; (b) PC-DFA scores plot (DF1 vs. DF2) of LC-MS+ve mode data. QC, quality control; C, control; CI, confidence interval...... 168 13

List of Figures

Figure 4.5: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin. Lipids specifically regulated in susceptible or resistant isolates: (a) significantly up-regulated in susceptible isolates; (b) significantly down-regulated in susceptible isolates; (c) significantly down-regulated in resistant isolates. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4...... 171

Figure 4.6: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin: lipids differentially regulated between susceptible and resistant isolates: (a) up-regulated in susceptible isolates and down-regulated in resistant isolates; (b) up-regulated in resistant isolates and down-regulated in susceptible isolates. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4...... 173

Figure S 4.1: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin. Common lipid responses between susceptible and resistant isolates showing lipids that are either (a) up-regulated or (b) down-regulated. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4...... 181

Figure 5.1: Venn diagram-like plot showing the overall clustering congruence between the four analytical approaches and the two microbiological tests. See text for explanation of its construction...... 193

Figure 5.2: Superimposed scatter plots of PCoA scores of the first two components of the VF tests and Procrustean-transformed FT-IR spectra...... 195

14

List of Figures

Figure 5.3: Superimposed scatter plots of PCoA scores of the first two components of the VF tests and Procrustean-transformed LC-MS negative mode data...... 197

Figure 5.4: Superimposed scatter plots of PCoA scores of the first two components of the metabolic tests and Procrustean-transformed GC-MS data...... 198

Figure 5.5: Box-whisker plots for each isolate demonstrating the concentration level of candidate intracellular metabolites from (a) variable 17 (leucine), (b) variable 49 (unknown), and (c) phosphate...... 199

Figure S 5.1: General scheme of sample preparation approach used including: (1) FT-IR analysis of samples directly from the culture; (2) LC-MS; and (3) GC-MS analysis of samples after quenching and extraction using 60% and 80% cold (-48 ºC) methanol, respectively...... 209

Figure S 5.2: Superimposed scatter plots of PCoA scores of the first two components of the metabolic test and Procrustean transformed LC-MS positive mode data...... 214

Figure S 5.3: Rotated loading plot of GC-MS data showing the highly correlated metabolites, which were associated with metabolic profiling data...... 214

Figure 6.1: A summary of the development of a workflow (Chapters 3, 4 and 5) for the discrimination between different pathogenic E. coli isolates...... 223

15

List of Tables

List of Tables

Table 1.1: Classification of antibiotics depending on mechanism of action and chemical structure. Table adapted from Mozayani and Raymon (2003)...... 29

Table 1.2: Effect of the emergence of resistance on prescription of antibiotics. Adapted from Finch (2004)...... 33

Table 1.3: Classification of metabolomic approaches...... 40

Table S 2.1: List of metabolites detected by GC-MS after extraction from control and stressed E. coli K-12 with trimethoprim at two different pH levels (5 and 7). See additional information on the attached CD...... 120

Table S 2.2: ANOVA analysis of the significantly changed metabolites between different conditions (attached CD)...... 120

Table 3.1: MIC of ciprofloxacin hydrochloride for pathogenic isolates of E. coli which have different sequence types (ST131 and non-ST131) with different susceptibilities to quinolones ...... 132

Table S 4.1: Normalised sample reconstitution volumes used prior to LC-MS analysis. C, control...... 180

Table S 4.2: LC-MS lipidomics table of class averages and within class standard error, PLS co-efficient, ANOVA p-value, and putative metabolite identifications for LC-MS+ve and LC-MS-ve mode data. See additional information on the attached CD...... 181

Table S 4.3: LC-MS+ve mode data set of the top 50 PLS coefficients for control compared to ciprofloxacin challenge for each respective isolate. See additional information on the attached CD...... 181

16

List of Tables

Table S 4.4: LC-MS-ve mode data set of the top 50 PLS coefficients for control compared to ciprofloxacin challenge for each respective isolate. See additional information on the attached CD...... 181

Table S 5.1: Genetic backgrounds mediating quinolone resistance in the ST131 uropathogenic E. coli (UPEC) isolates used in this study...... 210

Table S 5.2: Virulence factors for the Escherichia coli isolates used in this study. 211

Table S 5.3: Results of metabolic/biochemical profiling of the Escherichia coli isolates used in this study...... 212

Table 5.1: The Procrustes errors with the associated p-values of the pair-wise comparisons...... 193

Table 6.1: A summary of metabolites level changes in E. coli K-12 upon challenge with trimethoprim at either pH 5 or 7...... 219

17

Abstract

Abstract The University of Manchester Haitham AlRabiah Doctor of Philosophy Advanced metabolomics for the discrimination of uropathogenic Escherichia coli and their response to antibiotics 2013 In recent years, the role of metabolomics has become increasingly more important in the advancement of many research fields including medical studies. Due to lack of metabolomics research in the area of infectious disease and the rise in antibiotic resistance, there is a need for further studies on the modes of antibiotic action and the mechanisms of resistance of pathogenic microorganisms at the metabolome level. This study aimed to investigate effects of DNA synthesis inhibitors on the metabolome of E. coli and to develop a workflow for discrimination between E. coli isolates down to the sub-species level using a variety of methods, which can inform the choice of analytical techniques in metabolomics research. A metabolomics-based approach was used to elucidate metabolic alterations in E. coli K-12 upon challenge with trimethoprim at two pH levels (5 and 7) which mimic human urine acidity. FT-IR spectroscopy was used as a preliminary experiment to produce bacterial fingerprints and GC-MS was applied to generate global metabolic profiles in each condition. At pH 7, as the drug molecules exhibited higher permeability, stronger direct effects of the antibiotic were observed, i.e. decreased levels of nucleotides. Trehalose, an osmoprotectant, was up-regulated in these stress conditions and this up-regulation was mirrored by a decrease in glucose levels. This also correlated with up-regulation of pyruvate-related products (e.g. alanine, citrate and malate). Other off-target related effects were observed such as alterations in the levels of various amino acids upon trimethoprim challenge. This study offered a wider view of drug action at pH levels similar to healthy human urine. A high throughput FT-IR spectroscopy method was developed to discriminate between pathogenic E. coli isolates based on sequence type. This method employed a Bioscreen as a micro-culture incubator instead of traditional sample preparation (shaking flasks), which can be labour intensive and time consuming. Excluding the washing step in the protocol enabled discrimination between isolates of different sequence types. Moreover, a reproducible workflow of lipid analysis based on LC- MS was developed and applied on four pathogenic isolates with different sequence type and susceptibility to ciprofloxacin. This workflow enabled detection of a wide range of lipid classes and determination of significant alterations in lipid levels related to susceptibility to ciprofloxacin. Stressed and control isolates were also analysed using the developed Bioscreen FT-IR approach to assess phenotypic fingerprint differences, which were in line with the LC-MS-ve class distribution. Further investigation by means of four analytical platforms (FT-IR, GC-MS, LC- MS-ve and LC-MS+ve) was applied on E. coli ST131 isolates characterised using classical microbiological tests (virulence factors and metabolic tests). Procrustes transformation was used to compare between the analytical methods and the microbiological tests in terms of their capacity to discriminate between the different isolates. As indicated above, the results from FT-IR and LC-MS-ve were comparable and in line with virulence tests, while GC-MS and metabolic tests were in agreement. Complementary information generated by different analytical techniques and microbiological tests may indicate the requirement for careful selection of the method of investigation and may suggest the need to continue using a combination of methods which are applied to study different features of bacterial physiology.

18

Declaration

Declaration

No portion of the work referred to in this thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

19

Copyright Statement

Copyright Statement

i. The author of this thesis (including any appendices and/or schedules to this thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he has given The University of Manchester certain rights to use such Copyright, including for administrative purposes. ii. Copies of this thesis, either in full or in extracts and whether in hard or electronic copy, may be made only in accordance with the Copyright, Designs and Patents Act 1988 (as amended) and regulations issued under it or, where appropriate, in accordance with licensing agreements which the University has from time to time. This page must form part of any such copies made. iii. The ownership of certain Copyright, patents, designs, trade marks and other intellectual property (the “Intellectual Property”) and any reproductions of copyright works in the thesis, for example graphs and tables (“Reproductions”), which may be described in this thesis, may not be owned by the author and may be owned by third parties. Such Intellectual Property and Reproductions cannot and must not be made available for use without the prior written permission of the owner(s) of the relevant Intellectual Property and/or Reproductions. iv. Further information on the conditions under which disclosure, publication and commercialisation of this thesis, the Copyright and any Intellectual Property and/or Reproductions described in it may take place is available in the University IP Policy (see http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=487), in any relevant Thesis restriction declarations deposited in the University Library, The University Library‟s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University‟s policy on Presentation of Theses.

20

Dedication

Dedication

To my beloved Mum and Dad, my cherished Grandma, my lovely Sisters, and my delightful fiancee

21

Acknowledgements

Acknowledgements

I would like, first and foremost, to thank and express my utmost gratitude to Al- Mighty God for all the blessings He bestowed upon me. I would also like to express my sincere appreciation and immense gratitude to my supervisor, Prof. Roy Goodacre, for his guidance and support throughout this study. Your assistance was indispensable and your patience has made completion of this project possible. Thank you ever so much. I would like to express my thanks to Dr. Will Allwood for his valuable assistance and support in this project as well as the good times we shared during my stay in the UK. Special thanks go to the members of Roy‟s group past and present; Dr. David Ellis, Dr. Yun Xu, Dr. Catherine Winder, Dr. Nicoletta Nicolaou-Markide, Dr. Elon Correa, Dr. Andrew Vaughan, Dr. Yankuba Kassama, Dr. Nik Rattray, Dr. Kat Hollywood and Dr. Drupad Trivedi, and my colleagues, Omar Alharbi, Najla Almasoud, Howbeer Ali, Ali Sayqal, Abdu Subaihi and Piotr Gromski. I would also like to express my sincere thanks to my best friend and my flat mate who helped me through the hard times and made my stay very pleasant, Dr. Aldulaziz Al-Malik. Thank you very much. Special thanks also go to Dr. Brahim Achour for his friendship and the pleasant times we shared. Last but not least, I would like to express my appreciation, gratitude and thanks to my family; my mum, Naifah, who has suffered to give me happiness and has supported me with her prayers, my dad, Khalid, who is my support and my role model, my grandmother, Hesah, for her warmth, kindness and continuous prayers despite her illness, my sisters, Rawa, Haifa, Deema and Lama, for their emotional support and prayers. Also, my thanks go to my uncles and aunts, especially Tamim, Khalid, Fahad, Al-Jowharah, Monirah, Azizah, Hadiah and Nadrah, and my cousins, especially Rowaidah and Reem. Also, a special thank you to Ahmed Al-Faris for his immense academic and emotional support for me during my studies. And lovingly, my immense gratitude and thanks go to my fiancee with love and appreciation.

22

Abbreviations and Acronyms

Abbreviations and Acronyms

2 Chi-square -ve Negative +ve Positive Ψ Phage broth ANN Artificial neural network ANOVA Analysis of variance APCI Atmospheric pressure chemical ionisation APPI Atmospheric pressure photoionisation ATR Attenuated total reflectance BLNAR β-lactamase negative amoxicillin resistant CE Capillary electrophoresis CI Chemical ionisation CI Confidence interval CID Collision induced dissociation CoA Coenzyme A CPCA Consensus principal component analysis dCTP Deoxycytidine 5'-triphosphate DFA Discriminant function analysis DGs Diacylglycerides DHF Dihydrofolate DHFR Dihydrofolate reductase DIMS Direct infusion mass spectrometry DNA Deoxyribonucleic acid DTGS Deuterated triglycine sulphate dTMP Deoxythymidine 5'-monophosphate dTTP Deoxythymidine 5'-triphosphate dUMP Deoxyuridine 5'-monophosphate dUTP Deoxyuridine 5'-triphosphate EI Electron ionisation EMSC Extended multiplicative signal correction ESBL Extended spectrum β-lactamase ESI Electrospray ionisation FP-γ-GS Folylpoly-γ-glutamate synthetase FT-ICR Fourier transform ion cyclotron resonance FT-IR Fourier transform infrared FWHM Full width at half maximum GC Gas chromatography GMD Golm Metabolome Database GTP Guanosine 5'-triphosphate HCA Hierarchical cluster analysis HCD Higher-energy collision dissociation

23

Abbreviations and Acronyms

HILIC Hydrophilic interaction liquid chromatography HPLC High performance liquid chromatography HTP High-throughput HTS High-throughput screening IR Infrared KEGG Kyoto Encyclopedia of Genes and Genomes KPC Klebsiella pneumoniae carbapenemase LB Lysogeny broth LC Liquid chromatography LTQ Linear trap quadrupole m/z Mass-to-charge ratio MB-PCA Multi-block principal component analysis MCA Metabolic control analysis MCP Micro-channel plate MDR Multidrug resistance MDRP Multidrug resistant Pseudomonas aeruginosa MIC Minimum inhibitory concentration MIR Mid infrared region mRNA Messenger ribonucleic acid MRSA Methicillin resistant Staphylococcus aureus MS Mass spectrometry MSI Metabolomics standards initiative MSSA Methicillin sensitive Staphylococcus aureus MSTFA N-methyl-N-trimethylsilyl-trifluoroacetamide MT Metabolic test MVA Multivariate analysis NA Nutrient agar NAD Nicotinamide adenine dinucleotide NADP Nicotinamide adenine dinucleotide phosphate NB Nutrient broth NIST National Institute of Standards and Technology NMR Nuclear magnetic resonance oaTOF Orthogonal acceleration time of flight OD Optical density p Probability pABA para-aminobenzoic acid PA Phosphatidic acid PBP2a Penicillin binding proteins 2a PC Principal component PCA Principal component analysis PC-DFA Principal component-discriminant function analysis PCoA Principal coordinate analysis PC Phosphatidylcholine PE Phosphatidylethanolamine

24

Abbreviations and Acronyms

PG Phosphatidylglycerol PISP Penicillin intermediately resistant Streptococcus pneumoniae PLS Partial least squares ppGpp Guanosine 5'-diphosphate 3'-diphosphate ppm Parts per million PRSP Penicillin resistant Streptococcus pneumoniae psi Pounds per square inch PS Phosphatidylserine QC Quality control QIT Quadrupole ion trap reTOF Reflectron time of flight RI Retention index RP Reversed phase rpm Revolutions per minute RSD Relative standard deviation RT Retention time S Svedberg S/N Signal to noise ratio SDS Sodium dodecyl sulphate ST Sequence type TBS T-butyldimethylsilylation TCA Tricarboxylic acid TDC Time to digital converter TEV Total explained variance THF Tetrahydrofolate

THF (glu)n THF-polyglutamate TMS Trimethylsilylation TOF Time of flight UDP Uridine 5'-diphosphate UHPLC Ultra high performance liquid chromatography UPEC Uropathogenic Escherichia coli UTI Urinary tract infection VF Virulence factor VRE Vancomycin resistant enterococci

25

Preface

Preface

Metabolomics is a recently introduced field that has become an important tool in advancing research in multiple areas including environmental science, agriculture and medical research. The study presented in this thesis was carried out to further the application of metabolomics to the field of infectious disease and in particular urinary tract infections. The focus of this research was to advance understanding of trimethoprim‟s mode of action in environments similar to urine and to develop further methods to discriminate between different pathogenic E. coli isolates based on sequence type and susceptibility to the antibiotic ciprofloxacin. During this investigation, the outcomes of different analytical methods and microbiological tests were directly compared to inform the best choice of methodology. The metabolomics techniques used in this thesis are FT-IR spectroscopy, GC-MS and RPLC-MS in both positive and negative ionisation modes. These methods provided very rich data which were used to address the objectives of this study. While the work presented here focused only on E. coli and the use of only two antibiotics (trimethoprim and ciprofloxacin), it can be readily extended to study other infectious microorganisms and other antimicrobial agents.

This thesis is arranged to include six chapters: a general introduction to antibiotics and the field of metabolomics, four results chapters and a concluding discussion and future work chapter. Chapter 3 is published, Chapters 4 and 5 are submitted. As is the case with any publication, the results chapters constitute the outcome of collaboration with colleagues and the collaborators are credited for their contribution at the beginning of each chapter.

The research work presented in this thesis was a challenge which was found to be highly educational and most enjoyable. I managed to acquire valuable skills and experience most of the aspects of scientific research which will undoubtedly be employed in my future academic career. Finally, I believe this thesis has contributed, to some extent, to the advancement of metabolomics research and has provided further insights into the field of infectious disease; however, I am hopeful that in the future other researchers can utilise the findings of this work and further expand and build on it.

Haitham AlRabiah 26

Chapter One

1 Chapter One

General Introduction

1.1 Antibiotics

1.1.1 Overview The fight against infectious diseases is as old as mankind itself. In the days when etiological agents of infectious diseases had not yet been identified, progress had to rely mostly on chance as well as empirical observation. Natural products were the earliest therapeutic remedies against infectious diseases. For example, infected lesions could be treated by applying honey topically. More recent examples are quinine and emetine which were introduced into Europe in the 17th century. These two natural compounds are active against protozoa and both have survived through into current use (Greenwood, 2000).

In 1857, Louis Pasteur, a French chemist, was the first to document cause and effect relationships between certain microorganisms and biochemical processes when he studied „diseases of wine‟. He demonstrated that the type of microorganism introduced to the grape juice prior to the fermentation process affected the quality of the fermentation product. The first cause and effect relationship between a specific microorganism, Bacillus anthracis, and a specific disease, anthrax, was proved in 1876 by German physician, Robert Koch. Over 20 years, Koch and his students developed laboratory techniques and a variety of materials and methods for the in vitro cultivation of microorganisms, which eventually led to the establishment of the cause and effect relationship between many bacteria and human diseases (Cloutier, 1995).

Accordingly, the main therapeutic strategy developed was aimed at producing agents that are able to act quickly, specifically and have no serious effect on the host. These agents were referred to as „Magic Bullets‟ by Paul Ehrlich , a German physician who was able to synthesise the first antimicrobial in the world „Salvarsan‟, a remedy for syphilis, in 1910 (Saga and Yamaguchi, 2009; Cloutier, 1995). The greatest advance in the history of the fight against infectious disease was the discovery of penicillin by Alexander Fleming in 1928. Whilst studying Staphylococcus aureus culture dishes, he noticed that a fungus grew in the shape of a ring and that the area

27

Chapter One surrounding the ring appeared to be free of the bacteria. Penicillin was clinically used in the 1940s, which explains how the lives of many wounded soldiers in the Second World War were saved. The subsequent two decades were the golden era for antimicrobial chemotherapy, when many agents were developed in succession (e.g. streptomycin, chloramphenicol, tetracycline, macrolide, vancomycin, nalidixic acid and quinolone) (Saga and Yamaguchi, 2009).

1.1.2 Definition The literal meaning of the term antibiotic is „against life‟. In 1889, Vuillemin coined the term „antibiosis‟ to indicate antagonism between creatures (O'Grady and Lambert, 1997). In 1942, Waksman was the first to use the noun antibiotic and he defined it as “chemical substances that are produced by microorganisms and that have the capacity, in dilute solution, to selectively inhibit the growth of or even to destroy other microorganisms” (Waksman and Lechevalier, 1962). The first part of the definition excludes antimicrobial substances produced by other than microorganisms such as gastric juice and antibodies, whilst the last part excludes metabolite products to confine and describe the antibiotic precisely (O'Grady and Lambert, 1997).

1.1.3 Major classes of antibiotics Antibiotics can be natural, semi-synthetic or synthetic products. These products can be divided into two main categories: bactericidal - killing the bacteria, as with penicillin, or bacteriostatic - stopping the growth of bacteria, as with chloramphenicol (Walsh, 2003). In some instances, there can be overlaps between the two categories; for example, some antibiotics might start out as bacteriostatic, but as the dose is increased, they could become bactericidal (Walsh, 2003; Schlegel and Zaborosch, 1993).

Antibiotics can also be classified in different ways such as the economic impact, bacterial disease and clinical effectiveness (Walsh, 2003). The most common classification is either by chemical classes (Berdy, 1974) or by modes of action (Cuddy, 1997; Fish et al., 1995) as explained in Table 1.1.

28

Chapter One

Table 1.1: Classification of antibiotics depending on mechanism of action and chemical structure. Table adapted from Mozayani and Raymon (2003).

Mechanism of action Chemical structure Examples

Inhibitors of bacterial cell wall β-Lactams Penicillins, cephalosporins, synthesis vancomycin, cycloserine, bacitracin

Inhibitors of bacterial protein synthesis:

-Bacteriostatic. Affect function Macrolides, Macrolides (e.g. erythromycin, of 30S and 50S ribosomal azithromycin)

subunits causing a reversible Tetracyclines Tetracyclines (e.g. tetracycline, inhibition of protein synthesis doxycycline)

-Bactericidal. Binding to 30S Aminoglycosides Gentamicin, tobramycin, streptomycin ribosomal subunit altering

protein synthesis leading to bacterial cell

Inhibitors of nucleic acid synthesis:

-Directly via inhibition of Quinolones, Quinolones (e.g. nalidixic acid, topoisomerases (quinolones) or ciprofloxacin) inhibition of polymerases

(rifamycins) Rifamycins Rifamycins (e.g. rifampin, rifabutin)

-Indirectly via inhibition of Sulfonamides, Sulfonamides (e.g. ) dihydropteroate synthetase (sulfonamides) or inhibition of Diaminopyrimidines Diaminopyrimidines (e.g. trimethoprim) dihydrofolate reductase (diaminopyrimidines)

Agents affecting membrane Polymyxins Polymyxin B, colistin function

1.1.3.1 Modes of action Characteristic variations between prokaryotic and eukaryotic cells distinguish the bacterial cells with specific targets for antibiotics (Walsh, 2003). In order to understand the mechanism of action of antibiotics, four major targets have been studied: cell wall biosynthesis, protein biosynthesis, nucleic acid synthesis and bacterial cell membrane (O'Grady and Lambert, 1997).

29

Chapter One

1.1.3.1.1 Inhibitors of bacterial cell wall synthesis The cell walls of Gram-negative and Gram-positive bacteria are fundamentally different in many aspects. However, they both share the cross-linked chains of peptidoglycan (also known as murein or mucopeptide), which, with other cellular wall components, form a rigid, shape maintaining layer and a structural barrier to osmotic pressure that could kill the bacteria. Any disturbances of the biosynthesis of these components will lead to the destruction of the rigidity of the bacterial cell wall, resulting in bacterial death. Antibiotics, such as penicillins, vancomycin, cephalosporins, fosfomycin, bacitracin and teicoplanin work via this mechanism, mainly inhibiting enzymes or sequestering their substrates that are involved in the cross-linking and assembly of peptidoglycan (Walsh, 2003; O'Grady and Lambert, 1997).

1.1.3.1.2 Inhibitors of bacterial protein synthesis A ribosome, which is responsible for protein synthesis, is composed of two subunits and each subunit is composed of ribosomal RNA and various proteins (O'Grady and Lambert, 1997). The bacterial ribosome differs from the eukaryotic ribosome in both RNA and protein contents. When subjected to ultracentrifugation, mammalian (eukaryotic) ribosomes give a sedimentation coefficient of 80S (Svedberg units) and dissociate into 60S and 40S subunits. On the other hand, bacterial ribosomes display a 70S sedimentation coefficient and are composed of 50S and 30S subunits. Many antibiotics inhibit bacterial growth by interfering with ribosomal protein biosynthesis. They bind to specific ribosomal protein subunits directly or indirectly and cause an inhibition to protein synthesis and eventually lead to either bacteriostatic (e.g. chloramphenicol, macrolides and tetracycline) or bactericidal (e.g. aminoglycosides) effect. Although few members of this class, such as tetracycline, attack eukaryotic ribosomes, most of the others do not bind to 80S mammalian ribosomes, which accounts for their selective toxicity (O'Grady and Lambert, 1997; Greenwood, 2000).

30

Chapter One

1.1.3.1.3 Inhibitors of nucleic acid synthesis Antibiotics can inhibit nucleic acid synthesis at several levels, both directly and indirectly. Quinolones, for example, act directly on DNA synthesis by interfering with topoisomerase enzymes which maintain the DNA helix in its optimal supercoiling state during and replication. Quinolones bind to topoisomerases and cause a conformational change in their structure, which prevents re-ligation of DNA strands during DNA replication. These antibiotics have two targets, topoisomerase IV and topoisomerase II (DNA gyrase). Generally, in Gram- positive bacteria topoisomerase IV is the main target, while in Gram-negative bacteria DNA gyrase is the main site of action (Greenwood, 2000).

Sulfonamides and diaminopyrimidines inhibit DNA synthesis indirectly. They act on the folic acid synthesis pathway in different steps which allows them to be used synergistically in lower concentrations. Sulfonamides, such as sulfamethoxazole, which is an analogue to para-aminobenzoic acid, act at an early stage of the pathway by inhibiting the condensation of dihydropteridine with para-aminobenzoic acid to produce dihydropteroic acid; while diaminopyrimidines act at a later stage by inhibiting dihydrofolate reductase, the enzyme which reduces dihydrofolate to generate tetrahydrofolate (see Figure 2.5 for details). Trimethoprim is an example of a diaminopyrimidine antibiotic which demonstrates high selectivity in inhibiting dihydrofolate reductase compared with its mammalian equivalent (Greenwood, 2000; O'Grady and Lambert, 1997).

Rifamycins are another class of antibiotics that inhibit nucleic acid biosynthesis. They bind to DNA-dependent RNA polymerase (β-subunit), and as a result they interfere with mRNA formation. Rifampicin is one of the most effective members of this class, which has high selectivity as it acts on DNA dependent RNA polymerase of bacteria and does not affect mammalian enzymes (Greenwood, 2000; Erlich, 1973).

1.1.3.1.4 Agents affecting membrane function There are some antibiotics which act on the bacterial cell membrane. Tyrocidine and gramicidin are examples of topically used antibiotics that lack selectivity and attack both bacterial and mammalian cells. Polymyxins, such as colistin, are another

31

Chapter One example of antibiotics which use this mechanism but with a higher level of selectivity. These antibiotics use the same mechanism as cationic detergents, as they act on the phospholipid membrane - targeting the exposed phosphate group and resulting in disruption of the cell membrane, leakage of cytoplasmic content and ultimately (O'Grady and Lambert, 1997; Greenwood, 2000).

1.1.4 Antibiotic resistance When antibiotics were first used clinically it became generally believed that bacterial disease was no longer a threat to public health. However, this optimistic view diminished rapidly with the emergence of antibiotic resistance. For major classes of antibiotics, this mostly occurred within four years of clinical use (Wax et al., 2001). Ehrlich stated that “drug resistance follows the drug like a faithful shadow” (van der Goot, 2002) and this can be clearly seen in Figure 1.1.

Emergence of drug-resistant Development of antimicrobial agents bacteria 1928 Discovery of penicillin 1935 Discovery of a sulfonamide Emergence of penicillinase-producing 1940s Clinical application of penicillin Staphylococcus aureus 1950s Discovery of aminoglycoside, chloramphenicol, tetracycline, and macrolide Emergence and spread of multidrug-resistant S. aureus 1956 Discovery of vancomycin 1960 Synthesis of methicillin Emergence of MRSA 1961 1962 Synthesis of nalidixic acid Emergence of PISP 1967 Development of first-generation cephems Emergence of penicillinase-producing 1974 Haemophilus influenzae Development of second-generation cephems Emergence of PRSP 1977 Emergence of BLNAR H. influenzae 1980 Emergence of ESBL-producing Gram- 1983 Development of third-generation cephems negative bacilli Emergence of VRE 1986 Development of carbapenem Development of and monobactam new quinolones Increased infections with MRSA, 1990s PRSP, BLNAR, etc. Increase of resistant gonococci Increased use of third-generation cephems, carbapenem, oral cephems, and new quinolone Increase of MDRP 2000s antimicrobials Increase of quinolone-resistant E. coli (Decrease in newly developed antimicrobial agents) Figure 1.1: History of antibiotic development and emergence of antibiotic resistance. Adapted from Saga and Yamaguchi (2009). MRSA: Methicillin resistant Staphylococcus aureus, PISP: Penicillin intermediately resistant Streptococcus pneumoniae, BLNAR: β-lactamase negative amoxicillin resistant, PRSP: Penicillin resistant Streptococcus pneumoniae, ESBL: Extended-spectrum β-lactamase, VRE: Vancomycin resistant enterococci, MDRP: multidrug resistant Pseudomonas aeruginosa.

32

Chapter One

Initially, antibiotic resistance was easily overcome by increasing the dose of the antibiotic used to treat the infection, but that rapidly became an ineffective strategy (Cloutier, 1995). Prescribing alternative antibiotics was the obvious solution, however resistance appeared again and microorganisms became multidrug resistant (Table 1.2). The first cases of multidrug resistance (MDR) were detected between the late 1950s and the early 1960s in enteric bacteria, namely Escherichia coli, Salmonella and Shigella. One example of this is that of Shigella dysenteriae in South America which developed a resistance to tetracycline, chloramphenicol and streptomycin (Levy and Marshall, 2004; Levy, 2001). Some multidrug resistant strains can require up to six antibiotics for effective treatment, which is a real sign of increased risk of fatality in humans (Iseman, 1993).

Table 1.2: Effect of the emergence of resistance on prescription of antibiotics. Adapted from Finch (2004).

Infection/disease Change in recommended therapy

Urinary tract infections Sulfonamide → trimethoprim → fluoroquinolone Meningitis Chloramphenicol → ceftriaxone → vancomycin + third-generation cephalosporin

Biliary sepsis Ampicillin → cephalosporins → fluoroquinolone

Shigellosis Tetracycline → co-trimoxazole → fluoroquinolone

Enteric fever Chloramphenicol → ampicillin → co-trimoxazole → fluoroquinolone

Gonorrhoea Penicillin → quinolone → ceftriaxone

Staphylococci Penicillin → flucloxacillin → vancomycin

From a genetic point of view, antibiotic resistance is either intrinsic or acquired (Neu, 1992). Intrinsic resistance is associated with the inherent genetic makeup of an organism. For example, streptococci have intrinsic resistance to aminoglycosides because their cell walls prevent antimicrobial access to the target site (the bacterial ribosome). On the other hand, acquired resistance arises either from the change, or mutation, in the genetic makeup of an organism or from the acquisition of new genetic materials which encode a mechanism of resistance that was not previously possessed by the organism. The former type of acquired resistance is usually uncommon and not transferable to other organisms (e.g. Mycobacterium tuberculosis

33

Chapter One resistance to isoniazid), while the latter is of greater concern because it can be transferred from one organism to another as well as from one species to another through transferable segments of DNA (e.g. plasmids), where multiple resistance genes can be transferred at the same time. Plasmid mediated synthesis of β-lactamase enzymes in S. aureus, E. coli and Enterobacter species is a well-known example of this type of resistance. These enzymes bind to β-lactam antibiotics (e.g. penicillin) and hydrolyse the cyclic amide bond of the β-lactam ring which renders the antibiotics ineffective (Mulvey and Simor, 2009; O'Grady and Lambert, 1997).

Upon exposure to antibiotics, the majority of sensitive bacteria will be inhibited or killed. However, antibiotics may not inhibit resistant bacteria, whether resistance is intrinsic or acquired, which ultimately results in proliferation and predominance of the resistant strains. This can be described as a selective pressure of antimicrobial use which improves the living conditions of resistant strains by killing the sensitive strains and hence reduces the competition (McGowan, 1983) (Figure 1.2).

Non-resistant Resistant Antibiotic bacterium bacterium

Figure 1.2: How selective antimicrobial pressure affects microbes.

In 2002, Wise estimated that the worldwide consumption of antibiotics was between 100000 to 200000 tonnes per year. This intensive use of antibiotics has fuelled resistance of human pathogens to antibiotics and rendered antibiotics ineffective pushing the world into a post-antibiotic era (Wise, 2002).

Resistance has a negative economic effect by increasing the cost of treatment. Studies comparing the drug-susceptible and resistant infections have concluded that resistant infections double mortality and most likely double both morbidity and cost (Holmberg et al., 1987). In New York, methicillin resistant Staphylococcus aureus

34

Chapter One

(MRSA) has a record of mortality around three times higher than methicillin sensitive Staphylococcus aureus (MSSA) (21 % versus 8 %) which led to an increase in economic cost of 22% (Rubin et al., 1999). Another study has shown that the US government pays up to 30 billion dollars every year as a result of resistance (Phelps, 1989).

1.1.4.1 Mechanisms of drug resistance

1.1.4.1.1 Enzymatic inactivation of antibiotics Chemical structures of antibiotics might be destroyed or modified by bacterial enzymes resulting in inactive compounds. This mechanism is one of the most important mechanisms because it occurs in most widely used antibiotics such as penicillins and cephalosporins (Mulvey and Simor, 2009).

β-lactamase enzymes are an example of these enzymes. They hydrolyse the β-lactam ring which is the critical part of the antibiotic structure. This reaction opens the β- lactam ring and the compound loses its ability to bind to its target and becomes an inactive compound. Some genes that code β-lactamases are chromosomal such as those in resistant Enterobacter species, while others are found on transposons and plasmids such as genes of resistant Gram-negative bacilli bacteria (O'Grady and Lambert, 1997; Greenwood, 2000).

β-lactamase enzymes are involved in the resistance of many microorganisms, such as S. aureus and Neisseria gonorrhoeae to penicillin, Haemophilus influenzae to ampicillin and Enterobacter species to extended-spectrum cephalosporins (e.g. ceftazidime, cefotaxime, and ceftriaxone) (Mulvey and Simor, 2009).

Carbapenemases are an example of broad spectrum β-lactamase enzymes. They degrade carbapenem and are responsible for Pseudomonas aeruginosa resistance to meropenem and imipenem antibiotics (Mulvey and Simor, 2009). Klebsiella pneumoniae carbapenemase (KPC) is an emerging group of these enzymes. KPC is largely found in Klebsiella pneumoniae but it has also spread to other Enterobacteriaceae. Genes of this enzyme are located on plasmids which usually carry other resistant genes for different antibiotics (Queenan and Bush, 2007).

35

Chapter One

1.1.4.1.2 Alteration of the target site In order to function and have an effect, antibiotics bind to specific target sites on the bacterial cell, which vary between classes of antibiotics. As can be seen from the previous mechanism, changing the structure of the antibiotic can render it inactive. The same result occurs when the structure of the target site is altered. In simple terms, it is like a lock and key theory, where any change or modification of either the lock or the key prevents the combination being effective.

β-lactam antibiotics act on penicillin binding proteins as a target on the cell wall (O'Grady and Lambert, 1997). MRSA, for example, is resistant to all penicillins, carbapenems and cephalosporins because it contains altered penicillin binding proteins (PBP2a) (Katayama et al., 2000). MRSA carries a genetic element called staphylococcal cassette chromosome mec (SCCmec), which contains a gene called mecA which codes the production of PBP2a (Katayama et al., 2000).

Fluoroquinolones are further examples of antibiotics that become inactive by altering their site of action. As mentioned before, this group of antibiotics inhibits essential bacterial proteins for DNA replication, DNA gyrase (gyrA and gyrB) and topoisomerase (parC and parE). Mutations of specific sites on some genes can result in alteration of the target proteins (Jacoby, 2005).

1.1.4.1.3 Prevention of antibiotic access to the target site In order to be effective, an antimicrobial agent must be able to reach its target site at an adequate concentration. Therefore, one of the main mechanisms to counteract the activity of antibiotics is to prevent antibiotics from accessing their sites of action. This could be achieved either through a permeability barrier mechanism or an efflux pump mechanism. The Gram-negative bacterial cell wall consists of inner and outer membranes; diffusion of hydrophilic antibiotics into the cytoplasm is through outer membrane proteins known as porins (Nikaido, 1993). Mutations that introduce structural changes to the porins lead to a permeability barrier that perturbs access of an antibiotic to its site of action. Resistance to aminoglycosides and β-lactams in P. aeruginosa is related to this mechanism (Livermore, 2001).

Additionally, some Gram-positive and Gram-negative organisms have evolved an efflux mechanism that pumps out antimicrobial agents from the cytoplasm before

36

Chapter One they reach their target sites (Webber and Piddock, 2003). Such a mechanism is responsible for resistance to tetracyclines and quinolones (Levy, 1989; Levy, 1992).

1.1.4.1.4 Metabolic bypass Certain antimicrobial agents (e.g. trimethoprim, sulfamethoxazole) interfere with the metabolic pathway of folic acid by blocking two different enzymes in the same pathway and this ultimately affects the biogenesis of nucleic acids in bacteria. However, many bacteria are intrinsically resistant to these drugs and some are capable of transferring plasmids that encode entirely new, drug resistant enzymes to other non-resistant strains. The resultant bacteria express both resistant and sensitive enzymes which ensure the continued functioning of the targeted metabolic pathways in the presence of the antibiotic (Greenwood, 2000; Cloutier, 1995). A summary of the mechanisms of antibiotic action and the modes of resistance discussed above is presented in Figure 1.3.

Mechanism of action Mechanism of resistance

Inhibitors of bacterial cell wall synthesis Prevention of antibiotic access to the target sites - Penicillins - Permeability barrier - Cephalosporins - Efflux pump

Inhibitors of nucleic acids Alteration of the target site synthesis: DNA - altered penicillin binding proteins (PBP2a) • Inhibitors of DNA synthesis - Altered DNA gyrase - Directly (fluoroquinolones) -Indirectly (trimethoprim) DNA gyrase RNA DNA gyrase • Inhibitors of RNA polymerase polymerase mRNA - e.g. Rifamycins Ribosomes Agent affecting membrane function Metabolic bypass - Polymyxins -Trimethoprim

Inhibitors of bacterial protein synthesis Enzymatic inactivation of the antibiotic - Aminoglycosides - β-lactamase - Macrolides

Figure 1.3: Summary of main antibiotics mechanisms of action and bacterial mechanisms of resistance with antibiotic examples.

37

Chapter One

1.2 Metabolomics

1.2.1 Overview The earliest studies of small molecules related to health in humans were performed around 2000-1500 B.C., when ants were used by traditional Chinese doctors to estimate glucose levels in urine (Oresic, 2009). Since then, there has been continuous development in techniques for the detection and measurement of small molecules. Across the whole spectrum of such developments, one of the most recent breakthroughs is the emergence of the field of metabolomics. The aim of metabolomics is the comprehensive quantification and identification of all metabolites within an organism, and metabolomics has since come to be regarded as an invaluable tool in functional genomics (Fiehn, 2002; Hall et al., 2002).

When trying to define the term „metabolome‟, it is important first to understand that this term is derived from „metabolism‟, which originates from the Greek word metabolè which means change, and it involves metabolic flux and transformation that is incessant and fast (Villas-Boas et al., 2007). The precise definition of the „metabolome‟ has been greatly discussed by researchers over recent years, initially being defined by Oliver et al. (1998) as the entire set of molecules of low molecular weight in cells, considered quantitatively, existing in a particular physiological state. An alternative definition describes the metabolome as being made up exclusively of native small molecules (definable non-polymeric compounds) found in a cell or organism that are involved in metabolic reactions necessary for normal functioning, growth and maintenance (Harrigan and Goodacre, 2003). The metabolome can be regarded as the end product of gene and protein expression as well as interaction with the environment (Fiehn, 2002).

Metabolomics is a complex science which covers several different disciplines, including both organic and analytical chemistry as well as chemometrics, informatics and bioscience (Fukusaki and Kobayashi, 2005). The fields of applications of metabolomics are widespread and include plant science, medical science, pharmaceutical research, microbiology, food and plant nutrition as well as other applications (Kim et al., 2013; Kondo et al., 2011; Hirai et al., 2004; Bundy et al., 2005; Al Zweiri et al., 2010).

38

Chapter One

Any metabolomic experiment begins with quantifying all of the various metabolites in a cellular system, tissue or cell, in a specific state at a particular point in time. However, this is not possible at the moment as there are no straightforward automated analytical methods to enable this in a robust and reproducible manner. The chemical complexity of metabolites coupled with their heterogeneity, as well as the dynamic range of analytical techniques, the speed and volume of measurements that can be recorded and the procedural rules of extraction, all need to be carefully considered as they can all cause potential problems for researchers (Goodacre et al., 2004).

The properties of the metabolome, both chemical and physical, including atomic properties, vary more widely than either the proteome or the transcriptome. Being so chemically complex causes problems in the study of the entire metabolome as the range of metabolites detectable by any one piece of equipment are limited. A range of chemical species have been the subject of research, including polar volatiles and non-volatiles of low molecular weight, polar glycosides, non-polar lipids as well as inorganic species. Metabolites can also possess great variation in their physical properties (e.g., volatility) and in their concentration levels. It has been found that this variation can cover nine orders of magnitude (pM-mM). These issues, alongside other technical considerations, have made the full analysis of metabolomes difficult to achieve, and hence there is not yet an established comprehensive method for the study of metabolomics (Dunn et al., 2005). As a result of this, various different strategies are used, and these are detailed in Table 1.3.

39

Chapter One

Table 1.3: Classification of metabolomic approaches.

Term Definition

Metabolomics Non-biased comprehensive identification and quantification of the entire metabolome under a given set of conditions with high selectivity and sensitivity in an analytical technique. Currently, there is not one particular technique or combination of techniques that can successfully determine all metabolites present in mammalian, plant or microbial metabolomes (Dunn et al., 2005; Dunn and Ellis, 2005).

Metabolic profiling Analysis for identification and quantification of a wide range of metabolites correlated through either similar chemistries (e.g. amino acids, carbohydrates) or linked with specific metabolic pathways. Usually, it is in an inductive experimental strategy that is also known as untargeted analysis and it enables identification of metabolites where possible. Any relative changes in response/metabolite concentration help to show metabolic variations. Chromatographic separation is usually used before detection, and in sample preparation, metabolites are isolated from the rest of cellular molecules to avoid any chance of matrix effect before instrumental analysis (Goodacre et al., 2004; Dunn and Ellis, 2005; Harrigan and Goodacre, 2003).

Metabolic High-throughput global screening of crude samples to classify and fingerprinting discriminate between them according to their biological status or origin. Minimal sample preparation is needed and also quantification and metabolic identification are minimal. No chromatographic separation takes place, so analysis times are generally under a minute (Dunn and Ellis, 2005; Dunn, 2008; Goodacre et al., 2004; Fiehn, 2001).

Targeted analyses Single or small closely-related set of metabolites are entirely quantified and qualitatively identified following thorough sample preparation to ensure metabolites are involved in specific or closely related metabolic reactions, separated from the sample matrix (Goodacre et al., 2004; Dunn, 2008).

Metabolic Analysis of an organism‟s (exo)metabolome that consists of intracellular footprinting secreted and excreted metabolites as well as unconsumed metabolites from the organism‟s environment (like growth medium in microbial system). Quenching of the metabolism and extraction is unnecessary, making this technique a high-throughput strategy (Ellis and Goodacre, 2006; Dunn, 2008; Allen et al., 2003).

Metabonomics Quantitative measurement of living system metabolites following biological perturbation, including genetic modification or pathophysiological stimuli (Nicholson et al., 1999).

There are some clear benefits to studying the metabolome compared with studying the genome and proteome. The metabolome is more tractable due to the smaller

40

Chapter One number of analytes. It is currently estimated that there are approximately 3000 metabolites derived from the human body (Duarte et al., 2007), whilst there are almost 32000 genes found in man (GIL, 2003) and gene expression has shown that there are approximately 106 different human proteins (Oh et al., 2004; Goodacre, 2007).

The metabolome, being the genome‟s last downstream product, is the nearest to the function or phenotype of the cell. Even when any changes in the concentrations of proteins or transcripts cannot be detected, changes in concentrations of metabolites can be seen via the application of metabolic control analysis (MCA) (Cornish- Bowden and Cardenas, 2000). Experiments have also proven that the metabolome system has the unique advantage of a very high level of discrimination in the study of changes in biology (Pope et al., 2007; Urbanczyk-Wochniak et al., 2003). Another benefit of the metabolome strategy compared with proteomics or transcriptomics is that it is a high-throughput (HTP) strategy with a lower cost per analysis compared to proteomics (Dunn and Ellis, 2005; Fiehn, 2002; Harrigan and Goodacre, 2003). A direct result of these advantages is a marked increase in the interest in the field of metabolomics as is highlighted by the increased level of publications within the field (Figure 1.4). However, unfortunately, the low cost per analysis has the penalty of a higher equipment cost because of the requirements of high specification analytical instruments (Dunn, 2008).

No. of publications per year 2000 1800 1600 1400 1200 1000 800

600 No. of publicationsof No. 400 200

0

2000 2001 2002 2004 2005 2007 2008 2009 2010 2011 2012 2003 2006 Year Listed on ISI Web of Science ® using the search term "Metabolomics"

Figure 1.4: Number of publications in the field of metabolomics since 2000.

41

Chapter One

1.2.2 Metabolomics experiments In general, a multidisciplinary team of scientists in a workflow or pipeline operate the metabolomics experiment. The process can be simplified into: experimental design, sample collection and preparation, sample analysis using various analytical instruments, followed by pre-processing of raw data and chemometric analysis of the processed data, and finally data interpretation (Brown et al., 2005; Harrigan and Goodacre, 2003).

In order for the data to be both valid and reproducible, as this is required to give the necessary information from which biological data can be deduced, the metabolomics experiment must be carefully designed. Close attention needs to be paid to all steps in the pipeline in terms of sample numbers, sample preparation, appropriate selection of analytical technology and strategies for processing data, reporting the full details of the experiment and eliminating any bias that might produce invalid data (Broadhurst and Kell, 2006; Ransohoff, 2005).

1.2.2.1 Sample preparation In a metabolomics study, sample preparation constitutes an important part because of the constant metabolic flux occurring in the biological systems studied. Generally, biological variability is found to have a greater influence on the metabolomic study than the analytical variability (Roessner et al., 2000). Experimental errors must also be minimised before analysing a biological sample.

As exchange rates are high in metabolic reactions with a constant flux, a quenching step is essential in many experiments to arrest metabolism and gain a snapshot of the metabolome at a single time point (Krastanov, 2010; Dunn, 2008). For quenching, either a high or a low temperature is generally used to arrest enzymes. For microbial cells and biofluids, one of the most common methods for quenching is spraying the biomass into 60% buffered methanol at temperatures lower than -40 ºC (Hollywood et al., 2006; Jensen et al., 1999; Winder et al., 2008). An investigation into E. coli comparing three different methods of quenching (absolute boiling ethanol, 60% buffered cold methanol and 60% aqueous cold methanol) found that metabolite

42

Chapter One leakage occurs more in boiling ethanol. On the other hand, adding a buffer to the methanol solvent causes no difference compared with the aqueous solvent (Winder et al., 2008). Therefore, plunging the cell in 60% aqueous cold methanol (-48 ºC) and collecting the pellet from the quenched biomass via centrifugation is the most appropriate method and therefore is applied as a quenching technique in all the following experiments used in this thesis. Liquid N2 snap freezing is also used for enzymatic quenching of microbial cells (Dominguez et al., 1998) but it is mostly used within samples of animal or plant tissues (Ward et al., 2003; Viant et al., 2005).

The next stage in the process is mechanical disruption which is used to release the metabolites, followed by extraction of the metabolites (Atherton et al., 2006; Fukusaki and Kobayashi, 2005). It is important to make sure that metabolites extraction procedures used are the best possible (with respect to minimal extraction steps, limiting metabolite turnover, metabolite coverage permitted by choice of solvent and reproducibility) (Lutz et al., 2013). The target analysis also requires the selection of an appropriate extraction buffer depending on the chemical and physical properties of the targeted metabolites (Fukusaki and Kobayashi, 2005). For the non- targeted metabolite analysis of intracellular metabolites, the extraction strategy must maximise the recovery of many different types of metabolites from different chemical species in a non-biased manner (Dunn et al., 2005). For cell culture and microbial extracts, there are several common methods: (i) acid extraction using perchloric acid (HClO4) with subsequent freeze thawing and use of potassium hydroxide (KOH) to neutralise; (ii) alkali extraction using sodium hydroxide (NaOH), with subsequent heating to 80 ºC and then neutralisation; (iii) various solvent extraction approaches including boiling ethanolic extraction (90 ºC), freeze- thaw extraction in cold methanol (-48 ºC) and extraction by methanol/chloroform solution (2:1) (Tweeddale et al., 1998; Buchholz et al., 2001; Villas-Boas et al., 2005a; Hollywood et al., 2006; Winder et al., 2008). For E. coli (which is under investigation in this project), it was found that the freeze-thaw cold methanol extraction is the most suitable method for global untargeted analysis for central metabolism metabolites with high reproducibility compared with the other four methods mentioned above. For the analysis of microbial extracts, the next stage is sample analysis for which there is a large range of different instrumentations that can be applied (Hollywood et al., 2006).

43

Chapter One

The metabolomics strategy being used dictates, to a certain degree, the pre-analysis preparation of samples. In metabolite fingerprinting and profiling based analyses, samples are analysed either directly (i.e. in the whole cell), or after preliminary preparation, such as protein removal using precipitation (Dunn et al., 2011) or dilution without any further sub-classification of metabolites. In targeted analysis, the metabolome can be divided into different chemical classes. Targeted metabolites can be isolated from the matrix, which can contain other metabolites, usually via extraction methods that apply multiple stages of sample preparation to obtain a high concentration and a relatively purified extract (Dunn et al., 2005).

For all metabolomics techniques, there are solvent restrictions. As an example, in nuclear magnetic resonance (NMR) spectroscopy, solvents that produce multiple resonance like ethanol and hexane are not appropriate as they can cause significant overlap with the chemical shifts from metabolites and make the assignment of the NMR resonance difficult. Electrospray ionisation (ESI), which is one of the most commonly used mass spectrometry ion sources in metabolomics, has a potential problem of contamination with solvents made up of non-volatile buffers. The use of salt solutions in the preparation of samples for LC-MS analysis should also be avoided due to the suppression effects that salts can have upon application of ESI (King et al., 2000). In FT-IR spectroscopy, water has the disadvantage of causing huge absorptions requiring procedures for drying samples to remove water, or alternatively, attenuated total reflectance (ATR) can be used (Dunn et al., 2005).

1.2.2.2 Analytical technology In order to monitor the metabolome more comprehensively, complementary analytical techniques are necessary due to the metabolome‟s high level of chemical diversity (Atherton et al., 2006). Another problem which needs to be overcome is sensitivity as unlike transcriptome analysis there are no means of amplification (Boccard et al., 2010). Metabolic studies have used an extensive number of analytical techniques including Fourier transform infrared (FT-IR) spectroscopy (Kim et al., 2010), NMR spectroscopy (Serkova et al., 2005), direct infusion mass spectrometry (DIMS) (Kaderbhai et al., 2003) and hyphenated separation techniques coupled to mass spectrometry (MS) platforms (specifically gas chromatography (GC) (Jonsson et al., 2004), liquid chromatography (LC) (Takahashi et al., 2011) or capillary electrophoresis (CE) (Hirayama et al., 2009)). The considerations when

44

Chapter One selecting the most appropriate analytical techniques and methods of analysis involve weighing up which aspects are most important (sensitivity, chemical selectivity or the analysis time involved in achieving each of them and reaching a suitable compromise). Below is a further discussion of FT-IR, DIMS, GC-MS and LC-MS techniques.

1.2.2.2.1 Fourier transform infrared (FT-IR) spectroscopy Infrared spectroscopy is considered to be a very useful analytical technique since it can be applied to the identification of pure compounds or for generating metabolic fingerprints from complex biological matrices. It was first commercially available in the 1940s with a prism as a dispersive tool. After that, the instrument was modified in the 1950s by adding diffraction gratings. During this time, this analytical technique was applied scarcely in microbiological studies for bacterial classifications (Goulden and Sharpe, 1958). In general, there was less interest in applying IR spectroscopy in medical and biological studies at that time because of its various shortcomings including low sensitivity, reproducibility and long analysis time (Nelson, 1991). A huge development in infrared spectroscopy occurred with the introduction of a mathematical process known as Fourier transformation. The revised instrument is known as Fourier transform infrared (FT-IR) spectroscopy, which has many advantages over previous instruments such as reducing the time required to analyse samples and enhancing the quality of spectra (Stuart, 1996).

FT-IR spectroscopy is mainly used as a first round screening technique due to several advantages. For example, it is a HTP analytical technique, so it is rapid and can analyse several hundred/thousand samples per day producing large numbers of spectra. Another advantage is that it is a non-destructive instrument. In addition, the sample preparation required with this analytical tool is simple (fluids or cell suspensions are spotted directly, whereas tissues are homogenised in water prior to spotting) and does not require the use of reagents. Also, this technique has the key advantage of being relatively inexpensive (Griffin and Shockcor, 2004), and thus it is very appropriate to serve as a HTP screening method prior to more in-depth and costly metabolomic analyses (Dunn et al., 2005).

45

Chapter One

In FT-IR spectroscopy, samples are interrogated by an infrared beam which is absorbed by functional groups and polar bonds variably and at specific wavelengths. This absorption changes the dipole moment of bonds (e.g. C=O, O-H and N-H) resulting in them vibrating in different ways such as bending, stretching or rotating motions. However, some bonds such as O2 and N2 vibrate in a way that causes no changes in dipole moment and so they are not detected (Harrigan and Goodacre, 2003).

The infrared light region is divided into 3 main spectral regions: far, mid and near infrared. The mid infrared region (MIR), from 4000 to 600 cm-1, is the main region analysed with FT-IR spectroscopy in metabolome studies since it provides rich biochemical information, and it also contains the major wavenumbers of the most important biochemical classes (Figure 1.5) (Mantsch and Chapman, 1996; Harrigan and Goodacre, 2003).

A B C D E 0.6

0.5

0.4

0.3

0.2 Absorbance (arbitrary) 0.1 4000 3500 3000 2500 2000 1500 1000 500 -1 Wavenumber (cm )

Figure 1.5: FT-IR spectrum with the main characteristic regions. (A) represents fatty acid region (3050-2800 cm-1), (B) amide region attributed to peptides and proteins (1750-1500 cm-1), (C) mixed region ascribed to carboxylic group vibrations of polysaccharides, proteins and free amino acids (1500-1250 cm-1), (D) ascribed to polysaccharides (1200-900 cm-1) and (E) fingerprint region.

Water absorbance within the MIR region is one of the serious drawbacks of FT-IR spectroscopy as it produces wide bands in the spectra covering the distinctive biochemical fingerprint (Dunn et al., 2005). However, water absorbance can be avoided by employing various techniques. One of the simplest is to desiccate the sample before analysis on silicon (Correa et al., 2012), zinc selenide (ZnSe)

46

Chapter One

(AlRabiah et al., 2013; Chapter 3) or aluminium plates (Johnson et al., 2003). An alternative for overcoming the problem with water absorbance is to use an attenuated total reflectance (ATR) cell. ATR is an optic accessory that enables a non- destructive and direct analysis of aqueous samples (Schmitt and Flemming, 1998).

Whilst FT-IR spectroscopy has limited sensitivity and does not have the ability of structurally elucidating large numbers of metabolite features in complex samples as can be achieved using advanced analytical techniques such as mass spectrometry and NMR, it is still regarded as a valid tool in metabolomics. FT-IR spectroscopy is still the most commonly employed vibrational spectroscopic technique to achieve snap- shot spectral fingerprints of biological samples (Dunn et al., 2005).

This analytical technique has a wide range of applications in different fields. It is used in the clinical field as a metabolic fingerprinting tool for classification and detection of dysfunctions and diseases (Ellis and Goodacre, 2006; Naumann, 2001) such as in cancer diagnosis (Lasch et al., 2002), diabetes diagnosis by analysis of blood serum (Petrich et al., 2000) and for many other applications. FT-IR spectroscopy is also used for bacterial identification and classification at the level of species and sub-species (Naumann et al., 1991; AlRabiah et al., 2013).

1.2.2.2.2 Mass spectrometry Mass spectrometry (MS) is a powerful analytical technique that is applied in different disciplines due to its ability to produce very useful information about the molecular mass of the analyte under investigation. Through MS/MS and multi-stage MS (MSn), the molecular fragments of the analyte can be generated and measured thereby allowing structural elucidation of the analyte and potentially leading to identification (Watson and Sparkman, 2007).

Mass spectrometry produces a plot called a mass spectrum, which has two axes: the vertical axis represents the relative abundance of each detected ion and the horizontal axis represents the value of mass-to-charge (m/z) ratio (Williams and Fleming, 1995). In general, mass spectrometry as a technique depends mainly on two requirements: first, transferring the neutral molecule to a gaseous ionised phase with

47

Chapter One single or multiple charges, and subsequently making the ions traverse through a mass analyser under vacuum for subsequent detection (Ekman et al., 2008).

A mass spectrometer has three main components, the first being the ion source which is the component where gas phase ions are produced either in a vacuum (e.g. electron ionisation (EI), chemical ionisation (CI)) or at atmospheric pressure (e.g. electrospray ionisation (ESI), atmospheric pressure chemical ionisation (APCI), atmospheric pressure photoionisation (APPI)). The second is the mass analyser, which separates the ions in time or space (e.g. time of flight (TOF), ion trap) according to the m/z ratio of the ions. The third part is the detector, which converts the passage of ions with a specific m/z into practical and meaningful signals. Most detectors act either by detecting the image current of orbital frequencies as the circulating ions pass the detection plate or by physically detecting the ion current itself that results when the mass ions strike a detection plate (Liebler, 2002; Watson and Sparkman, 2007; Ekman et al., 2008).

Mass spectrometry has become an essential tool in metabolomics studies especially after the huge development and enhancement in mass analysers such as the TOF (McFadden et al., 1963; O‟Halloran et al., 1964) and the Orbitrap (Hu et al., 2005; Makarov, 2000). As a result, it has become possible to obtain large volumes of biological data of higher accuracy and specificity to match the requirments of the field (Dunn, 2008).

Although MS is a destructive technique that consumes samples (unlike FT-IR and NMR spectroscopy), it has several distinct advantages including its high level of sensitivity as it is able to detect very low concentrations in metabolic profiling studies, such as detection of many metabolites at physiological concentrations. Examples of these metabolites include lipids, amino and organic acids, fatty acids, sugars, and nucleotide bases (Dunn, 2008).

Additionally, mass spectrometric measurements are characterised by a high level of accuracy. The level of accuracy achieved by MS in metabolomics can be measured in parts per million (ppm) in molecular mass measurement. The greater propensity for fragmentation is also useful for working out the molecular structure. As a result of this, it is not only possible to generate reproducible identification of previously characterised metabolites which depends on accuracy only, but MS can also help in

48

Chapter One reliable identification of new metabolites that are not available in metabolomic databases (Villas-Boas et al., 2005b; Dunn, 2008).

One of the major limitations in the study of metabolomics is the inability to characterise all detected metabolites extracted from biological samples (i.e. from mammalian, plant or microbial systems), which is mainly due to biological complexity. The integration of chromatographic systems with mass spectrometry enables sensitivity to be increased and allows the detection and quantification of a broad range of metabolites. These advantages though come with the penalty of longer analysis time, typically 10–60 min (Dunn, 2008). Direct infusion mass spectrometry (DIMS) can overcome this shortcoming by performing fast and high capacity screening of a large number of samples, though it has limited capacity for metabolite quantification and identification (Dunn, 2008; Dunn et al., 2005). DIMS with two hyphenated strategies (GC-MS and LC-MS) are discussed in more detail below.

1.2.2.2.2.1 Direct infusion mass spectrometry (DIMS) DIMS is the most direct and simple approach in the technology of mass spectrometry. It is applied mainly with electrospray instruments and used mostly for sample classification and fingerprinting (e.g. in some microbial and plant metabolomics studies) (Kaderbhai et al., 2003; Mauri and Pietta, 2000; Vaidyanathan et al., 2002; Goodacre et al., 2003). This is due to its high-throughput and speed (compared with chromatography-MS), and also to the higher sensitivity it offers compared with alternative direct metabolite fingerprinting techniques such as NMR. Although it has many advantages, it is rarely used in metabolite identification and quantification mainly due to the matrix effect of either ionisation enhancement or suppression caused particularly in complex samples containing compounds such as salts, organic and hydrophobic molecules (Bedair and Sumner, 2008; Dunn et al., 2005; Taylor and Linforth, 2003).

DIMS, however, lacks the depth of coverage for complex samples with high numbers of metabolites due to the overlapping of ions, where several peaks may appear within a very narrow window (i.e. with a small difference between monoisotopic masses) unless if a mass spectrometry technique with high accuracy and high resolution (e.g. Fourier transform ion cyclotron resonance (FT-ICR) mass

49

Chapter One spectrometry) is applied. It also lacks the ability to resolve isomers of the same elemental composition (for example glucose-6-phosphate and fructose-6-phosphate), however tandem mass analysis may overcome this dilemma by applying it as a complementary separate strategy (Bedair and Sumner, 2008; Dunn et al., 2005).

Mass spectrometry in conjugation with separation techniques affords the ability to address these issues. Chromatography is one of the most popular separation techniques, and although it increases the time required for sample analysis, it greatly aids in resolving the issues of mass ion overlap and isomeric compounds. Since chromatography separates analytes, it helps to decrease the chemical diversity during ionisation which minimises the matrix effect (Boccard et al., 2010; Bedair and Sumner, 2008; Dunn et al., 2005). The two most widely used chromatographic separation techniques in metabolomics that are coupled with mass spectrometry are GC and LC.

1.2.2.2.2.2 Gas chromatography-mass spectrometry (GC-MS) GC-MS is one of the most commonly applied techniques in metabolomics research due to its high resolution, high separation efficiency, high sensitivity and robustness that result from a combination of a capillary GC with a highly sensitive mass spectrometer (Kopka, 2006). The availability of compound libraries for metabolite identification is also one of the main advantages for this hyphenated technique. Generally, this mature technique can be used to analyse volatile and non-volatile (after derivatisation) compounds both quantitatively and qualitatively with high reproducibility and lower cost compared with other coupled techniques (e.g. LC- NMR) (Schauer et al., 2005; Kopka, 2006).

In GC-MS analysis, thermal stability and volatility are the main requirements of the sample being analysed. These requirements can be found in some metabolites (e.g. small molecular weight hydrocarbons, short chain alcohols and esters) but not in the majority of metabolites (Bedair and Sumner, 2008). Since metabolomics focuses on the analysis of small and largely non-volatile compounds (e.g. carbohydrates, organic acids, amino acids), a chemical derivatisation step is required to increase the volatility of such compounds such that they can be separated by GC.

50

Chapter One

Oxime/silylation derivatisation is the most common method. It is a two-stage method which gives compounds both of the aforementioned requirements by the addition of trimethylsilyl (TMS) groups (Roessner et al., 2000). This method is suitable for a wide range of compounds (e.g. alcohols, amides, amino acids and thiols) (Dunn et al., 2005). In brief, firstly oximes are formed (using alkoxyamines), followed by replacement of the active hydrogen in the polar functional group with TMS groups (using N-methyl-N-trimethylsilyl-trifluoroacetamide; MSTFA) (Figure 1.6). TMS is less polar and reduces the dipole-dipole interaction that results in increasing the compound‟s volatility (Dunn et al., 2005). Despite this derivatisation method covering many metabolites of interest and being in widespread use, it has certain disadvantages such as being time consuming and requiring anhydrous environment for preparation and storage due to the sensitivity of silylation to water (as it is a reversible reaction) (Dunn et al., 2005). T-butyldimethylsilylation (TBS) is used for amino and organic acid derivatisation and it is more stable than TMS (Glassop et al., 2007). Other derivatisation methods are more specific and they have narrower compound ranges and are mostly used for targeted analysis (Drozd, 1986). It must also be considered that performing derivatisation introduces an additional sample preparation phase and so can introduce further technical errors into an experiment.

Z- isomer Z- isomer

MSTFA

Estrone

E- isomer E- isomer

Figure 1.6: A schematic of the two-stage oxime silylation derivitisation reaction.

51

Chapter One

Samples are typically introduced to the GC column by a mode of injection known as the split/splitless injector. The split mode is appropriate for highly concentrated samples whereas the splitless mode is preferable for samples containing lower metabolite concentrations (Watson, 1999). The stationary phase of the GC column is either non-polar, which mainly separates analytes depending on the boiling point, or polar, which separates analytes according to their polarity as well as volatility. GC has high resolution and a high peak capacity, but for complex samples (such as those analysed in metabolomics) it might lose some resolution. To overcome this, a two dimensional GC (GC × GC) can be used to improve the peak capacity and resolution of metabolites that co-elute in one dimensional GC. GC × GC consists of two columns with different stationary phases to improve compound selectivity. Typically, the first column is a long non-polar column while the second is a short polar column. The sample effluent is transferred from the first to the second column by modulators (e.g. cryogenic modulators) which focus samples leaving the first column as a series of pulses into the head of the second column (Bedair and Sumner, 2008; Seeley et al., 2007). Although it might be expected that GC × GC-MS is going to be one of the main analytical tools in metabolomics, to date its applications are limited due to complexities and difficulties in the automation of data processing (Bedair and Sumner, 2008).

As mentioned above, for GC, sample volatility is required, so it should only be combined with MS ion sources that are specifically suited to gas phase analytes. These include chemical ionisation (CI) and electron ionisation (EI) ion sources. The latter is considered an aggressive ionisation technique with the ability to cause fragmentation of the analyte and thereby facilitates its identification and characterisation.

As for mass analysers compatible with GC, it was found that the traditional coupling of GC with quadrupole MS results in a large dynamic range and high sensitivity, but at the expense of scanning speed and nominal mass accuracy. More recently, another technique GC-TOF/MS has been more commonly employed for metabolite profiling due to the higher mass resolution and accuracy compared with quadrupoles (Bedair and Sumner, 2008). Another advantage of TOF/MS is the high scan speeds that it is capable of, which means that the technique is highly applicable to hyphenation with a chromatographic separation with a fast and high resolution method (Davis et al.,

52

Chapter One

1999; Allwood and Goodacre, 2010; Picó, 2008). EI and TOF will be discussed in more detail in the next sections.

Electron Ionisation (EI) EI is one of the earliest ionisation sources, having been developed in 1918 by Dempster (Dempster, 1918), and modified by Nier and Bleakney (Bleakney, 1929; Nier, 1947). This technique is suitable for volatile compounds with a molecular weight less than 1000 g/mol (Ashcroft, 1997). It requires a high vacuum (10⁻⁶ mbar) and consists of a filament made from electro-emissive elements such as tungsten or rhenium, which produces electrons when heated (Ashcroft, 1997). These electrons are accelerated by a potential difference, which gives energy that is sufficient for ionisation and fragmentation (70 eV is the standard value of energy used), towards a positive target (trap) in a spiral motion that is produced by a magnet to increase the pathway length and give more chance of interaction with the introduced volatile molecules. A magnetic field has an effect on the emitted electrons and not the analytes and their ions (this can be achieved due to the relatively higher mass of ions compared to electrons). The analyte is introduced into the electron path (between the heated filament and the target) and bombarded by the accelerated electrons. Electrons with a sufficient energy cause ionisation and extensive fragmentation of the analyte (bond strengths 4−7 eV). Typically, these accelerated high-energy electrons produce positive ions because it is difficult for the analyte to capture the accelerated electrons. Post ionisation, a repeller with a positive charge is applied to direct the ions towards the analyser (Ashcroft, 1997; Watson, 1999; Ekman et al., 2008).

Time of Flight (TOF) The concept of TOF was proposed in 1946 by Stephens (1946) who developed a linear TOF system. The mechanism of this analyser can be understood from its name. The ions produced in the ion source are accelerated by a high voltage, then passed through a flight tube, which contains a field free region, to the detector (Ashcroft, 1997; de Hoffmann and Stroobant, 2002; Ekman et al., 2008). This technique measures the time needed for the ions to travel a defined distance through

53

Chapter One the flight tube from the ion source to the detector (Watson and Sparkman, 2007), which is generally a micro-channel plate (MCP) time to digital converter (TDC)

(Allwood and Goodacre, 2010). The time (tTOF) can be calculated by the equation:

tTOF √ √

where tTOF, as explained above, is the travel time (t) of the ion from the ion source to the detector, L is the length of the flight tube, v is the ion velocity after acceleration, m is the mass of the ion and z its charge, Ua is the electric potential difference that accelerates the ion after it leaves the ion source and q is the charge on the ion. This equation shows that tTOF is directly proportional to the square root of m/z, so that smaller ions with a lower m/z reach the detector faster than large ions (Ekman et al., 2008) and this is the principle of the original linear TOF. However, one of the greatest disadvantages of linear analysers is poor resolution (de Hoffmann and Stroobant, 2002).

The resolution of TOF mass analysers is restricted by the velocity distribution of ions and proportional to the time of flight. The flight time can be increased by either increasing the length of the flight tube or by reducing the acceleration voltage. However, an extensive increase of tube length might affect the performance of the analyser with more chance of losing ions. On the other hand, decreasing the acceleration voltage causes a reduction in sensitivity, so in order to ensure both high resolution and high sensitivity, the tube must be long (1 to 2 metres) and the voltage must be high (> 20 kV) (de Hoffmann and Stroobant, 2002). Techniques which have been used to increase the resolution of TOF/MS whilst maintaining high sensitivity involve the use of a reflectron (reTOF) and orthogonal acceleration (oaTOF).

a) The reflectron Reflectron (also called reflector) was first proposed by Mamyrin and colleagues (1973). It is an ion mirror that is located after the field free region and simply consists of a series of grids and ring electrodes (de Hoffmann and Stroobant, 2002). The reflectron increases the length of the flight path and compensates and reduces the variations in kinetic energy of ions which have the same m/z ratio. Ions with

54

Chapter One higher velocity (and kinetic energy) penetrate more deeply into the field and thus are retained within the reflectron field for longer, while ions with lower kinetic energy penetrate less deeply and so are retained for shorter time periods within the reflectron field (de Hoffmann and Stroobant, 2002). This differential retention within the reflectron field and subsequent reflection out of this electric field leads to ion focusing and allows both low and high kinetic energy ions with the same m/z to reach the detector at the same time. This prevents peak broadening and permits a greater resolution at the expense of sensitivity, price and mass range (Ekman et al., 2008; de Hoffmann and Stroobant, 2002).

In most TOF/MS instruments, a linear reflector is employed (e.g. LECO TOF/MS). However, a two stage reflector is also commonly used (e.g. Waters LCT), which allows both „V‟ mode and „W‟ mode analyses (Ekman et al., 2008).

b) Orthogonal acceleration Another way to improve the resolution of TOF instruments and make them more compatible with continuous ion sources, such as electron ionisation (EI) which is usually used in conjunction with GC, is by extracting ions orthogonally. This type of instruments is called orthogonal acceleration TOF (oaTOF) mass analyser and was initially developed by O‟Halloran group in 1964 (O‟Halloran et al., 1964). Orthogonal acceleration TOF varies from other TOF mass analysers in that ions access the flight tube perpendicular to the ion source axis by means of a pusher plate which converts the beam of continuous ions to a pulsed beam (Guilhaus et al., 2000; Ekman et al., 2008). The beam of ions generated in the ion source is collimated by a set of lenses. The orthogonal pusher then accelerates the ions, which have very limited radial energy dispersion (in the TOF direction), down the flight tube. This reduced energy dispersion has the effect of improving the resolving power of the instrument (Guilhaus et al., 2000). In addition, orthogonal acceleration can be used in combination with a reflectron TOF analyser to provide even higher resolution (Watson and Sparkman, 2007).

55

Chapter One

1.2.2.2.2.3 Liquid chromatography-mass spectrometry (LC-MS) Liquid chromatography (LC), in simple terms, is described as the separation of compounds in a mixture according to the affinity of each compound to one of two different phases: a stationary phase (either solid or liquid) and a liquid mobile phase. Compounds that have higher affinity to the stationary phase chemistry are retained for longer and therefore have longer retention times whilst compounds of lower affinity to the stationary phase are rapidly eluted and thus have shorter retention times (Watson, 1999).

One great advantage of LC-MS in metabolomics is that it can be used to analyse extremely small molecules (which can be detected by GC-MS) as well as those species with molecular weights exceeding 600 Da. Therefore, it is useful in the analysis of conjugated bile acids, sugars, glycosides and phospholipids (Dunn, 2008).

There are many different types of liquid chromatography, though in the majority of metabolomics experiments, analysis with reversed phase high performance liquid chromatography (RP-HPLC) is generally applied (Issaq et al., 2008). More recently, hydrophilic interaction liquid chromatography (HILIC) is becoming more popular (Buszewski and Noga, 2012). Metabolic studies are increasingly employing liquid chromatography in conjunction with MS (LC-MS), which enables the majority of metabolites to be both separated and characterised. Due to rapid advancement in the development of column chemistries and bonded phases, HPLC has become an ideal separation technique for resolving various types of compounds, including both hydrophilic and hydrophobic compounds, salts, acids and bases. GC separation techniques are limited to separating volatile compounds, whereas HPLC does not have such limitations. As long as the compound dissolves in a liquid, it can be analysed by HPLC and generally the properties of the solute dictate the separation and determine the column type to be used for successful separation by the stationary phase and the mobile phase (Issaq et al., 2008). In metabolomics, a gradient mobile phase is typically used, which in the case of RP-HPLC starts with a high percentage of water and ends with a high percentage of an organic solvent (e.g. methanol, acetonitrile). This mobile phase system elutes polar compounds very rapidly; the non-polar compounds elute at later retention times once the mobile phase

56

Chapter One composition becomes more organic than aqueous (Bedair and Sumner, 2008; Allwood and Goodacre, 2010; Watson, 1999).

Hydrophilic interaction liquid chromatography (HILIC) is a normal phase mode that is used as a complementary technique to reversed phase columns (Kind et al., 2007; Idborg et al., 2005). Interest in this technique by researchers has continuously grown in the last 10 years (Buszewski and Noga, 2012). It is used mainly for the separation of polar compounds and the mobile phase used with this type of columns is an inverse of the reversed phase mode of elution (i.e. it starts with high percentage of organic solvent and ends with a high percentage of water). Separation occurs by partitioning of compounds between a stagnant layer of water in the stationary phase and the mobile phase (Hemstrom and Irgum, 2006).

By hyphenating LC with MS, the number of analytes detected is increased due to LC separation of analytes before their introduction to MS. This reduces the ionisation competition/suppression that occurs in DIMS. Two soft ionisation sources are commonly used in LC-MS based metabolomics. These are atmospheric pressure chemical ionisation (APCI) and electrospray ionisation (ESI). ESI is the more commonly applied technique and is ideal for the ionisation of many types of metabolites, including amino and organic acids (Tolstikov and Fiehn, 2002), sugars and sugar alcohols (Tolstikov et al., 2003; Moco et al., 2006), sterols, steroids, phospholipids and fatty acids (Allwood et al., 2006) as well as drug compounds (Chen et al., 2007).

In general, different mass analysers are hyphenated with liquid chromatography. Quadrupoles were the first type of mass spectrometers to be applied to LC-MS (De Vos et al., 2007; Tolstikov and Fiehn, 2002), and TOF analysers are also used. According to Dunn (2008), LC-MS for metabolomics is more sensitive, accurate and robust when a LTQ-Orbitrap is connected with liquid chromatography systems that operate with sub-2 µm particles. Both ESI and Orbitrap will be discussed in more detail below.

Electrospray Ionisation (ESI) Dole and his team first introduced the electrospray ionisation (ESI) technique in 1968 (Dole et al., 1968). In 1984, Yamashita and Fenn coupled ESI to MS

57

Chapter One

(Yamashita, 1984), and in 2002 Fenn was part awarded the Nobel Prize in Chemistry for „the development of soft desorption ionisation methods (ESI) for mass spectrometric analysis of biological macromolecules‟ (Fenn, 2003).

ESI is considered to be a soft ionisation technique, in which the ionisation process occurs under atmospheric pressure (Ashcroft, 1997; Watson, 1999) although gases are applied to sweep the ions through the ionisation source and onto the analyser. The sample droplets are converted into a gaseous state by evaporation with relatively little fragmentation (Ekman et al., 2008).

ESI equipment typically consists of a stainless steel capillary of internal diameter between 75 and 100 µm (Ashcroft, 1997). The solution, which can vary from 100% aqueous to 100% organic, containing the sample, passes through the capillary at a low flow rate. A potential difference of 3–6 kV between this capillary and the counter-electrode, which are separated by a distance of between 0.3 cm and 2 cm, is applied to create an electric field (de Hoffmann and Stroobant, 2002). The electric field causes charged analyte molecules (either positive or negative depending on the mode used) to accumulate in the front of the capillary and form a Taylor cone. This cone is converted to small droplets by the electric force from the capillary, which repels the ions, and from the coaxial gas (usually nitrogen), which helps droplets to form and break the surface tension of the Taylor cone (de Hoffmann and Stroobant, 2002; Ashcroft, 1997). The droplets are exposed to either a heated inert gas, most commonly nitrogen, which acts as a drying gas or to a heated capillary to remove the remaining solvent molecules (de Hoffmann and Stroobant, 2002). Solvent evaporation causes a decrease in the size of droplets, thus increasing the density of charge, and the droplets become unstable due to increased repulsive force between the ions (Watson and Sparkman, 2007).

Before reaching the Rayleigh limit (the point when a cohesive force of a solvent is overcome by repulsive forces between charges of the same polarity in an electrolytic solution), a coulombic repulsion of droplets may occur resulting from asymmetrical distribution of charges on the surface. This repulsion allows droplets to remain, maintaining their spherical shape and reducing the density of the charge (Watson and Sparkman, 2007; Ashcroft, 1997; Dole et al., 1968; Cerda et al., 2001). Further coulombic repulsions occur until the analyte reaches the gaseous phase in ionic form.

58

Chapter One

It is not fully clear what the exact mechanism of ion formation from charged droplets is and there have been several different theories attempting to explain it (Mora et al., 2000; Dole et al., 1968; Iribarne and Thomson, 1976).

Once formed by the ESI process, these ions pass to a region with an intermediate vacuum (1 mbar) through a conical orifice. In this intermediate region, light ions (from solvent) or gas molecules are not able to pass to the third region, which contains the mass analyser (high vacuum region) and they are pumped out, while analytes (heavier ions) pass through a small hole called the skimmer, which functions to produce a fine stream of ions that are directed into the mass analyser under vacuum (Ashcroft, 1997).

ESI is a recommended ion source used to analyse a wide range of molecules (from 100 g/mol to 100 × 106 g/mol in some cases), in both positive (often acetic acid or formic acid is added to enhance protonation of the analyte) and negative (often a volatile amine or ammonia solution is added to help deprotonation of the analyte) ionisation modes (Ekman et al., 2008; de Hoffmann and Stroobant, 2002). It is an ionisation method that is very gentle and any resultant fragmentation of the molecular ions formed is minimal making this ionisation technique useful because it can be possible to successfully analyse molecules with weak bonds (Ekman et al., 2008).

Ionisation by this method can generate ionised species from thermally labile non- volatile compounds. Additionally, this ionisation method produces multiply charged ions, which allows analysis of high molecular mass analytes such as peptides within the m/z range of the mass analyser (Ekman et al., 2008; Dole et al., 1968). On the other hand, one of the prime disadvantages of ESI is ion suppression of targeted analytes when there is an ionic species (e.g. water soluble salts), which can prevent the formation of gas-phase ions of the analyte (King et al., 2000).

The Orbitrap The Orbitrap is one of the relatively newly invented mass analysers, devised in the late 1990s by Alexander Makarov and adapted from the Knight/Kingdon Trap (Makarov, 2000). It consists of two electrodes; a spindle shaped inner axial electrode and a barrel shaped coaxial outer electrode and both form coaxial axisymmetric

59

Chapter One electrodes (Watson and Sparkman, 2007). A constant electric potential is applied between the electrodes. The opposing surfaces of the electrodes are non-parallel, so an inhomogeneous electrical field occurs between the surfaces. The minimum electric field is found in the centre of the analyser where there is the greatest separation between electrodes surfaces, and it increases uniformly far from the centre in both directions of the z axis (Watson and Sparkman, 2007).

Ions are injected perpendicularly to the z axis of the Orbitrap and with tangential velocity to prevent collision with the spindle electrode (Ekman et al., 2008). Ions orbit in the space between the inner and outer electrodes. The radius of ions orbiting (r) depends on balancing the centrifugal forces resulting from their tangential velocity and the electrodynamic centripetal forces without mass dependence resulting in trapping ions with all masses radially to the analyser (see Equation 1.2; eV is the kinetic energy of the ions, eE is the force due to the electric field) (Watson and Sparkman, 2007).

In addition, ions oscillate harmoniously along the z axis independently from the radius; i.e. purely quadratic, which results from the specific shape of the electrodes (Hu et al., 2005). The frequency (f) of this oscillation is inversely proportional to the square root m/z ratio (Ekman et al., 2008).

√ ⁄

The Orbitrap senses the image current resulting from the oscillation frequency of ions as they circulate past the detecting electrodes, rather than detecting the direct impact of the ion as used in TOF/MS. This gives the Orbitrap the capacity not only to build up the image current of ions but also to improve the sensitivity. Various Orbitrap instruments are available commercially. Up to date, LTQ-Orbitrap from Thermo Scientific, a LTQ (Linear Trap Quadrupole) hybrid Orbitrap mass spectrometry system, is currently one of the most commonly used systems in metabolomics. This hybrid system consists of three parts, the first of which is a linear ion trap, which is fast and efficient at fragmenting the analyte, although fragmentation can also occur in the Orbitrap (Scigelova and Makarov, 2006). The 60

Chapter One second part is the C-trap which is named based on its shape (Scigelova and Makarov, 2006) and it acts as an electronic gate for ions to enter the Orbitrap (the third part) and it transfers the ions from the ion trap with short pulses that are suitable for analysis within the Orbitrap (Makarov et al., 2006).

There are many advantages of the Orbitrap over most other analysers. It has a high mass accuracy (less than 2 ppm) (Lim et al., 2007), high resolving power (over 100000 FWHM for m/z 400) which is affected by m/z, and a wide dynamic range (ca. 5000) (Scigelova and Makarov, 2006), which means the analyser has the ability to detect small peaks (of low intensity) with a high level of accuracy even if there are predominant peaks within the same spectrum.

There is always the aim in the scientific community to find highly performing equipment at the lowest possible cost. Thermo Scientific recently introduced the „ExactiveTM‟, which is a benchtop Orbitrap mass spectrometer at around half the price of its hybrid counterpart (Koulman et al., 2009; Bateman et al., 2009). It was designed to be highly accurate and to provide high resolution full scans and is a stand-alone mass analyser with no ion trap at the front. The lower price has made it more affordable for ordinary laboratories and it is now used in metabolomics studies (Lu et al., 2010; Khreit et al., 2013).

1.2.2.3 Data analysis Metabolomics data are highly rich in information; therefore, data collection is frequently followed by multivariate analysis (MVA). The way in which any analysis is designed depends on the information that the researcher wishes to obtain. As its name suggests, multivariate analysis involves observing and analysing a multitude of variables for more than one object (sample), in this case being hundreds−thousands of metabolite features across a large number of samples. This is difficult to visualise but can be explained in abstract terms; if there is an „n-dimensional hyperspace‟, every n-variable (metabolite) has a unique position in that hyperspace. The advantage of MVA is that it reduces and simplifies the dimensions in the hyperspace to a much smaller number of components; this can be achieved using both unsupervised and supervised statistical methods (Goodacre, 2007).

The data that have been collected from the metabolomics experiments in this work generally fall into two main categories. The first category includes data such as list

61

Chapter One of retention time-mass variable pairs, or identified deconvoluted metabolites with their intensities, reported within each analysed sample. This type of data is generally collected from chromatography or CE linked to MS. The second category encompasses data such as fingerprint type traces in the form of wavenumbers and absorbance for each analysed sample collected from FT-IR analysis.

1.2.2.3.1 Unsupervised methods Unsupervised techniques attempt to describe similarities or dissimilarities between the obtained data, without using any previous knowledge regarding potential relationships between the samples under investigation. The data are presented to the algorithm which conducts the analysis and then visually depicts these relationship patterns. The most common methods used for this are principal components analysis (PCA) and hierarchical cluster analysis (HCA). Once the clustering is complete and either ordination score plots or dendrograms have been created, then these need to be manually analysed and interpreted.

1.2.2.3.1.1 Principal component analysis (PCA) PCA is the starting point for most multivariate data analyses (Boccard et al., 2010). It is one of the earliest multivariate techniques and is also the most extensively used in chemometrics. The way PCA works is by describing the variance of a set of multivariate data in relation to a set of principal components (PCs) in an attempt to reduce the number of original variables at the same time maintaining the original data variance in the form of the new PC indices. Over 90% of the total explained variance (TEV) can often be accounted for by the first few PCs, with the first PC representing the greatest amount of data variance and subsequent PCs successively accounting for a reduced amount of variance. The terms PC scores and loadings are used to describe the PCA data. The former are the transformed variable values while the latter are the weights by which the original data variables should be multiplied to obtain the PC scores. Similarities and differences in the data set can be quickly visualised by plotting the data in the space (2D or 3D) that has been defined by the PC scores accounting for the greatest variance (i.e. a PCA scores plot), which can enable samples to be discriminated effectively (Sumner et al., 2003; Manly, 1994). The original variables (metabolites) that contribute to the PCs that explain the

62

Chapter One desired experimental variance can then be identified within the PCA loadings plot that graphically represents which variables have the greatest weighting in each PC (Joliffe, 1986; Causton, 1987).

1.2.2.3.2 Supervised methods Supervised methods generally entail providing the statistical algorithm with a priori information on class structure, so that the algorithm knows which samples should belong to which experimental classes. Supervised analyses aim to achieve a mathematical model that can associate some or all of the target traits with the inputs. Any error between the already established target and the response of the model (i.e. the output), needs to be made as small as possible, generally via one of the following methods: artificial neural networks (ANNs), discriminant function analysis (DFA) or partial least squares (PLS) (Goodacre, 2007). However, care must be taken when interpreting such models since they are easily over trained and will fit the data to the a priori classes even based upon information that can be regarded as background noise, for this reason performing a validation step on the model is absolutely paramount.

1.2.2.3.2.1 Discriminant function analysis (DFA) Discriminant function analysis (DFA) is a supervised projection method. In DFA, prior knowledge of the data set class structure is utilised to minimise within-group variance measures whilst simultaneously maximising between-group variance measures (Manly, 1994). DFA is usually used to lessen the dimensionality of data based upon PCs. PC-DFA creates measures of the differences both between classes and within classes by using what is known about the class structure (Johnson et al., 2003; Kaderbhai et al., 2003).

1.2.2.3.2.2 Partial least squares (PLS) PLS is a supervised learning technique. It builds a linear model representing the relationship between the dependent variables (Y) and the independent variables (X) in the form of Y = XB + E; B is a regression coefficient while E is the error difference between the observed and predicted (Y) values. The greater the absolute value of B, the greater the influence of that independent variable is on the predictive

63

Chapter One model. The aim of the optimum model is to identify those factors in the input matrix (X) that best describe the input variables variance simultaneously achieving a perfect correlation with the target variables (Y) (Vinzi et al., 2010).

1.2.2.3.2.3 Cross-validation Statistical validation of a supervised method is required to minimise bias and overtraining of the data. Bootstrap cross-validation is one of these tools that can be employed to validate the results (e.g. data from PC-DFA). The prediction performance of a model can be estimated and cross-validated against this re- sampling technique. Bootstrap randomly selects N samples from a set containing N samples, each time recording the sample and replacing it back into the set. This results in 63.2% of the samples being selected at least once forming the training set, while the remaining 36.8% of non-selected samples are used as the test set, and this process is performed 1000 times (Efron, 1981).

64

Chapter One

1.3 Aims and objectives of the thesis The overall aim of this study was to examine the effects of antibiotics which indirectly or directly inhibit DNA synthesis (trimethoprim and ciprofloxacin) on E. coli isolates at the metabolome and lipidome levels as well as to generate a reproducible approach to discriminate between E. coli isolates (including those of the same sequence type), which may assist in selecting the analytical techniques suitable for the intended metabolomics investigation. This study was of two main parts, elucidation of drug action and method development, which highlight the utility of metabolomics in the field of infectious disease, particularly urinary tract infections. The main aim was addressed through the following objectives:

1. To investigate the direct and off-target effects of the basic antibiotic trimethoprim on E. coli K-12 metabolome at pH 5 and 7 (mimicking human urine pH). 2. To develop a high-throughput FT-IR spectroscopy sample preparation approach to distinguish between different E. coli isolates based on sequence type. 3. To develop a lipidomics workflow using FT-IR spectroscopy and LC-MS to distinguish between control and stressed pathogenic E. coli isolates in terms of metabolic fingerprint and lipid profiling upon challenge with ciprofloxacin. 4. To investigate and compare the ability of different analytical approaches and classical microbiological tests to discriminate between E. coli ST131 isolates.

65

Chapter One

1.4 References Al Zweiri, M., Sills, G. J., Leach, J. P., Brodie, M. J., Robertson, C., Watson, D. G. & Parkinson, J. A. 2010. Response to drug treatment in newly diagnosed epilepsy: a pilot study of 1H NMR- and MS-based metabonomic analysis. Epilepsy Research, 88, 189-195. Allen, J., Davey, H. M., Broadhurst, D., Heald, J. K., Rowland, J. J., Oliver, S. G. & Kell, D. B. 2003. High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nature Biotechnology, 21, 692-696. Allwood, J. W., Ellis, D. I., Heald, J. K., Goodacre, R. & Mur, L. A. J. 2006. Metabolomic approaches reveal that phosphatidic and phosphatidyl glycerol phospholipids are major discriminatory non-polar metabolites in responses by Brachypodium distachyon to challenge by Magnaporthe grisea. Plant Journal, 46, 351-368. Allwood, J. W. & Goodacre, R. 2010. An introduction to liquid chromatography-mass spectrometry instrumentation applied in plant metabolomic analyses. Phytochemical Analysis, 21, 33-47. AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369. Ashcroft, A. E. 1997. Ionisation methods in organic mass spectrometry, Cambridge, The Royal Society of Chemistry. Atherton, H. J., Bailey, N. J., Zhang, W., Taylor, J., Major, H., Shockcor, J., Clarke, K. & Griffin, J. L. 2006. A combined 1H-NMR spectroscopy- and mass spectrometry- based metabolomic study of the PPAR-alpha null mutant mouse defines profound systemic changes in metabolism linked to the metabolic syndrome. Physiological Genomics, 27, 178-186. Bateman, K. P., Kellmann, M., Muenster, H., Papp, R. & Taylor, L. 2009. Quantitative- qualitative data acquisition using a benchtop orbitrap mass spectrometer. Journal of the American Society for Mass Spectrometry, 20, 1441-1450. Bedair, M. & Sumner, L. W. 2008. Current and emerging mass-spectrometry technologies for metabolomics. Trends in Analytical Chemistry, 27, 238-250. Berdy, J. 1974. Recent developments of antibiotic research and classification of antibiotics according to chemical structure. Advances in Applied Microbiology, 18, 309-406. Bleakney, W. 1929. A new method of positive ray analysis and its application to the measurement of ionisation potentials in mercury vapor. Physical Review, 34, 157- 160. Boccard, J., Veuthey, J. & Rudaz, S. 2010. Knowledge discovery in metabolomics: an overview of MS data handling. Journal of Separation Science, 33, 290-304. Broadhurst, D. I. & Kell, D. B. 2006. Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics, 2, 171-196. Brown, M., Dunn, W. B., Ellis, D. I., Goodacre, R., Handl, J., Knowles, J. D., O'Hagan, S., Spasic, I. & Kell, D. B. 2005. A metabolome pipeline: from concept to data to knowledge. Metabolomics, 1, 39-51. Buchholz, A., Takors, R. & Wandrey, C. 2001. Quantification of intracellular metabolites in Escherichia coli K12 using liquid chromatographic-electrospray ionisation tandem mass spectrometric techniques. Analytical Biochemistry, 295, 129-137. Bundy, J. G., Willey, T. L., Castell, R. S., Ellar, D. J. & Brindle, K. M. 2005. Discrimination of pathogenic clinical isolates and laboratory strains of Bacillus cereus by NMR- based metabolomic profiling. FEMS Microbiology Letters, 242, 127-136. Buszewski, B. & Noga, S. 2012. Hydrophilic interaction liquid chromatography (HILIC)-a powerful separation technique. Analytical and Bioanalytical Chemistry, 402, 231- 247. Causton, D. 1987. A biologist advanced mathematics, London, Allen and Unwin. Cerda, B. A., Breuker, K., Horn, D. M. & McLafferty, F. W. 2001. Charge/radical site initiation versus coulombic repulsion for cleavage of multiply charged ions. Charge

66

Chapter One

solvation in poly(alkene glycol) ions. Journal of the American Society for Mass Spectrometry, 12, 565-570. Chen, G., Pramanik, B. N., Liu, Y.-H. & Mirza, U. A. 2007. Applications of LC/MS in structure identifications of small molecules and proteins in drug discovery. Journal of Mass Spectrometry, 42, 279-287. Cloutier, M. J. 1995. Antibiotics: mechanisms of action and the acquisition of resistance- when magic bullets lose their magic. American Journal of Pharmaceutical Education, 59, 167-172. Cornish-Bowden, A. & Cardenas, M. L. 2000. Technological and medical implications of metabolic control analysis, Dordrecht, Kluwer Academic Publishers. Correa, E., Sletta, H., Ellis, D. I., Hoel, S., Ertesvag, H., Ellingsen, T. E., Valla, S. & Goodacre, R. 2012. Rapid reagentless quantification of alginate biosynthesis in Pseudomonas fluorescens bacteria mutants using FT-IR spectroscopy coupled to multivariate partial least squares regression. Analytical and Bioanalytical Chemistry, 403, 2591-2599. Cuddy, P. G. 1997. Antibiotic classification: implications for drug selection. Critical Care Nursing Quarterly, 20, 89-102. Davis, S. C., Makarov, A. A. & Hughes, J. D. 1999. Ultrafast gas chromatography using time-of-flight mass spectrometry. Rapid Communications in Mass Spectrometry, 13, 237-241. de Hoffmann, E. & Stroobant, V. 2002. Mass spectrometry: principles and applications, Chichester, John Wiley & Sons. De Vos, R. C. H., Moco, S., Lommen, A., Keurentjes, J. J. B., Bino, R. J. & Hall, R. D. 2007. Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. Nature Protocols, 2, 778-791. Dempster, A. J. 1918. A new method of positive ray analysis. Physical Review, 11, 316-325. Dole, M., Mack, L. L., Hines, R. L., Mobley, R. C., Ferguson, L. D. & Alice, M. B. 1968. Molecular Beams of macroions. The Journal of Chemical Physics, 49, 2240-2249. Dominguez, H., Rollin, C., Guyonvarch, A., Guerquin-Kern, J. L., Cocaign-Bousquet, M. & Lindley, N. D. 1998. Carbon-flux distribution in the central metabolic pathways of Corynebacterium glutamicum during growth on fructose. European Journal of Biochemistry, 254, 96-102. Drozd, J. 1986. Chemical derivatisation in gas chromatography, Amsterdam, Elsevier Science. Duarte, N. C., Becker, S. A., Jamshidi, N., Thiele, I., Mo, M. L., Vo, T. D., Srivas, R. & Palsson, B. O. 2007. Global reconstruction of the human metabolic network based on genomic and bibliomic data. Proceedings of the National Academy of Sciences of the United States of America, 104, 1777-1782. Dunn, W. B. 2008. Current trends and future requirements for the mass spectrometric investigation of microbial, mammalian and plant metabolomes. Physical Biology, 5, 011001. Dunn, W. B., Bailey, N. J. C. & Johnson, H. E. 2005. Measuring the metabolome: current analytical technologies. Analyst, 130, 606-625. Dunn, W. B., Broadhurst, D., Begley, P., Zelena, E., Francis-McIntyre, S., Anderson, N., Brown, M., Knowles, J. D., Halsall, A., Haselden, J. N., Nicholls, A. W., Wilson, I. D., Kell, D. B., Goodacre, R. & Human Serum Metabolome, H. C. 2011. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols, 6, 1060-1083. Dunn, W. B. & Ellis, D. I. 2005. Metabolomics: current analytical platforms and methodologies. Trends in Analytical Chemistry, 24, 285-294. Efron, B. 1981. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika, 68, 589-599.

67

Chapter One

Ekman, R., Silberring, J., Westman-Brinkmalm, A. M., Kraj, A., Desiderio, D. M. & Nibbering, N. M. 2008. Mass spectrometry: instrumentation, interpretation, and applications, Hoboken, John Wiley & Sons. Ellis, D. & Goodacre, R. 2006. Metabolic fingerprinting in disease diagnosis: biomedical applications of infrared and Raman spectroscopy. Analyst, 131, 875-885. Erlich, H. 1973. Molecular Biology of rifomycin, New York, MSS Information Corporation. Fenn, J. B. 2003. Electrospray wings for molecular elephants (Nobel Lecture). Angewandte Chemie International Edition, 42, 3871-3894. Fiehn, O. 2001. Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comparative and Functional Genomics, 2, 155-168. Fiehn, O. 2002. Metabolomics - the link between genotypes and phenotypes. Plant Molecular Biology, 48, 155-171. Finch, R. G. 2004. Antibiotic resistance: a view from the prescriber. Nature Reviews Microbiology, 2, 989-994. Fish, D. N., Piscitelli, S. C. & Danziger, L. H. 1995. Development of resistance during antimicrobial therapy: a review of antibiotic classes and patient characteristics in 173 studies. Pharmacotherapy, 15, 279-291. Fukusaki, E. & Kobayashi, A. 2005. Plant metabolomics: potential for practical operation. Journal of Bioscience and Bioengineering, 100, 347-354. GIL. 2003. euGenes: Genomic Information for Eukaryotic Organisms [Online]. Bloomington. Available: http://eugenes.org/ [Accessed 22/04/ 2013]. Glassop, D., Roessner, U., Bacic, A. & Bonnett, G. D. 2007. Changes in the sugarcane metabolome with stem development. Are they related to sucrose accumulation? Plant and Cell Physiology, 48, 573-584. Goodacre, R. 2007. Metabolomics of a superorganism. Journal of Nutrition, 137, 259S- 266S. Goodacre, R., Vaidyanathan, S., Dunn, W. B., Harrigan, G. G. & Kell, D. B. 2004. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends in Biotechnology, 22, 245-252. Goodacre, R., York, E. V., Heald, J. K. & Scott, I. M. 2003. Chemometric discrimination of unfractionated plant extracts analysed by electrospray mass spectrometry. Phytochemistry, 62, 859-863. Goulden, J. D. S. & Sharpe, M. E. 1958. The infrared absorption spectra of Lactobacilli. Journal of General Microbiology, 19, 76-86. Greenwood, D. 2000. Antimicrobial chemotherapy, Oxford, Oxford University Press. Griffin, J. L. & Shockcor, J. P. 2004. Metabolic profiles of cancer cells. Nature Reviews Cancer, 4, 551-561. Guilhaus, M., Selby, D. & Mlynski, V. 2000. Orthogonal acceleration time-of-flight mass spectrometry. Mass Spectrometry Reviews, 19, 65-107. Hall, R., Beale, M., Fiehn, O., Hardy, N., Sumner, L. & Bino, R. 2002. Plant metabolomics: the missing link in functional genomics strategies. Plant Cell, 14, 1437-1440. Harrigan, G. G. & Goodacre, R. 2003. Metabolic Profiling: its role in biomarker discovery and gene function analysis, Norwell, Springer. Hemstrom, P. & Irgum, K. 2006. Hydrophilic interaction chromatography. Journal of Separation Science, 29, 1784-1821. Hirai, M. Y., Yano, M., Goodenowe, D. B., Kanaya, S., Kimura, T., Awazuhara, M., Arita, M., Fujiwara, T. & Saito, K. 2004. Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proceedings of the National Academy of Sciences of the United States of America, 101, 10205-10210. Hirayama, A., Kami, K., Sugimoto, M., Sugawara, M., Toki, N., Onozuka, H., Kinoshita, T., Saito, N., Ochiai, A., Tomita, M., Esumi, H. & Soga, T. 2009. Quantitative metabolome profiling of colon and stomach cancer microenvironment by capillary electrophoresis time-of-flight mass spectrometry. Cancer Research, 69, 4918-4925.

68

Chapter One

Hollywood, K., Brison, D. R. & Goodacre, R. 2006. Metabolomics: current technologies and future trends. Proteomics, 6, 4716-4723. Holmberg, S. D., Solomon, S. L. & Blake, P. A. 1987. Health and economic impacts of antimicrobial resistance. Reviews of Infectious Diseases, 9, 1065-1078. Hu, Q., J. Noll, R., Li, H., Makarov, A., Hardman, M. & Cooks, R. G. 2005. The Orbitrap: a new mass spectrometer. Journal of Mass Spectrometry, 40, 430-443. Idborg, H., Zamani, L., Edlund, P. O., Schuppe-Koistinen, I. & Jacobsson, S. P. 2005. Metabolic fingerprinting of rat urine by LC/MS: part 1. Analysis by hydrophilic interaction liquid chromatography-electrospray ionisation mass spectrometry. Journal of Chromatography B, 828, 9-13. Iribarne, J. & Thomson, B. 1976. On the evaporation of small ions from charged droplets. Journal of Chemical Physics, 64, 2287. Iseman, M. D. 1993. Treatment of multidrug-resistant tuberculosis. the New England Journal of Medicine, 329, 784-791. Issaq, H. J., Abbott, E. & Veenstra, T. D. 2008. Utility of separation science in metabolomic studies. Journal of Separation Science, 31, 1936-1947. Jacoby, G. A. 2005. Mechanisms of resistance to quinolones. Clinical Infectious Diseases, 41, S120-S126. Jensen, N. B. S., Jokumsen, K. V. & Villadsen, J. 1999. Determination of the phosphorylated sugars of the embden-meyerhoff-parnas pathway in Lactococcus lactis using a fast sampling technique and solid phase extraction. Biotechnology and Bioengineering, 63, 356-362. Johnson, H. E., Broadhurst, D., Goodacre, R. & Smith, A. R. 2003. Metabolic fingerprinting of salt-stressed tomatoes. Phytochemistry, 62, 919-928. Joliffe, I. T. 1986. Principal component analysis, New York, Springer. Jonsson, P., Gullberg, J., Nordstrom, A., Kusano, M., Kowalczyk, M., Sjostrom, M. & Moritz, T. 2004. A strategy for identifying differences in large series of metabolomic samples analysed by GC/MS. Analytical Chemistry, 76, 1738-1745. Kaderbhai, N. N., Broadhurst, D. I., Ellis, D. I., Goodacre, R. & Kell, D. B. 2003. Functional genomics via metabolic footprinting: monitoring metabolite secretion by Escherichia coli tryptophan metabolism mutants using FT–IR and direct injection electrospray mass spectrometry. Comparative and Functional Genomics, 4, 376- 391. Katayama, Y., Ito, T. & Hiramatsu, K. 2000. A new class of genetic element, staphylococcus cassette chromosome mec, encodes methicillin resistance in Staphylococcus aureus. Antimicrobial Agents and Chemotherapy, 44, 1549-1555. Khreit, O. I. G., Grant, M. H., Zhang, T., Henderson, C., Watson, D. G. & Sutcliffe, O. B. 2013. Elucidation of the phase I and phase II metabolic pathways of (±)-4'- methylmethcathinone (4-MMC) and (±)-4'-(trifluoromethyl)methcathinone (4- TFMMC) in rat liver hepatocytes using LC-MS and LC-MS2. Journal of Pharmaceutical and Biomedical Analysis, 72, 177-185. Kim, D.-H., Jarvis, R. M., Xu, Y., Oliver, A. W., Allwood, J. W., Hampson, L., Hampson, I. N. & Goodacre, R. 2010. Combining metabolic fingerprinting and footprinting to understand the phenotypic response of HPV16 E6 expressing cervical carcinoma cells exposed to the HIV anti-viral drug lopinavir. Analyst, 135, 1235-1244. Kim, J., Jung, Y., Song, B., Bong, Y. S., Ryu, D. H., Lee, K. S. & Hwang, G. S. 2013. Discrimination of cabbage (Brassica rapa ssp. pekinensis) cultivars grown in different geographical areas using 1H NMR-based metabolomics. Food Chemistry, 137, 68-75. Kind, T., Tolstikov, V., Fiehn, O. & Weiss, R. H. 2007. A comprehensive urinary metabolomic approach for identifying kidney cancer. Analytical Biochemistry, 363, 185-195. King, R., Bonfiglio, R., Fernandez-Metzler, C., Miller-Stein, C. & Olah, T. 2000. Mechanistic investigation of ionisation suppression in electrospray ionisation. Journal of the American Society for Mass Spectrometry, 11, 942-950.

69

Chapter One

Kondo, Y., Nishiumi, S., Shinohara, M., Hatano, N., Ikeda, A., Yoshie, T., Kobayashi, T., Shiomi, Y., Irino, Y., Takenawa, T., Azuma, T. & Yoshida, M. 2011. Serum fatty acid profiling of colorectal cancer by gas chromatography/mass spectrometry. Biomarkers in Medicine, 5, 451-460. Kopka, J. 2006. Current challenges and developments in GC-MS based metabolite profiling technology. Journal of Biotechnology, 124, 312-322. Koulman, A., Woffendin, G., Narayana, V. K., Welchman, H., Crone, C. & Volmer, D. A. 2009. High-resolution extracted ion chromatography, a new tool for metabolomics and lipidomics using a second-generation Orbitrap mass spectrometer. Rapid Communications in Mass Spectrometry, 23, 1411-1418. Krastanov, A. 2010. Metabolomics - the state of art. Biotechnology & Biotechnological Equipment, 24, 1537-1543. Lasch, P., Haensch, W., Lewis, E. N., Kidder, L. H. & Naumann, D. 2002. Characterisation of colorectal adenocarcinoma sections by spatially resolved FT-IR microspectroscopy. Applied Spectroscopy, 56, 1-9. Levy, S. B. 1989. Evolution and spread of tetracycline resistance determinants. Journal of Antimicrobial Chemotherapy, 24, 1-3. Levy, S. B. 1992. Active efflux mechanisms for antimicrobial resistance. Antimicrobial Agents and Chemotherapy, 36, 695-703. Levy, S. B. 2001. Antibiotic resistance: consequences of inaction. Clinical Infectious Diseases, 33, S124-S129. Levy, S. B. & Marshall, B. 2004. Antibacterial resistance worldwide: causes, challenges and responses. Nature Medicine, 10, S122-S129. Liebler, D. C. 2002. Introduction to proteomics: Tools for the new biology, Totowa, Humana Press. Lim, H., Chen, J., Sensenhauser, C., Cook, K. & Subrahmanyam, V. 2007. Metabolite identification by data-dependent accurate mass spectrometric analysis at resolving power of 60 000 in external calibration mode using an LTQ/Orbitrap. Rapid Communications in Mass Spectrometry, 21, 1821-1832. Livermore, D. M. 2001. Of Pseudomonas, porins, pumps and carbapenems. Journal of Antimicrobial Chemotherapy, 47, 247-250. Lu, W., Clasquin, M. F., Melamud, E., Amador-Noguez, D., Caudy, A. A. & Rabinowitz, J. D. 2010. Metabolomic analysis via reversed-phase ion-pairing liquid chromatography coupled to a stand alone Orbitrap mass spectrometer. Analytical Chemistry, 82, 3212-3221. Lutz, N. W., Sweedler, J. V. & Wevers, R. A. 2013. Methodologies for metabolomics: experimental strategies and techniques, Cambridge, Cambridge University Press. Makarov, A. 2000. Electrostatic axially harmonic orbital trapping: a high-performance technique of mass analysis. Analytical Chemistry, 72, 1156-1162. Makarov, A., Denisov, E., Kholomeev, A., Baischun, W., Lange, O., Strupat, K. & Horning, S. 2006. Performance evaluation of a hybrid linear ion trap/Orbitrap mass spectrometer. Analytical Chemistry, 78, 2113-2120. Mamyrin, B., Karataev, V., Shmikk, D. & Zagulin, V. 1973. The mass-reflectron, a new nonmagnetic time-of-flight mass spectrometer with high resolution. Soviet Physics: Journal of Experimental and Theoretical Physics Letters, 37, 45-48. Manly, B. 1994. Multivariate Statistical Methods, London, Chspmsn and Hall. Mantsch, H. H. & Chapman, D. 1996. Infrared spectroscopy of biomolecules, New York, Wiley. Mauri, P. & Pietta, P. 2000. Electrospray characterisation of selected medicinal plant extracts. Journal of Pharmaceutical and Biomedical Analysis, 23, 61-68. McFadden, W. H., Day, J. C., Teranishi, R. & Black, D. R. 1963. Use of capillary gas chromatography with a time-of-flight mass spectrometer. Journal of Food Science, 28, 316-319. McGowan, J. E. 1983. Antimicrobial resistance in hospital organisms and its relation to antibiotic use. Reviews of Infectious Diseases, 5, 1033-1048.

70

Chapter One

Moco, S., Bino, R. J., Vorst, O., Verhoeven, H. A., de Groot, J., van Beek, T. A., Vervoort, J. & de Vos, C. H. R. 2006. A liquid chromatography-mass spectrometry-based metabolome database for tomato. Plant Physiology, 141, 1205-1218. Mora, J. F. D. L., Van Berkel, G. J., Enke, C. G., Cole, R. B., Martinez-Sanchez, M. & Fenn, J. B. 2000. Electrochemical processes in electrospray ionisation mass spectrometry. Journal of Mass Spectrometry, 35, 939-952. Mozayani, A. & Raymon, L. P. 2003. Handbook of drug interactions: a clinical and forensic guide, New York, Humana Press. Mulvey, M. R. P. & Simor, A. E. M. D. 2009. Antimicrobial resistance in hospitals: how concerned should we be? Canadian Medical Association Journal, 180, 408-415. Naumann, D. 2001. FT-infrared and FT-Raman spectroscopy in biomedical research. Applied Spectroscopy Reviews, 36, 239-298. Naumann, D., Helm, D. & Labischinski, H. 1991. Microbiological characterisations by FT- IR spectroscopy. Nature, 351, 81-82. Nelson, W. H. 1991. Modern techniques for rapid microbiological analysis, New York, Wiley. Neu, H. C. 1992. The crisis in antibiotic resistance. Science, 257, 1064-1073. Nicholson, J. K., Lindon, J. C. & Holmes, E. 1999. 'Metabonomics': understanding the metabolic responses of living systems to pathophysiological stimuli via multivariate statistical analysis of biological NMR spectroscopic data. Xenobiotica, 29, 1181- 1189. Nier, A. O. 1947. A mass spectrometer for isotope and gas analysis. Review of Scientific Instruments, 18, 398-411. Nikaido, H. 1993. Transport across the bacterial outer membrane. Journal of Bioenergetics and Biomembranes, 25, 581-589. O'Grady, F. & Lambert, H. P. 1997. Antibiotic and chemotherapy: anti-infective agents and their use in therapy, New York, Churchill Livingstone. O‟Halloran, G., Fluegge, R., Betts, J. & Everett, W. 1964. Parts 1 and 2 determination of chemical species prevalent in plasma jet. Southfield: Technical Laboratories Divisions. Oh, J. E., Krapfenbauer, K., Fountoulakis, M., Frischer, T. & Lubec, G. 2004. Evidence for the existence of hypothetical proteins in human bronchial epithelial, fibroblast, amnion, lymphocyte, mesothelial and kidney cell lines. Amino Acids, 26, 9-18. Oliver, S. G., Winson, M. K., Kell, D. B. & Baganz, F. 1998. Systematic functional analysis of the yeast genome. Trends in Biotechnology, 16, 373-378. Oresic, M. 2009. Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction. Nutrition Metabolism and Cardiovascular Diseases, 19, 816-824. Petrich, W., Dolenko, B., Früh, J., Ganz, M., Greger, H., Jacob, S., Keller, F., Nikulin, A. E., Otto, M., Quarder, O., Somorjai, R. L., Staib, A., Werner, G. & Wielinger, H. 2000. Disease pattern recognition in infrared spectra of human sera with diabetes mellitus as an example. Applied Optics, 39, 3372-3379. Phelps, C. E. 1989. Bug/drug resistance. Sometimes less is more. Medical Care, 27, 194- 203. Picó, Y. 2008. Food contaminants and residue analysis, Toronto, Elsevier. Pope, G. A., Mackenzie, D. A., Defemez, M., Aroso, M., Fuller, L. J., Mellon, F. A., Dunn, W. B., Brown, M., Goodacre, R., Kell, D. B., Marvin, M. E., Louis, E. J. & Roberts, I. N. 2007. Metabolic footprinting as a tool for discriminating between brewing yeasts. Yeast, 24, 667-679. Queenan, A. M. & Bush, K. 2007. Carbapenemases: the versatile β-lactamases. Clinical Microbiology Reviews, 20, 440-458. Ransohoff, D. F. 2005. Bias as a threat to the validity of cancer molecular-marker research. Nature Reviews Cancer, 5, 142-149. Roessner, U., Wagner, C., Kopka, J., Trethewey, R. N. & Willmitzer, L. 2000. Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. Plant Journal, 23, 131-142.

71

Chapter One

Rubin, R. J., Harrington, C. A., Poon, A., Dietrich, K., Greene, J. A. & Moiduddin, A. 1999. The economic impact of Staphylococcus aureus infection in New York City hospitals. Emerging Infectious Diseases, 5, 9-17. Saga, T. & Yamaguchi, K. 2009. History of antimicrobial agents and resistant bacteria. Japan Medical Association Journal, 52, 103-108. Schauer, N., Steinhauser, D., Strelkov, S., Schomburg, D., Allison, G., Moritz, T., Lundgren, K., Roessner-Tunali, U., Forbes, M. G., Willmitzer, L., Fernie, A. R. & Kopka, J. 2005. GC-MS libraries for the rapid identification of metabolites in complex biological samples. FEBS Letters, 579, 1332-1337. Schlegel, H. G. & Zaborosch, C. 1993. General microbiology, Cambridge, Cambridge University Press. Schmitt, J. & Flemming, H. C. 1998. FTIR-spectroscopy in microbial and material analysis. International Biodeterioration & Biodegradation, 41, 1-11. Scigelova, M. & Makarov, A. 2006. Orbitrap mass analyser - overview and applications in proteomics. Proteomics, 6, 16-21. Seeley, J. V., Micyus, N. J., Bandurski, S. V., Seeley, S. K. & McCurry, J. D. 2007. Microfluidic deans switch for comprehensive two-dimensional gas chromatography. Analytical Chemistry, 79, 1840-1847. Serkova, N., Fuller, T. F., Klawitter, J., Freise, C. E. & Niemann, C. U. 2005. 1H-NMR- based metabolic signatures of mild and severe ischemia/reperfusion injury in rat kidney transplants. Kidney International, 67, 1142-1151. Stephens, W. E. 1946. A Pulsed mass spectrometer with time dispersion. Physical Review, 69, 691. Stuart, B. 1996. Modern infrared spectroscopy, Chichester, John Wily & Sons Ltd. Sumner, L. W., Mendes, P. & Dixon, R. A. 2003. Plant metabolomics: large-scale phytochemistry in the functional genomics era. Phytochemistry, 62, 817-836. Takahashi, H., Morimoto, T., Ogasawara, N. & Kanaya, S. 2011. AMDORAP: non-targeted metabolic profiling based on high-resolution LC-MS. BMC Bioinformatics, 12, 259. Taylor, A. J. & Linforth, R. S. T. 2003. Direct mass spectrometry of complex volatile and non-volatile flavour mixtures. International Journal of Mass Spectrometry, 223-224, 179-191. Tolstikov, V. V. & Fiehn, O. 2002. Analysis of highly polar compounds of plant origin: combination of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Analytical Biochemistry, 301, 298-307. Tolstikov, V. V., Lommen, A., Nakanishi, K., Tanaka, N. & Fiehn, O. 2003. Monolithic silica-based capillary reversed-phase liquid chromatography/electrospray mass spectrometry for plant metabolomics. Analytical Chemistry, 75, 6737-6740. Tweeddale, H., Notley-McRobb, L. & Ferenci, T. 1998. Effect of slow growth on metabolism of Escherichia coli, as revealed by global metabolite pool ("Metabolome") analysis. Journal of Bacteriology, 180, 5109-5116. Urbanczyk-Wochniak, E., Luedemann, A., Kopka, J., Selbig, J., Roessner-Tunali, U., Willmitzer, L. & Fernie, A. R. 2003. Parallel analysis of transcript and metabolic profiles: a new approach in systems biology. EMBO Reports, 4, 989-993. Vaidyanathan, S., Kell, D. B. & Goodacre, R. 2002. Flow-injection electrospray ionisation mass spectrometry of crude cell extracts for high-throughput bacterial identification. Journal of the American Society for Mass Spectrometry, 13, 118-128. van der Goot, H. 2002. Trends in Drug Research III, Amsterdam, Elsevier Viant, M. R., Bundy, J. G., Pincetich, C. A., de Ropp, J. S. & Tjeerdema, R. S. 2005. NMR- derived developmental metabolic trajectories: an approach for visualising the toxic actions of trichloroethylene during embryogenesis. Metabolomics, 1, 149-158. Villas-Boas, S. G., Hojer-Pedersen, J., Akesson, M., Smedsgaard, J. & Nielsen, J. 2005a. Global metabolite analysis of yeast: evaluation of sample preparation methods. Yeast, 22, 1155-1169. Villas-Boas, S. G., Mas, S., Akesson, M., Smedsgaard, J. & Nielsen, J. 2005b. Mass spectrometry in metabolome analysis. Mass Spectrometry Reviews, 24, 613-646.

72

Chapter One

Villas-Boas, S. G., Nielsen, J., Smedsgaard, J., Hansen, M. A. E. & Roessner-Tunali, U. 2007. Metabolome analysis: an introduction, Hoboken, John Wiley & Sons. Vinzi, V. E., Chin, W. W., Henseler, J. & Wang, H. 2010. Handbook of partial least squares, Heidelberg, Springer. Waksman, S. A. & Lechevalier, H. A. 1962. The actinomycetes, London, Bailliere. Walsh, C. 2003. Antibiotics: actions, origins, resistance, Washington DC, ASM Press. Ward, J. L., Harris, C., Lewis, J. & Beale, M. H. 2003. Assessment of 1H NMR spectroscopy and multivariate analysis as a technique for metabolite fingerprinting of Arabidopsis thaliana. Phytochemistry, 62, 949-957. Watson, D. G. 1999. Pharmaceutical analysis : a textbook for pharmacy students and pharmaceutical chemists, Edinburgh, Churchill Livingstone. Watson, J. T. & Sparkman, O. D. 2007. Introduction to mass spectrometry: instrumentation, applications and strategies for data interpretation, Chichester, John Wiley & Sons Ltd. Wax, R. G., Lewis, K., Salyers, A. A. & Taber, H. 2001. Bacterial resistance to antimicrobials, New York, Marcel Dekker. Webber, M. A. & Piddock, L. J. V. 2003. The importance of efflux pumps in bacterial antibiotic resistance. Journal of Antimicrobial Chemotherapy, 51, 9-11. Williams, D. H. & Fleming, I. 1995. Spectroscopic methods in organic chemistry, London, McGraw-Hill. Winder, C. L., Dunn, W. B., Schuler, S., Broadhurst, D., Jarvis, R., Stephens, G. M. & Goodacre, R. 2008. Global metabolic profiling of Escherichia coli cultures: an evaluation of methods for quenching and extraction of intracellular metabolites. Analytical Chemistry, 80, 2939-2948. Wise, R. 2002. Antimicrobial resistance: priorities for action. Journal of Antimicrobial Chemotherapy, 49, 585-586. Yamashita, M. F., John B. 1984. Electrospray ion source. Another variation on the free-jet theme. Journal of Physical Chemistry, 88, 4451-4459.

73

Chapter Two

2 Chapter Two pH plays a role in the mode of action of trimethoprim on Escherichia coli.

Haitham AlRabiah1, J. William Allwood1,2, Elon Correa1, Yun Xu1 and Royston Goodacre1*

1School of Chemistry and Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK

2Current address: Clinical and Environmental Metabolomics, School of Bioscience, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

*Correspondence to Roy Goodacre: [email protected]

This chapter is a manuscript of an article prepared for submission. William Allwood participated in operating the mass spectrometry and in data processing while Elon Correa and Yun Xu contributed to data analysis. Royston Goodacre contributed to this work through supervision and guidance of the study.

74

Chapter Two

Abstract In this investigation, metabolomics-based approaches were applied to understand the interaction of the antibiotic trimethoprim with Escherichia coli K-12 at sub minimum inhibitory concentration levels (MIC ≈ 0.2 mg/L, as well as 0.03 and 0.003 mg/L). Trimethoprim inhibits dihydrofolate reductase and thereby is an indirect inhibitor of nucleic acid synthesis. Due to the basic nature of trimethoprim, two pH levels (5 and 7) were selected which mimicked the urine pH of a healthy person. This also allowed us to investigate the effect on bacterial metabolism when the antibiotic exists in different ionisation states. UHPLC-MS on a Thermo LTQ- Orbitrap XL MS was employed to detect trimethoprim molecules inside the bacterial cell and this showed that at pH 7 more of the drug was recovered compared to cells in the pH 5 environment and this correlated with classical growth curve measurements.

Preliminary experiments used Fourier transform infrared (FT-IR) spectroscopy to establish that reproducible phenotypes were recovered under all 8 conditions (3 drug levels and control × 2 pH levels), after which GC-MS was used to generate global (non-targeted) metabolic profiles. In addition to finding direct mode of action effects where nucleotides were decreased at pH 7 with increasing trimethoprim levels, off- target pH-related effects were also observed for many of the amino acids measured. In addition, we observed general stress-related effects where the osmoprotectant trehalose was significantly higher at increased antibiotic levels at pH 7, and this correlated with the consumption of glucose and fructose and increase in pyruvate- related products from TCA as well as lactate and alanine. Alanine is a known regulator of sugar metabolism and this may be why it is increased to enhance sugar consumption and thus trehalose production. We believe these results provide a wider view of the action of trimethoprim and metabolomics has also indicated several alternative areas of metabolism to be investigated further for increased understanding of the off-target effects of this antibiotic.

75

Chapter Two

2.1 Introduction

One of the most effective mechanisms of drug action is enzyme inhibition, although often the mechanisms underlying the specific modes of action are not always fully understood (Voet and Voet, 2004; Quinlivan et al., 2000). This is typically because there is often an assumption that an antibiotic is an inhibitor of a specific enzyme (or indeed other target), not realising that this chemical can have other „off-target‟ effects, such as binding to unidentified enzymes or indirect interactions with other metabolic pathways that may affect the performance of the drug (Kwon et al., 2008). The range and complexity of cellular chemical reactions (the metabolic network) increase the challenge of understanding the mode of action of antibiotics as multiple changes in the metabolic network will occur during antibiotic-induced abiotic perturbation (Kwon et al., 2008). It is believed that metabolomics is a powerful approach that can be used to measure phenotypic response following antimicrobial challenge (Scheltema et al., 2010). Analysis of metabolomes has increased dramatically in recent years due to the introduction of ultra-high mass resolution mass spectrometers (Allwood and Goodacre, 2010) which allow accurate identification of small molecules in complex metabolite extracts (Breitling et al., 2006; Dunn et al., 2013). Indeed, this approach has already been used for analysing various pathogen phenotypes (Scheltema et al., 2010).

The biosynthesis of essential metabolites, such as purines, thymine, glycine and methionine, generally uses folates as cofactors that either add or subtract one-carbon units. Several therapeutics, including anticancer agents like pemetrexed and antibiotics such as trimethoprim, target folate metabolism (Gangjee et al., 2007; Bushby and Hitching, 1968). Folates can be found in three different oxidation/reduction states (viz., dihydrofolate (DHF), folate or pteroylglutamate and tetrahydrofolate (THF)) and are synthesised from guanosine 5'-triphosphate (GTP), p-aminobenzoic acid (pABA) and glutamates. THF are formed by reducing DHF with an enzyme called dihydrofolate reductase (DHFR). Various folates such as 5- methyl-THF, 5-formyl-THF, 5-formimino-THF, 10-formyl-THF, 5,10-methenyl- THF and 5,10-methylene-THF can be formed by substituting tetrahydrofolate species with one-carbon units to produce active donors involved in various biosynthetic reactions (Kwon et al., 2008).

76

Chapter Two

Urinary tract infections (UTIs) are very common and it is estimated that during the female lifespan 50% are likely to acquire a UTI (Fihn, 2003; Griebling, 2005). A study of midstream urine samples spanning 252 centres in 17 countries revealed that Escherichia coli accounted for 77% of all isolates, 80% of general infections and 40% of nosocomial infections (Ronald, 2003; Kahlmeter, 2003). The weak base antifolate drug trimethoprim resulted from the work of Hitchings and his group across the 1940-60s at Burroughs Wellcome, USA. Hitchings and colleagues studied the cellular actions of biologically important heterocyclic purines and on the basis that interference in associated processes might lead to the discovery of therapeutic effects (Anderson et al., 2012). The Hitchings group successfully developed several therapeutic active agents and Hitchings and Elion were part awarded the Nobel Prize for Physiology and Medicine in 1988, in honour of their work on the discovery of important principles in drug treatment (Hitchings, 1989). Trimethoprim is still used therapeutically today and is particularly effective in treating both community and nosocomial UTIs (Baerheim, 2001). Trimethoprim is mainly used to treat uncomplicated UTIs and acts by inhibiting bacterial DHFR, in turn reducing the active tetrahydrofolates, which as detailed above are needed for synthesis of various essential metabolites and these are important precursors for nucleic acid biosynthesis (Katzung, 1995).

In this study, E. coli K-12 was challenged with different concentrations of trimethoprim at different pH levels (pH 5 and 7) and analysed by Fourier transform infrared (FT-IR) spectroscopy and gas chromatography-mass spectrometry (GC-MS) to produce global snapshots of the bacterial phenotypic and untargeted metabolic profiles respectively. We believe this metabolomics-based approach would provide a greater level of insight and understanding of trimethoprim‟s mode(s) of action. The reason for including varying pH in this investigation is because it is known that trimethoprim is largely excreted unchanged in human urine, and that in a healthy person the normal pH range of urine is between 4.6 and 7.5 (Wilson, 2004; Anderson et al., 2012). The bacterial intracellular trimethoprim levels were estimated using liquid chromatography-mass spectrometry (LC-MS) as this antibiotic is ionised within this pH range and this may affect its ability to be transported across the cell membrane (Dobson and Kell, 2008).

77

Chapter Two

2.2 Materials and Methods General maintenance and growth of E. coli K-12 MG1655 is provided in the Supplementary Information. This also includes preliminary investigations of growth optimisation, different media and different pH levels. We also include details of the minimum inhibitory concentrations (MICs) calculation for trimethoprim at pH 5, 7 and 9.

2.2.1 Antibiotic perturbation of E. coli 18 mL of LB medium at different pH (5, 7 and 9) was inoculated with 1 mL of bacteria (see Supplementary Information) and 1 mL of 0.2, 0.03 and 0.003 mg/L of trimethoprim in 100 mL conical flasks. Control samples were identical except the 1 mL of trimethoprim was substituted with 1 mL of distilled water (dilution solvent of the drug). Each condition was replicated six times and each incubated for 18 h at 37 ºC and 200 rpm.

The overnight culture of each replicate was split for both FT-IR and GC-MS to make sure that results were obtained from the same biological cultures. For FT-IR, 450 µL from each culture was collected and the biomass was washed three times with physiological saline and re-suspended in 400 µL of saline. For GC-MS, 15 mL was processed as described in Supplementary Information.

In order to estimate the amount of trimethoprim inside the E. coli cells, cellular extracts were prepared, analysed and quantified against a 20 point calibration curve constructed using a trimethoprim reference standard via LC-MS. For UHPLC-MS, a Thermo Accela UHPLC system (Thermo-Fisher Ltd., Hemel Hempsted, UK) coupled to a Thermo LTQ-Orbitrap XL MS system was employed (Thermo-Fisher, Bremen, Germany). The methods used are described by Kim et al. (2010).

Full details of materials and methods are provided in the Supplementary Information section and see Figure S 2.1-S 2.3.

2.2.2 FT-IR spectroscopy Clean 96-well zinc selenide (ZnSe) plates (Bruker Ltd., Coventry, UK) were used as the sample carrier. 20 µL of the above bacterial preparations were spotted onto these

78

Chapter Two plates and oven dried at 40 ºC for 45 min (as detailed by AlRabiah et al. (2013; Chapter 3)). High-throughput screening (HTS) FT-IR spectroscopic analysis was carried out on Bruker Equinox 55 infrared spectrometer (Bruker Ltd.) equipped with a HTX™ module according to the method of Winder et al. (2006). All spectra were obtained in the 4000–600 cm-1 range, 64 scans were acquired at 4 cm-1 resolution. These experimental conditions were maintained during all measurements.

After analysis, the FT-IR data were converted to ASCII format tab delimited files prior to data analysis in MATLAB 2010a (The Mathworks Inc., Natwick, USA) and R version 2.13.1 (R Foundation for Statistical Computing, Vienna, Austria). Prior to multivariate analysis (vide infra), CO2 signals were removed as detailed by AlRabiah et al. (2013; Chapter 3) and FT-IR data were baseline corrected using an extended multiplicative signal correction (EMSC) algorithm (Martens et al., 2003). All data were subsequently autoscaled prior to analysis (Goodacre et al., 2007).

2.2.3 GC-MS For GC-MS, samples inoculated at pH 9 were excluded from the analysis due to the extreme effect of the drug at this pH level that prevents the collection of adequate biomass for analysis. For the remaining conditions, 15 mL from each flask was collected and applied for further experiments (Figure S 2.1).

GC-TOF/MS was conducted using a LECO Pegasus III TOF/MS operated in GC- MS mode (Leco Corp., St. Joseph, MO, USA), with a Gerstel MPS-2 autosampler (Gerstel, Baltimore, MD, USA) and an Agilent 6890N GC × GC with a split/splitless injector and Agilent LPD split-mode inlet liner (Agilent Technologies, Stockport, UK). Full details of the GC-MS protocol used are provided in the Supplementary Information and these follow the accepted Metabolomics Standards Initiative (MSI) guidelines (Sumner et al., 2007) and follow our published protocols, and included pooled samples to act as quality controls (QC) (Dunn et al., 2011; Begley et al., 2009). The only difference in this study is that for metabolite extraction 80% methanol was used rather than 100% methanol to enhance the recovery of polar small molecules.

Following GC-MS, these data were processed using the deconvolution method reported by Begley et al. (2009). In addition, prior to statistical analysis, QC samples

79

Chapter Two were used as in the work of Wedge et al. (2011) to provide data quality assurance by evaluating and eliminating mass features that showed high deviation within QC samples.

2.2.4 Metabolomics data analysis For FT-IR and GC-MS, the multivariate data analysis included unsupervised principal component analysis (PCA) and supervised principal component- discriminant function analysis (PC-DFA). PC-DFA depends on the prior knowledge of experimental structure (i.e. the experimental class structure) and a number of retained PCs to discriminate between groups (different classes). The PC-DFA models were validated via 1000 bootstrap cross-validations (Correa et al., 2012) and validation results are reported (as percentage of correct classification) inside the legends of the respective PC-DFA scores plot figures.

In addition, for FT-IR spectroscopy, a multi-block (MB)-PCA model known as consensus PCA (CPCA) (Xu and Goodacre, 2012) was subsequently constructed to aid in spectral interpretation. The first CPCA model related the antibiotic dosing concentration trend for each pH condition as an individual block, and a second model was constructed to illustrate the distribution of samples at different pH levels between control samples and three different drug concentrations as individual blocks.

All multivariate data analyses were performed in R, and all scripts are available from the authors on request.

80

Chapter Two

2.3 Results and Discussion

2.3.1 Determination of the optimum growth conditions (media and pH levels) for E. coli K-12 E. coli K-12 was exposed to different concentrations of trimethoprim at different pH levels, and preliminary experiments established that the optimum medium to use was LB (Figure S 2.4), which was therefore used throughout this work. E. coli was cultured in different pH environments (3, 5, 7 or 9). No growth occurred in extreme acidic conditions (Figure S 2.5), perhaps because when pH < 4 this environment typically has a bactericidal effect on E. coli (Giannella et al., 1972; Waterman and Small, 1998; Zhu et al., 2006). A consideration of the effect of pH on bacterial growth is important as our experiment investigates the effect of trimethoprim challenge on E. coli at carrying pHs which were chosen to mimic the pH of the natural urine environment and will affect the ionisation status of the antibiotic; the pKa of trimethoprim is ≈ 7.4 (Aagaard et al., 1991) and the ionisation of the N1 and

N3 on the ring is discussed later.

The pH of the cytoplasm (pHi) of E. coli is regulated between 7.2 and 7.8 (Wilks and

Slonczewski, 2007). If changes occur in the environmental pH (pHo), the bacterium tries to preserve nucleic acid and protein stability, as well as enzymatic activity, by maintaining this range (Wilks and Slonczewski, 2007). E. coli uses several mechanisms to maintain pH homeostasis and one of the most common appears to be cation-dependant proton flux (Salmond et al., 1984). As can be seen from

Figure S 2.5, when the pH is 7, which results in a ΔpH (pHi - pHo) of approximately zero, the highest growth occurs. Therefore, pH 7 is the optimum of the three pH levels.

Although the enteric bacterium E. coli can preserve the activity of its nucleic acids, proteins and enzymes in a pH range from 4.5 to 9 (Wilks and Slonczewski, 2007), a comparison between pH 5 and pH 9 showed that at pH 5 the growth curve was higher (Figure S 2.5), indicating that E. coli K-12 can adapt to mildly acidic conditions better than basic conditions. This may be because under the alkaline conditions of pH 9, homeostasis makes high energy demands on the cell and protons are lost (Slonczewski et al., 1981). In addition, when E. coli is cultured in a medium

81

Chapter Two that contains amino acids, e.g. LB, it has a greater possibility of surviving in acidic conditions (Foster, 2004).

A microscopic view of these E. coli at different pH (Figure S 2.6) reveals variations in the size of cells. At pH 5 and 7, the cells are typical of E. coli, whilst at pH 9 the cells are slightly shorter and are affected by this mildly alkaline environmental pH.

2.3.2 Determination of the MIC of trimethoprim in E. coli K-12 In order to measure subtle antibiotic effects on E. coli, it is important to use levels that are below the MIC of trimethoprim, else all that will be measured is cell death and hence biomass level differences rather than metabolic shifts. Therefore, E. coli was challenged with different concentrations of trimethoprim (Figure S 2.7). From Figure S 2.7, it can be seen that the MIC of trimethoprim in E. coli K-12 under optimum conditions (LB medium at pH 7) is approximately 0.2 mg/L, therefore this and lower concentrations (0.03 and 0.003 mg/L) were chosen to challenge E. coli K- 12 in order to determine the effect of the antibiotic at a range of concentrations from the MIC to levels that had little or no effect on growth.

2.3.3 Challenge of E. coli K-12 with different concentrations of trimethoprim at different pH levels Trimethoprim is a heterocyclic weak base with pKa 7.4 (Aagaard et al., 1991) (Figure S 2.8 a). As mentioned above, it acts on dihydrofolate reductase and is categorised as an antibiotic that inhibits nucleic acid synthesis.

The effect of different pH levels on the drug molecules can be characterised according to the Henderson-Hasselbalch equation for weak bases, which can be rewritten in a simple way to calculate the percentage of ionisation as seen in Equation 2.1. The effect of pH on the partition coefficient of bases can be expressed as shown in Equation 2.2, where is the apparent partition coefficient (P) which is variable with the pH (Watson, 1999).

82

Chapter Two

Figure S 2.8 b shows that at pH 9 there was no growth when the bacteria were exposed to 0.2 mg/L of the antibiotic. Ionisation calculations indicate that at this pH trimethoprim remained largely non-ionised (only 2.5% ionised), which facilitated its penetration through the cell membrane of the microbial cell (O'Grady and Lambert, 1997), thus inhibiting growth.

At pH 7, which is the optimum pH for growth of the bacterium, it was found that 71.5% of the drug was ionised. Clearly the non-ionised remainder was able to penetrate and have a measurable effect on this bacterium (Figure S 2.8 c).

By contrast, at pH 5, 99.6% of trimethoprim was ionised, which reduced its ability to penetrate the cell membrane. Although at 0.2 mg/L there was a slight effect on bacterial growth (Figure S 2.8 d), which provides evidence that trimethoprim passed into the cell, at lower dose levels there was no clear effect on growth. This may be due to the ability of trimethoprim (molecular weight 290.3 g/mol) to pass through porins, which are transmembrane proteins in the outer membrane that hydrophilic molecules (molecular weight up to 600 g/mol in the case of E. coli) can penetrate by passive diffusion (Yeagle, 1993).

In order to establish whether trimethoprim penetrates the bacterial cell wall, targeted LC-MS was conducted to quantify the drug within E. coli. This work focused on pH 5 for the arguments made above and this was compared with pH 7 as a control, and of course both of these pH levels are relevant as they are within the normal pH range of human urine.

Generation of a standard curve for trimethoprim (Figure S 2.3) established that at a level of 0.2 mg/L the detectable signal with LC-MS was poor. Therefore, 0.8 mg/L of trimethoprim was used to ensure that the drug could be detected by LC-MS. The effect on growth of E. coli is shown in Figure 2.1 a and this shows that the drug had the strongest effect when added at the beginning of the culture (lag phase) at pH 7 (light blue curve) than at pH 5 (red curve). These curves agree with the data presented in Figure S 2.8 c and d (i.e. in terms of the drug effect at 0.2 mg/L) and the literature which shows that trimethoprim has a profound effect during the bacterial

83

Chapter Two lag phase (Sangurdekar et al., 2011). By contrast, when the drug was added after 5 h (during the exponential phase) there was no effect at pH 7 (orange curve) and only a slight effect on the growth curve at pH 5 (green curve) compared with controls. This means that the integrity of the bacteria is not compromised and biomass yield is high enough to allow accurate estimations of drug uptake, or otherwise, from these bacteria.

As detailed in the Supplementary Information, LC-MS was used to estimate the relative quantification of trimethoprim inside the cell (Figure 2.1 b). As expected, the highest level of the drug was recovered from cells at pH 7 when trimethoprim was added at the beginning of the lag phase (t = 0 h), while the second highest was when the bacterium was challenged at t = 0 h with the drug at pH 5. This relative difference is due to the ionisation of trimethoprim and thus the nearly fully ionised drug is not able to penetrate the cell wall, presumably via porins.

It was interesting to observe that when the bacterium was challenged at mid- exponential phase (t = 5 h) at both pH 5 and 7, regardless of the ionised state of the drug, the intracellular levels of the antibiotic (Figure 2.1 b) were at their lowest, and this is presumably why these cultures exhibited little reduction in growth rate (Figure 2.1 a).

84

Chapter Two

a 1 0.9 0.8 0.7 control pH 5 0.6 0.8 mg/L T=0 h pH 5 0.5 0.8 mg/L T=5 h pH 5 0.4 Control pH7 0.3 0.8 mg/L T=0 h pH 7 0.2

Optical Optical denisty (OD 600nm) 0.8 mg/L T=5 h pH 7 0.1 0 0 5 10 15 20 25 Time (hours)

b 10 Drugadded at t=0 h Drugadded at t=5 h 9 8 7 6 5 4

(mg/L per OD unit) OD per (mg/L 3

2 Relative amount of trimethoprim of trimethoprim amount Relative 1 0 pH 5 pH 7 pH5 pH7

Figure 2.1: (a) Growth curves of E. coli K-12 at pH 5 (dashed line) and pH 7 (solid line). For pH 5, the dashed blue line represents control samples, dashed red indicates samples challenged with 0.8 mg/L of trimethoprim added at the beginning of the lag phase (t = 0 h) and dashed green denotes samples challenged with 0.8 mg/L of trimethoprim and added at mid-exponential phase (t = 5 h). For pH 7, the solid purple line represents control samples, solid light blue indicates samples challenged with 0.8 mg/L of trimethoprim added at the beginning of the lag phase (t = 0 h) and solid orange denotes samples challenged with 0.8 mg/L of trimethoprim and added at the exponential phase (t = 5 h). (b) Column chart representing relative E. coli intracellular levels of trimethoprim after challenging with 0.8 mg/L of the drug at pH 5 (red columns) and pH 7 (blue columns) at different growth stages (time = 0 and 5 h) as detected by LC-MS analysis.

85

Chapter Two

2.3.4 Metabolic fingerprinting of E. coli K-12 using FT-IR spectroscopy E. coli K-12 cells were cultured in LB medium at three pH levels (5, 7, 9) and exposed to three concentrations of trimethoprim (0.003, 0.03 and 0.2 mg/L), giving 12 different conditions including three controls. FT-IR spectra were recorded from the dried cell biomass in transmission mode at all three pH levels. It can be seen from Figure S 2.9 b that at pH 9 some spectra gave the response of empty wells (flat baselines), resulting from the complete inhibition of E. coli growth at this pH; these corresponded to exposure to MIC levels (0.2 mg/L) of trimethoprim. Due to the very low (or in some case no) signal, all FT-IR data from cultures at pH 9 were excluded from the remaining experiments. Prior to multivariate analysis, appropriate scaling and normalisation was conducted for all 8 conditions at pH 5 and 7; the effects of these mathematical operations are shown in Figure S 2.10. Subsequently, PCA and supervised PC-DFA were applied to these spectra.

Figure 2.2 a shows the PCA scores plot of PC1 versus PC2; the variance explained by PC1 is 78.9% and by PC2 12.8%. It can be seen that the largest difference in these samples is the dominant phenotypic shift in E. coli due to exposure to 0.2 mg/L of trimethoprim at pH 7 which are clearly separated from all other samples in PC1. Next, PC-DFA was applied and this was based upon the first 20 PCs (accounting for a total explained variance (TEV) of 99.99%) and the a priori knowledge of the different conditions (8 classes in total), and was validated as detailed above (the 95% confidence ranges are provided in parentheses for the 8 groups in Figure 2.2 b). It is clear from this PC-DFA score plot that cells exposed to 0.2 mg/L at pH 5 could now also be clearly differentiated. Moreover, PC-DF1 which accounts for the most group variance allows separation from all the cells exposed to pH 7, which are located on the right hand side for this plot, while pH 5 are found on the left hand side. In addition, PC-DF2 generally explains the exposure of cells in both pH environments to increasing levels of trimethoprim.

86

Chapter Two

a

b

-1 Figure 2.2: (a) PCA scores plot of PC1 vs. PC2 after CO2 removal around 2350 cm and EMSC scaling. The total explained variance (TEV) of PC1 is 78.9% and for PC2 is 12.8%. (b) PC-DFA score plots of pH 5 and 7 samples. 20 PCs were extracted from PCA and used as inputs to DFA. These 20 PCs explain 99% of TEV; the legend in the plot shows the 95% confidence interval (CI) for the correct classification of the eight conditions. C, control.

87

Chapter Two

As there were multiple interactions, pH versus antibiotic level, MB-PCA was used to remove these potentially interacting factors. Figure 2.3 shows the results of MB- PCA and two block scores were derived for the two pH sub-groups. The distribution of samples exposed to different concentrations of trimethoprim at each pH are now clearly revealed in the 1st PC and both the pH 7 and 5 plots are congruent. The same process was repeated for the antibiotic dose effect and Figure S 2.11 presents this where the four block-scores were derived for drug dose based sub-groups, therefore focusing upon the effect of the drug dose at the two different pH levels. The distribution of samples at pH 5 and pH 7 revealed a clear separation at 0.2 mg/L, and partial separation at 0.03 and 0.003 mg/L. Additionally, some separation can be seen in the control samples (0 mg/L), which is consistent with the growth curves (Figure S 2.8) and again highlights the varying phenotypic response to the different pH environments.

The loadings plots from all three chemometric analyses were complex (data not shown) and did not clearly reveal any obvious features. Indeed, the chemical resolution of IR spectroscopy is at the functional group level rather than at the level of specific metabolites and thus in order to study the subtle effects of trimethoprim on the intracellular metabolome of E. coli at pH 5, as well as the rather more extreme effects at pH 7, a more sensitive and advanced analytical technique such as chromatography linked to mass spectrometry is required. It was expected that by including pH as well as sub-MIC antibiotic levels, we might be able to observe a wider response of E. coli to the drug in conditions similar to the pH range of urine, thus helping to elucidate the mechanism of action of the drug in vivo.

88

Chapter Two

a

b

Figure 2.3: FT-IR multi-block PCA score plots showing the relationship between the effect of different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) and that of different pH levels. Block scores plots showing the distribution of samples with different concentrations at (a) pH 5 and (b) pH 7.

89

Chapter Two

2.3.5 Metabolic profiling of E. coli K-12 using GC-MS The same bacterial samples that were analysed by FT-IR spectroscopy were processed for GC-MS (Figure S 2.1). Following the MSI reporting standards for metabolite identification (Sumner et al., 2007), 43 metabolites were identified at Level 1 (RI (+/- 20 RI units) and MS matched to our in-house reference standard (80% similarity)), 20 were identified at Level 2 (putative MS match to external library (80% similarity)) and 4 at Level 3 (metabolite class indicated), while 92 were unknown (level 4) (see Table S 2.1 for details of these metabolites and their relative abundance).

GC-MS data were subjected to a multivariate analysis after data pre-processing. Initially, PCA was applied (data not shown) but unlike FT-IR spectroscopy, no separation was observed in the PCA scores plot and therefore PC-DFA was employed (Figure 2.4).

Figure 2.4: PC-DFA score plots of GC-MS profiles (25 PCs were extracted from PCA and used as inputs to DFA, explaining 99% of the TEV). The legend in the figure shows the 95% CI for the correct classification of the 8 conditions. Significantly altered metabolites were mined through a combination of PC-DFA loadings and univariate significance testing (Student t-test). C, control.

90

Chapter Two

In this plot, clustering was apparent which was related to both the pH effect and antibiotic dose effect, and this was very similar to the class separation observed in the PC-DFA from the FT-IR data. E. coli exposed to trimethoprim at pH 7 had a more marked effect on the intracellular metabolome compared to the equivalent cells at pH 5 and antibiotic-related trajectories can be seen for both pH environments moving from 0 (control) through 0.003 to 0.03 and to MIC levels at 0.2 mg/L.

The next stage was to now relate the changes we observe from GC-MS to the mechanisms of microbial response to both pH and trimethoprim.

2.3.6 Phenotypic response of E. coli K-12 to varying pH and trimethoprim exposure A summary of the overall effects at both pH levels and the change in intracellular metabolites with respect to trimethoprim dose is shown in Figure 2.5. For full details of the relative metabolite levels, the reader is referred to Table S 2.1. While Figure S 2.12 provides an overlay of metabolite changes for all 8 conditions on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway (Metabolic pathways- Escherichia coli K-12 MG1655) for the central metabolism of E. coli K-12 MG1655 (Kanehisa and Goto, 2000).

At pH 7, where the permeability of the drug molecules is higher (as discussed above), as expected metabolites linked with the dihydrofolate reductase (DHFR) enzyme (the main target of trimethoprim) generally show a stronger response than metabolites extracted from samples incubated at pH 5 (see Figure 2.5). It is known that DHFR plays a key role in the folate biosynthesis pathway. Therefore, a direct outcome of blocking DHFR is to deprive the cell of tetrahydrofolate (THF) and thus dihydrofolate accumulates. This in turn inhibits folylpoly-γ-glutamate synthetase (FP-γ-GS) (Figure 2.5) (Kwon et al., 2008).

As shown in the metabolic scheme (Figure 2.5), this may indirectly result in the accumulation of glutamate, which would explain the rapid rise in the level of glutamate we observe when the drug has its strongest activity (pH 7, 0.2 mg/L) (Figure S 2.12). As detailed in EcoCyc, this non-essential amino acid is involved in numerous reactions including the biosynthesis of ornithine and proline. This explains the similarity in the levels of glutamate, ornithine and proline under all conditions

91

Chapter Two which are at their highest levels under the same conditions; i.e. when the drug is very active (Figure 2.5). In ornithine biosynthesis, glutamate condenses with acetyl CoA to produce N-acetyl-glutamate, a precursor in ornithine synthesis (EcoCyc, 2012c). Glutamate is involved in the biosynthesis of proline by being first phosphorylated to L-glutamate-5-phosphate and subsequent reduction to glutamate-5-semialdehyde, which is converted to pyrroline-5-carboxylate, which is then reduced to proline (EcoCyc, 2012d). Proline acts as an osmoprotectant in bacteria (Csonka and Hanson, 1991), and it has been reported that glutamate also acts as an osmolyte in E. coli under specific growth conditions (Strom and Kaasen, 1993).

Trehalose, a disaccharide compound that consists of two glucose moieties, was first known as a „storage‟ metabolite (a source of carbon and energy), and later it was reported that trehalose also acts as a protectant during adverse growth conditions in both prokaryotic and eukaryotic cells (Strom and Kaasen, 1993). In our study, we find that under pH 7 and 0.2 mg/L trimethoprim conditions, the drug has its strongest effect on the cellular phenotype (Figures S 2.8) and this stress effect is reflected in the elevated levels of trehalose (MSI level 3) that we observe (Figures 2.5 and S 2.12). Under osmotic stress, it has been reported that an osmotically regulated enzyme, trehalose phosphate synthase, is stimulated mainly by K+ and consumes glucose 6-phosphate and UDP-glucose to produce trehalose (Giaever et al., 1988). This could explain the concomitant reduction in the level of glucose and other sugars in general (Figure 2.5) when trehalose is elevated (Figure S 2.12). Alternatively, one or both of the trehalase enzymes (which are periplasmically and cytoplasmically located) that split trehalose into two glucose molecules are blocked or inhibited, and this would lead to an increase in the level of trehalose and reduce the pool of glucose available in the cell, thus obstructing glycolysis (Horlacher et al., 1996). For sugars in general, the depletion of their levels after drug challenge may therefore be due to the stress of the drug, which increases catabolism and the consumption of sugars to generate a range of compatible solutes which act as osmoprotectants. A consequence of this reduction in sugars may in turn lead to an increase in the level of alanine, an amino acid that in higher organisms acts as a regulator in sugar metabolism and glycolysis (Gerich et al., 2001) (Figure 2.5 and S 2.12). In addition, this may simply be that carbon has been mobilised for (osmo)protection rather than being channelled directly into protein synthesis per se.

92

Chapter Two

The direct effect of blocking dihydrofolate reductase, which is expected when the drug is near its MIC, is a reduction in THF. Consequently, there will be a depletion of THF-polyglutamate (THF (glu)n), a key metabolite in the biosynthesis of 10- formyl THF and 5,10-methylene THF, resulting in a reduction in these compounds (Figure 2.5); unfortunately, none of these metabolites were directly observed in our experiment as we conducted untargeted GC-MS rather than targeted LC-MS. The first of these compounds, 10-formyl THF, is a substrate of an enzyme called phosphoribosylglycinamide formyltransferase, which takes part in inosine monophosphate biosynthesis (EcoCyc, 2012a). Reduction in this substrate results in a reduction in inosine monophosphate, which acts as a precursor of purine nucleotides, and thus results in a depletion of adenine and guanine (Michal, 1998) which we do observe with untargeted GC-MS (Figure 2.5 and S 2.12). Similarly, 5,10-methylene-THF is reduced to 5-methyl-THF which then methylates homocysteine to produce low levels of methionine, an essential amino acid that is converted to N-formyl-L-methionine, a starting amino acid in protein biosynthesis (Michal, 1998; EcoCyc, 2012b). Methionine levels are also seen to decrease in our experiment (Figure 2.5). Methionine acts as a regulator of the first enzyme in its de novo biosynthesis (homoserine O-succinyltransferase), which produces O-succinyl- homoserine by transferring the succinyl group to homoserine from succinyl-CoA (EcoCyc, 2012b). When methionine is at a low level (as found at high drug concentrations at pH 7) (Figure 2.5 and S2.12), this may additionally result from extensive consumption in the feedback inhibition, possibly resulting in an accumulation of homoserine, which acts as a competitive inhibitor of glutamate dehydrogenase, an enzyme that has a role in a reversible reaction to produce and consume glutamate (Sakamoto et al., 1975). In addition, there is also a reduction of 5,10-methylene THF, which is catalysed by thymidylate synthase to methylate deoxyuridine 5'-monophosphate (dUMP) and produces deoxythymidine 5'- monophosphate (dTMP). It has been reported that a reduction in dTMP results in a reduction of thymine (Michal, 1998; Jia et al., 2009), the latter is seen in our metabolic profiles (Figure 2.5 and S 2.12). Unlike in some other organisms such as Candidatus Phytoplasma mali, the reaction mediated here by thymidylate synthase is currently thought to be irreversible and cannot therefore directly explain the reduction of dUMP (BioCyc, 2012). Rather, a possible explanation is that in E. coli the deamination of deoxycytidine 5'-triphosphate (dCTP) to deoxyuridine 5'- 93

Chapter Two triphosphate (dUTP) using deoxythymidine 5'-triphosphate (dTTP), which is reduced by the reduction in dTMP. This reduction results in the depletion of dUTP, which causes a depletion in dUMP and thus in uracil (Michal, 1998) which we do observe (Figure 2.5).

In general, we observed that all nucleotides were down-regulated in line with increasing antibiotic concentration, and this response was greater at pH 7 than at pH 5, and this is likely to be due to the high level of trimethoprim entering the cells (Figure 2.1). Although guanine shows the same response, it has a unique response at pH 5 under high antibiotic dose, where its level increased considerably (Figure 2.5 and Figure S 2.12). We can find no explanation for this increase in guanine level under this condition and as there is a lack of appropriate literature we do not feel it necessary to generate ambiguous hypotheses. We also observe that many amino acids, including histidine, tyrosine, leucine, valine and phenylalanine (Figure 2.5, surrounded by a dotted rectangle), have the same response as guanine under this condition, including different levels and ratios compared with other conditions (Figure S 2.12). This may reflect a common feature among these metabolites which results in them having almost the same response. For example, it was found that guanosine 5'-diphosphate 3'-diphosphate (ppGpp) is a histidine regulator in Salmonella typhimurium. This may explain why histidine and guanine gave similar responses under the eight conditions (Stephens et al., 1975).

As for tyrosine, phenylalanine and tryptophan, it was found that these aromatic amino acids, which are the downstream products of a folate precursor called chorismate, gave the same response after 2 h of treatment (Kwon et al., 2010). However, in this experiment, when samples were collected after 18 h of drug exposure it was found that only phenylalanine and tyrosine had the same response in that both accumulated most at the highest dose of the drug at pH 5, similar to histidine and guanine. By contrast, tryptophan accumulated at the same high dose at pH 7, where the drug is highly active and affects the growth of these bacteria. This shows that tyrosine and phenylalanine, the downstream products of prephenate, gave similar responses to guanine, unlike tryptophan (Figure 2.5), which had similar levels to alanine under all conditions (Figure S 2.12); this may be due to a common function or pathway between them (Kang et al., 2011). Figure 2.5 (and

94

Chapter Two

Figure S 2.12) also shows that there is a correlation between tryptophan and glutamate, which is one of the products of tryptophan biosynthesis (EcoCyc, 2012f).

The branched chain amino acids valine, leucine and isoleucine have strongly interrelated biosynthetic pathways. Leucine and valine originate from 2- oxoisovalerate, while isoleucine originates from threonine (Figure 2.5) (EcoCyc, 2012e; Michal, 1998). This explains their similar responses when challenged with trimethoprim at both pH levels (Figure S 2.12).

Turning to the detection of aspartic acid, it can be seen that its level decreased with increasing drug dose, regardless of the pH level (Figure 2.5 and Figure S 2.12). Phosphorylation of this amino acid is the starting point of synthesis of many amino acids including lysine, a basic amino acid that showed contrasting levels to those of its precursor aspartate at pH 7 (Figure 2.5 and Figure S 2.12) (Michal, 1998). Aspartate also acts as a precursor of nicotinamide, which showed strong depletion when the drug was highly active, perhaps because of an extensive use of its products, nicotinamide adenine dinucleotide (NAD) and nicotinamide adenine dinucleotide phosphate (NADP), which act as coenzymes and play a central role in metabolic reactions (Michal, 1998). Tryptophan was found at high levels under the most extreme condition (i.e. 0.2 mg/L of trimethoprim at pH 7). Although there is evidence that tryptophan acts as a precursor in nicotinamide synthesis, the direct relationship between these two metabolites in E. coli is yet to be reported (Michal, 1998). Nevertheless, quinolinate is one of the end products of tryptophan metabolism and is involved in nicotinamide metabolism, which may be taken as evidence of a correlation between these two metabolites in E. coli (KEGG, 2012a; KEGG, 2012b).

Alanine levels were observed to be high when the drug is highly active and the bacterium is under stress from exposure to trimethoprim. This may be correlated to an extensive consumption of sugars, resulting in an increase in the level of pyruvate, an end-product of glycolysis. Pyruvate acts as a substrate of valine-pyruvate aminotransferase (EcoCyc, 2012g) and high levels of pyruvate result in an increase in alanine, an amino acid that acts as a regulator of sugar metabolism in higher organisms (Gerich et al., 2001). A potential consequence of the overflow of metabolism from the consumption of monosaccharides (Figure 2.5) is the excessive production of pyruvate generated via glycolysis. The cell would need to deal with

95

Chapter Two this overproduction of pyruvate and this would in turn result in an increase in the level of lactic acid and tricarboxylic acid (TCA) cycle intermediates. We certainly observe a direct correlation of lactate to pyruvate (Figure 2.5 and S 2.12) and the only two metabolites that we detected by GC-MS from the TCA cycle were citrate and malate, and both had their highest levels at pH 7 and 0.2 mg/L trimethoprim.

U U U U Tryptophan isoleucine Leucine Valine U

Trehalose Dihydrofolate DHFR x Trimethoprim Anthranilate Threonine 2-oxo-

Anthranilate isovalerate

D D - Tetrahydrofolate U U Glucose Fructose U Tyrosine Phenylalanine (THF) Ornithine

U FP

Glutamate synthase

-

γ - GS x U Glutamine Pi Proline Prephenate U Tetrahydrofolypolyglutamate Histidine

U (THF (glu)n) Pyruvate Alanine Chorismate D D u Adenine Guanine D Methionine 5, 10-Methylene-THF Homocysteine U D Inosine U Citrate Malate 5-Methyl-THF 10-Formyl-THF

dCMP Phosphoribosylformylglycinamide U Phosphoribosyl- Lactic acid glycinamide formyltransferase

D D Aspartate D (Thymine) dTMP dUMP (Uracil) D After challenged with drug: Thymidylate synthase D Down-regulated at pH 7

U Up-regulated at pH 7 U D D Down-regulated at pH 5 Lysine Nicotinamaide U Up-regulated at pH 5

Figure 2.5: Metabolic effects of trimethoprim challenge on E. coli K-12 at pH 5 and pH 7. When partially ionised at pH 7, trimethoprim is seen to impact on metabolism directly associated with the dihydrofolate pathway, as well as off-target effects upon nucleotide, sugar and amino acid metabolism, glycolysis, the TCA cycle, and up-regulation of osmoprotectants. When trimethoprim is in a poorly ionised state (pH 5), it appears to have a profound effect upon the up-regulation of amino acid metabolism.

96

Chapter Two

2.4 Conclusion Global metabolic profiling was used in this study as it has potential to help improve understanding of the mode of action of antibiotics on bacteria both at the level of immediate on-target metabolic effects, as well as providing the potential to highlight important areas of metabolism which are affected due to off-target effects of the drugs. In this study, we chose to investigate the mode of action of trimethoprim. This is a basic molecule that acts as an indirect inhibitor of nucleic acid synthesis by blocking dihydrofolate reductase. E. coli K-12 was challenged with three concentrations of trimethoprim (0.2, 0.03 and 0.003 mg/L ≤ MIC) at two pH levels (5 and 7); pH was considered important as this mimics the typical pH range of healthy human urine.

Direct known effects of trimethoprim were readily observed where nucleotides decreased and glutamate increased when E. coli was exposed to this antibiotic, and this is a direct consequence of trimethoprim blocking dihydrofolate reductase. This can be seen more clearly at pH 7 where the drug molecules have a higher permeability through the cell membrane. Whilst glutamate levels increased under this condition, and the same was observed for its products, ornithine and proline, a contrary response was seen for methionine, an amino acid which blocks glutamate dehydrogenase indirectly. All the nucleotides (except guanidine) that were detected by this global metabolic profile were down-regulated more significantly at pH 7 compared with pH 5.

With respect to more indirect drug effects, guanine showed an unexpected response when challenged with the highest antibiotic dose at pH 5. Its level significantly increased and this was the same as many other amino acids that were measured including histidine, leucine, valine, tyrosine and phenylalanine. By contrast, tryptophan, which shares the same precursor with tyrosine and phenylalanine, only showed a significant increase with trimethoprim concentration at pH 7.

The osmoprotectant sugar trehalose was significantly increased with high trimethoprim levels at pH 7 and was anti-correlated with its corresponding substrate glucose, and also fructose was highly consumed under these conditions. The fate of these two mono-saccharides was to either generate trehalose or, as would be more normally expected, be catabolised via glycolysis to pyruvate. Under these conditions,

97

Chapter Two pyruvate increased as did its immediate products within the TCA cycle (citrate and malate), as well as alanine and lactic acid that are produced directly from pyruvate.

In conclusion, as well as measuring the direct effect on nucleotide metabolism that trimethoprim is known to affect, we also observe pH dependent antibiotic effects on amino acid profiles and most significantly increased trehalose levels, an osmoprtectant that is produced when bacteria are under stress. These results provide a wider view of the action of trimethoprim, and metabolomics has also indicated several alternative areas of metabolism to be investigated further by time-course metabolic profiling, targeted metabolite quantification, and fluxomic-based investigation.

98

Chapter Two

2.5 References Aagaard, J., Gasser, T., Rhodes, P. & Madsen, P. O. 1991. MICs of ciprofloxacin and trimethoprim for Escherichia coli: influence of pH, inoculum size and various body fluids. Infection, 19, S167-S169. Allwood, J. W. & Goodacre, R. 2010. An introduction to liquid chromatography-mass spectrometry instrumentation applied in plant metabolomic analyses. Phytochemical Analysis, 21, 33-47. AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369. Anderson, R., Groundwater, P., Todd, A. & Worsley, A. 2012. Antibacterial agents: chemistry, mode of action, mechanisms of resistance and clinical applications, Hoboken, Wiley. Baerheim, A. 2001. Empirical treatment of uncomplicated cystitis. Keep it simple. British Medical Journal, 323, 1197-1198. Begley, P., Francis-McIntyre, S., Dunn, W. B., Broadhurst, D. I., Halsall, A., Tseng, A., Knowles, J., Goodacre, R. & Kell, D. B. 2009. Development and performance of a gas chromatography−time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Analytical Chemistry, 81, 7038- 7046. BioCyc 2012. Candidatus Phytoplasma mali reaction: 2.1.1.45. Menlo Park: SRI International. Breitling, R., Pitt, A. R. & Barrett, M. P. 2006. Precision mapping of the metabolome. Trends in Biotechnology, 24, 543-548. Bushby, S. R. M. & Hitching, G. H. 1968. Trimethoprim, a sulphonamide potentiator. British Journal of Pharmacology, 33, 72-90. Correa, E., Sletta, H., Ellis, D. I., Hoel, S., Ertesvag, H., Ellingsen, T. E., Valla, S. & Goodacre, R. 2012. Rapid reagentless quantification of alginate biosynthesis in Pseudomonas fluorescens bacteria mutants using FT-IR spectroscopy coupled to multivariate partial least squares regression. Analytical and Bioanalytical Chemistry, 403, 2591-2599. Csonka, L. N. & Hanson, A. D. 1991. Prokaryotic osmoregulation: genetics and physiology. Annual Review of Microbiology, 45, 569-606. Dobson, P. D. & Kell, D. B. 2008. Carrier-mediated cellular uptake of pharmaceutical drugs: an exception or the rule? Nature Reviews Drug Discovery, 7, 205-220. Dunn, W., Erban, A., Weber, R. M., Creek, D., Brown, M., Breitling, R., Hankemeier, T., Goodacre, R., Neumann, S., Kopka, J. & Viant, M. 2013. Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics. Metabolomics, 9, 44-66. Dunn, W. B., Broadhurst, D., Begley, P., Zelena, E., Francis-McIntyre, S., Anderson, N., Brown, M., Knowles, J. D., Halsall, A. & Haselden, J. N. 2011. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols, 6, 1060- 1083. EcoCyc 2012a. Escherichia coli K-12 substr. MG1655 Pathway: 5-aminoimidazole ribonucleotide biosynthesis I. Menlo Park: SRI International. EcoCyc 2012b. Escherichia coli K-12 substr. MG1655 pathway: methionine biosynthesis I. Menlo Park: SRI International. EcoCyc 2012c. Escherichia coli K-12 substr. MG1655 pathway: ornithine biosynthesis. Menlo Park: SRI International. EcoCyc 2012d. Escherichia coli K-12 substr. MG1655 pathway: proline biosynthesis I. Menlo Park: SRI International. EcoCyc 2012e. Escherichia coli K-12 substr. MG1655 pathway: superpathway of leucine, valine, and isoleucine biosynthesis. Menlo Park: SRI International.

99

Chapter Two

EcoCyc 2012f. Escherichia coli K-12 substr. MG1655 pathway: tryptophan biosynthesis. Menlo Park: SRI International. EcoCyc 2012g. Escherichia coli K-12 substr. MG1655 reaction: 2.6.1.66. Menlo Park: SRI International. Fihn, S. D. 2003. Acute uncomplicated urinary tract infection in women. The New England Journal of Medicine, 349, 259-266. Foster, J. W. 2004. Escherichia coli acid resistance: tales of an amateur acidophile. Nature Reviews Microbiology, 2, 898-907. Gangjee, A., Jain, H. D. & Kurup, S. 2007. Recent advances in classical and non-classical antifolates as antitumor and antiopportunistic infection agents: part I. Anti-Cancer Agents in Medicinal Chemistry, 7, 524-542. Gerich, J. E., Meyer, C., Woerle, H. J. & Stumvoll, M. 2001. Renal gluconeogenesis. Its importance in human glucose homeostasis. Diabetes Care, 24, 382-391. Giaever, H. M., Styrvold, O. B., Kaasen, I. & Strom, A. R. 1988. Biochemical and genetic characterisation of osmoregulatory trehalose synthesis in Escherichia coli. Journal of Bacteriology, 170, 2841-2849. Giannella, R., Zamcheck, N. & Broitman, S. A. 1972. Gastric acid barrier to ingested microorganisms in man: studies in vivo and in vitro. Gut, 13, 251-256. Goodacre, R., Broadhurst, D., Smilde, A. K., Kristal, B. S., Baker, J. D., Beger, R., Bessant, C., Connor, S., Capuani, G. & Craig, A. 2007. Proposed minimum reporting standards for data analysis in metabolomics. Metabolomics, 3, 231-241. Griebling, T. L. 2005. Urologic diseases in America project trends in resource use for urinary tract infections in women. Journal of Urology, 173, 1281-1287. Hitchings, G. H. 1989. Selective inhibitors of dihydrofolate reductase. In Vitro Cellular & Developmental Biology, 25, 303-310. Horlacher, R., Uhland, K., Klein, W., Ehrmann, M. & Boos, W. 1996. Characterisation of a cytoplasmic trehalase of Escherichia coli. Journal of Bacteriology, 178, 6250-6257. Jia, J., Zhu, F., Ma, X. H., Cao, Z. W. W., Li, Y. X. X. & Chen, Y. Z. 2009. Mechanisms of drug combinations: interaction and network perspectives. Nature Reviews Drug Discovery, 8, 111-128. Kahlmeter, G. 2003. An international survey of the antimicrobial susceptibility of pathogens from uncomplicated urinary tract infections: the ECO.SENS Project. Journal of Antimicrobial Chemotherapy, 51, 69-76. Kanehisa, M. & Goto, S. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research, 28, 27-30. Kang, L., Shaw, A. C., Xu, D., Xia, W., Zhang, J., Deng, J., Woldike, H. F., Liu, Y. & Su, J. 2011. Upregulation of MetC is essential for D-alanine-independent growth of an alr/dadX-deficient Escherichia coli strain. Journal of Bacteriology, 193, 1098-106. Katzung, B. G. 1995. Basic and clinical pharmacology, New York, Appleton & Lange. KEGG 2012a. Nicotinate and nicotinamide metabolism - Escherichia coli K-12 MG1655. Kyoto: Kanehisa Laboratories. KEGG 2012b. Tryptophan metabolism - Escherichia coli K-12 MG1655. Kyoto: Kanehisa Laboratories. Kim, D.-H., Jarvis, R. M., Allwood, J. W., Batman, G., Moore, R. E., Marsden-Edwards, E., Hampson, L., Hampson, I. N. & Goodacre, R. 2010. Raman chemical mapping reveals site of action of HIV protease inhibitors in HPV16 E6 expressing cervical carcinoma cells. Analytical and Bioanalytical Chemistry, 398, 3051-3061. Kwon, Y. K., Higgins, M. B. & Rabinowitz, J. D. 2010. Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in E. coli. ACS Chemical Biology, 5, 787-795. Kwon, Y. K., Lu, W., Melamud, E., Khanam, N., Bognar, A. & Rabinowitz, J. D. 2008. A domino effect in antifolate drug action in Escherichia coli. Nature Chemical Biology, 4, 602-608.

100

Chapter Two

Martens, H., Nielsen, J. P. & Engelsen, S. B. 2003. Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Analytical Chemistry, 75, 394-404. Michal, G. 1998. Biochemical Pathways: an atlas of biochemistry and molecular biology, New York, Wiley-Spektrum. O'Grady, F. & Lambert, H. P. H. P. 1997. Antibiotic and chemotherapy: anti-infective agents and their use in therapy., New York, Churchill Livingstone. Quinlivan, E. P., McPartlin, J., Weir, D. G. & Scott, J. 2000. Mechanism of the antimicrobial drug trimethoprim revisited. The FASEB Journal, 14, 2519-2524. Ronald, A. 2003. The etiology of urinary tract infection: traditional and emerging pathogens. Disease-a-Month, 49, 71-82. Sakamoto, N., Kotre, A. M. & Savageau, M. A. 1975. Glutamate dehydrogenase from Escherichia coli: purification and properties. Journal of Bacteriology, 124, 775-783. Salmond, C. V., Kroll, R. G. & Booth, I. R. 1984. The effect of food preservatives on pH homostasis in Escherichia coli. Journal of General Microbiology, 130, 2845-2850. Sangurdekar, D. P., Zhang, Z. & Khodursky, A. B. 2011. The association of DNA damage response and nucleotide level modulation with the antibacterial mechanism of the anti-folate drug trimethoprim. BMC Genomics, 12, 583 Scheltema, R. A., Decuypere, S., T'Kindt, R., Dujardin, J.-C., Coombs, G. H. & Breitling, R. 2010. The potential of metabolomics for Leishmania research in the post-genomics era. Parasitology, 137, 1291-1302. Slonczewski, J. L., Rosen, B. P., Alger, J. R. & Macnab, R. M. 1981. pH homeostasis in Escherichia coli: measurement by 31P nuclear magnetic resonance of methylphosphonate and phosphate. Proceedings of the National Academy of Sciences of the United States of America-Biological Sciences, 78, 6271-6275. Stephens, J. C., Artz, S. W. & Ames, B. N. 1975. Guanosine 5'-diphosphate 3'-diphosphate (ppGpp): positive effector for histidine operon transcription and general signal for amino-acid deficiency. Proceedings of the National Academy of Sciences of the United States of America, 72, 4389-4393. Strom, A. R. & Kaasen, I. 1993. Trehalose metabolism in Escherichia coli: stress protection and stress regulation of gene expression. Molecular Microbiology, 8, 205-210. Sumner, L. W., Amberg, A., Barrett, D., Beale, M. H., Beger, R., Daykin, C. A., Fan, T. W. M., Fiehn, O., Goodacre, R., Griffin, J. L., Hankemeier, T., Hardy, N., Harnly, J., Higashi, R., Kopka, J., Lane, A. N., Lindon, J. C., Marriott, P., Nicholls, A. W., Reily, M. D., Thaden, J. J. & Viant, M. R. 2007. Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211-221. Voet, D. & Voet, J. G. 2004. Biochemistry, Hoboken, John Wiley & Sons. Waterman, S. R. & Small, P. L. C. 1998. Acid-sensitive enteric pathogens are protected from killing under extremely acidic conditions of pH 2.5 when they are inoculated onto certain solid food sources. Applied and Environmental Microbiology, 64, 3882- 3886. Watson, D. G. 1999. Pharmaceutical analysis: a textbook for pharmacy students and pharmaceutical chemists, Edinburgh, Churchill Livingstone. Wedge, D. C., Allwood, J. W., Dunn, W., Vaughan, A. A., Simpson, K., Brown, M., Priest, L., Blackhall, F. H., Whetton, A. D., Dive, C. & Goodacre, R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Analytical Chemistry, 83, 6689- 6697. Wilks, J. C. & Slonczewski, J. L. 2007. pH of the cytoplasm and periplasm of Escherichia coli: rapid measurement by green fluorescent protein fluorimetry. Journal of Bacteriology, 189, 5601-5607. Wilson, M. 2004. Microbial inhabitants of humans: their ecology and role in health and disease, Cambridge, Cambridge University Press.

101

Chapter Two

Winder, C. L., Gordon, S. V., Dale, J., Hewinson, R. G. & Goodacre, R. 2006. Metabolic fingerprints of Mycobacterium bovis cluster with molecular type: implications for genotype-phenotype links. Microbiology, 152, 2757-2765. Xu, Y. & Goodacre, R. 2012. Multiblock principal component analysis: an efficient tool for analysing metabolomics data which contain two influential factors. Metabolomics, 8, 37-51. Yeagle, P. 1993. The membrane of cells, London, Academic Press. Zhu, H., Hart, C. A., Sales, D. & Roberts, N. B. 2006. Bacterial killing in gastric juice - effect of pH and pepsin on Escherichia coli and Helicobacter pylori. Journal of Medical Microbiology, 55, 1265-1270.

102

Chapter Two

2.6 Supplementary Information 2.6.1 Supplementary Materials and Methods

2.6.1.1 General Chemicals

Unless otherwise stated, all chemicals were attained from Fisher Scientific Ltd. (Loughborough, UK), and all solvents and acids were obtained from Sigma-Aldrich (Gillingham, UK).

2.6.1.2 Microorganisms

Escherichia coli K-12 (MG1655) was used for this investigation and its genome is available (Blattner et al., 1997).

2.6.1.3 Media

Nutrient agar (NA) was prepared from a preparatory mixture (beef extract 3 g/L, peptone 5 g/L, NaCl 8 g/L and 12 g/L of agar no. 2) (Lab-M, Bury, UK) following the manufacturer‟s instructions (28 g in 1 L of deionised water) and then autoclaved (at 121 ºC and 15 psi for 15 min).

Lysogeny broth (LB) was prepared by dissolving 10 g of tryptone (Formedia, Hunstanton, UK), 5 g of yeast extract (Amersham Life Sciences, Cleveland, USA) and 10 g of NaCl in 1 L of reverse osmosis water and after this was autoclaved (121 ºC, 45 min and 15 psi).

Phage broth (ψ) was prepared by dissolving 0.5 g MgSO4.7H2O (VWR International

Ltd, Lutterworth, UK), 0.74 g CaCl2.2H2O (VWR International Ltd, Lutterworth, UK), 1 g glucose, 5 g tryptone (Oxoid Ltd., Basingstoke, UK), 5 g yeast extract (Oxoid Ltd.) and 5 g Lab Lemco Powder (Oxoid Ltd.) in 800 mL of distilled water then adjusting to pH 7.2 with NaOH. This was made up to 1 L with distilled water and finally autoclaved (121 ºC, 45 min, 15 psi).

Nutrient broth (NB) was prepared from a preparatory mixture (beef extract 1 g/L, yeast extract 2 g/L, peptone 5 g/L, NaCl 5 g/L) (Lab-M) following the manufacturer‟s instructions (13 g in 1 L of H2O) and then autoclaved (121 ºC, 45 min, 15 psi).

103

Chapter Two

In order to calculate the minimum inhibitory concentration (MIC) of trimethoprim (Sigma-Aldrich), 100 mg was dissolved in 50 mL distilled water and 500 µL glacial acetic acid (Andrews, 2001). A series of trimethoprim dilutions (10, 8, 4, 2, 1.5, 1.4, 1.3, 1.2, 1.1, 1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.09, 0.08, 0.06, 0.05,0.03, 0.02, 0.003, 0.002 mg/L) were then prepared with distilled water (Andrews, 2001). MIC is defined as “the lowest concentration of the agent that completely inhibits the growth of the test organism” (Madigan et al., 2000). Alternatively, it can be defined as the concentration of an agent producing 90% inhibition of the original inoculum (Aagaard et al., 1991). Within this chapter, the first definition of MIC applies since the experimentally determined MIC was based upon 100% inhibition.

2.6.1.4 Growth characteristics

E. coli was streaked on a NA plate to obtain axenic colonies. Biomass was collected from single colonies to prepare 1 mL 20% [v/v] glycerol working inoculum stocks which were stored at -20 ºC. A 1 mL master stock of bacteria was also stored at -80 ºC (Figure S 2.1).

Gram staining was performed using a Sigma-Aldrich kit, the stained bacterial cells were observed with a Zeiss LSM 510 META confocal microscope using the ×100 objective (Carl Zeiss Ltd., Welwyn Garden City, UK).

Starting growth condition. At the start of each experiment, 49 mL of medium was inoculated with 1 mL of working stock and incubated at 37 ºC in a shaking incubator at 200 rpm for 24 h. The overnight cultures (1 mL) were diluted with 49 mL fresh media and further incubated at 37 ºC, 200 rpm for 1 h. These new cultures were diluted with physiological saline (0.9% [w/v] NaCl) to 0.5 McFarland standard optical density (OD) at 600 nm using a Biomate 5 spectrophotometer (Thermo, Hemel Hempstead, UK) and used as experimental inoculate (Figure S 2.1).

Estimation of bacterial biomass. In order to standardise the size of the inocula for growth curve experiments, 40 mL of the culture was diluted and washed two times with physiological saline (0.9% [w/v] NaCl). The bacterial turbidity was adjusted to 0.5 McFarland standard (OD 0.1 ± 0.02) optical density (OD) at 600 nm using a Biomate 5 (Figure S 2.1). 0.5 McFarland standard was prepared by adding 85 mL of

104

Chapter Two

1% [w/v] H2SO4 to 0.5 mL of 1.175% [w/v] barium chloride dihydrate

(BaCl2.2H2O), and made up to 100 mL with deionised water and mixed well.

The bacterial growth curves were measured using an OD 600 nm in a Bioscreen spectrophotometer (Labsystems, Basingstoke, UK). This Bioscreen was run at the following settings: 10 min preheating, incubation temperature 37 ºC, continuous medium shake, measurement interval 10 min and 24 h for the total experiment.

180 µL of LB at pH 5 and 7 (± 0.2) was inoculated with 10 µL of 0.5 McFarland standard inocula and 10 µL serial dilutions of trimethoprim in Bioscreen plates. The control samples consisted of 180 µL of LB at pH 5 and 7 (± 0.2) with 10 µL of 0.5 McFarland standard inocula and 10 µL distilled water. Incubation was for 18-24 h, as described above. Four pilot experiments were applied on E. coli K-12.

Each class in all Bioscreen experiments was prepared in five biological replicates, and all growth was conducted at 37 °C for 24 h.

2.6.1.5 Pilot experiments

1. Comparison of growth curves between the different media. 190 µL of LB, ψ and NB media were inoculated with 10 µL of 0.5 McFarland standard inocula Bioscreen plates.

2. Comparison of growth curves in LB media at pH 3, 5, 7 and 9. 190 µL of LB at different pH 3, 5, 7 and 9 (± 0.2) were inoculated with 10 µL of 0.5 McFarland standard inocula in Bioscreen plates.

3. Determination of the drug MIC in neutral LB media (pH 7). 180 µL of LB were inoculated with 10 µL of 0.5 McFarland standard inocula and 10 µL serial dilutions of trimethoprim in Bioscreen plates. The control samples consisted of 180 µL of LB with 10 µL of 0.5 McFarland standard inocula and 10 µL of distilled water.

4. Challenge of E. coli with trimethoprim at different pH levels. 180 µL of LB at pH 5, 7 and 9 (± 0.2) were inoculated with 10 µL of 0.5 McFarland standard inocula and 10 µL serial dilutions of trimethoprim in Bioscreen plates. The control samples consisted of 180 µL of LB at pH 5, 7 and 9 (± 0.2) with 10 µL of 0.5 McFarland standard inocula and 10 µL of distilled water.

105

Chapter Two

2.6.1.6 GC-MS

Chemicals for GC-MS analysis All materials were purchased from Sigma-Aldrich unless otherwise stated. Pyridine (extra dry), hexane, methoxylamine hydrochloride, and N-methyl-N-trimethylsilyl- trifluoroacetamide (MSTFA) were obtained from Acros Organics (Loughborough,

UK). The internal standards malonic acid-d2, succinic acid-d4, glycine-d5 and leucine-d4 were purchased from Sigma-Aldrich, as were methanol and water, which were of analytical grade or higher purity.

Sample preparation Samples for FT-IR spectroscopy and GC-MS were collected from the same respective cultures for each sample. For GC-MS, samples inoculated at pH 9 were excluded from the analysis due to the extremely strong effect of the drug at this pH level that prevents the collection of adequate biomass for analysis. For the remaining conditions, 15 mL from each flask was collected and applied for further experiments (Figure S 2.1).

Metabolic Quenching 15 mL from each flask of the overnight culture was collected and added to a double volume of 60% cold methanol (-48 ºC) and mixed quickly (Winder et al., 2008). The quenched culture was centrifuged for 10 min at 4800 g and -8 ºC. 2 mL of the supernatant was collected to assess the leakage of internal metabolites and the remainder was removed rapidly. Further centrifugation was applied on the pellet for 2 min to remove further residual supernatant. The cell pellets and collected supernatants were stored at -80 ºC until metabolite extraction and further analysis (see Figure S 2.1).

Metabolite extraction Methanol was applied as the extraction solvent. The biomass pellets were suspended in 1 mL of 80% methanol (-48 ºC), transferred to 2 mL tubes, flash frozen in liquid nitrogen and placed on wet ice, once semi-defrosted the samples were vortexed thoroughly for approximately 30 s.

106

Chapter Two

The freeze-thaw and vortex cycle was repeated further two times to maximise extraction of intracellular metabolites from within the cells. The suspensions were centrifuged for 5 min at 13000 g and -9 ºC. The supernatants were retrieved to clean 2 mL tubes and kept on dry ice. 500 µL of 80% methanol (-48 ºC) was added to the pellet and the whole procedure was repeated and the second extraction aliquot was combined with the first (on dry ice) and vortexed thoroughly (Figure S 2.1). For GC- MS samples, 825 µL of each extract (normalised to OD and made up with 80% methanol) was spiked with 100 µL of internal standard (0.3 mg/L succinic-d4 acid, malonic-d2 acid and glycine-d5 in HPLC grade water). Quality control (QC) samples were created by combining 60 µL from each sample and mixing thoroughly, the QC mix was divided into 4 QC samples, each containing 606 µL of the QC mix (the OD averaged volume for all experimental replicates). All samples were dried for 16 h using a speed vacuum concentrator operated at 30 ºC (Eppendorf 5301 concentrator, Eppendorf, Cambridge, UK).

GC-MS sample derivatisation A two-step chemical derivatisation was used, due to the non-volatile nature of many metabolite classes within central metabolism. Before analysis of the reaction products with GC-MS, firstly carbonyl moieties were substituted via methoxyamination and secondly there was a silylation reaction. In order to remove residual condensation, samples removed from -80 ºC storage were placed in a speed vacuum concentrator for 30 min. Following this, the extracts were dissolved in 50 µL of 20 mg/mL O-methoxylamine hydrochloride in pyridine. Then, they were vortexed and incubated at 60 ºC for 30 min in a dry-block heater. The next stage was to add 50 μL of MSTFA and the extracts were mixed again and incubated at 60 ºC for 30 min. Once that stage was complete, 20 μL of retention index marker solution was added (0.3 mg/mL docosane, nonadecane, decane, dodecane, and pentadecane in pyridine) before 15 min of centrifugation at 15800 g. The supernatant (100 μL) that was obtained from this process was then transferred to 2 mL amber glass GC-MS vials fitted with 200 µL inserts prior to analysis. With the employed GC-MS analytical method, a throughput of 40 samples per day is possible, so to achieve a higher level of sample chemical stability, randomised batches of 40 samples per day were derivatised across the analysis period with QC samples also derivatised across

107

Chapter Two multiple batches so as to provide a measure of derivatisation (technical) and instrument (analytical) error.

Instrumentation In order to achieve unbiased statistical analysis, it was important to prevent analytical drift in relation to the sample experimental class. This was done by analysing the samples over separate randomised analytical blocks. The QC samples were analysed across the analytical batches at regular intervals. To perform initial column conditioning to the sample matrix, five injections of QC sample were performed, followed by five injections of experimental samples and then a QC injection. At the end of the block run, once all experimental samples had been analysed, three QC injections were made. This procedure was repeated for each daily block of samples („derivatisation blocks‟) until sample analysis was completed. GC- TOF/MS system stability was assessed across both derivatisation batches ran as a single experimental block. This was facilitated by calculating the relative standard deviations (RSDs) of each feature detected within the QC samples and removal of metabolite features showing greater than a 25% deviation, deemed as unreliably detected features. In accordance with the methods of Begley et al. (2009) and Wedge et al. (2011), the equipment used for the GC-TOF/MS analyses included a LECO Pegasus III TOF/MS operated in GC-MS mode (Leco Corp., St. Joseph, MO, USA), with a Gerstel MPS-2 autosampler (Gerstel, Baltimore, MD, USA) and an Agilent 6890N GC × GC with a split/splitless injector and Agilent LPD split-mode inlet liner (Agilent Technologies, Stockport, UK). A 30 m × 0.25 mm × 0.25 μm VF17-MS bonded phase capillary column (Varian, Oxford, UK) was used at a constant helium carrier gas flow of 1 mL/min. The temperature program was as follows: 4 min hold at 70 ºC, 20 ºC/min to 300 ºC, 4 min hold. A split ratio of 4:1 was used for sample injections of 1 μL. The operational temperature of the injector was 280 ºC, and after 30 s, a 25 mL/min gas saver flow was used, and the transfer line was held at 240 ºC. The mass spectrometer had a source temperature of 220 ºC and was operated at 70 eV ionisation energy, and acquired m/z 45-600 at 20 Hz.

GC-TOF/MS data pre-processing and analysis Raw GC-TOF/MS data was processed using the exact method of Begley et al. (2009) and Wedge et al. (2011), which was based on the „Compare‟ capability of

108

Chapter Two

LECO‟s ChromaTOF v3.25 software (Leco Corp., St. Joseph, MO, USA). In this method, a set of reference spectra are compiled for a list of QC representative metabolites, including QC samples from within each analytical block. All later samples are then searched against the particular reference table that has been created. For the reference table to be unbiased, all peaks evident in a representative QC sample that were within an appropriate range for signal/noise (S/N) ratio and chromatographic peak width could be included. Peak identities were assigned, whenever possible, based on similarity matching to mass spectral entries in the NIST (National Institute of Standards and Technology) library and GMD (Golm Metabolome Database) (Kopka et al., 2005) for putative level identification or by comparing the mass spectral and retention index with an in-house generated metabolite library of authentic reference compounds. A peak width of 1.8 s and a minimum S/N ratio of 10 were applied as the peak detection parameters in ChromaTOF. Relative quantification for each metabolite was applied on the basis of internal standards, and retention indices were calculated using retention markers. The quantified peak areas for each metabolite within each sample, along with their metabolite identities and the sample information, were transferred to an XY matrix in Microsoft Excel. Prior to statistical analysis, QC samples were used as in the work of Wedge et al. (2011) to provide data quality assurance by evaluating and eliminating mass features that showed high deviation within QC samples.

109

Chapter Two

180 µL LB Test: 10 µL drug (1) Streak Bacteria on NA plate Control: 10 µL water 18 h incubation (1)

Collect axenic colonies dil. to get exact conc. C1 V1=C2 V2 Bioscreen Plate 1 mL 1 mL 40 mL 1mL Bioscreen Spectrophotometer 18 mL LB Test: 1 mL drug (Labsystems, Basingstoke, UK). Control: 1 mL water Temperature 37 C, continuous 1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it 18 h incubation medium shake, OD 600 nm, stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline measurement interval 10 min.

Take 2 mL from

supernatant (Leakage) 450µL

Remove supernatant and collect the pellet Measure the OD for normalisation

Reconstitute in 1 mL. mL

4800 g (-8 ºC) for 10 min. Normalsaline

Washit with Spot 20µL

Transfer to 2 mL tubes mix) (quick and and (quick Add 30 30 Add (2) Pre chilled (-48 ºC) (80%) methanol

FT-IR plate (ZnSe) Pre chilled (-48 ºC) High throughput screening (HTS) FT- (60%) methanol IR spectroscopic analysis was carried out using a Bruker Equinox 55 infrared spectrometer (Bruker Ltd., Coventry, UK). equipped with an HTX™ module according to the method of Winder et al. (2006). FT- 8 classes: IR spectra were recorded directly pH 5: Control, 0.003, 0.03, 0.2 mg/L from the dried cell biomass in Flash freeze in pH 7: Control, 0.003, 0.03, 0.2 mg/L transmission mode.

Liq. Nitrogen for 1 min.

(80%) methanol (80%) on methanol

cycle

(3 times) (3

Remove Remove samples to until wet defrost ice procedure Freeze/ thaw thaw Freeze/ •Combine the two aliquots, mix and prechilled (3)

of of centrifuge. • add Internal standards uL •Normalise by 80% methanol depending on the OD

remaining remaining the whole repeat pellet and •Dry for an overnight Add 500 500 Add

After the 3 rd Retrieve Dervatise cycle supernatant 2 steps method

centrifuge to clean tube (1) methoxyamine (keep it dry) hydrochloride Vortex for 30 s 13000 g (-9 ºC) for 5 (2) MSTFA min. GC-TOF/MS profiling was performed with a LECO Pegasus III EI-TOF/MS (Begley et al., 2009). Deconvolution of GC-MS profiles was performed within LECO ChromaTOF, metabolite identifications were performed by MS library matching to the NIST05, Max Plank Golm Metabolome Database, and in house libraries.

Figure S 2.1: General scheme of sample preparation including (1) analysis by Bioscreen to determine the MIC of trimethoprim and produce the growth curves of E. coli K-12 at pH 5 and 7 with and without drug challenge. (2) FT-IR analysis of samples after washing with normal saline. (3) GC-MS analysis of samples after quenching and extraction using 60% and 80% cold (-48 ºC) methanol respectively.

110

Chapter Two

2.6.1.7 Trimethoprim quantification from E. coli using LC-MS

All materials were purchased from Sigma-Aldrich unless otherwise stated.

Sample preparation 18 mL of LB medium at pH 5 and 7 in 100 mL conical flasks was inoculated with 1 mL of 0.5 McFarland standard inoculum and 1 mL of 0.8 mg/L of trimethoprim (four times higher than the MIC) added at two time points (time = 0 and 5 h). Control samples were identical except the 1 mL of trimethoprim was substituted with 1 mL of distilled water (dilution solvent of the drug) at the beginning of the experiment. In this experiment, each condition was replicated three times except the control samples which were prepared in six replicates to use three replicates to spike with drug after extraction (explained further below). All samples were incubated for 6 h at 37 ºC and 200 rpm. 15 mL from each flask were collected and were applied for further experiments (Figure S 2.2).

Trimethoprim extraction Quenching metabolism was identical to that used for the GC-MS samples (see above). The metabolite and drug extraction was similar to that applied for GC-MS samples. However, for LC-MS samples, instead of normalisation of the samples to the OD 600 reading by dilution with 80% methanol prior to sample drying, the samples were prepared and analysed in an identical fashion, and the detected relative concentration levels of trimethoprim were then normalised to the OD 600 reading.

Instrumentation For LC-MS analysis, the samples were reconstituted in 100 µL of water, vortex mixed and centrifuged for 15 min at 10000 g. The supernatants were transferred to analytical vials with 200 µL fixed inserts, stored in the autosampler at 5 ºC and analysed within 24 h of reconstitution in LC-MS positive ionisation mode. Ultra high performance LC separations were performed according to the following method on a

Thermo Accela UHPLC system. A Hypersil Gold C18 reversed phase column (100 mm × 2.1 mm × 1.9 µm) was used (Thermo-Fisher Ltd., Hemel Hempsted, UK). The UHPLC was operated at a flow rate of 400 µL/min, the column was

111

Chapter Two maintained at a temperature of 50 ºC. Solvent A (HPLC water and 0.1% formic acid) and solvent B (HPLC methanol and 0.1% formic acid) gradient programme was as follows: 100% A 0–1 min, 100% A–100% B 1–12 min, 100% B 12–20 min, 100% A 20–22 min. Prior to sample analysis, a new LC column was conditioned with the linear gradient conditions for 40 min at a flow rate of 400 µL/min. A sample injection volume of 10 µL was employed in wasteless mode. After each sample analysis, the UHPLC system was conditioned with the initial gradient solvent conditions thus returning the system to a clean state prior to the analysis of the next sample. Autosampler syringe and line washes were performed with 80% methanol. The Thermo LTQ-Orbitrap XL MS system was operated using Xcalibur software (Thermo-Fisher Ltd.) exactly following the method described by Dunn et al. (2008). Prior to the analytical batch runs, the LTQ and Orbitrap were tuned to optimise conditions for the detection of ions in the mid detection range of m/z 100−1000 and calibrated according to the manufacturer‟s predefined methods in both ESI polarities with the manufacturer‟s recommended calibration mixture, consisting of caffeine, sodium dodecyl sulfate, sodium taurocholate, the tetrapeptide MRFA and Ultramark 1621. The Orbitrap was operated in full scan mode at a mass resolution of 30000 (FWHM defined at m/z 400) and a scan speed of 0.4 s. The ESI conditions were optimised to allow efficient ionisation and ion transmission of trimethoprim without causing insource fragmentation leading to the detection of an intact parent mass ion (M+H+) with high sensitivity.

A trimethoprim reference standard was run for the purpose of constructing a twenty point calibration curve (0.05, 0.06, 0.08, 0.09, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.8 mg/L) (Figure S 2.3). The response of the reference standard was compared between the reference standard in solution and also when spiked into an intracellular sample matrix, no major changes in the measured peak area for trimethoprim (M+H+ m/z 291.143) were observed. The intracellular samples and reference standards were run in a completely randomised order. Blank samples were interspersed throughout the analytical run to allow assessments of carry over to be made. The peak areas were calculated for the extracted ion chromatogram of trimethoprim (M+H+ m/z 291.143) and imported into Microsoft Excel.

112

Chapter Two

Data processing Using Microsoft Excel 2007, the peak areas obtained for the trimethoprim reference standard were plotted against the known concentration level, thus producing a calibration curve. The extracted peak areas for trimethoprim within the intracellular extracts were then inferred against the calibration curve, thus revealing the concentration of trimethoprim within the intracellular extract. As a final step, the intracellular concentration levels were then normalised to the OD 600 reading taken from the original bacterial culture, thus providing a normalised non-biased relative quantification value to compare between the intracellular extracts and experimental conditions.

113

Chapter Two

180 µL LB Test: 10 µL drug at T=0 (1) Streak Bacteria on NA plate and 5 h Control: 10 µL water 6 h incubation Collect axenic colonies dil. to get exact conc. C1 V1=C2 V2 Bioscreen Plate 1 mL 1 mL 40 mL 1mL 18 mL LB Bioscreen Spectrophotometer Test: 1 mL drug at T=0 (Labsystems, Basingstoke, UK). and 5 h Temperature 37 C, continuous 1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it Control: 1 mL water medium shake, OD 600 nm, 6 h incubation stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline measurement interval 10 min.

Take 2 mL from supernatant (Leakage)

Remove supernatant and collect the pellet Measure the OD for normalisation Reconstitute in 1 mL. 4800 g (-8 ºC) for 10 min. mL

Transfer to 2 mL tubes mix)

(quick and and (quick Add 30 30 Add Pre chilled (-48 ºC) (80%) methanol

Pre chilled (-48 ºC) (60%) methanol

8 classes: pH 5: Control, 0.8 mg/L at T=0h, 0.8 mg/L at T=5h Flash freeze in pH 5: Control, 0.8 mg/L at T=0h, 0.8 mg/L at T=5h

Liq. Nitrogen for 1 min.

(80%) methanol (80%) on methanol

cycle

(3 times) (3

wet ice until until defrost ice wet Remove samples tosamples Remove procedure Freeze/ thaw thaw Freeze/ •Combine the two aliquots, mix and prechilled (2)

of of centrifuge.

uL

remaining whole the repeat remaining and pellet Add 500 500 Add

After the 3 rd Retrieve Add 100 µL of cycle supernatant HPLC water

centrifuge to clean tube Transfer to 200 µL (keep it dry) analytical vials Vortex for 30 s 13000 g (-9 ºC) for 5 min. Intracellular levels of trimethoprim were relatively quantified by LC-ESI+/MS (UHPLC-LTQ-Orbitrap XL MS, m/z 100-1000, calibrated following manufacturers recommendations), 10 µL injections were ran on a water-methanol (0.1% Formic Acid) reverse phase gradient (Hypersil

Gold C18 100 x 2.1mm 1.9 µm particle size: Thermo-Fisher Ltd.). A Trimethoprim reference standard was ran for the purpouse of constructing a twenty point calibration curve. Peak areas were extracted for the protonated adduct of trimethoprim and normalised to OD within Microsof Excel.

Figure S 2.2: General scheme of sample preparation including (1) analysis by Bioscreen to produce the growth curves of E. coli K-12 at pH 5 and 7 after challenge with 0.8 mg/ L of trimethoprim added at two time points: (I) at the beginning of the lag phase (time = 0 h) and (II) at the mid-exponential phase (time = 5 h). (2) LC-MS analysis of sample extracts, for relative quantification of the intracellular drug levels after quenching and extraction using 60% and 80% cold (-48 ºC) methanol respectively.

114

Chapter Two

140000000

120000000 R² = 0.9881

100000000

80000000

60000000 Peak areaPeak 40000000

20000000

0 0 0.5 1 1.5 2 Concentration (mg/L)

Figure S 2.3: Calibration curve of trimethoprim concentration built from 20 different gradient concentrations of trimethoprim.

115

Chapter Two

2.6.2 Supplementary Results

1.200

1.000

0.800

0.600 Lysogeny broth (LB) Nutrient broth (NB) 0.400 Phage broth (ψ)

0.200 Optical Optical density OD ( 600nm)

0.000 0. 5. 10. 15. 20. 25. Time (hours)

Figure S 2.4: Growth curves of E. coli K-12 in three different media. The blue plot indicates LB; red NB and green Ψ.

1.2

1

0.8 pH 3 0.6 pH 5 0.4 pH 7 pH 9

Optical Optical density (OD 600nm) 0.2

0 0. 5. 10. 15. 20. 25. Time (hours)

Figure S 2.5: Growth curves of E. coli K-12 at four different pH values in the same medium (LB). The blue plot indicates pH 3; red pH5; green pH 7 and purple pH 9.

116

Chapter Two

a b

20 µm 20 µm

c

20 µm

Figure S 2.6: Optical microscopic image (×100 magnification) of E. coli K-12 inoculated in LB medium at different pH levels. (a) pH 5; (b) pH 7; (c) pH 9.

1.000 0.900 0.800 control 0.700 0.600 8 mg/L 0.500 2 mg/L 0.400 0.3 mg/L 0.300 0.2 mg/L 0.200

Optical Optical density (OD 600nm) 0.03 mg/L 0.100 0.003 mg/L 0.000 0. 5. 10. 15. 20. 25. Time (hours)

Figure S 2.7: Growth curves of E. coli K-12 (at pH 7 in LB) exposed to different concentrations of trimethoprim. Blue indicates control samples (0 mg/L); red 8 mg/L; green 2 mg/L; purple 0.3 mg/L; turquoise 0.2 mg/L; orange 0.03 mg/L and light blue 0.003 mg/L.

117

Chapter Two

a b 1.000 0.900 0.800 0.700 0.600 control 0.500 0.003 mg/L 0.400 0.03 mg/L 0.300

0.200 0.2 mg/L Optical Optical density (OD 600nm) 0.100 0.000 0. 5. 10. 15. 20. 25. 30. Time (hours)

c d 1.200 1.000 0.900 1.000 0.800 0.700 0.800 0.600 control control 0.600 0.500 0.003 mg/L 0.003 mg/L 0.400 0.400 0.03 mg/L 0.03 mg/L 0.300

0.2 mg/L 0.200 0.2 mg/L Optical Optical density (OD 600nm) 0.200 Optical density (OD 600nm) 0.100 0.000 0.000 0. 5. 10. 15. 20. 25. 30. 0. 5. 10. 15. 20. 25. 30. Time (hours) Time (hours)

Figure S 2.8: (a) Chemical structure of trimethoprim (blue circles show the main ionisation points on the structure in acidic media). Growth curves of E. coli K-12 exposed to different concentrations of trimethoprim. Blue indicates control samples (0 mg/L); red 0.003 mg/L; green 0.03 mg/L and purple 0.2 mg/L at (b) pH 9, (c) pH 7 and (d) at pH 5.

a b 0.6 0.35 0.3 0.5 0.25 0.4 0.2 0.3 0.15 0.2

0.1

Absorbance (arbitrary) Absorbance (arbitrary) 0.1 0.05

0 0 4000 3500 3000 2500 2000 1500 1000 500 4000 3500 3000 2500 2000 1500 1000 500 -1 -1 c Wavenumber (cm ) d Wavenumber (cm ) 0.35

0.3 0.4

0.25 0.3 0.2

0.15 0.2

0.1 Absorbance (arbitrary) Absorbance (arbitrary) 0.1 0.05

0 0 4000 3500 3000 2500 2000 1500 1000 500 4000 3500 3000 2500 2000 1500 1000 500 Wavenumber (cm-1) Wavenumber (cm-1)

Figure S 2.9: FT-IR spectra obtained from E. coli K-12. (a) After exposure to four concentrations of trimethoprim (0.2, 0.03, 0.003 and 0 mg/L) at three different pH values (pH 5, 7 and 9). There were six biological replicates for each condition; each replicate was analysed three times, totalling 18 spectra for each condition (total number of spectra = 216). (b) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 9 (total number of spectra = 72). (c) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 7 (total number of spectra = 72). (d) After exposure to different concentrations of trimethoprim (0, 0.003, 0.03, 0.2 mg/L) at pH 5 (total number of spectra = 72).

118

Chapter Two

a b 0.3

0.4 0.25

0.3 0.2

0.15 0.2

0.1 Absorbance (arbitrary) Absorbance (arbitrary) 0.1 0.05 0 4000 3500 3000 2500 2000 1500 1000 500 4000 3500 3000 2500 2000 1500 1000 500 Wavenumber (cm-1) Wavenumber (cm-1) Figure S 2.10: (a) FT-IR spectra obtained from E. coli K-12 after exposure to four concentrations of trimethoprim (0.2, 0.03, 0.003 and 0 mg/L) at two different pH values (5 and 7). (b) FT-IR spectra -1 post CO2 removed at ≈ 2350 cm and EMSC scaling (Martens et al., 2003).

a b

c d

c d

Figure S 2.11: FT-IR multi-block PCA scores plot showing the distribution of samples with different pH levels at different drug concentrations: (a) 0 mg/L, (b) 0.003 mg/L, (c) 0.03 mg/L and (d) 0.2 mg/L.

119

Chapter Two

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

Valine Valine

Proline

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

Thymine

Tyrosine

Ornithine

Histidine

pH50.003

pH50.003

Tryptophan

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

Phenylalanine

Nicotinamide

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

0

0

0

1

0

1

2

0

1

0

1

2

0

1

2

3

4

5

6

7

8

9

0.1

0.2

0.3

0.4

0.1

0

1

0.2

0.4

0.6

0.8

1.2

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0

1

2

3

4

5

6

7

8

0.2

0.4

0.6

0.8

1.2

1.4

1.6

10

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0.05

0.15

0.25

0.35

0.02

0.04

0.06

0.08

0.12

0.14

0.2

0.4

0.6

0.8

1.2

1.4 1.6

omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

18

26

21

24

18

19

20

22

23

25

pH70.2

pH70.2

20

pH70.3

pH70.3

pH70.003

pH70.003

19

pH7Control

pH7Control

pH50.2

pH50.2

pH50.03

pH50.03

Leucine Leucine

Guanine

1

21

pH50.003

pH50.003

22

17

21

pH5Control

pH5Control

0

0

5

10

15

20

25

30

35

0.1

0.2

0.3

0.4

0.5

0.6

0.7 0.8

omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

2

17

1

17

9

16

12

3

15

pH70.2

pH70.2

4

16

pH70.3

26

pH70.3

pH70.003

pH70.003

pH7Control

pH7Control

24

pH50.2

pH50.2

Inosine

pH50.03

pH50.03

23

pH50.003

pH50.003

Glutamicacid

pH5Control

pH5Control

0

0

1

2

0.1

0.02

0.04

0.06

0.08

0.12

0.14

0.16

0.5

1.5

2.5 14

omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

2

16

25

pH70.2

pH70.2

9

pH70.3

pH70.3

10

5

8

pH70.003

pH70.003

5

pH7Control

pH7Control

13

9

pH50.2

pH50.2

6

pH50.03

pH50.03

Adenine

13

Methionine

pH50.003

pH50.003

7

pH5Control

pH5Control

0

0

0.1

0.2

0.3

0.4

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.05

0.15

0.25 0.35

omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

3

15

5

11

pH70.2

pH70.2

pH70.3

pH70.3

pH70.003

pH70.003

10

pH7Control

pH7Control

7

pH50.2

pH50.2

11

Uracil

Lysine

pH50.03

pH50.03

pH50.003

pH50.003

pH5Control

pH5Control

0

0.1

0

1

2

3

4

5

6

0.02

0.04

0.06

0.08

0.12 0.14

omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

4

14

pH70.2

pH70.3

pH70.003

0.2

pH 7 pH

pH7Control

pH50.2

0.03

pH 7 pH

pH50.03

Glucose

pH50.003

pH5Control

pH 7 pH

0.003

0

0.1

0.02

0.04

0.06

0.08

0.12

0.14

0.16 0.18

omlsd ek Area Peak Normalised

5

pH 7 pH

control

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

pH70.2

0.2

pH 5 pH

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.3

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

pH70.003

A c i d i t y

0.03

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH7Control

pH 5 pH

pH7Control

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.2

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

pH50.03

Alanine Alanine

Fructose

Isoleucine

pH 5 pH

0.003

Malic acid Malic

Lacticacid

Citricacid

Trehalose

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

pH50.003

Aspartic Acid Aspartic

pH50.003

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

pH5Control

0

0

0

0

0

1

2

3

0

1

2

3

0.1

0

1

2

3

4

0.1

0.2

0.1

0.1

0.02

0.04

0.06

0.08

0.12

0.14

0.16

0.18

0.02

0.04

0.06

0.08

0.12

0.14

0.16

0.5

1.5

2.5

3.5

0.5

1.5

2.5

0.02

0.04

0.06

0.08

0.12

0.14

0

1

2

3

4

5

6

7

8

0.02

0.04

0.06

0.08

0.12

0.14

0.16

0.18

0.5

1.5

2.5

3.5

8

pH 5 pH

9 7

omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised omlsd ek Area Peak Normalised

omlsd ek Area Peak Normalised

13

11

12

10 control 6 Figure S 2.12: KEGG metabolic pathway of E. coli K-12 MG1655 highlighting significant metabolites with their relative levels subjected to different concentrations of trimethoprim at different pH levels (depicted in Materials and Methods). For significantly changed metabolites between pH and drug levels see Table S 2.2.

Table S 2.1: List of metabolites detected by GC-MS after extraction from control and stressed E. coli K-12 with trimethoprim at two different pH levels (5 and 7). See additional information on the attached CD.

Table S 2.2: ANOVA analysis of the significantly changed metabolites between different conditions (attached CD).

120

Chapter Two

2.6.3 Supplementary References

Aagaard, J., Gasser, T., Rhodes, P. & Madsen, P. O. 1991. MICs of ciprofloxacin and trimethoprim for Escherichia coli: influence of pH, inoculum size and various body fluids. Infection, 19, S167-S169. Andrews, J. M. 2001. Determination of minimum inhibitory concentrations. Journal of Antimicrobial Chemotherapy, 48, 5-16. Begley, P., Francis-McIntyre, S., Dunn, W. B., Broadhurst, D. I., Halsall, A., Tseng, A., Knowles, J., Goodacre, R. & Kell, D. B. 2009. Development and performance of a gas chromatography−time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Analytical Chemistry, 81, 7038- 7046. Blattner, F. R., Plunkett, G., Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Colladovides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B. & Shao, Y. 1997. The complete genome sequence of Escherichia coli K-12. Science, 277, 1453-1462. Dunn, W. B., Broadhurst, D., Brown, M., Baker, P. N., Redman, C. W. G., Kenny, L. C. & Kell, D. B. 2008. Metabolic profiling of serum using ultra performance liquid chromatography and the LTQ-Orbitrap mass spectrometry system. Journal of Chromatography B, 871, 288-298. Kopka, J., Schauer, N., Krueger, S., Birkemeyer, C., Usadel, B., Bergmuller, E., Dormann, P., Weckwerth, W., Gibon, Y., Stitt, M., Willmitzer, L., Fernie, A. R. & Steinhauser, D. 2005. [email protected]: the Golm metabolome database. Bioinformatics, 21, 1635-1638. Madigan, M., Marktinko, J. & Parker, J. 2000. Brock biology of microorganisms, London, Prentice Hall International. Martens, H., Nielsen, J. P. & Engelsen, S. B. 2003. Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Analytical Chemistry, 75, 394-404. Wedge, D. C., Allwood, J. W., Dunn, W., Vaughan, A. A., Simpson, K., Brown, M., Priest, L., Blackhall, F. H., Whetton, A. D., Dive, C. & Goodacre, R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Analytical Chemistry, 83, 6689- 6697. Winder, C. L., Dunn, W. B., Schuler, S., Broadhurst, D., Jarvis, R., Stephens, G. M. & Goodacre, R. 2008. Global metabolic profiling of Escherichia coli cultures: an evaluation of methods for quenching and extraction of intracellular metabolites. Analytical Chemistry, 80, 2939-2948.

121

Chapter Three

3 Chapter Three

High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy

Haitham AlRabiah1, Elon Correa1, Mathew Upton2,3 and Royston Goodacre1*

1School of Chemistry and Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK

2School of Medicine, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PL, UK

3Current address: Plymouth University Peninsula Schools of Medicine and Dentistry, Plymouth University, Drake Circus, Plymouth, PL4 8AA, UK

*Correspondence to Roy Goodacre: [email protected]

This chapter constitutes a published research article:

AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369.

Elon Correa participated in data analysis and Mathew Upton provided the bacterial samples studied in this work. Royston Goodacre contributed to this work through guiding and supervising the investigation.

122

Chapter Three

Abstract Fourier transform infrared (FT-IR) spectroscopy is an established rapid whole- organism fingerprinting method that generates metabolic fingerprints from bacteria that reflect the phenotype of the microorganism under investigation. However, whilst FT-IR spectroscopy is fast (typically 10 s − 1 min per sample), the approaches for microbial sample preparation can be time consuming as plate culture or shake flasks are used for growth of the organism. We report a new approach that allows micro- cultivation of bacteria from low volumes (typically 200 µL) to be coupled with FT- IR spectroscopy. This approach is fast and easy to perform and gives equivalent data to the lengthier and more expensive shake flask cultivations (sample volume = 20 mL). With this micro-culture approach, we also demonstrate high reproducibility of the metabolic fingerprints. The approach allowed separation of different isolates of Escherichia coli involved in urinary tract infection, including members of the globally disseminated ST131 clone, with respect to both genotype and resistance or otherwise to the antibiotic ciprofloxacin.

123

Chapter Three

3.1 Introduction One of the most common infections in humans, and in females in particular, is urinary tract infection (UTI). Research has led to the assumption that worldwide UTI will affect approximately 50% of women at some point in their lives (Fihn, 2003; Griebling, 2005). Of these, approximately 50% will have recurrent UTI and some of these women will suffer from chronic UTIs (Scholes et al., 2000). Escherichia coli is the microorganism most commonly contributing to UTI (both community and nosocomial (an infection acquired whilst in hospital)) (Baerheim, 2001). In an international survey of midstream urine samples taken at 252 centres in 17 countries, it was found that 77% of all isolates, 80% of general infections and 40% of nosocomial infections could be attributed to E. coli (Kahlmeter, 2003; Ronald, 2003). More recent molecular epidemiological analyses indicate that a relatively limited number of clones of uropathogenic E. coli (UPEC) cause the majority of UTI and members of these clones are often multi-drug resistant (Lau et al., 2008). A particularly important clone is sequence type (ST)131, which is globally disseminated (Rogers et al., 2011).

In order to identify UTI pathogens, traditional biochemical methods based on growth and nutritional properties are used. However, in general these are protracted (based on the growth rate of the pathogen), which can compromise the effectiveness of treatment of infections. Whilst recent molecular techniques have led to more rapid solutions for characterising microorganisms, unfortunately they tend to rely on already known DNA sequences for identification and the costs incurred by such methods, coupled with the highly specialised equipment render them impractical for use in routine laboratories (Kastanos et al., 2009; Winder and Goodacre, 2004). Therefore, fast, accurate and automated methods that are relatively inexpensive need to be developed. The ideal technique would be successful in identifying sources of infection effectively as well as making differentiation to sub-species level possible, for epidemiological purposes. One possible solution for this so-called whole organism fingerprinting is Fourier transform infrared (FT-IR) spectroscopy, which is an effective technique for bacterial identification and discrimination. The advantages of this method are that it is non-destructive and can create distinct spectral fingerprints for different microorganisms (Helm et al., 1991; Naumann et al., 1991; Meyers, 2000). The FT-IR spectra generated from bacteria can be used to examine

124

Chapter Three cell components such as proteins, nucleic acids, carbohydrates and lipids, and are manifest due to the vibrational modes of different functional group complements within these biochemical classes (Stuart, 2004). Research has shown that FT-IR spectroscopy, alongside multivariate data analysis, can give quick, user friendly and relatively inexpensive screening with the ability to differentiate bacteria at different taxonomic levels (Garip et al., 2007; Goodacre et al., 1996; Lin et al., 2005; Mariey et al., 2001; Maquelin et al., 2002).

Current methods for FT-IR spectroscopy require that the bacteria are grown and then sampled and this can be labour intensive when many different bacterial isolates need to be analysed. Whilst Naumann and colleagues have increased the speed of analysis using micro-cultivation and infrared analysis via FT-IR microscopy, this unfortunately requires expensive instrumentation and quite detailed image analysis for automation (Choo-Smith et al., 2001). Thus, the aim of this study was to develop a high-throughput methodology that readily couples bacterial growth requirements with FT-IR spectroscopy. This approach we illustrate for the discrimination between UPEC isolates with different genetic sequence types and different susceptibilities to quinolones; a group of antibiotics that inhibit nucleic acid synthesis directly by targeting DNA gyrase and topoisomerase IV. The developed method is rapid, less labour intensive than larger scale approaches and up to 200 samples can be simultaneously analysed under identical growth conditions, which is a prerequisite for whole organism fingerprinting methods that measure bacterial phenotype rather than genotype per se.

125

Chapter Three

3.2 Materials and Methods

3.2.1 Microorganisms Ten UPEC isolates were recovered from urine samples submitted to the bacteriology laboratory at Central Manchester Foundation Trust. These consisted of two isolates that were quinolone sensitive (isolates 173 and 191 of ST131) and eight isolates that were quinolone resistant (isolates 160, 163 and 164 of ST131; isolates 152, 161, 162, 169 and 171 that were non-ST131).

Each isolate was streaked on an agar plate to obtain axenic colonies. Biomass was collected from these single colonies to prepare 1 mL 20% [v/v] glycerol working bacterial stocks, which were stored at -20 ºC. Isolates were routinely cultured in lysogeny broth (LB) medium at 37 ºC for 18 h.

3.2.2 Media Lysogeny broth (LB) was prepared by dissolving 10 g of tryptone (Formedia, Hunstanton, UK), 5 g of yeast extract (Amersham Life Sciences, Cleveland, USA) and 10 g of sodium chloride (Fisher Scientific Ltd., Loughborough, UK) in 1 L of reverse osmosis water and after this it was autoclaved (121 ºC, 15 min and 15 psi).

3.2.3 Antibiotics In order to calculate the minimum inhibitory concentration (MIC) for ciprofloxacin, 100 mg of the hydrochloride salt (Discovery fine chemicals, Dorset, UK) was dissolved in 50 mL distilled water. The final concentrations tested were: 100, 80, 50, 40, 25, 20, 12.5, 10, 8, 6.25, 5, 3.125, 2.5, 1.56, 1.25, 1, 0.78, 0.625, 0.5, 0.39, 0.3125, 0.3, 0.25, 0.1563, 0.1, 0.078, 0.05, 0.039, 0.03, 0.025, 0.02, 0.0195, and 0.0025 mg/L according to Andrews (2001).

3.2.4 Growth conditions For the start of each experiment, 49 mL of medium was inoculated with 1 mL of working stock from the freezer and incubated at 37 ºC in a shaking incubator at 200 rpm for 24 h. After incubation overnight, cultures (1 mL from each) were diluted with 49 mL fresh media and further incubated at 37 ºC, 200 rpm for 1 h. These new

126

Chapter Three axenic cultures were diluted to 0.5 McFarland standard optical density (OD) at 600 nm using Biomate 5 spectrophotometer (Thermo, Hemel Hempstead, UK) and used as experimental inocula. Briefly, 40 mL of the culture was diluted and washed two times with physiological saline (0.9% [w/v] NaCl). The bacterial turbidity was adjusted to be equivalent to a 0.5 McFarland standard (OD 0.1 ± 0.02) optical density (OD) at 600 nm using Biomate 5.

The bacterial growth curves (typically n = 5 per experiment) were measured as OD at 600 nm in a Bioscreen spectrophotometer (Labsystems, Basingstoke, UK). This „Bioscreen‟ was run at the following settings: 10 min preheating, then an incubation temperature of 37 ºC, with continuous medium shake, measurement interval of 10 min and in general 18 h for the total experiment. Typical growth curves, including when bacteria were exposed to antibiotics, are shown in Figure S 3.1.

3.2.5 FT-IR spectroscopy

3.2.5.1 Sample preparation Five different methods of sample preparation were investigated in order to develop a high-throughput method for discrimination between E. coli isolates of different clonal lineages; see Figure S 3.2 and S 3.3 for cartoons of this process.

3.2.5.1.1 Method 1 19 mL of LB medium was inoculated with 1 mL of 0.5 McFarland standard equivalent inocula of pathogenic isolates in 100 mL flasks and incubated for 18 h at 37 ºC and 200 rpm. 2 mL from each culture was collected and 450 µL from this volume was washed 3 times with normal saline and re-suspended in 400 µL of normal saline. 20 µL was spotted onto a zinc selenide (ZnSe) plate (Bruker Ltd, Coventry, UK) and oven dried at 40 ºC for 45 min.

3.2.5.1.2 Method 2 19 mL of LB medium was inoculated with 1 mL of 0.5 McFarland standard inocula of pathogenic isolates in 100 mL flasks and incubated for 18 h at 37 ºC and 200 rpm.

127

Chapter Three

20 µL from each flask was spotted directly onto a ZnSe plate and oven dried at 40 ºC for 45 min.

3.2.5.1.3 Method 3 190 µL of LB medium was inoculated with 10 µL of 0.5 McFarland standard inocula of pathogenic isolates in a Bioscreen plate and incubated as described above (Section 3.2.4). In order to produce enough biomass, each condition was cultured in 5 wells and 150 µL from each well was collected and mixed. 20 µL from each well was collected and spotted directly onto a ZnSe plate and oven dried at 40 ºC for 45 min.

3.2.5.1.4 Method 4 This was identical to Method 3 except that after collection (450 µL) the biomass was washed 3 times with normal saline and re-suspended in 400 µL of normal saline. 20 µL was spotted onto a ZnSe plate and oven dried at 40 ºC for 45 min.

3.2.5.1.5 Method 5 The supernatants from Method 4 were also collected and 20 µL was spotted on a ZnSe plate and oven dried at 40 ºC for 45 min.

3.2.5.2 Instrumentation Prior to analysis, 96-well ZnSe plates were washed with 10% sodium dodecyl sulfate (SDS) solution, rinsed three times with deionised water and then rinsed three times with analytical grade propan-2-ol. The plates were finally rinsed with deionised water and air-dried at room temperature.

High-throughput screening (HTS) FT-IR spectroscopic analysis was carried out using a Bruker Equinox 55 infrared spectrometer (Bruker Ltd.) equipped with an HTX™ module using previously published methods (Winder et al., 2006). FT-IR spectra were recorded directly from the dried cell biomass in transmission mode using a deuterated triglycine sulfate (DTGS) detector. A background spectrum was collected for each measurement from the reference well position (A1, 94104,

128

Chapter Three

18720 µm) of the ZnSe plates. All spectra were obtained in the 4000–600 cm-1 range, and 64 scans were acquired at 4 cm-1 resolution. These experimental conditions were maintained during all measurements. Spectra acquisition and spectral background subtractions were performed using OPUS software (Bruker Ltd.). The FT-IR data were converted and analysed by MATLAB 2010a (The Mathworks Inc., Natwick, USA) and R version 2.13.1 (R Foundation for Statistical Computing, Vienna, Austria).

3.2.5.3 Data analysis -1 For FT-IR data, CO2 signals were removed from the spectra at (i) 2400-2275 cm (and filled with a trend) and also (ii) below 700 cm-1 were removed (Winder et al., 2006) because they might be considered to be misleading by causing spurious variations on multivariate analysis. FT-IR data were then baseline corrected according to the extended multiplicative signal correction (EMSC) scaling method (Martens et al., 2003). The EMSC method was originally developed to reduce the disturbing effect of light-scattering due to small particles scattering light more than larger ones (Naes et al., 1990). This type of normalisation takes the information registered in the spectra and attempts to separate physical light-scattering effects from the actual light absorbed by molecules.

Data were then analysed using principal component analysis (PCA). PCA is an unsupervised method with no a priori knowledge of experimental structure and is used to reduce the dimensionality of the data. The objective of PCA is to explain the variance-covariance structure of a set of variables through a few linear combinations of these variables (Johnson and Wichern, 2007). Much of the original data variability can be accounted for by a small number of principal components (PCs), which are then used for data reduction and visual data interpretation. The PCA results are discussed in terms of PCA scores and loadings; the PCs are the transformed variable values and the loadings are the weights by which the original data variables should be multiplied to obtain the component scores. After PCA, a supervised method, known as discriminant function analysis (DFA) was applied to the PCs (i.e. PC- DFA). PC-DFA depends on the prior knowledge of experimental structure (i.e. the experimental class structure) and retained PCs to discriminate between groups (different classes) (Manly, 1994; Windig et al., 1983). DFA is a supervised

129

Chapter Three technique that discriminates groups using a priori knowledge of class membership. The algorithm works to maximise between-group variance and minimise within- group variance (Varmuza and Filzmoser, 2009). In the present work, the PCs were used as inputs for DFA and the results were validated by 1000 bootstrap cross- validations. In this process, 50 PC-DFA models were built which had from the 1st to the nth PC selected (where n was set to a maximum of 50). Each of the models was rigorously tested using re-sampling methods to check that the clustering was not over-fitted. Bootstrap is a re-sampling technique that can be applied as cross- validation to estimate the prediction performance of a model. The basic idea of this method is to select randomly, with replacement, N samples from a set containing exactly N samples. All selected samples, including the repetitions, are then used as a training set and the non-selected samples are used as a test set (Efron, 1981). One can think of this as having all samples analysed (N = ||X|| for our case) in a bag. A single sample is then taken out of the bag randomly and its number noted – this sample now forms part of the training data, and the sample is placed back into the bag. This random sample picking process is repeated until ||X|| samples are in the training set. Some samples will be used multiple times, and on average 63.2% of all of the samples will have been selected for training, and the remaining 36.8% will be used as the test data.

For the best model, which was typically built using the first 30 PCs, 1000 bootstraps were conducted (i.e. 1000 PC-DFA models were generated) and statistics performed on the test set only. We calculated 95% confidence interval (CI) and in the PC-DFA scores plots quote in parentheses the lower and upper bounds of the CI obtained for all 1000 models, we also provide the chi-square (2) statistics computed on the contingency table represented by the classification matrix, also known as confusion matrix, built over the 1000 models. The objective of this 2 test was to compare the difference between the mean expected (real instances) and the mean observed (predicted instances) values in terms of the true positive, true negative, false positive and false negative cells of the classification matrix. The null hypothesis is that the expected and observed means are equal, which would suggest that the models are accurate. Therefore, the higher the p-value of the 2 statistics (closer to 1) the stronger the evidence is to fail to reject the null hypothesis and thus the models are considered statistically valid.

130

Chapter Three

Conversion of FT-IR absorbance spectra to an XY data matrix and the following multivariate data analyses were performed in MATLAB 2010a and R version 2.13. All scripts used for data analysis are available from the authors on request.

131

Chapter Three

3.3 Results and Discussion Pathogenic isolates of E. coli of different sequence types (ST131 and non-ST131) were challenged with different concentrations of ciprofloxacin hydrochloride, which is completely soluble in water (Andrews, 2001), to determine the susceptibility to the drug and the MIC of each isolate (Figure S 3.1 displays example growth curves). It was found that isolates 160, 161, 163 and 169 were fully resistant, whilst isolates 173 and 191 were fully sensitive. In addition, isolates 152, 162, 164 and 171 showed different levels of susceptibility to ciprofloxacin (Table 3.1).

Table 3.1: MIC of ciprofloxacin hydrochloride for pathogenic isolates of E. coli which have different sequence types (ST131 and non-ST131) with different susceptibilities to quinolones

Isolate number Sequence type Quinolone phenotype MIC range determination (mg/L)*

152 non-ST131 R 0.25-0.1

161 non-ST131 R No effect

162 non-ST131 R 12.5-10

169 non-ST131 R No effect

171 non-ST131 R 0.5-0.25

160 ST131 R No effect

163 ST131 R No effect

164 ST131 R Approx. 50

173 ST131 S 0.03-0.02

191 ST131 S 0.05-0.03 * The highest concentration used from ciprofloxacin hydrochloride is 100 mg/L; R and S indicate resistance or susceptibility to quinolones, respectively.

Initial experiments in micro-culture on the Bioscreen instrument were conducted using a modified method of sample preparation on just two isolates (162 and 163); these were chosen as they had different gene sequences and the results were compared with those of the standard shake flask method. The first step in modifying the conditions was the use of a Bioscreen plate rather than flasks, which has some advantages over the usual method of preparing samples in shaking flasks. First, it is less laborious because the growth curves are generated directly and the same culture can then be used for analysis by FT-IR spectroscopy. This allows more confidence as

132

Chapter Three growth profile and FT-IR spectra are from exactly the same biological replicate. It is also more accurate because the automation process in the Bioscreen eliminates errors and reduces the variation that might be produced by manual techniques. Another advantage is that the effort of the manual technique is reduced. Finally, Bioscreens have enough room to handle two 100-well sterile honeycomb plates, making it possible to run 200 samples at the same time under exactly the same conditions, which is difficult when following the classical microbial growth method.

To compare the new method and the shake flask method on isolates 162 and 163, each was cultured in multiple biological replicates both in Bioscreen plates and in 100 mL flasks, under the same conditions. After growth, the biomass was collected (Figure S 3.2) and analysed using FT-IR spectroscopy either directly or after washing with physiological saline. The data from FT-IR were subjected to PCA and the four scores plots from each of the four different conditions are reported in the Supplementary Information (Figure S 3.4). An additional plot of all cultures together (Figure S 3.5) also helps illustrate the effects of washing and of culturing in Bioscreen plates versus shake flasks. A clear distinction between these isolates regardless of the method of culturing can be observed and this is irrespective of the culture volume of the two methods: 20 mL for the flask cultures compared to just 200 µL in the Bioscreen micro-culture instrument.

As the previous experiments showed that micro-culture and flasks gave equivalent clustering and considering the advantages of the micro-cultivation strategy, this approach was explored further. Another series of experiments on ten isolates was conducted to determine the effects of biomass washing by analysing the samples in three different preparation methods: direct analysis (non-washed samples), washed samples, and the supernatants from the washed samples were also analysed by FT-IR spectroscopy and the resulting spectroscopic fingerprints compared using cluster analysis.

133

Chapter Three

a

b

Figure 3.1: (a) Raw FT-IR spectra obtained from isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191 before washing the bacterial cells; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plot of non-washed samples (30 PCs were extracted from PCA and used as inputs to DFA, these 30 PCs explain 99.97% of the totals explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations.

The FT-IR spectra from the direct analysis of the bacterial biomass is shown in Figure 3.1 a and these all display very similar features and this demonstrates the need for cluster analysis using PC-DFA. It can be see clearly from Figure 3.1 b that

134

Chapter Three isolates of ST131 formed a definite and tight cluster that was located on the right (positive side) of DF1, while non-ST131 isolates were spread on the left (negative side) of DF1. This separation according to ST was encouraging from these non- washed samples and was not observed in either the supernatants (Figure S 3.6) or washed samples (Figure S 3.7). Further analysis of the STs individually from the non-washed samples (Figure 3.2) revealed that there did appear to be some separation according to whether the E. coli isolates were resistant or otherwise to ciprofloxacin. This was not observed in the washed bacterial cells or their supernatants (data not shown).

If we inspect Figure 3.2 in more detail, it can be readily observed that the sensitive isolates (red symbols) were generally closer to each other compared with the resistant isolates (blue symbols) in both ST131 and non-ST131 bacteria. With respect to ST131 isolates (Figure 3.2 a), another potential advantage of this FT-IR spectroscopic analysis is that the distribution of these isolates with the same genetic sequence might depend on their susceptibility to ciprofloxacin since the sensitive ones are separated from resistant isolates in the second discriminant function (DF2).

In simple terms, the main differences between these sample preparation methods are that the direct analysis will measure everything: the bacteria themselves as well as any effect they have had on medium consumption and bacterial excretions (exometabolome) whilst during the washing process these two factors are recovered and analysed separately. As the differences between the direct analysis and the other two sample preparation methods are so profound, they do require some explanation.

The first area where difference may occur is due to changes to the medium during growth. The non-washed samples of isolates from both ST groups contained the media and the discrimination might be due to members of one ST preferentially consuming specific ingredients in the medium more than the other ST (Gibreel et al., 2012). Whilst differentiation was not obviously manifest in the supernatant analysis (Figure S 3.6), combination of the supernatant with the cells may have contributed to spectral differences clearly observed with respect to ST (Figure 3.1).

135

Chapter Three

a

b

Figure 3.2: PC-DFA scores plot of non-washed method for (a) ST131 and (b) non-ST131 isolates. In both cases, 30 PCs were extracted from PCA and used as inputs to DFA. Those 30 PCs explain 99.98% of the data variance on both data sets ST131 and non-ST131. CI, confidence interval.

136

Chapter Three

It has been reported previously that different E. coli strains utilise different compounds (Klemperer et al., 1979; Olukoya, 1986), which may support the above hypothesis and we have recently demonstrated that UPEC from different STs vary significantly in their metabolic potential, indicating the importance of biochemical variation between isolates (Gibreel et al., 2012).

Secondly, the washing process (even in physiological saline; 0.9% [w/v] NaCl) may cause variations in the biomass due to stress and lysis. The third and most important area concerns the impact of secretions from the isolates. The exometabolome has been the subject of studies in many different fields, such as clinical medicine (Dunn et al., 2009) and plant-pathology (Allwood et al., 2010) as well as microbiology (Behrends et al., 2009; Kaderbhai et al., 2003; Mapelli et al., 2008). In functional genomics studies, where the aim is to assign function to genes that lack annotation, metabolic footprinting can be used to assess the effect of gene knockouts on the phenotype of microbial mutants. Thus, it has been demonstrated that using analytical techniques such as FT-IR spectroscopy and mass spectrometry (GC-MS and DIMS), it is possible to discriminate between different E. coli and yeast isolates with gene mutations (Allen et al., 2003; Kaderbhai et al., 2003; Mas et al., 2007).

All three suggestions could be valid, and in order to understand more about why the washing of cells alters the phenotypic information and hence clustering, this requires further investigation. This could include using less complex media and the supernatant needs to be analysed by higher resolution analysis afforded by mass spectrometry to see if certain metabolites in the bacterial footprint correspond to different ST or antibiotic phenotype.

137

Chapter Three

3.4 Conclusion A modified method of sample preparation for FT-IR spectroscopy was examined and applied to a group of UPEC. The first step was to compare between micro-culturing in a Bioscreen and traditional shake flask culture, and equivalent phenotypes were revealed using FT-IR spectroscopy. Following this, further whole organism fingerprinting analyses established that the most information rich and informative FT-IR spectra were generated when biomass was analysed directly from the micro- cultures. We therefore have confidence in the validity and reproducibility of this approach and believe it can be utilised to handle large numbers of bacterial isolates without the need for sample washing.

From this study, it can be seen that a high-throughput micro-culture method coupled with FT-IR spectroscopy enabled discrimination between uropathogenic bacterial isolates from important clones. Due to the high-throughput nature of this approach, it has the major advantage that one is now able to analyse several hundred or even thousands of bacterial samples per day. In addition, it is non-destructive and inexpensive, requiring only simple sample preparation. It can therefore be concluded that this method would be suitable for future FT-IR spectroscopic based investigations of bacteria for identification or characterisation purposes.

138

Chapter Three

3.5 References Allen, J., Davey, H. M., Broadhurst, D., Heald, J. K., Rowland, J. J., Oliver, S. G. & Kell, D. B. 2003. High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nature Biotechnology, 21, 692-696. Allwood, J. W., Clarke, A., Goodacre, R. & Mur, L. A. J. 2010. Dual metabolomics: a novel approach to understanding plant-pathogen interactions. Phytochemistry, 71, 590- 597. Andrews, J. M. 2001. Determination of minimum inhibitory concentrations. Journal of Antimicrobial Chemotherapy, 48, 5-16. Baerheim, A. 2001. Empirical treatment of uncomplicated cystitis. Keep it simple. British Medical Journal, 323, 1197-1198. Behrends, V., Ebbels, T. M. D., Williams, H. D. & Bundy, J. G. 2009. Time-resolved metabolic footprinting for nonlinear modeling of bacterial substrate utilisation. Applied and Environmental Microbiology, 75, 2453-2463. Choo-Smith, L. P., Maquelin, K., van Vreeswijk, T., Bruining, H. A., Puppels, G. J., Thi, N. A. G., Kirschner, C., Naumann, D., Ami, D., Villa, A. M., Orsini, F., Doglia, S. M., Lamfarraj, H., Sockalingum, G. D., Manfait, M., Allouch, P. & Endtz, H. P. 2001. Investigating microbial (micro)colony heterogeneity by vibrational spectroscopy. Applied and Environmental Microbiology, 67, 1461-1469. Dunn, W. B., Brown, M., Worton, S. A., Crocker, I. P., Broadhurst, D., Horgan, R., Kenny, L. C., Baker, P. N., Kell, D. B. & Heazell, A. E. P. 2009. Changes in the metabolic footprint of placental explant-conditioned culture medium identifies metabolic disturbances related to hypoxia and pre-eclampsia. Placenta, 30, 974-980. Efron, B. 1981. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika, 68, 589-599. Fihn, S. D. 2003. Acute uncomplicated urinary tract infection in women. The New England Journal of Medicine, 349, 259-266. Garip, S., Bozoglu, F. & Severcan, F. 2007. Differentiation of mesophilic and thermophilic bacteria with Fourier transform infrared spectroscopy. Applied Spectroscopy, 61, 186-192. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Bolton, F. J., Fox, A. J. & Upton, M. 2012. High metabolic potential may contribute to the success of ST131 uropathogenic Escherichia coli. Journal of Clinical Microbiology, 50, 3202-3207. Goodacre, R., Timmins, E. M., Rooney, P. J., Rowland, J. J. & Kell, D. B. 1996. Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance- absorbance Fourier transform infrared spectroscopy and artificial neural networks. FEMS Microbiology Letters, 140, 233-239. Griebling, T. L. 2005. Urologic diseases in America project trends in resource use for urinary tract infections in women. Journal of Urology, 173, 1281-1287. Helm, D., Labischinski, H., Schallehn, G. & Naumann, D. 1991. Classification and identification of bacteria by Fourier-transform infrared spectroscopy. Journal of General Microbiology, 137, 69-79. Johnson, R. A. & Wichern, D. W. 2007. Applied multivariate statistical analysis, Upper Saddle River, Pearson/Prentice Hall. Kaderbhai, N. N., Broadhurst, D. I., Ellis, D. I., Goodacre, R. & Kell, D. B. 2003. Functional genomics via metabolic footprinting: monitoring metabolite secretion by Escherichia coli tryptophan metabolism mutants using FT-IR and direct injection electrospray mass spectrometry. Comparative and Functional Genomics, 4, 376- 391. Kahlmeter, G. 2003. An international survey of the antimicrobial susceptibility of pathogens from uncomplicated urinary tract infections: the ECO.SENS Project. Journal of Antimicrobial Chemotherapy, 51, 69-76.

139

Chapter Three

Kastanos, E. K., Kyriakides, A., Hadjigeorgiou, K. & Pitris, C. 2009. A novel method for urinary tract infection diagnosis and antibiogram using Raman spectroscopy. Journal of Raman Spectroscopy, 41, 958-963. Klemperer, R. M. M., Ismail, N. & Brown, M. R. W. 1979. Effect of R plasmid RPI on the nutritional requirements of Escherichia coli in batch culture. Journal of General Microbiology, 115, 325-331. Lau, S. H., Reddy, S., Cheesbrough, J., Bolton, F. J., Willshaw, G., Cheasty, T., Fox, A. J. & Upton, M. 2008. Major uropathogenic Escherichia coli strain isolated in the northwest of England identified by multilocus sequence typing. Journal of Clinical Microbiology, 46, 1076-1080. Lin, M. S., Al-Holy, M., Chang, S. S., Huang, Y. Q., Cavinato, A. G., Kang, D. H. & Rasco, B. A. 2005. Rapid discrimination of Alicyclobacillus strains in apple juice by Fourier transform infrared spectroscopy. International Journal of Food Microbiology, 105, 369-376. Manly, B. 1994. Multivariate statistical methods. A primer, London, Chapman and Hall. Mapelli, V., Olsson, L. & Nielsen, J. 2008. Metabolic footprinting in microbiology: methods and applications in functional genomics and biotechnology. Trends in Biotechnology, 26, 490-497. Maquelin, K., Kirschner, C., Choo-Smith, L. P., van den Braak, N., Endtz, H. P., Naumann, D. & Puppels, G. J. 2002. Identification of medically relevant microorganisms by vibrational spectroscopy. Journal of Microbiological Methods, 51, 255-271. Mariey, L., Signolle, J. P., Amiel, C. & Travert, J. 2001. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vibrational Spectroscopy, 26, 151-159. Martens, H., Nielsen, J. P. & Engelsen, S. B. 2003. Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Analytical Chemistry, 75, 394-404. Mas, S., Villas-Boas, S. G., Hansen, M. E., Akesson, M. & Nielsen, J. 2007. A comparison of direct infusion MS and GC-MS for metabolic footprinting of yeast mutants. Biotechnology and Bioengineering, 96, 1014-1022. Meyers, R. A. 2000. Encyclopedia of analytical chemistry: applications, theory, and instrumentation, Chichester, Wiley. Naes, T., Isaksson, T. & Kowalski, B. 1990. Locally weighted regression and scatter correction for near-infrared reflectance data. Analytical Chemistry, 62, 664-673. Naumann, D., Helm, D. & Labischinski, H. 1991. Microbiological characterisations by FT- IR spectroscopy. Nature, 351, 81-82. Olukoya, D. K. 1986. Nutritional variation in Escherichia coli. Journal of General Microbiology, 132, 3231-3234. Rogers, B. A., Sidjabat, H. E. & Paterson, D. L. 2011. Escherichia coli O25b-ST131: a pandemic, multiresistant, community-associated strain. Journal of Antimicrobial Chemotherapy, 66, 1-14. Ronald, A. 2003. The etiology of urinary tract infection: traditional and emerging pathogens. Disease-a-Month, 49, 71-82. Scholes, D., Hooton, T. M., Roberts, P. L., Stapleton, A. E., Gupta, K. & Stamm, W. E. 2000. Risk factors for recurrent urinary tract infection in young women. The Journal of Infectious Diseases, 182, 1177-1182. Stuart, B. H. 2004. Infrared spectroscopy: fundamentals and applications, Chicester, John Wiley & Sons. Varmuza, K. & Filzmoser, P. 2009. Introduction to multivariate statistical analysis in chemometrics, Boca Raton, CRC Press. Winder, C. L. & Goodacre, R. 2004. Comparison of diffuse-reflectance absorbance and attenuated total reflectance FT-IR for the discrimination of bacteria. Analyst, 129, 1118-1122.

140

Chapter Three

Winder, C. L., Gordon, S. V., Dale, J., Hewinson, R. G. & Goodacre, R. 2006. Metabolic fingerprints of Mycobacterium bovis cluster with molecular type: implications for genotype-phenotype links. Microbiology, 152, 2757-2765. Windig, W., Haverkamp, J. & Kistemaker, P. G. 1983. Interpretation of sets of pyrolysis mass spectra by discriminant analysis and graphical rotation. Analytical Chemistry, 55, 81-88.

141

Chapter Three

3.6 Supplementary Information

152 161 1.400 1.400 1.200 1.200 1.000 1.000 Control 0.800 0.800 Control 100 mg/L 0.600 0.600 0.25 mg/L 50 mg/L 0.400 0.400 0.1 mg/L 25 mg/L

0.200 0.200 Optical density (OD 600nm) Optical Optical density (OD 600nm) 12.5 mg/L 0.000 0.000 0. 5. 10. 15. 20. 25. 0. 5. 10. 15. 20. Time (hours) Time (hours)

162 169 1.200 1.400

1.000 1.200 1.000 0.800 0.800 Control 0.600 Control 0.600 100 mg/L 0.400 12.5 mg/L 0.400 50 mg/L 10 mg/L

0.200 0.200 25 mg/L

Optical Optical density (OD 600nm) Optical Optical density (OD nm) 600 0.000 0.000 0. 5. 10. 15. 20. 25. 0. 5. 10. 15. 20. Time (hours) Time (hours)

171 160 1.600 1.400 1.400 1.200 1.200 1.000 1.000 0.800 Control 0.800 Control 0.600 100 mg/L 0.600 0.5 mg/L 0.400 25 mg/L 0.400 0.25 mg/L

0.200 0.200 12.5 mg/L

Optical Optical density (OD 600nm) Optical Optical density (OD nm) 600 0.000 0.000 0. 5. 10. 15. 20. 25. 0. 5. 10. 15. 20. 25. Time (hours) Time (hours)

163 164 1.400 1.400 1.200 1.200 1.000 1.000 Control 0.800 0.800 100 mg/L Control 0.600 0.600 50 mg/L 80 mg/L 0.400 0.400 25 mg/L 50 mg/L

0.200 0.200 Optical Optical density (OD 600nm)

12.5 mg/L Optical density (OD nm) 600 0.000 0.000 0. 5. 10. 15. 20. 25. 0. 5. 10. 15. 20. 25. Time (hours) Time (hours)

173 191 1.400 1.400 1.200 1.200 1.000 1.000 Control 0.800 0.800 0.05 mg/L Control 0.600 0.600 0.025 mg/L 0.05 mg/L 0.400 0.400 0.02 mg/L 0.03 mg/L 0.200 0.200

Optical Optical density (OD nm) 600 0.03 mg/L

0.000 Optical density (OD nm) 600 0.000 0. 5. 10. 15. 20. 25. 0. 5. 10. 15. 20. 25. Time (hours) Time (hours)

Figure S 3.1: Typical growth curves of 10 pathogenic isolates of E. coli showing different susceptibilities to ciprofloxacin.

142

Chapter Three

FT-IR plate (ZnSe)

Spot 20 µL

(3) Spot 20 µL directly Wash it 3 times (4) with Normal saline

Streak Bacteria on NA plate 190 µL LB 18 h incubation

Collect axenic colonies dil. to get exact conc. C1 V1=C2 V2

1 mL 1 mL 40 mL 1mL

1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline

19 mlLB 18 h incubation

take 2mL

Spot 20 µL directly Wash it 3 times (2) with Normal saline (1)

Spot 20 µL

Figure S 3.2: Comparison of four different methods of sample preparation for FT-IR analysis: (1) after washing with saline and (2) directly from the flask. From the Bioscreen plate (3) directly and (4) after washing.

143

Chapter Three

Streak Bacteria on NA plate 190 µL LB 18 h incubation

Collect axenic colonies dil. to get exact conc. C1 V1=C2 V2

1 mL 1 mL 40 mL 1mL

1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it

stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline

(3)

From supernatant theFrom

SpotµL 20

directlyfrom tube the

Spot 20 µLSpot

(5)

(4)

SpotµL 20 afterwashing

FT-IR plate (ZnSe)

Figure S 3.3: A comparison of three different methods of sample preparation for FT-IR analysis: (3) directly from Bioscreen plate, (4) after washing and (5) supernatant.

144

Chapter Three

a Bioscreen Non-washed Bioscreen Washed b 0.2 0.04

0.02 0.1

0 0 -0.02

-0.1

-0.04

PC PC 2, TEV = 12.7% PC 2, TEV = 14.9%

-0.2 -0.06 -0.5 0 0.5 1 -0.1 -0.05 0 0.05 0.1 PC 1, TEV = 82.0% 162 PC 1, TEV = 69.2% c Flask Non-Washed 163 Flask Washed d 0.2 0.04

0.02 0.1 0 0 -0.02

-0.1

PC PC 2, TEV = 5.6% -0.04 PC PC 2, TEV = 11.7%

-0.2 -0.06 -0.6 -0.4 -0.2 0 0.2 0.4 -0.1 -0.05 0 0.05 0.1 PC 1, TEV = 90.6% PC 1, TEV = 78.3%

-1 Figure S 3.4: PCA scores plots of PC1 vs. PC2 after CO2 removal at ca. 2350 cm followed by EMSC scaling. (a) Samples cultured in Bioscreen plate and not washed; the total explained variance (TEV) of PC1 is 82% and of PC2 12.7 %. (b) Samples from a Bioscreen and washed (TEV of PC1 69.2 % and of PC2 14.9 %). (c) Samples cultured in flasks and not washed (TEV of PC1 90.6% and of PC2 5.6 %). (d) Samples grown in flasks and washed (TEV of PC1 78.3% and of PC2 11.7%). Circles representing isolate 162, diamonds representing isolate 163.

Full Data 0.2

0.15

0.1

0.05

0

-0.05 163, Bio, Washed

PCTEV2, 2.9% = 163, Bio Non-Washed -0.1 163, Flask, Non-Washed 163, Flask, Washed 162, Bio, Washed -0.15 162, Bio Non-Washed 162, Flask, Non-Washed 162, Flask, Washed -0.2 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 PC 1, TEV = 95.0%

Figure S 3.5: PCA score plot of the full data of isolates 162 and 163, cultured under different -1 conditions. PC1 vs. PC2 after CO2 removal around 2350 cm and EMSC scaling. TEV of PC1 is 95% and for PC2 2.9%. Closed symbols for washed samples while open symbols for non-washed samples. Isolate 162 represented by circles, diamonds symbolise isolate 163. Red symbols, shaking flasks; blue symbols, Bioscreen plates.

145

Chapter Three

a

b

Figure S 3.6: (a) Raw FT-IR spectra obtained from supernatants of isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plots of supernatant samples; 50 PCs were extracted from PCA and used as inputs to DFA, these 50 PCs explain 99.98% of the total explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations.

146

Chapter Three

a

b

Figure S 3.7: (a) Raw FT-IR spectra obtained from isolates 152, 160, 161, 162, 163, 164, 169, 171, 173 and 191 after washing; the spectra are offset in the Y-axis for ease of visualisation. (b) PC-DFA scores plots of washed samples; 50 PCs were extracted from PCA and used as inputs to DFA, these 50 PCs explain 99.73% of the total explained variance (TEV)); the legend beside the plot shows the 95% confidence interval (CI) for the 10 bacterial isolates estimated from the DFA model validation over 1000 independent bootstrap cross-validations.

147

Chapter Four

4 Chapter Four

A workflow for bacterial metabolic fingerprinting and lipid profiling: application to ciprofloxacin challenged Escherichia coli

J. William Allwood1,3*†, Haitham AlRabiah1*, Elon Correa1, Andrew Vaughan1, Yun Xu1, Mathew Upton2,4, Royston Goodacre1

1School of Chemistry and Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK

2School of Medicine, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PL, UK

3Current address: Clinical and Environmental Metabolomics, School of Bioscience, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

4Current address: Plymouth University Peninsula Schools of Medicine and Dentistry, Plymouth University, Drake Circus, Plymouth, PL4 8AA, UK

* J. William Allwood and Haitham AlRabiah contributed equally to this publication † Correspondence to J. William Allwood: [email protected]

This chapter constitutes a manuscript submitted to Metabolomics. William Allwood participated in sample preparation and the operation of mass spectrometry. Andrew Vaughan contributed to data processing while Elon Correa and Yun Xu participated in data analysis. Mathew Upton provided the bacterial samples studied in this work. Royston Goodacre contributed to this work through supervision and guidance of the study.

148

Chapter Four

Abstract The field of lipidomics focuses upon the non-targeted analysis of lipid composition, the process of which follows similar routines to those applied in conventional metabolic profiling, however lipidomics differs with respect to the sample preparation steps. Conventionally, lipidomics has applied analytical techniques such as direct infusion mass spectrometry (DIMS) and more recently reversed phase liquid chromatography-mass spectrometry (RPLC-MS), for the detection of mono-, di-, and tri-acyl glycerols, phospholipids, and other complex lipophilic species such as sterols. The field is rapidly expanding, especially with respect to the clinical sciences where it is known that changes of lipid composition, particularly phospholipids, are commonly associated with many disease processes. As a proof of principle study, a small number of Escherichia coli isolates were selected on the basis of their sensitivity (E. coli isolates 161 and 171, non-ST131 sequence isolates which are resistant and sensitive respectively, E. coli isolates 160 and 173, ST131 sequence isolates which are resistant and susceptible respectively) to a second generation fluoroquinolone antibiotic, known as ciprofloxacin. Control (non- antibiotic challenged) and antibiotic challenged cultures for each of the isolates were compared for changes in metabolite and lipid composition as detected by Fourier transform infrared (FT-IR) spectroscopy and RPLC-MS, and appraised with a variety of chemometric data analysis approaches. The developed bacterial lipidomics workflow was deemed to be highly reproducible and led to the detection of a large array of lipid classes as well as highlighting a range of significant lipid alterations that differed in regulation between susceptible and resistant E. coli isolates.

149

Chapter Four

4.1 Introduction Lipids are biological compounds which perform a range of functions, including storing energy, acting as signalling molecules and as structural moieties (Vance and Vance, 1996). Due to the multiple possible ways that the many different building blocks of lipids can be combined, an extremely wide variation in the structures of lipids is possible. It has been suggested that there are theoretically around 200000 lipids within the major lipid classes (Oresic, 2011). Lipids are, by nature, hydrophobic and can often be dissolved in organic solvents (Smith, 1997). Fahy et al. (2005) defined lipids as “hydrophobic or amphipathic small molecules that may originate entirely or in part by carbanion-based condensations of thioesters (fatty acids, polyketides, etc.) and/or by carbocation-based condensations of isoprene units (prenols, sterols, etc.)”. More recently, it has been highlighted that whilst lipids are regarded as being useful for storing energy while cells are being processed, they are actually crucial to cellular regulation in their own right (Köfeler et al., 2012).

Lipids have been extensively researched for the last 50 or more years as they perform such a crucial biological role. However, other fields of science such as molecular biology, proteomics and genomics have latterly generated more interest. For successful biomedical research into lipids, it must be possible to quantify all types of lipids accurately, yet it is very difficult to create analytical platforms that are adequately sensitive to enable the detection of intact lipid molecular species in particular biological systems, and this held up the progress of research in this field for some time (Oresic et al., 2008). Fortunately, in the last ten years or so, lipid research has been given a new lease of life thanks to recent developments in analytical techniques prompted by research into metabolomics (Goodacre et al., 2004; van der Greef et al., 2004). This has enabled the detection and quantification of many intact corresponding molecular species of lipids and „lipidomics‟ has developed as a result. Lipidomics is defined as “the full characterisation of lipid molecular species and of their biological roles with respect to expression of proteins involved in lipid metabolism and function, including gene regulation” (Spener et al., 2003). It is a subspecialty of metabolomics, which encompasses the study of other biological molecules (metabolites) including amino acids, nucleic acids and sugars.

150

Chapter Four

Lipidomics focuses upon the non-targeted analysis of lipid composition; the process of data collection and statistical analysis follows very similar routines to those applied in conventional metabolic profiling, though the sample preparation steps applied are different. Analytical techniques such as direct infusion mass spectrometry (DIMS) (Goodacre et al., 2002; Han and Gross, 2005; Allwood et al., 2006) have been applied as an analytical tool for lipidomics, and reversed phase liquid chromatography-mass spectrometry (RPLC-MS) (Kolak et al., 2007; Wedge et al., 2011) has occasionally been applied to detect mono-, di-, and tri-acyl glycerols, phospholipids, and other complex lipophilic species such as sterols and ceramides. An example of a difference in processes applied in metabolomics and lipidomics is that both use LC-MS, but the column and the mobile phase gradient used with lipids are likely to be different.

Given the influence and importance of microbes upon many areas of biological, clinical, biotechnological and environmental research, they are a significant target for metabolomics studies in general (Tang, 2011). In clinical studies, antibiotics and their modes of action and resistance is one of the main fertile areas that requires more research and discovery to improve understanding of the actions of antibiotics and their different reactions with bacteria. Many studies have focused upon exploring the polar central metabolism of bacteria under the stress of antibiotics (Halouska et al., 2007; Kwon et al., 2010; Kwon et al., 2008). There have been fewer studies of lipids and non-polar metabolites. Therefore, the principal aim of this study was to establish and validate a method for lipid profiling of microbes under antibiotics stress using RPLC-MS.

Ciprofloxacin, a fluoroquinolone antibiotic that inhibits nucleic acid synthesis directly by targeting topoisomerase enzymes (Greenwood, 2000), was chosen to challenge four pathogenic E. coli isolates. These included 160 and 173 of ST131 sequence type, which are classified as resistant and sensitive respectively, as well as E. coli isolates 161 and 171 of non-ST131 sequence types, which are classified as resistant and intermediate resistant respectively. These were selected to provide a range of E. coli response when challenged with ciprofloxacin.

151

Chapter Four

4.2 Materials and Methods

4.2.1 Chemicals All chemicals used were of analytical reagent or a higher purity grade. All reference standards were of 99% minimum purity. All materials were purchased from Sigma- Aldrich (Gillingham, UK) unless otherwise stated. Ciprofloxacin hydrochloride was obtained from Discovery fine chemicals (Dorset, UK), and a stock solution was prepared by dissolving 100 mg in 50 mL of sterile distilled water. Further dilutions were also made with sterile distilled water. HPLC grade methanol, chloroform and water were purchased from Sigma-Aldrich. Formic acid was purchased from VWR International (East Grinstead, UK).

4.2.2 Escherichia coli isolates Four clinical E. coli isolates were obtained from urine samples submitted to the bacteriology laboratory at Central Manchester Foundation Trust. The isolates were selected on the basis of sequence type and sensitivity towards quinolones. A sensitive isolate (isolate 173) and a resistant isolate (isolate 160) of sequence type ST131, as well as two isolates that were resistant and partially-resistant to quinolones (isolates 161 and 171 respectively) of non-ST131 sequence type, were selected for the lipidomic analysis.

4.2.3 Preparation of E. coli inoculates The pathogenic E. coli isolates were cultured in lysogeny broth (LB): 10 g tryptone (Formedia, Hunstanton, UK), 5 g yeast extract (Amersham Life Sciences, Cleveland, USA) and 10 g sodium chloride (Fisher Scientific Ltd., Loughborough, UK), dissolved in 1 L of reverse osmosis water then autoclaved (121 ºC, 15 min and 15 psi). Each 49 mL flask culture was inoculated with 1 mL of bacterial stock (20% [v/v] glycerol) and incubated in a shaking incubator for 24 h at 37 ºC and 200 rpm. 49 mL fresh media was used to dilute the overnight cultures (1 mL from each) followed by a further hour of incubation, again at 37 ºC and 200 rpm. Physiological saline (0.9% [w/v] NaCl) was used to wash and dilute these cultures to 0.5 McFarland standard optical density (OD) at 600 nm. A Biomate 5 spectrophotometer

152

Chapter Four

(Thermo, Hemel Hempstead, UK) was used and the resultant cultures were used as experimental inoculates (Figure 4.1).

4.2.4 Determining ciprofloxacin minimal inhibitory concentration (MIC) A Bioscreen spectrophotometer (Labsystems, Basingstoke, UK) operating at 600 nm for OD was chosen to measure bacterial growth curves of each condition; five biological replicates were used. The following settings were used for the Bioscreen: 10 min preheating, an incubation temperature of 37 ºC, continuous medium shake, and measurement intervals of 10 min. The total time for the experiment was 18 h. In Bioscreen plates, 180 µL of LB was inoculated with 10 µL of the experimental inocula and 10 µL of ciprofloxacin hydrochloride solution, ranging from 0.0025 to 100 mg/L (Andrews, 2001). The control samples consisted of 180 µL LB with 10 µL of experimental inocula and 10 µL sterile distilled water (drug dilution solvent).

4.2.5 E. coli culture and antibiotic challenge for metabolic fingerprinting and lipid profiling 18 mL of LB medium was inoculated with 1 mL of experimental inocula for each of the four isolates in 100 mL flasks and incubated for 18 h at 37 ºC and 200 rpm. E. coli isolates 160 and 173 were challenged with 1 mL of 0.02 mg/L ciprofloxacin hydrochloride, and isolates 161 and 171 with 1 mL of 0.3 mg/L ciprofloxacin hydrochloride. For the control, non-treated samples, the ciprofloxacin hydrochloride solution was replaced with an equal volume of sterile distilled water, which is the dilution solvent of the drug (Figure 4.1). For each E. coli isolate, six biological replicate cultures were prepared, three were challenged with ciprofloxacin hydrochloride and three were used as controls. The selected concentrations of antibiotic were determined according to the previously established MICs (see Section 4.2.4). After antibiotic challenge the cultures were incubated for a further 18 h at 37 ºC and 200 rpm prior to sample collection (Figure 4.1).

153

Chapter Four

4.2.6 E. coli sample collection and quenching for metabolic fingerprinting and lipid profiling The sample collection and quenching of metabolism followed the procedures developed by Winder et al. (2008) (Figure 4.1). From each culture, 15 mL was collected and quenched in 30 mL of 60% cold methanol (-48 ºC, chilled on dry ice), mixed quickly and then the quenched culture was centrifuged for 10 min at 4800 g and -8 ºC. The supernatant was removed rapidly. The remaining bacterial pellets were further centrifuged for 2 min and the residual supernatant removed. At this stage, the quench supernatant may be sampled for the purpose of assessing metabolite leakage which is probably more relevant to the analysis of the polar metabolome. The bacterial pellets were stored at -80 ºC until metabolite extraction and LC-MS analysis were performed.

For Fourier transform infrared (FT-IR) spectroscopy, a 1 mL sub-sample of each bacterial culture was collected and spotted directly following the method of AlRabiah et al. (2013; Chapter 3). A further 1 mL from each culture was collected for OD determination at 600 nm using Biomate 5, the OD measurement provides a factor for sample normalisation to account for differences between the cell densities obtained for the various replicate cultures (Figure 4.1).

4.2.7 Fourier transform infrared (FT-IR) spectroscopy A 96-well zinc selenide (ZnSe) FT-IR plate (Bruker Ltd., Coventry, UK) was first washed with 10% sodium dodecyl sulfate (SDS), rinsed three times with deionised water, rinsed three times with analytical grade propan-2-ol, then rinsed with deionised water and air-dried at room temperature (AlRabiah et al., 2013; Chapter 3). 20 µL from each culture was spotted directly and the plate oven dried at 40 ºC for 45 min. A Bruker Equinox 55 infrared spectrometer (Bruker Ltd.) was used to carry out high-throughput screening (HTS) FT-IR spectroscopic analysis. The spectrometer was equipped with an HTX™ module, following the method of Winder et al. (2006). The dried cell biomass was analysed in transmission mode to collect the FT-IR absorbance spectra. For each measurement, a background spectrum was collected from a blank reference well. All spectra were obtained between 4000 and 600 cm-1 range, and the resolution for an acquired 64 scans was 4 cm-1. OPUS software (Bruker Ltd.) was used for spectral acquisition and background subtraction,

154

Chapter Four and after that data was directly exported to R version 2.13.1 (R Foundation for Statistical Computing, Vienna, Austria) for chemometric analysis.

4.2.8 Sample extraction for LC-MS lipid profiling The bacterial cell pellets were suspended in 1 mL of freshly prepared and pre-chilled (-20 ºC for a minimum of 24 h) HPLC grade methanol:chloroform solution (1:1). The samples were thoroughly vortexed and shaken for 15 min, after which, 0.5 mL of cold HPLC grade water was added to the sample extract. The samples were mixed and centrifuged at 4800 g for 3 min to aid phase separation. The upper polar phase was removed and put into a clean 2 mL microcentrifuge tube. 150 µL of the lower non-polar phase was recovered, centrifuged at 13363 g and 100 µL recovered to another clean 2 mL microcentrifuge tube, thus removing any particulate matter. The non-polar phase samples were left to evaporate at room temperature until completely dry prior to transfer to storage at -80 ºC.

4.2.9 LC-MS analysis All samples were analysed on an Accela UHPLC system (Thermo-Fisher Ltd., Hemel Hempsted, UK) coupled to an electrospray LTQ-Orbitrap XL hybrid mass spectrometry system (Thermo-Fisher, Bremen, Germany). For UHPLC-MS analysis, the samples were reconstituted in 8:2 HPLC grade methanol:water (100 µL per OD 600 nm of 0.32; see Table S 4.1), vortex mixed and centrifuged for 15 min at 13363 g. A quality control (QC) sample was prepared by combining an equal volume of each extract and vortex mixing thoroughly. Each sample extract was transferred to a single analytical vial with 200 µL fixed insert, capped, stored in the auto sampler at 5 ºC and analysed within 48 h of reconstitution in both positive (LC- MS+ve) and negative (LC-MS-ve) ionisation modes. The samples were analysed over two separate analytical blocks (respective of ESI polarity), each completely randomised. UHPLC separations were performed with a method identical to that previously described (Dunn et al., 2011; Wedge et al., 2011). Briefly, 5 µL of extract was injected onto a Hypersil GOLD UHPLC C18 column (100 mm × 2.1 mm × 1.9 µm, Thermo-Fisher Ltd.), the UHPLC was operated at a flow rate of 400 µL/min and the column was maintained at a temperature of 50 ºC. Solvent A (HPLC water and 0.1% formic acid) and solvent B (HPLC methanol and

155

Chapter Four

0.1% formic acid) gradient programme was as follows: LC-MS+ve 100% A held for 1 min, 0–100% B over 11 min, 100% B held for 8 min, returning to 100% A over 2 min (total run time of 22 min); LC-MS-ve 100% A held for 1 min, 0–100% B over 16 min, 100% B held for 5 min, returning to 100% A over 2 min (total run time of 24 min). The Thermo LTQ-Orbitrap XL MS system was operated under Xcalibur software (Thermo-Fisher Ltd.) and the method described by Wedge et al. (2011) was followed precisely. Prior to sample analysis, the LTQ-Orbitrap MS was tuned to optimise detection of ions in the m/z 100−1000 range and calibrated according to the manufacturers‟ predefined methods for both ESI polarities. Data were acquired in the Orbitrap mass analyser operating at a mass resolution of 30000 (FWHM defined at m/z 400) and a scan speed of 0.4 s. For each analytical block, initially 20 injections of QC sample were performed for column conditioning, after which 5 injections of experimental samples were made, followed by a QC injection. This was repeated until all samples within the block were analysed. Finally, three QC injections were made at the end of the block run. Prior to and in between each analytical block, the ESI ion tube and spray deflector were cleaned using 8:2 HPLC grade methanol:water acidified with 1% formic acid and ultrasonication for 15 min.

4.2.10 Processing of raw LC-MS profiles and lipid identification The UHPLC-MS raw data profiles were first converted into a NetCDF format within the Xcalibur software's file conversion programme. Peak deconvolution was performed using the freely available XCMS software (http://masspec.scripps.edu/xcms/xcms.php) as described previously (Wedge et al., 2011; Dunn et al., 2008). The XCMS deconvolution results in the production of a Microsoft Excel based XY matrix of mass spectral features (with related accurate m/z and retention time variable pairs) × sample, with peak area inputted where the mass spectral feature was detected in each sample. The data matrix was then signal corrected so that peaks failing quality control (greater than 20% relative standard deviation (RSD) within QC samples across the analytical run) were removed. Each sample was then normalised to its total peak area. The putative identification of lipid features detected on the UHPLC-MS platform was performed applying the PUTMEDID-LCMS set of workflows as previously described (Brown et al., 2011). Multiple identifications can be observed for a single lipid feature as different lipids

156

Chapter Four can be detected with the same accurate m/z (e.g. isomers with the same molecular formula). A single lipid can also be detected as multiple features, as different types of ion (e.g. protonated and sodiated ions). A table of the putative lipid assignments is available in the Supplementary Information (Table S 4.2).

4.2.11 Statistical analyses

4.2.11.1 Principal component analysis (PCA) The objective of PCA is to explain the variance-covariance structure of a set of variables through a few linear combinations of these variables (Johnson and Wichern, 2007). Much of the original data variability can be accounted for by a small number of PCs which are then used for data reduction and visual data interpretation. The PCA results are described in terms of component scores and loadings. The component scores (PCs) are the transformed variable values and the loadings are the weights by which the original data variables should be multiplied to obtain the component scores. PC scores and loadings plots, as well as the associated loadings weights for each FT-IR spectral wavelength or UHPLC-MS feature (RT- m/z pair), were computed using the R statistics package.

4.2.11.2 Principal component-discriminant function analysis Discriminant function analysis (DFA) is a supervised technique that discriminates between different groups using a priori knowledge of class membership. The algorithm works to maximise between-group variance and minimise within-group variance (Varmuza and Filzmoser, 2009). In the present work, the PCs were used as inputs for DFA and the results were validated by 1000 bootstrap cross-validations. Bootstrap is a re-sampling technique that can be applied as cross-validation to estimate the prediction performance of a model. The basic idea of this method is to select randomly, with replacement, N samples from a set containing exactly N samples. All selected samples, including the repetitions, are then used as a training set and the non-selected samples are used as a test set (Efron, 1981). One can think of this as having all samples analysed (N = ||X|| for our case) in a bag. A single sample is taken out of the bag randomly and its number noted – this sample now forms part of the training data, and the sample is placed back into the bag. This

157

Chapter Four

random sample picking process is repeated until ||X|| samples are in the training set. Some samples will be used multiple times, and on average 63.2% of all of the samples will have been selected for training, the remaining 36.8% are used as the test data. PC-DFA scores and loadings plots, as well as the associated loadings weights for each FT-IR spectral wavelength or UHPLC-MS feature (RT-m/z pair), were computed using the R statistics package.

4.2.11.3 Uni- and multivariate feature selection To identify the most relevant metabolites generated by this study, we combined the power of two well-known statistical methods. The first is a partial least squares (PLS) regression algorithm and the second is an analysis of variance (ANOVA) statistical test. PLS is a supervised learning method that relates a set of independent variables X (metabolites) to a set of dependent variables Y (the identity of the samples). PLS projects the X and Y variables into sets of orthogonal latent variables, scores of X and scores of Y, so that the covariance between these two sets of latent variables is maximised (Martens and Naes, 1992). The purpose of PLS is to build a linear model Y = XB + E, where B is a matrix of regression coefficients and E represents the difference (error) between observed and predicted Y values (Vinzi et al., 2010). The size of the absolute value of the coefficient for each independent variable represents the influence of that variable on the prediction or dependent variable. The higher the absolute value of the coefficient is, the higher the influence of the variable.

ANOVA is a statistical method that examines the difference in means across multiple groups of interest (Hair et al., 2005). It is a generalisation of a t-test to more than two groups with the advantage that ANOVA reduces the chances of committing a type I error when comparing more than two groups. For each of the treatment types investigated, we compared the control versus treatment samples to determine the top 50 most significantly changed metabolites using the following approach. First, a PLS algorithm was applied to the data and the metabolites were sorted by the magnitude of their corresponding PLS coefficient values from the highest to the lowest. Second, the ANOVA statistical test was applied to all metabolites. Finally, starting with the metabolite with the highest PLS coefficient we chose the top 50 metabolites whose

158

Chapter Four p-value computed by ANOVA was less than 0.05. All calculations were again computed using the R statistics package.

159

Chapter Four

180 µL LB Test: 10 µL drug (1) Streak Bacteria on NA plate Control: 10 µL water 18 h incubation

Collect axenic colonies dil. to get exact conc. C1 V1=C2 V2 Bioscreen Plate 1 mL 1 mL 40 mL 1mL 18 mL LB Bioscreen Spectrophotometer Test: 1 mL drug (Labsystems, Basingstoke, UK). Control: 1 mL water Temperature 37 C, continuous 1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it 18 h incubation medium shake, OD 600 nm, stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline measurement interval 10 min. Take 2 mL from

supernatant (Leakage) 1mL

Remove supernatant and collect the pellet Measure the OD for normalisation

Reconstitute in 1 mL. 4800 g (-8 ºC) for 10 min mL

Spot

directly

mix)

(quick and (quickand 20 20 Add Add 30 (2)

Pre chilled (-20 ºC) µL methanol:chloroform solution (1:1)

Vortex and shake for 15 min FT-IR plate (ZnSe) Pre chilled (-48 ºC) Add 0.5 mL (60%) methanol High throughput screening (HTS) FT-IR spectroscopic analysis was carried out using a Bruker Equinox 55 infrared spectrometer (Bruker Cold HPLC water Ltd., Coventry, UK). equipped with an HTX™ module according to the 8 classes: method of Winder et al. (2006). FT- ST131 : IR spectra were recorded directly 160 (control, 0.02 mg/L), 173 (control, 0.02 mg/L) from the dried cell biomass in non-ST131 : transmission mode. 161 (control, 0.3 mg/L), 171 (control, 0.3 mg/L)

4800 g (-8 ºC) for 3 min.

(3) Collect the polar phase

150µL from the non-polar phase Reconstitute in Centrifuge Take 100 µL methanol:water (8:2) according to the OD. Vortex, mix and centrifuge 13363 g Evaporate at room Analytical vial temperature

UHPLC-LTQ-Orbitrap XL MS, m/z 100-1000, calibrated following manufacturers recommendations (ESI+ and ESI-). Reverse phase gradient (Hypersil Gold C18 100 x 2.1mm 1.9 µm particle size: Thermo- Fisher Ltd.). Gradient programme was adjusted to run time of 24 min).

Figure 4.1: Schematic of sample preparation including: (1) Bioscreen analysis to determine the MIC of ciprofloxacin and produce the growth curves of pathogenic E. coli isolates; (2) FT-IR analysis of samples; (3) LC-MS analysis of samples after quenching with cold (-48 ºC) methanol and extraction with (1:1) methanol:chloroform.

160

Chapter Four

4.3 Results and Discussion

4.3.1 Determination of the ciprofloxacin minimal inhibitory concentration (MIC) The minimum inhibitory concentration (MIC), as its name suggests, is the lowest concentration that can completely inhibit the growth of bacteria. It is important to use ciprofloxacin levels lower than the MIC to measure subtle antibiotic effects on E. coli because if higher levels are used all that will be measured is cell death rather than the lipidome alteration that is the desired information and is the goal of this experiment. It was therefore vital to determine these concentrations. A wide range of ciprofloxacin hydrochloride concentrations (from 0.0025 to 100 mg/L) were used to challenge four pathogenic E. coli isolates with different sequence types (ST131 and non-ST131) to find out the MIC of each isolate. E. coli isolates 161 (non-ST131 sequence type) and 160 (ST131 sequence type), regarded as fully resistant, were not found to be inhibited in growth even with ciprofloxacin concentrations up to 100 mg/L (Figure 4.2 a, c and e). With intermediate-resistant E. coli isolate 171 (non-ST131 sequence type), the ciprofloxacin MIC fell between 0.25 and 0.5 mg/L (Figure 4.2 d). However, it was found that with the sensitive E. coli isolate 173 (ST131 sequence type), the ciprofloxacin MIC fell between 0.02 and 0.03 mg/L (Figure 4.2 b).

The concentrations of antibiotic to challenge bacteria in sample generation for metabolic fingerprinting and lipid profiling were selected based upon these results (Figure 4.2 e). In order to enable direct comparison of metabolic responses of E. coli ST131 sequence type isolates 160 and 173, a common antibiotic concentration of 0.02 mg/L was chosen as it was in the MIC range of the sensitive isolate 173. For the non-ST131 sequence type isolates 161 and 171, an antibiotic concentration of 0.3 mg/L was chosen as it was between the MIC of 0.25−0.5 mg/L of isolate 171, and was a high enough concentration to bring about an inhibitory effect upon growth without total bacterial death occurring (Figure 4.2).

161

Chapter Four

a b 160 173 1.400 1.400 1.200 1.200 1.000 1.000 Control Control 0.800 0.800 100 mg/L 0.05 mg/L 0.600 0.600 50 mg/L 0.03 mg/L 0.400 0.400 25 mg/L 0.025 mg/L

0.200 0.200 Optical Optical density (OD 600nm) 12.5 mg/L Optical density (OD 600nm) 0.02 mg/L 0.000 0.000 0. 5. 10. 15. 20. 0. 5. 10. 15. 20. Time (hours) Time (hours)

c d 161 171

1.400 1.600 1.400 1.200 1.200 1.000 Control 1.000 Control 0.800 100 mg/L 0.800 1 mg/L 0.600 50 mg/L 0.600 0.5 mg/L 0.400 25 mg/L 0.400 0.25 mg/L 0.200 0.200

Optical Optical density (OD 600nm) 0.1 mg/L

Optical Optical density (OD 600nm) 12.5 mg/L 0.000 0.000 0. 5. 10. 15. 20. 0. 5. 10. 15. 20. Time (hours) Time (hours)

e Selected dose to MIC range determination Isolate number Sequence type Quinolone phenotype challenge the (mg/L)* bacteria (mg/L) 160 ST131 Fully resistant No effect 0.02 173 ST131 Sensitive 0.03−0.02 0.02 161 non-ST131 Fully resistant No effect 0.3 171 non-ST131 Intermediate resistant 0.5−0.25 0.3 * Ciprofloxacin hydrochloride highest dose used is 100 mg/L.

Figure 4.2: OD 600 nm growth curves collected during 0-18 h post antibiotic challenge to establish minimal inhibitory concentrations (MICs) of ciprofloxacin hydrochloride against E. coli ST131 sequence isolates: (a) 160 (resistant) and (b) 173 (sensitive). Non-ST131 sequence isolates include: (c) 161 (resistant) and (d) 171 (intermediate resistant). (e) Table of predicted MIC (mg/L ciprofloxacin hydrochloride).

4.3.2 Fourier transform infrared (FT-IR) spectroscopy based metabolic fingerprinting E. coli pathogenic isolates were exposed to different concentrations of ciprofloxacin depending on the MIC of the sensitive isolate from each sequence (as explained above), giving eight different conditions including four controls. Each condition was prepared in three biological replicates to give 24 samples. These samples were first analysed by FT-IR spectroscopy. This is a high-throughput analytical technique that is used as an initial screening tool due to its ability to analyse a large number of samples in a very short time and more cheaply than with mass spectrometric or other spectroscopic approaches (viz. nuclear magnetic resonance (NMR) spectroscopy).

162

Chapter Four

FT-IR spectra (data not shown) were recorded from the dried cell biomass in transmission mode and CO2 signals were removed from the spectra, as they might be considered misleading as they may cause spurious variations on multivariate analysis and FT-IR data were baseline corrected as described by AlRabiah et al. (2013; Chapter 3) using the extended multiplicative signal correction (EMSC) scaling method (Martens et al., 2003).

The data from the FT-IR spectroscopy experiment were subjected to a multivariate analysis technique known as PC-DFA (Varmuza and Filzmoser, 2009) to show the distribution of samples based on their metabolic fingerprints, and a PC-DF scores plot was generated (Figure 4.3 a). It was found that ST131 isolates (160 and 173) accumulated tightly on the negative square of the plot (negative side of the DF1 and DF2 axes), while both non-ST131 isolates (161 and 171) were distributed far from the negative square. Isolate 161 was seen to separate from the ST131 sequence type isolates, largely along the positive side of the DF2 axis. On the other hand, isolate 171 was clustered further away on the positive side of the DF1 axis. This distribution of these isolates matched with the sequence type (i.e. ST131 formed a close cluster while non-ST131, which might be more than a sequence type, distributed far from the ST131 cluster and did not form a definite cluster). These results were matched and confirmed the results of AlRabiah et al. (2013; Chapter 3), as the same sample preparation protocol was followed.

E. coli isolates classed as fully resistant (isolate 160 and 161), indicated by blue symbols, revealed very little difference between control and ciprofloxacin challenged samples, which means that there are no significant metabolic changes at the level of the metabolic fingerprint detected by FT-IR spectroscopy. The sensitive E. coli isolate 173 and the intermediate resistant isolate 171 (red symbols) both showed a clear separation between the control and ciprofloxacin challenged sample clusters. Challenged samples were liberated from the control cluster to the negative side of DF2; this shift is indicated by red arrows in Figure 4.3 a.

As a next step, PC-DFA loadings were produced from pair-wise comparisons (Figures 4.3 b, c, d and e). The contribution made by specific IR peaks to the separation between control and stressed samples of the same isolate can be seen. Starting with fully resistant E. coli isolates 160 and 161, although they were of

163

Chapter Four different sequence type, and are clearly clustered far from each other in the PC-DFA scores plot, they revealed very similar loadings plots on the basis of spectral change due to ciprofloxacin challenge (Figures 4.3 b and d). In these loadings plots, the most marked spectral changes are in peak 1645 and 1705 cm-1 which can be assigned to the amide spectral region. This can indicate that although these isolates are fully resistant, variations in the protein content occur after the drug stress. Inspection of the other two sensitive isolates (173 and 171) revealed similar spectral change due to challenge by ciprofloxacin (Figures 4.3 c and e). The clearest spectral changes due to ciprofloxacin stress in these isolates included peaks at 1653 and 1700 cm-1, again corresponding to the amide spectral region, indicating alterations in protein content in response to ciprofloxacin challenge. Additionally, further alterations were found at (~1500 cm-1), but there is no obvious explanation for this, which can be seen clearly in isolate 173 where the drug shows its strongest effect. The tight clustering of samples of the same experimental class within the PC-DFA scores plot (Figure 4.3 a) revealed high technical reproducibility of sample growth and collection making this more suitable for more in-depth analysis of bacterial lipids through LC-MS.

164

Chapter Four

a non non

ST131

- ST131

b c

d e

Figure 4.3: FT-IR spectra analysed using principal component-discriminant function analysis (PC- DFA) of E. coli challenged for 18 h with ciprofloxacin and control bacteria. (a) PC-DFA scores plot (DF1 vs. DF2). PC-DFA loadings plots derived by comparisons of ciprofloxacin challenged versus control samples for each respective isolate: (b) ST131 sequence isolate 160 (resistant); (c) ST131 sequence isolate 173 (sensitive); (d) non-ST131 sequence isolate 161 (resistant); (e) non-ST131 sequence isolate 171 (intermediate resistance). C, control; CI, confidence interval.

165

Chapter Four

4.3.3 Lipid profiling of pathogenic E. coli isolates before and after challenging with ciprofloxacin hydrochloride using LC-MS Whilst the FT-IR spectroscopy was performed directly on the bacterial biomass, this is clearly not possible to achieve using LC-MS. Therefore, bacterial samples from the same biological replicates were further analysed after quenching and extraction using LC-MS to produce a lipid profile of each bacterial sample. Lipid extracts were analysed in both LC-MS+ve and LC-MS-ve ionisation modes, and peak deconvolution was performed as described above. In the LC-MS+ve and LC-MS-ve mode datasets, a total of 2933 and 1505 mass spectral features were extracted respectively. The putative identification of lipid features was performed by applying the PUTMEDID- LCMS set of workflows (Brown et al., 2011) (Table S 4.2). A first look at the putatively identified lipid species detected in this study showed that the phospholipids of E. coli are mainly phosphatidylcholines (PCs) and/or phosphatidylethanolamines (PEs). As previous reports state that the major lipid classes of E. coli are represented by PEs, phosphatidylglycerols (PGs) and cardiolipins, with minor contributions made by phosphatidylserines (PSs) and phosphatidic acids (PAs) (Ames, 1968), it could be concluded that the major lipid species detected in this study are more likely to be PEs than PCs. The PC and/or PE phospholipids were largely detected in LC-MS+ve mode, though some were also observed in LC-MS-ve mode. A small number of PAs were also observed in both modes and a number of PGs were observed predominantly in LC-MS-ve mode. In addition to phospholipids, a large number of acylglycerols and free fatty acids were detected, as well as further diverse lipid species like sterols and sphingolipids. The Metabolomics Standards Initiative (MSI) guidelines categorise the putative level of identification that this method provides and thus all metabolites can be categorised as identification level 2 and 3 (Sumner et al., 2007).

After the PUTMEDID-LCMS workflow, the datasets were signal corrected so that peaks that failed quality control (greater than 20% RSD within QC samples across the analytical run) were removed, resulting in a total of 2790 and 1415 mass spectral features being preserved within the LC-MS+ve and LC-MS-ve datasets respectively. This was a loss of features of 4.9% and 6% respectively. Mass spectral features detected within the first 0.5 min of the chromatograms were also excluded since they are deemed to be within the LC void volume, and likely from polar metabolites. This

166

Chapter Four resulted in the exclusion of a further 148 and 98 mass spectral features for ESI positive and negative modes respectively. The LC-MS+ve and LC-MS-ve ionisation mode datasets that resulted had 2642 and 1317 mass spectral features respectively. The fact that the UHPLC-MS datasets had low levels of exclusion from the quality control procedure alone shows that the results are of a high standard. Finally, each sample from both of the two datasets was normalised to its total peak area, and then a range of multi- and univariate methods were used to analyse these lipidomic data.

As a first step to evaluate the LC-MS datasets, they were subjected to multivariate analysis. PC-DFA was applied to show the distribution of the various samples and sample groups in relation to each other (Figure 4.4). The central positioning of the QC samples through the 0 axes of both DF1 and DF2 indicate that after deconvolution and quality control steps, the QC samples still represent an average data point. The tight clustering and overlap of QC samples also shows the high levels of analytical reproducibility given by the LC-MS instrumentation and methodologies used. Starting with the negative mode dataset, PC-DFA scores plot for these data (Figure 4.4 a) shows that both ST131 isolates (160 and 173) accumulate closer to each other and score on the positive square of the plot (i.e. positive side of both DF1 and DF2 axes). For non-ST131 isolates, they were distributed far from the positive square of the plot. These observations match with what was found in the FT-IR scores plot which might be a hint that there is a common feature between these two platforms in terms of compound detection. This observation needs to be proved with a larger number of samples and will be the subject of further work. Also it can be seen that PC-DF scores plot of LC-MS (Figure 4.4) are very similar to FT-IR (Figure 4.3) in terms of samples shift after ciprofloxacin challenge. Resistant isolates (160 and 161) show a slight separation between control and challenged samples in both LC-MS-ve (Figure 4.4 a) and LC-MS+ve (Figure 4.4 b) ionisation modes. On the other hand, isolates 173 and 171, which are considered as sensitive and intermediate resistant respectively, both show a great significance and clear separation between the control and the challenged samples. Challenged samples were shifted with a trajectory across the DF1 axis either to the positive side of the axis as in LC-MS-ve mode plot (Figure 4.4 a) or to the negative side as can be seen in Figure 4.4 b which represents LC-MS+ve mode datasets. In general, the overall PC-DF score plots of the LC-MS datasets show that whilst there are changes in lipid profiles of the resistant

167

Chapter Four bacterial isolates (160 and 161) when challenged by ciprofloxacin, the changes are small compared with the large alterations in lipid profiles that result from ciprofloxacin challenge within the sensitive isolate (173) and the intermediate resistant isolate (171).

a ST131

b

Figure 4.4: PC-DFA scores plot of LC-MS data from E. coli challenged for 18 h with ciprofloxacin and control bacteria: (a) PC-DFA scores plot (DF1 vs. DF2) of LC-MS-ve mode data; (b) PC-DFA scores plot (DF1 vs. DF2) of LC-MS+ve mode data. QC, quality control; C, control; CI, confidence interval. 168

Chapter Four

The next objective after PC-DFA was to apply uni- and multivariate significance testing to select which lipids (mass spectral features) were significantly altered between control and ciprofloxacin challenged samples for each bacterial isolate independently. Statistical significance methods were applied and the results were combined. A PLS algorithm was applied to the data first, and the mass spectral features were sorted by the magnitude of their corresponding PLS coefficient values. After this, ANOVA statistical test was applied to all mass spectral features. Finally, starting with the mass spectral feature with the highest PLS coefficient, the top 50 were selected whose p-value computed by ANOVA was less than 0.05 (Tables S 4.3 and S 4.4). For each bacterial isolate, the significant features were sorted according to retention time. Since certain lipid species were represented by multiple mass spectral features, the top 50 list was next reduced to only include the major ion (most intense) observed for each lipid species. Once the list had been cut down in this way, the top 50 mass spectral features lists for each isolate and ESI polarity were obtained next, and a comprehensive list of the significant lipid mass spectral features across all bacterial isolates was compiled for the ESI datasets – in both positive and negative modes. Trend plots were then generated for each of the statistically significant lipid features, and these were grouped on the basis of response to ciprofloxacin challenge: first, lipid features that were commonly up or down- regulated across both resistant and sensitive E. coli isolates (Figure S 4.1); second, lipid features that were up or down-regulated specifically in susceptible or specifically in resistant isolates (Figure 4.5); and third, lipid features that were differentially regulated between resistant and sensitive isolates (Figure 4.6).

4.3.4 A wide selection of E. coli lipid species are altered by challenge with ciprofloxacin, with many showing differential regulation between sensitive and resistant E. coli isolates After the important lipids had been grouped in accordance with how they responded to ciprofloxacin challenge, the effect of the putative assignment of the lipid species and the levels of saturation within the fatty acid components of the lipid species on the patterns of response were taken into account, so that potentially significant trends within the results could be recognised. A series of lipid alterations were grouped as they show common responses between both resistant and sensitive isolates

169

Chapter Four

(Figure S 4.1). Within the sensitive isolate 173 and intermediate resistant isolate 171, higher levels of lipid up-regulation were typically observed, and most of these lipids were putatively assigned as PGs, PAs/PGs, and PGs/DGs (diacylglycerides). Lipid species that typically showed a higher level of down-regulation were observed again in the sensitive isolate 173 and intermediate resistant isolate 171. These lipids were putatively assigned as PC/PE/PE-NMe (N-methyl PE) and one lipid was putatively assigned as a highly unsaturated DG with a total chain composition of 40:10 or 38:7 (dependent on the potential ESI adduct ion that was formed). By comparing these phospholipids, according to their putative assignment, it can be seen that up- regulated phospholipids had a greater proportion of fatty acid saturation (usually one or two unsaturated bonds) than the phospholipids that were down-regulated (usually up to four unsaturated bonds).

The lipid features that were up or down-regulated specifically in susceptible or in resistant isolates are shown in Figure 4.5. The lipids that were significantly up- regulated in the sensitive isolates generally showed a higher level of up-regulation within the fully sensitive isolate 173 compared with the intermediate resistant isolate 171. Six phospholipids assigned as PEs or PCs were up-regulated and the same was seen for a DG. The putative assignments suggest high levels of saturation within the phospholipid fatty acids, but not the DG, with only one or two unsaturated bonds being present. Most of the significantly down-regulated lipids responded similarly, specifically between the sensitive and intermediate resistant isolates. A few phospholipids followed this response and they were putatively assigned as PCs/PEs, PAs/PSs, PCs and Lyso-PC. Cholesteryl-beta-D-glucoside and steroid-like species (possibly hopanoids) were also down-regulated. A few lipids were specifically altered within the resistant isolates 160 and 161. These were all highly down- regulated upon challenge with ciprofloxacin, and were putatively assigned as DGs and PAs.

170

Chapter Four

a. Significantly altered in susceptible isolates: up-regulated

PC 29:1 / PE 32:1 (C37H72NO8P) / PA 34:2 (C37H69O8P) / PE 34:1 / PC 31:1 (C39H76NO8P) / PA 36:2 PC 27:1 / PE 30:1 (C35H68NO8P) PE-NMe (O-28:0) (C34H72NO6P) (C39H73O8P) / PC (O-28:0) (C36H76NO 6P)

ESI+ mode Peak no. 2332 RT 14.52 min. ESI+ mode Peak no. 2398 RT 14.97 min. ESI+ mode Peak no. 2479 RT 15.46 min. m/z m/z 662.4763 m/z 690.5076 718.5383 9 40 9

8 35 8

7 7 30 6 6 25 5 5 20 4 4 15

3 3

AveragePeakArea AveragePeakArea AveragePeakArea 10 2 2

1 5 1

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

PE 40:1 (C45H88NO7P) PC (29:1) / PE (32:1) (C37H72NO8P) PE-NMe(O-28:0) (C34H72 NO 6P):

ESI+ mode Peak no. 2730 RT 18.03 min. m/z ESI- mode Peak no. 1050 RT 19.71 min. m/z ESI+ mode Peak no. 2458 RT 12.67 min.

Phospholipids 786.6402 688.4900 m/z 696.3792 6 18 4

16 3.5 5 14 3 4 12 2.5 10 3 2 8 1.5

2 6

AveragePeakArea AveragePeakArea AveragePeakArea 1 4 1 2 0.5

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

DG 35:4 (C38H66O5)

ESI+ mode Peak no. 2303 RT 16.72 min. Non-ST131 ST131 m/z 647.4589 5 4.5 4.5 4 3.5 4 3 3.5 2.5 2 3 1.5 2.5 1 Average Peak Area Peak Average 0.5 2 0

1.5 AveragePeakArea

1

QC QC

160 C 160 161 C 161 C 171 C 173 Diacylglycerols

0.5 0.3 171

161 0.3 0.3 161

160 0.02 160 0.02 173 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 b. Significantly altered in susceptible isolates: down-regulated PC 32:2 / PC O-32:3 (C40H76NO7P) / PE (38:2) / PC (35:2) (C43H82NO8P) / PA(38:4) (C41H73O8P) / PS (32:3) (C38H68NO10P) / 1-(6-[5]-ladderane-hexanyl)-2-(8-[3]-ladderane-octanyl)-sn- PE-NMe2 (36:2) (C42H80NO6P) 2-octadecanoyl-1-hexadecyl-sn-glycero-3-phosphate (C37H75O7P) glycerophosphoethanolamine (C43H72 NO6P)

ESI- mode Peak no. 1220 RT 21.17 min. m/z ESI- mode Peak no. 1168 RT 19.24 min. m/z ESI+ mode Peak no. 2737 RT 15.25 min. m/z 770.5670 745.4808 788.4753 1.2 1 2.5

0.9 1 0.8 2

0.7 0.8 0.6 1.5

0.6 0.5

0.4 1

0.4

AveragePeakArea AveragePeakArea AveragePeakArea 0.3

0.2 0.5 0.2 0.1

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 17-Hydroxy-3-oxo-17alpha-pregna-1,4-diene-21-carboxylic acid, gamma-lactone / Canrenone / Norethindrone acetate / Norethisterone 1-16:1-lysoPC (C H NO P) 24 48 7 Cholesteryl-beta-D-glucoside (C33H56O6) acetate (C22H28O3): Steroid Derivatives

ESI+ mode Peak no. 2004 RT 12.98 min. ESI- mode Peak no. 899 RT 20.32 min. m/z ESI- mode Peak no. 369 RT 16.87 min. m/z 339.1980

Phospholipids m/z 516.3063 593.4069 2.5 2.5 35

30 Others 2 2 25

1.5 1.5 20

15

1 1

AveragePeakArea AveragePeakArea AveragePeakArea 10 0.5 0.5 5

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 c. Significantly altered in resistant isolates: down-regulated

PA (32:1) (C35H67O8P) / DG (37:7) (C40H64O5) / DG PA (34:1) (C37H71O8P) (35:4) (C38H66O5) ESI- mode Peak no. 973 RT 20.23 min. m/z ESI- mode Peak no. 1021 RT 21.40 min. m/z 645.4474 673.4791

2.5 60

50 2

40 1.5

30

1

20

AveragePeakArea AveragePeakArea

0.5

10

Phospholipids/ Diacylglycerols

0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 Figure 4.5: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin. Lipids specifically regulated in susceptible or resistant isolates: (a) significantly up-regulated in susceptible isolates; (b) significantly down-regulated in susceptible isolates; (c) significantly down-regulated in resistant isolates. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4.

171

Chapter Four

During the investigation, the most fascinating series of lipid alterations observed were probably those that were differentially regulated between the sensitive and resistant isolates (Figure 4.6). Many phospholipids were down-regulated in the resistant isolates (160 and 161) and up-regulated in the sensitive and intermediate resistant isolates (173 and 171). Most commonly, the significant lipids were down- regulated in the resistant isolates and up-regulated in the sensitive isolates, with the majority of the significant lipids being putatively assigned as PCs, PCs/PEs and PAs. A small number of DGs also fell within this response class, with putative assignments suggesting potentially high levels of unsaturation within the fatty acid components. Interestingly, an ion putatively assigned as cholesteryl 11-hydroxy- eicosatetraenoate was extremely highly up-regulated, but only within the intermediate resistant isolate 171. Comparatively, a small number of lipids were seen to be up-regulated within the resistant isolates and down-regulated within the sensitive isolate (173). One of these ions was putatively assigned as a PG with highly saturated fatty acids and another as either a PA/PG, once again with highly saturated fatty acids, or a DG with a different degree of unsaturation.

172

Chapter Four

a. Differentially Expressed between resistant and sensitive: up-regulated in susceptible, down-regulated in resistant isolates

PC (32:0) / PE (35:0) / PE-NMe (34:0) (C40H80NO8P) / N-(2-hydroxytricosanoyl)-4,8-sphingadienine PC (14:0/dm18:0) / PC (16:0/dm16:0) /

(C41H79NO4) PA (34:1) (C37H71O8P) PC (O-32:1) (C40H80NO7P)

ESI+ mode Peak no. 2537 RT 16.14 min. m/z ESI+ mode Peak no. 2357 RT 16.21 min. ESI+ mode Peak no. 2480 RT 16.50 min. m/z 734.5680 m/z 675.4965 718.5776 7 9 1.6

6 8 1.4 7 1.2 5 6 1 4 5 0.8 3 4 0.6

3

AveragePeakArea AveragePeakArea 2 AveragePeakArea 0.4 2 1 1 0.2

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

PC (34:0) / PE (37:0) / PE-NMe (36:0) (C42H84NO8P) / PA (38:4) (C41H73 O8P) / PA (36:1) (C39H75 O8P) / PC (35:1) / PE (38:1) (C43H84NO8P) / N-(2-hydroxypentacosanoyl)-4,8-sphingadienine Dipalmitoylphosphatidylserine (C36H70NO10P) / PC (O-32:0) (C40H84NO6P) (C43H83NO4) 1-(8-[5]-ladderane-octanyl)-2-(8-[3]-ladderane-octanyl)-sn-glycerol (C43 H70O3)

ESI+ mode Peak no. 2688 RT 17.45 min. m/z ESI+ mode Peak no. 2634 RT 17.61 min. m/z ESI+ mode Peak no. 2509 RT 17.56 min. m/z 774.6015 762.6012 725.5102 1.2 3.5 2.5

3 1 2 2.5 0.8 1.5 2 0.6 1.5

1 Average Peak Peak Average Area

0.4 AveragePeakArea AveragePeakArea 1 0.5 0.2

Phospholipids 0.5

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

PA (36:2) (C39H73 O8P) / DG (39:5) (C42 H72O5) / 1-(10-methyl- hexadecanoyl-2-(8-[3]-ladderane-octanyl)-sn-glycerol / 1-(9,14-dimethyl- PA (36:1) (C39H75O8P) pentadecanoyl-2-(8-[3]-ladderane-octanyl)-sn-glycerol (C40H72 O4)

ESI- mode Peak no. 1073 RT 23.13 min. m/z ESI- mode Peak no. 1068 RT 21.69 min. m/z 701.5101 699.4943

30 6

25 5

20 4

15 3

10 2

AveragePeakArea AveragePeakArea

5 1

0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

DG (40:7) (C43H70O5) / DG (38:4) (C41H72O5) / DG (38:7) (C41H66O5) / DG (36:4) (C39H68O5) PA(O-32:0) (C35H73O6P) DG (31:0) (C34H66O5)

ESI- mode Peak no. 992 RT 20.95 min. m/z ESI- mode Peak no. 1049 RT 22.39 min. m/z ESI+ mode Peak no. 2349 RT 8.47 min. 659.4632 687.4947 m/z 671.4142 9 14 0.7

8 12 0.6 7 10 0.5 6

5 8 0.4

4 6 0.3 AveragePeakArea 3 AveragePeakArea AveragePeakArea 4 0.2 2 2 0.1

Diacylglycerols 1

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 1-tetradecanyl-2-(8-[3]-ladderane-octanyl)-sn-glycero- Cholesteryl 11-hydroperoxy-eicosatetraenoate 3-phospho-(1'-sn-glycerol) (C40H75O8P); (C47H76O4): Cholesteryl ester Dialkylglycerophosphoglycerols

ESI+ mode Peak no. 2583 RT 18.03 min. m/z ESI- mode Peak no. 1090 RT 22.93 min. m/z 749.5483 713.5101 12 12

10 10

8 8

6 6 Others

4 Peak Average Area 4 AveragePeakArea

2 2

0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 b. Differentially Expressed between resistant and sensitive: up-regulated in resistant, down-regulated in susceptible isolates

PG (30:0) (C36H71O10P) / PA (32:0) (C35H69 O8P) / DG PG(31:1) (C37H71O10P) (37:6) (C40H66O5); DG (35:3) (C38H68 O5) ESI- mode Peak no. 1082 RT 19.99 min. m/z ESI- mode Peak no. 1059 RT 20.05 min. m/z Non-ST131 ST131 705.4683 693.4694 5 4.5 9 30 4 3.5 8 25 3 7 2.5 2 6 20 1.5 5 1 15 Area Peak Average 0.5 4 0

3 10

AveragePeakArea AveragePeakArea

2

QC QC

160 C 160 171 C 171 C 173

5 C 161

Phospholipids/ Diacylglycerols

1 0.3 171

161 0.3 0.3 161

160 0.02 160 0.02 173 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 Figure 4.6: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin: lipids differentially regulated between susceptible and resistant isolates: (a) up-regulated in susceptible isolates and down-regulated in resistant isolates; (b) up-regulated in resistant isolates and down- regulated in susceptible isolates. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4.

173

Chapter Four

Several trends have emerged within the LC-MS lipidomics dataset and the most significant is probably that all phospholipids were assigned classes of PCs/PEs and PAs/PGs, and their putative assignments suggest there are high levels of saturation within the fatty acids. Conversely, many ions putatively assigned as DGs were shown to be significant and had a higher proportion of unsaturated fatty acids than phospholipids. Characteristically, the levels of up or down-regulation observed were much higher for the sensitive isolate 173 and the intermediate resistant isolate 171 than the levels of regulation within the resistant isolates (Figure 4.5). As the distribution in the samples observed within the PC-DFA models (Figure 4.4) showed much greater change in the sensitive and intermediate-resistant isolates, the results observed within the trend plots of the statistically significant lipid species are not so surprising. Interestingly, the majority of statistically significant lipids were up- regulated in 171 and 173 whilst being down-regulated in 160 and 161 isolates, with only a small number of significant lipids revealing the converse trends (Figure 4.5 and 4.6). In response to ciprofloxacin challenge, the sensitive isolates in general showed high levels of up-regulation of PCs or PEs, whereas the resistant isolates appear to be commonly characterised by down-regulation of PCs or PEs, which seems to suggest that the bacteria are undergoing different homeoviscous adaptations.

It is of no great surprise that the majority of lipids that were significantly altered by ciprofloxacin challenge within this study were putatively assigned as being PEs or PCs, in view of the findings of previous research that have shown that in E. coli PEs are one of the major phospholipid species detected (Ames, 1968). One possible explanation is that ciprofloxacin treatment specifically affects PEs or PCs. Another explanation is that these are one of the major lipid species of E. coli and therefore, due to their higher concentrations, changes in their levels are more likely to be detected and that the alternate species of phospholipids may be either close to, or under, the detection limit of the LC-MS instrumentation used. If the latter was the case, then alterations in the other major lipid classes of E. coli might be expected, particularly PGs, PSs and cardiolipins (Ames, 1968). There have been a number of previous investigations in this field, though all slightly differing in nature. Some have been based upon extracted liposomes (Bensikaddour et al., 2008) and some on extracted membrane lipids of E. coli (Leying et al., 1986; Merino et al., 2003), rather

174

Chapter Four than on the bacterial cells within cultures, as in this study. It was found that ciprofloxacin interacts, in these systems, at the water-phospholipid head group interface of the lipid membrane, as has been revealed by atomic force microscopy (Merino et al., 2003), and also by a combination of quasi-elastic light scattering and steady-state fluorescence anisotropy, alongside attenuated total reflectance (ATR)- FT-IR and 31P NMR spectroscopy (Bensikaddour et al., 2008). Considering these observations, it is clearly crucial that this study should use highly sensitive equipment such as the LC-MS to investigate lipid alterations within the „true‟ bacterial cell. The development of online LC-MS/MS methods and offline MSn analyses of pre-fractionated samples (van der Hooft et al., 2011; van der Hooft et al., 2012) in order not to just unambiguously assign a lipid identification but also to identify levels of saturation and positions of unsaturated bonds within fatty acids will be a very significant progress in investigations of this type of study. Once these methods are developed with their unrivalled abilities to confirm lipid identifications (to level 1 of MSI) and elucidate the unsaturated fatty acid bond positions, they will enhance our ability to understand the effects of ciprofloxacin modulation of fatty acid structure within the putatively assigned PE and PC phospholipids.

175

Chapter Four

4.4 Conclusion The LC-MS workflow that was developed to analyse lipid species from E. coli when challenged by an antibiotic such as ciprofloxacin has been proven to be a highly reproducible technique that is valuable for discovery phase studies for elucidating bacterial antibiotic resistance mechanisms. In addition, the combination of a lipid profiling approach with alternative metabolomics platforms that are suited to profiling primary metabolism (e.g. GC-MS) (as in Chapter 3) may be very successful in the study of the mechanisms that underlie antibiotic modes of action within sensitive isolates. The developed workflow used in this study has identified a series of interesting and potentially clinically significant lipid alterations in E. coli upon challenge with ciprofloxacin, but to improve further the level of information gained, online RPLC-MS/MS and offline direct infusion MSn methods need to be developed with the LTQ-Orbitrap XL MS system to identify lipid species across all classes. This would enable phospholipid classes and fatty acid constituents to be identified by using orthogonal RT information and MS/MS spectral fragmentation compared with analytical standards. The positions of unsaturated bonds within the fatty acid constituents of phospholipids could be identified by using online fraction collection and performing MS3>5 level experiments via collision induced dissociation (CID) and higher-energy collision dissociation (HCD) methods. The great potential of LC-MS lipidomics in the study of antibiotic modes of action or bacterial modes of resistance has been proven by this study, though further improvements in methods for the unambiguous identification of the lipid species are required to maximise the biological information obtained by such a non-targeted lipidomics approach.

176

Chapter Four

4.5 References Allwood, J. W., Ellis, D. I., Heald, J. K., Goodacre, R. & Mur, L. A. J. 2006. Metabolomic approaches reveal that phosphatidic and phosphatidyl glycerol phospholipids are major discriminatory non-polar metabolites in responses by Brachypodium distachyon to challenge by Magnaporthe grisea. Plant Journal, 46, 351-368. AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369. Ames, G. F. 1968. Lipids of Salmonella typhimurium and Escherichia coli: structure and metabolism. Journal of Bacteriology, 95, 833-843. Andrews, J. M. 2001. Determination of minimum inhibitory concentrations. Journal of Antimicrobial Chemotherapy, 48, 5-16. Bensikaddour, H., Snoussi, K., Lins, L., Van Bambeke, F., Tulkens, P. M., Brasseur, R., Goormaghtigh, E. & Mingeot-Leclercq, M. P. 2008. Interactions of ciprofloxacin with DPPC and DPPG: fluorescence anisotropy, ATR-FTIR and 31P NMR spectroscopies and conformational analysis. Biochimica Et Biophysica Acta, 1778, 2535-2543. Brown, M., Wedge, D. C., Goodacre, R., Kell, D. B., Baker, P. N., Kenny, L. C., Mamas, M. A., Neyses, L. & Dunn, W. B. 2011. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. Bioinformatics, 27, 1108-1112. Dunn, W. B., Broadhurst, D., Begley, P., Zelena, E., Francis-McIntyre, S., Anderson, N., Brown, M., Knowles, J. D., Halsall, A., Haselden, J. N., Nicholls, A. W., Wilson, I. D., Consortium, T. H., Kell, D. B. & Goodacre, R. 2011. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols, 6, 1060-1083. Dunn, W. B., Broadhurst, D., Brown, M., Baker, P. N., Redman, C. W. G., Kenny, L. C. & Kell, D. B. 2008. Metabolic profiling of serum using ultra performance liquid chromatography and the LTQ-Orbitrap mass spectrometry system. Journal of Chromatography B, 871, 288-298. Efron, B. 1981. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika, 68, 589-599. Fahy, E., Subramaniam, S., Brown, H. A., Glass, C. K., Merrill, A. H., Murphy, R. C., Raetz, C. R. H., Russell, D. W., Seyama, Y., Shaw, W., Shimizu, T., Spener, F., van Meer, G., VanNieuwenhze, M. S., White, S. H., Witztum, J. L. & Dennis, E. A. 2005. A comprehensive classification system for lipids. Journal of Lipid Research, 46, 839-861. Goodacre, R., Vaidyanathan, S., Bianchi, G. & Kell, D. B. 2002. Metabolic profiling using direct infusion electrospray ionisation mass spectrometry for the characterisation of olive oils. Analyst, 127, 1457-1462. Goodacre, R., Vaidyanathan, S., Dunn, W. B., Harrigan, G. G. & Kell, D. B. 2004. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends in Biotechnology, 22, 245-252. Greenwood, D. 2000. Antimicrobial chemotherapy, Oxford, Oxford University Press. Hair, J. F., Black, W. C. & Babin, B. J. 2005. Multivariate data analysis, New York, Prentice Hall. Halouska, S., Chacon, O., Fenton, R. J., Zinniel, D. K., Barletta, R. G. & Powers, R. 2007. Use of NMR metabolomics to analyse the targets of D-cycloserine in mycobacteria: role of D-alanine racemase. Journal of Proteome Research, 6, 4608-4614. Han, X. L. & Gross, R. W. 2005. Shotgun lipidomics: electrospray ionisation mass spectrometric analysis and quantitation of cellular lipidomes directly from crude extracts of biological samples. Mass Spectrometry Reviews, 24, 367-412. Johnson, R. A. & Wichern, D. W. 2007. Applied multivariate statistical analysis, Upper Saddle River, Pearson/Prentice Hall.

177

Chapter Four

Köfeler, H. C., Fauland, A., Rechberger, G. N. & Trötzmüller, M. 2012. Mass spectrometry based lipidomics: an overview of technological platforms. Metabolites, 2, 19-38. Kolak, M., Westerbacka, J., Velagapudi, V. R., Wagsater, D., Yetukuri, L., Makkonen, J., Rissanen, A., Hakkinen, A.-M., Lindell, M., Bergholm, R., Hamsten, A., Eriksson, P., Fisher, R. M., Oresic, M. & Yki-Jarvinen, H. 2007. Adipose tissue inflammation and increased ceramide content characterise subjects with high liver fat content independent of obesity. Diabetes, 56, 1960-1968. Kwon, Y. K., Higgins, M. B. & Rabinowitz, J. D. 2010. Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in E. coli. ACS Chemical Biology, 5, 787-795. Kwon, Y. K., Lu, W., Melamud, E., Khanam, N., Bognar, A. & Rabinowitz, J. D. 2008. A domino effect in antifolate drug action in Escherichia coli. Nature Chemical Biology, 4, 602-608. Leying, H., Suerbaum, S., Kroll, H. P., Karch, H. & Opferkuch, W. 1986. Influence of beta- lactam antibiotics and ciprofloxacin on composition and immunogenicity of Escherichia coli outer membrane. Antimicrobial Agents and Chemotherapy, 30, 475-480. Martens, H. & Naes, T. 1992. Multivariate calibration, Chichester, Wiley. Martens, H., Nielsen, J. P. & Engelsen, S. B. 2003. Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Analytical Chemistry, 75, 394-404. Merino, S., Domenech, O., Diez, I., Sanz, F., Vinas, M., Montero, M. T. & Hernandez- Borrell, J. 2003. Effects of ciprofloxacin on Escherichia coli lipid bilayers: an atomic force microscopy study. Langmuir, 19, 6922-6927. Oresic, M. 2011. Informatics and computational strategies for the study of lipids. Biochimica Et Biophysica Acta, 1811, 991-999. Oresic, M., Hanninen, V. A. & Vidal-Puig, A. 2008. Lipidomics: a new window to biomedical frontiers. Trends in Biotechnology, 26, 647-652. Smith, A. D. 1997. Oxford dictionary of biochemistry and molecular biology, Oxford, Oxford University Press. Spener, F., Lagarde, M., Géloên, A. & Record, M. 2003. What is lipidomics? European Journal of Lipid Science and Technology, 105, 481-482. Sumner, L. W., Amberg, A., Barrett, D., Beale, M. H., Beger, R., Daykin, C. A., Fan, T. W. M., Fiehn, O., Goodacre, R., Griffin, J. L., Hankemeier, T., Hardy, N., Harnly, J., Higashi, R., Kopka, J., Lane, A. N., Lindon, J. C., Marriott, P., Nicholls, A. W., Reily, M. D., Thaden, J. J. & Viant, M. R. 2007. Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211-221. Tang, J. 2011. Microbial Metabolomics. Current Genomics, 12, 391-403. van der Greef, J., Stroobant, P. & van der Heijden, R. 2004. The role of analytical sciences in medical systems biology. Current Opinion in Chemical Biology, 8, 559-565. van der Hooft, J. J. J., Vervoort, J., Bino, R. J., Beekwilder, J. & de Vos, R. C. H. 2011. Polyphenol identification based on systematic and robust high-resolution accurate mass spectrometry fragmentation. Analytical Chemistry, 83, 409-416. van der Hooft, J. J. J., Vervoort, J., Bino, R. J., de Vos, R. C. H. 2012. Spectral trees as a robust annotation tool in LC-MS based metabolomics. Metabolomics, 8, 691-703. Vance, D. E. & Vance, J. E. 1996. Biochemistry of lipids, lipoproteins and membranes, Amsterdam, Elsevier. Varmuza, K. & Filzmoser, P. 2009. Introduction to multivariate statistical analysis in chemometrics, Boca Raton, CRC Press. Vinzi, V. E., Chin, W. W., Henseler, J. & Wang, H. 2010. Handbook of partial least squares, London, Springer. Wedge, D. C., Allwood, J. W., Dunn, W., Vaughan, A. A., Simpson, K., Brown, M., Priest, L., Blackhall, F. H., Whetton, A. D., Dive, C. & Goodacre, R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An

178

Chapter Four

assessment in patients with small-cell lung cancer. Analytical Chemistry, 83, 6689- 6697. Winder, C. L., Dunn, W. B., Schuler, S., Broadhurst, D., Jarvis, R., Stephens, G. M. & Goodacre, R. 2008. Global metabolic profiling of Escherichia coli cultures: an evaluation of methods for quenching and extraction of intracellular metabolites. Analytical Chemistry, 80, 2939-2948. Winder, C. L., Gordon, S. V., Dale, J., Hewinson, R. G. & Goodacre, R. 2006. Metabolic fingerprints of Mycobacterium bovis cluster with molecular type: implications for genotype-phenotype links. Microbiology, 152, 2757-2765.

179

Chapter Four

4.6 Supplementary Information

Table S 4.1: Normalised sample reconstitution volumes used prior to LC-MS analysis. C, control.

(average

)

nm

600

reconstitutionvolume

MSinjection order

-

Sample Sample number Flask Absorbance at OD of 3replicates technical Class LC Normalised (µL)* 1 161 C1 0.365 1 1 114.1 2 161 C2 0.376 1 9 117.5 3 161 C3 0.375 1 17 117.2 4 161 0.31 0.376 2 2 117.5 5 161 0.32 0.3695 2 10 115.5 6 161 0.33 0.357 2 18 111.6 7 171 C1 0.396 3 3 123.8 8 171 C2 0.36 3 11 112.5 9 171 C3 0.374 3 19 116.9 10 171 0.31 0.322 4 4 100.6 11 171 0.32 0.3215 4 12 100.5 12 171 0.33 0.412 4 20 128.8 13 160 C1 0.391 5 5 122.2 14 160 C2 0.362 5 13 113.1 15 160 C3 0.374 5 21 116.9 16 160 0.021 0.377 6 6 117.8 17 160 0.022 0.378 6 14 118.1 18 160 0.023 0.36 6 22 112.5 19 173 C1 0.3905 7 7 122 20 173 C2 0.387 7 15 120.9 21 173 C3 0.372 7 23 116.3 22 173 0.021 0.446 8 8 139.4 23 173 0.022 0.4345 8 16 135.8 24 173 0.023 0.3245 8 24 101.4 *(Minimum volume (100 µL)/Minimum OD (0.32)) × Sample OD.

180

Chapter Four

a. Common response: up-regulated

PG (34:1) (C40H77O10P) / PA (36:1) (C39H75 O8P) PG (36:2) (C42H79O10P) PG (36:1) (C42H81O10P) / DG (41:4) (C44H78O5)

ESI- mode Peak no. 1174 RT 21.14 min. m/z ESI- mode Peak no. 1228 RT 21.47 min. m/z ESI- mode Peak no. 1234 RT 22.88 min. m/z 747.5152 773.5302 775.5464

70 18 7

16 60 6 14 50 5 12

40 10 4

30 8 3

6

AveragePeakArea AveragePeakArea AveragePeakArea 20 2 4

Phospholipids 10 1 2

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

1-tetradecanyl-2-(8-[3]-ladderane-octanyl)-sn-glycero- 1-octadecyl-heptadecanoate / Ceroplastic acid (C35H70O2) 3-phospho-(1'-sn-glycerol) (C40H75O8P)

ESI+ mode Peak no. 2206 RT 16.50 min. m/z ESI- mode Peak no. 1198 RT 20.96 min. m/z Non-ST131 ST131 591.5348 759.5144 5 4.5 5 16 4 4.5 3.5 14 3 4 2.5 12 3.5 2 10 1.5 3 Area

1 Average Peak Area Peak Average

2.5 Peak 8 0.5

Others 2 0

6 Average AveragePeakArea 1.5 4

1 QC

160 C 160 161 C 161 C 171 C 173

2 0.3 171

0.5 0.3 161

160 0.02 160 0.02 173

0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

b. Common response: down-regulated

PC 32:4 / PE 35:4 (C40H72NO8P) / PC 30:1 / PE 33:1 PC (30:1) / PE (33:1) (C38H74NO8P) / (C38H74NO8P) / PE-NMe2(O-28:0) (C35H74 NO6P) PE-NMe2(O-28:0) (C35H74NO6P) DG 40:10 / DG 38:7 (C41H66O5)

ESI+ mode Peak no. 2512 RT 17.56 min. m/z ESI- mode Peak no. 1075 RT 19.99 min. m/z ESI+ mode Peak no. 2378 RT 14.84 min. 725.5102 702.5059 m/z 683.4645 6 40 0.35

35 0.3 5 30 0.25 4 25 0.2 3 20 0.15 15

2

AveragePeakArea AveragePeakArea AveragePeakArea 0.1 10 1

5 0.05

Phospholipids/ Diacylglycerols

0 0 0 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02 QC 161 C 161 0.3 171 C 171 0.3 160 C 160 0.02 173 C 173 0.02

Figure S 4.1: LC-MS trend plots of lipids significantly altered in response to ciprofloxacin. Common lipid responses between susceptible and resistant isolates showing lipids that are either (a) up- regulated or (b) down-regulated. Values plotted are the mean m/z lipid peak areas and error bars represent the standard error within the non-averaged data for each experimental class. Lipid identification was conducted as in Brown et al. (2011), see Tables S 4.2, S 4.3, S 4.4.

Table S 4.2: LC-MS lipidomics table of class averages and within class standard error, PLS co- efficient, ANOVA p-value, and putative metabolite identifications for LC-MS+ve and LC-MS-ve mode data. See additional information on the attached CD.

Table S 4.3: LC-MS+ve mode data set of the top 50 PLS coefficients for control compared to ciprofloxacin challenge for each respective isolate. See additional information on the attached CD.

Table S 4.4: LC-MS-ve mode data set of the top 50 PLS coefficients for control compared to ciprofloxacin challenge for each respective isolate. See additional information on the attached CD.

181

Chapter Five

5 Chapter Five

Multiple metabolomics of uropathogenic E. coli reveal different information content in terms of metabolic potential compared to virulence factors

Haitham AlRabiah1, Yun Xu1, Nicholas J. Rattray1, Andrew A. Vaughan1, Tarek Gibreel2,3, Ali Sayqal1, Mathew Upton2,3, J. William Allwood1,4 and Royston Goodacre1*

1School of Chemistry and Manchester Institute of Biotechnology, University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK

2School of Medicine, University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PL, UK

3Current address: Plymouth University Peninsula Schools of Medicine and Dentistry, Plymouth University, Drake Circus, Plymouth, PL4 8AA, UK

4Currrent address: Clinical and Environmental Metabolomics, School of Bioscience, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK

*Correspondence to Roy Goodacre: [email protected]

This chapter is a submitted manuscript to Analyst. Yun Xu participated in analysis of data generated in this study. Nicholas Rattray and William Allwood contributed to the operation of mass spectrometry. Andrew Vaughan contributed to data processing. Tarek Gibreel and Mathew Upton provided the bacterial samples with microbiological information. Ali Sayqal provided technical assistance. Royston Goodacre supervised and guided the investigation undertaken in this study.

182

Chapter Five

Abstract Metabolomics is an analytical science that refers to the analysis of the total complement of all the metabolites found in a cell, tissue or fluid. As no single approach can achieve this, there are many different methods that are used for metabolomics and these generally rely on different types of analytical platforms. Whilst these approaches are apparently selected in an unbiased manner, they are usually dictated by the favourite analytical methods in the laboratory performing the metabolomics investigations. These may inadvertently introduce chemical selectivity into the analysis of the biological system under study, and this may then result in bias within the investigation. In order to investigate this, we analysed a collection of uropathogenic Escherichia coli (UPEC) from the globally disseminated ST131 clone. The selected isolates had previously undergone extensive characterisation using classical microbiological methods for a variety of metabolic tests (MT) and virulence factors (VF). Clearly, the former should correlate well with metabolomics as it is aimed at categorising the small molecule complement of cells. By contrast, the virulence determinants are generally larger molecules associated with the bacterial cell wall and include capsule antigens, fimbriae or toxins and one would expect these to be less well correlated with metabolomics measurements as these are structural in nature and found on the outside of the cell. These 11 UPEC isolates were analysed by four metabolomics approaches including: Fourier transform infrared (FT-IR) spectroscopy, gas chromatography-mass spectrometry (GC-MS) after derivatisation of polar non-volatile analytes, as well as reversed phase liquid chromatography-mass spectrometry (RPLC-MS) in both positive (LC-MS+ve) and negative (LC-MS-ve) electrospray ionisation modes. A comparison of the discriminatory ability of these four metabolomics methods with the metabolic test and virulence factors test was made using Procrustes transformations and we used rigorous statistics including multiple testing on these data to ascertain which methods produce congruent results; this congruence was in terms of correlation in PCA space and bootstrapping was used to generate probability values of whether these correlations were real or by chance. We found that FT-IR and LC-MS-ve mode, but not LC-MS+ve mode, were comparable with each other and gave highly similar clustering compared with the virulence factors tests. By contrast, FT-IR and LC- MS-ve were not comparable to the metabolic tests. We found that the GC-MS profiles

183

Chapter Five were significantly more congruent with the metabolic tests than the virulence determinants. Although LC-MS+ve was not statistically correlated with either, visual inspection of clusters with the metabolic test suggested there may be some loose congruence between the two methods. We conclude that metabolomics investigations may be biased to the analytical platform that is used for analysis. This is a reflection of the chemistry employed by the methods: e.g., LC-MS will target lipophilic species when RPLC is used compared to GC-MS which due to the derivatisation methods employed predominantly selects polar small molecules. We therefore believe that multiple platforms should be employed where possible and that the analyst should consider that there is a danger of false correlations between the analytical data and the biological characteristics of interest if the full metabolome has not been measured.

184

Chapter Five

5.1 Introduction Metabolomics aims to categorise the small molecular weight complement of cells, tissue and biofluids (Fiehn, 2002; Oliver et al., 1998; Dunn et al., 2011b), and although arguably an „ancient‟ science (Oresic, 2009), a plethora of analytical platforms, mainly based on mass spectrometry (MS) and various molecular separation techniques including gas chromatography (GC) and liquid chromatography (LC), have made it possible to detect small molecules in biological matrices (Dunn et al., 2013).

In practice, the detection of the full metabolome is still unachievable by a single analytical tool due to the chemical complexity of metabolites, great variations in their concentration levels and various other reasons such as analyte lability (Goodacre et al., 2004). Therefore, in addition to MS, other detection techniques such as nuclear magnetic resonance (NMR) spectroscopy and vibrational spectroscopies (viz. Fourier transform infrared (FT-IR) and Raman) are used as complementary analytical approaches. In particular, FT-IR spectroscopy is considered to be a low cost, high-throughput technique making it a first option for preliminary experiments to give a preview of the experiment direction before more advanced tools are employed (Dunn et al., 2005).

The question arises as to exactly how complementary these methods are. For example, in FT-IR spectroscopy sample extraction is usually not performed and the method provides chemical information at the level of molecular vibrations, not isolated metabolites per se. By contrast, MS-based studies are performed usually after extraction and usually after GC or LC. All of these processes introduce selectivity to the analysis and hence potential analytical bias. If we consider GC-MS using methanol extraction followed by a two-stage methoxyamination and silylation (Winder et al., 2008; Roessner et al., 2000), one is generally selecting metabolites from central metabolism such as sugars, sugar phosphates, amino acids and small fatty acids etc. For LC-MS, using reversed phase liquid chromatography (RPLC) one targets more lipophilic species and another choice made is the polarity of the ion source in terms of positive or negative electrospray ionisation. It is currently unlikely that people have the resources to include all possible analytical approaches and therefore choices are made on which are the most appropriate or accessible to select.

185

Chapter Five

Therefore, in this study, we used a range of metabolomics platforms on a microbiologically characterised set of uropathogenic Escherichia coli (UPEC) isolates that all belong to the same sequence type (ST131), an important and globally disseminated clone (Lau et al., 2008). Due to the platforms available in our laboratory, we selected FT-IR spectroscopy, GC-MS of polar non-volatile analytes, and RPLC-MS in both ESI positive and negative modes. Both GC-MS and LC-MS analyses were performed on 80% methanol extracts. Once the data were collected, we used a series of chemometric methods to compare the differentiation ability of all four methods. Moreover, these were compared with genotypic and phenotypic characteristics that are measured during investigation of the pathogenic potential of UPEC and included data for a panel of metabolic tests and virulence factor carriage.

186

Chapter Five

5.2 Materials and Methods

5.2.1 General chemicals Unless otherwise stated, all chemicals were supplied by Fisher Scientific Ltd. (Loughborough, UK), and all solvents and acids were obtained from Sigma-Aldrich (Gillingham, UK).

5.2.2 Microorganisms The 11 uropathogenic Escherichia coli (UPEC) isolates examined were obtained from bacteriuria urine samples submitted to the bacteriology laboratory at the Central Manchester Foundation Trust. The isolates were all from the ST131 lineage and resistant to quinolones due to different genetic mechanisms (Table S 5.1). Identification of virulence capacity and metabolic profile have been previously described (Gibreel et al., 2012a; Gibreel et al., 2012b) and these are provided in Tables S 5.2 and S 5.3.

5.2.3 Preparation of E. coli inoculates for metabolic fingerprinting and metabolic profiling Samples were prepared according to the protocols described by AlRabiah et al. (2013; Chapter 3) with the only exception being that samples were incubated for 21 h rather than 18 h (see Figure S 5.1 for details). After cultivation of the bacteria (see Supplementary Information), each of the 4 biological replicates were split for FT-IR, GC-MS and LC-MS to ensure that results were obtained from the same biological cultures. For GC-MS and LC-MS, 15 mL from each replicate was collected, quenched and extracted according to the procedures developed by Winder et al., (2008). The only difference in this study is that for metabolite extraction 80% methanol was used rather than 100% methanol to enhance the recovery of polar small molecules. Samples for GC-MS and LC-MS, including quality control samples (QCs), were normalised to optical density (OD) and made up with 80% methanol. Further sample processing steps were applied to the GC-MS samples (adding internal standards, a two-step chemical derivatisation and adding retention index marker solutions). LC- MS samples were reconstituted in 100 µL HPLC grade water, vortex mixed and

187

Chapter Five centrifuged before instrumental analysis. Full details of sample preparations are available in the Supplementary Information.

5.2.4 FT-IR spectroscopy A Bruker Equinox 55 infrared spectrometer (Bruker Ltd., Coventry, UK) equipped with a HTX™ module was used for FT-IR spectroscopic analysis using the method described by AlRabiah et al. (2013; Chapter 3) and Winder et al. (2006). Spectra were collected in the range of 4000−600 cm-1, with 64 co-adds and at a resolution of 4 cm-1.

5.2.5 GC-MS A LECO Pegasus III TOF/MS (Leco Corp., St. Joseph, MO, USA) was used to conduct GC-TOF/MS and its mode of operation is provided in the Supplementary Information following our established GC-MS protocol (Begley et al., 2009; Dunn et al., 2011a), which follows Metabolomics Standards Initiative (MSI) guidelines (Sumner et al., 2007). After GC-MS, data were processed via the deconvolution method of Begley et al. (2009). QC samples were used before statistical analysis, as described by Wedge et al. (2011), to give quality assurance of data by evaluating and removing mass features exhibiting high deviation within the QC samples.

5.2.6 LC-MS UHPLC-MS analysis was carried out on an Accela UHPLC autosampler system (Thermo-Fisher Ltd. Hemel Hempsted, UK) coupled to an electrospray LTQ- Orbitrap XL hybrid mass spectrometry system (Thermo-Fisher, Bremen, Germany) as previously described by Dunn et al., (2011a) and Wedge et al. (2011) and highlighted in the Supplementary Information section. Note that the same samples were analysed twice: once in positive and again in negative ESI modes. QCs were also used as detailed by Wedge et al. (2011) to provide quality assurance of the LC- MS data.

188

Chapter Five

5.2.7 Data analysis The pre-processed FT-IR, GC-MS and LC-MS data (see Supplementary Information for full details) were first analysed using principal component analysis (PCA). The first 1:n PCs scores which explained ~75% of the total variance were then subjected to discriminant function analysis (DFA). DFA was calibrated with 11 classes (one for each of the 11 E. coli isolates) and the first 3 discriminant functions (ordinates) were retained. In order to make visualisation easier, and more importantly to balance the number of samples for Procrustes analysis (vide infra), as each class contained 36 FT-IR spectra (4 biological replicates, 3 spots for each and 3 measurements off each spot), these were mean-averaged to generate 11 DFA coordinates for the 11 isolates. In a similar fashion for GC-MS and LC-MS (in both ionisation modes) where each sample was represented by 4 injections (1 for each of the 4 biological replicates), the resulting DFA scores were also averaged.

In addition to the analytical metabolomics data, the E. coli isolates had also been subjected to classical microbiological testing. Metabolic activity was probed via 47 biochemical tests (Table S 5.3) designed to measure carbon source utilisation and enzymatic activity using the Vitek 2 ID-GNB card and the Vitek 2 compact Automated Expert System (Biomérieux, Basingstoke, UK) (Gibreel et al., 2012b). The virulence capabilities (Table S 5.2) of these strains were investigated through genetic screening for the presence of 30 ExPEC associated virulence factors (VF) encompassing five categories (adhesins, toxins, siderophores, capsule and „miscellaneous‟) (Gibreel et al., 2012a).

These metabolic tests (MT) and VF tests are characters that are both represented as present/absent data. These are clearly very different to the FT-IR, GC-MS and LC- MS quantitative data which are all continuous data. To make these two different data types comparable with each other, the pattern of the MT and the VF test data sets were also projected into ordination space using the following procedure: first, a pair- wise distance matrix was calculated to measure the dissimilarity between every pair of the isolates using the Jaccard distance (Jaccard, 1912); next, principal coordinate analysis (PCoA) was performed on the square rooted distance matrix and the first 3 PCs were retained (Borg and Groenen, 2005).

189

Chapter Five

The result of the above analysis was six different ordination analyses: PC-DFA from the four metabolomics data sets and PCoA from the metabolic tests and virulence factors. In order to compare the similarity in the discriminatory ability generated by these different analyses Procrustes, analysis was performed on all possible data set pairs (Andrade et al., 2004). In this process, the similarity is measured in terms of the Procrustes error, which varies from 0 to 1; where 0 indicates a perfect match and 1 indicates that the two sets of clusters are completely different. The statistical significance level of the levels of these similarities was assessed using a Procrustean test procedure (Jackson, 1995). For each comparison, 10000 permutation tests were performed by permuting the order of the samples in the data sets and subsequently performing the Procrustes analysis. The Procrustes errors of these permutations were recorded to form a null distribution. The observed Procrustes error was then compared against the null distribution and an empirical p-value was derived by counting the number of cases when the Procrustes error obtained from the permuted data sets was lower than the observed error; this was then divided by 10000 (the total number of the permutation tests).

If any of the pair-wise comparisons indicated comparable clusters, it would also be interesting to investigate which variables in the metabolomics data sets (i.e. FT-IR, GC-MS and LC-MS in both positive (LC-MS+ve) and negative (LC-MS-ve) ionisation mode data sets) were mainly responsible for the matched patterns revealed after the Procrustes rotation. This was achieved by first projecting the loadings of the PCA into the PC-DFA space using the DFA loadings and then rotating these again using the Procrustes orthogonal rotation matrix. The resultant loadings were denoted as Procrustes rotated loadings. The variables with significantly high loadings were the ones that contributed most to the matched pattern after the Procrustes rotation.

190

Chapter Five

5.3 Results and Discussion In clinical microbiology, bacterial characterisation is largely dependent on phenotypic methods such as biochemical tests and bacterial morphology. These are time consuming and often provide limited information when compared with modern bioanalytical techniques. The two most common biochemical tests that microbiologists use are (i) those based on metabolic tests which involve growth on selective media to test for specific enzymes and (ii) assays for virulence factors which often reflect how the microorganism interacts with its environment and include its adhesins and capsule as well as any toxins produced. In general terms, metabolic tests reflect the organism‟s metabolic potential whilst some virulence tests probe the surface of the microorganism, as it is this surface that interacts with the environment.

To assess the level of information that metabolomics data may generate from microbiological samples, we compare four metabolomics approaches with each other and, importantly, with these two classical microbiology tests from a range of UPEC isolated from a local hospital. The results from the metabolomics methods, MT and VF, were analysed using cluster analysis and these generated six different ordination scores plots: four PC-DFA plots from the FT-IR, GC-MS and LC-MS in both positive and negative ionisation modes, and PCoA from the MT and VF. The resulting cluster plots then need to be compared and this is very difficult by eye. For example, the comparison of two sets of clusters in three dimensions requires one to: (i) first translate the spatial clustering (arrangement of samples) of one sample set onto the other, so that they are now both centred together; (ii) next, the clusters are scaled so that they are of equivalent size and (iii) finally, the clusters are aligned by rotation. Of course for simple shapes, this can be done by eye. The problem is that for the comparison of clusters generated from six different methods (as in this study), the number of unique comparisons that needs to be made is 15, and these need to be ranked and objectively assessed. Therefore in this study, we used a series of Procrustes transformations.

The Procrustes errors with the associated p-values of the pattern comparisons were calculated as described above and are presented in Table 5.1. In this table, the comparisons which revealed very similar spatial arrangements of the clusters from the PCoA and PC-DFA are highlighted in yellow. A Venn diagram-like figure 191

Chapter Five reflecting these overall comparisons is shown in Figure 5.1. This figure was constructed by first performing PCoA on the Procrustes errors table and converting it to a 2D X-Y scatter scores plot. Next, we calculated the 95% chi-square (2) confident regions (these are the ellipses shown in the plot) around each class, assuming that each have the same size of covariance matrices; this presumes that following the Procrustes transformation all resulting data transformations would have the same scale. It is clear from this comparison in Figure 5.1 that there are mainly four congruent pairs of clusters. In Table 5.1, these can be judged by having a low p-value (< 0.01; from multiple testing). These are highlighted below:

1. The LC-MS profiles in negative mode and the virulence factor test data had the highest similarity level with a Procrustes error of 0.4533 and the associated p-value was 0.0002 (i.e. only 2 out of 10000 permutations had obtained a higher Procrustes error). 2. The FT-IR spectra also obtained a statistically significant similarity to the VF test data with a p-value of 0.0072. By contrast, GC-MS and LC-MS in positive mode did not have a significant similarity to the VF test data (p > 0.01). 3. The GC-MS metabolite data obtained a very significant similarity to the classical metabolic tests (p = 0.0006), while the other 3 data sets had no significant similarity to this type of data (p > 0.01). We note that in Figure 5.1, there is some congruence between GC-MS with the VF but this is not as strong as the MT. 4. For the comparisons between the four metabolomics data types, the FT-IR data and LC-MS profiles in the negative mode had similar shapes, and this was to be expected as both were very similar to the VF test. 5. Finally, there was low similarity between the VF test and the metabolic test as p > 0.01.

192

Chapter Five

Table 5.1: The Procrustes errors with the associated p-values of the pair-wise comparisons.

+ve -ve LC-MS LC-MS GC-MS FT-IR VF MT - LC-MS+ve

0.6699 - LC-MS-ve (p=0.0543) 0.9239 0.7423 - GC-MS (p=0.7521) (p=0.0903) 0.9344 0.5333 0.8973 - FT-IR (p=0.8118) (p=0.0059) (p=0.3701) 0.8855 0.4533 0.6603 0.5429 - VF (p=0.5633) (p=0.0002) (p=0.0107) (p=0.0072) 0.7782 0.6737 0.5681 0.7843 0.6653 - MT (p=0.2021) (p=0.072) (p=0.0006) (p=0.2195) (p=0.091) Values highlighted in yellow are considered significant (p < 0.01) and indicate pairs of methods that provide equivalent clusters/shapes.

PCoA on the Procruestes error matrix 1

0.8

0.6

GC-MS 0.4

0.2 MT

VF

PC2 0

-0.2 FT-IR LC-MS -ve

-0.4 LC-MS +ve

-0.6

-0.8 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 PC 1

Figure 5.1: Venn diagram-like plot showing the overall clustering congruence between the four analytical approaches and the two microbiological tests. See text for explanation of its construction.

5.3.1 Interpretation of FT-IR spectra FT-IR spectroscopy is not a particularly popular metabolic fingerprinting method but it has been extensively used for so called „whole-organism fingerprinting‟ (Goodacre et al., 1998) for bacterial characterisations due to its high-throughput nature with

193

Chapter Five minimal sample preparation (Garip et al., 2007; Mariey et al., 2001; Naumann et al., 1991; Goodacre et al., 1996). In this study, FT-IR was applied to discriminate between isolates with the same sequence type and the FT-IR clusters had similar scores to those from virulence factor tests (Figure 5.2 and Table 5.1). Figure 5.2 shows the results from both the FT-IR (in red) and VF (in blue) where it can be seen that, in FT-IR, isolate 48 forms a cluster that is distinct from the other isolates, but is collocated with results from its VF test. Inspection of Table S 5.2, which shows the scores of the different virulence tests, reveals isolate 48 is the only isolate with a negative score for PAI. PAI is an acronym for pathogenicity islands, which are mobile genetic elements that carry the genes responsible for the production of many virulence factors, including protein secretion systems, toxins, adhesins and many others (Hacker and Kaper, 2000). FT-IR spectra from intact bacteria contain information on fatty acids, amides, polysaccharides, proteins and amino acids. As these virulence factors may be located in the membrane (outer surface of the organism), it is likely that FT-IR spectroscopy is detecting the loss of these as the whole organism is analysed and hence that is why it is located away from the other 10 isolates.

Isolates 52 and 75, 160 and 164 share the same VF profile, with the exception of strains 160 and 164 being negative for traT (Table S 5.2). All four isolates cluster together in the FT-IR data and are located reasonably close to their respective clusters from VF; they are located in the positive side of PC1 (Figure 5.2) and this may reflect that these isolates are all positive for the afa/draBC surface adhesins. Isolate 2 is also coincident in terms of FT-IR spectra with these four isolates but is very different for VF and this disparity was also observed for the LC-MS in negative ionisation mode comparison with VF (vide infra).

Capsular association factors (kpsMT K5 and kpsMT II) are extracellular and this may be reflected in the FT-IR spectra. Isolates 2, 25, 48, 183, 184 and 230 are positive for both these factors and, with the exception of isolate 2, are located on the negative part of PC1. Isolate 124 is also associated with these isolates and this may be a consequence of it being negative for afa/draBC as discussed above.

194

Chapter Five

Finally, no relationship between FT-IR spectra and traT was evident from this analysis and this was also observed for the LC-MS conducted in the negative ionisation mode. FT-IR vs. VF test 0.3 183, 230

0.2 160, 164

0.1 230 184 183 25 164 0 124 160 75 52 2 Red: FT-IR Blue: VF test 2, 25, 184 PC2 -0.1 48 48 52, 75 -0.2 124

-0.3

-0.4 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 PC 1

Figure 5.2: Superimposed scatter plots of PCoA scores of the first two components of the VF tests and Procrustean-transformed FT-IR spectra.

5.3.2 Interpretation of LC-MS profiles The same 11 E. coli isolates from uropathogenic infections were also analysed by reversed-phased LC-MS. As discussed above, 80% methanol extracts were prepared from these bacterial cultures and MS was performed in both LC-MS+ve and LC-MS-ve modes. Comparisons were made with VF and MT and it was found that LC-MS-ve mode shows a higher level of similarity with VF tests than FT-IR spectroscopy did (Table 5.1 and Figure 5.3). Moreover, because of these congruencies between [LC- MS-ve and VF] and [FT-IR and VF] it was not surprising that the [LC-MS-ve and FT- IR] comparison was also very similar (Table 5.1).

There were, however, two minor differences between the LC-MS-ve comparison with VF (Figure 5.3) compared with the FT-IR spectroscopic comparison (Figure 5.2) and these are briefly highlighted below:

 The first significant disparity is the observation that isolates 2, 25 and 184 were collocated in LC-MS-ve mode whereas they were significantly spread in

195

Chapter Five

PC1 in FT-IR. We note that they possess identical VF tests (Table S 5.2) and a possible explanation for this is that LC-MS-ve is detecting these preferentially compared with FT-IR (Table 5.1).  The second difference is that in FT-IR, isolates 2, 52, 75, 160 and 164 were very closely clustered together. By contrast, in LC-MS-ve isolates 160 and 164 „moved‟ to the positive parts of PC1 and PC2 and cluster very closely with their respective VF tests, whilst isolates 2, 52 and 75 are now collocated near the origin with isolates 124 and 183 (Figure 5.3).

It is possible that some of these small differences are because in LC-MS a methanolic extract is used compared to FT-IR where whole-organism fingerprinting is used. The similarity between the differentiation ability of FT-IR and LC-MS-ve with VF is interesting and this may reflect that both metabolomics methods are preferentially detecting cell wall components. As discussed above, FT-IR analyses the intact bacteria and certainly contains information on proteins and lipids, amongst other cellular components. In LC-MS, as reversed phase LC is used with the negative ionisation mode, more lipophilic species are analysed that may be associated with the cell wall and this has been reported before for direct infusion MS (Vaidyanathan et al., 2001; Vaidyanathan et al., 2002).

In the positive mode of LC-MS, very little similarity with VF was observed (Table 5.1). By contrast, although comparison of LC-MS+ve with MT (Figure S 5.2) showed some congruence; this was not statistically significant and so will not be discussed here.

196

Chapter Five

LC-MS negative mode vs. VF test 0.3 183, 230 0.2 160, 164 230

0.1 160

164 184 75 0 -ve 25 2 124 52 Red: LC-MS 183 Blue: VF test 2, 25, 184 PC2 -0.1 48 52, 75 -0.2 48 124

-0.3

-0.4 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 PC 1

Figure 5.3: Superimposed scatter plots of PCoA scores of the first two components of the VF tests and Procrustean-transformed LC-MS negative mode data.

5.3.3 Interpretation of GC-MS profiles The GC-MS approach used here is the one popularised by Fiehn and colleagues (Fiehn et al., 2000) and generates information-rich metabolite profiles of polar analytes and so mainly covers metabolites involved in the central metabolism. As can be seen from Figure 5.4, there is high similarity between GC-MS profiles for the 11 bacteria (highlighted in red) with the metabolic tests (in blue) and the similarity match is 0.5681 and is highly significant with p = 0.0006 (Table 5.1).

Isolates 160 and 164 share exactly the same results from MT and they are located closer to each other in the positive side of PC1 with isolate 183. Following Procrustes transformation of the PC-DFA from the GC-MS data, isolates 160 and 164 are very similar and are recovered far from all other isolates, which are congruent with their MT except for 230 which is positive in PC2.

197

Chapter Five

GC-MS vs. metabolic test 0.5

0.4

230 0.3

0.2 Red: GC-MS Blue: metabolic test PC2 0.1 184 48

75 184 230 48 0 2 25 75 183 160 164 124 160, 164 52 25 2 124 183 -0.1

-0.2 52 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 PC 1

Figure 5.4: Superimposed scatter plots of PCoA scores of the first two components of the metabolic tests and Procrustean-transformed GC-MS data.

Inspection of the MT (Table S 5.3) reveals that 160 and 164 are unique from all other isolates in that they scored positive in the GlyA (glycine arylamidase) test, which detects the glycine arylamidase enzyme. Arylamidase enzymes mainly hydrolyse peptides containing L-amino acids with an unsubstituted α-amino group in the N- terminal residue (Behal and Cox, 1968) and one of the main amino acids released by this enzyme is leucine (Riley and Behal, 1971). Therefore, the PC-DFA loading plots from GC-MS were produced (Figure S 5.3) and it was found that two variables were highly discriminatory (variables 17 and 49). Variable 17 is identified by in-house database matching to leucine and shows a much higher level in these two isolates than in the other E. coli (Figure 5.5 a). Moreover, arylamidase enzymes are involved in 8 of the metabolic tests in this experiment (Table S 5.3) and isolates 160 and 164 have the highest scores in these tests compared with others.

The other variable that was identified as significant (Figure S 5.3) was variable 49, which unfortunately we are unable to identify. When this feature is plotted for the 11 isolates (Figure 5.5 b) it is also elevated in isolates 160 and 164 confirming its importance as a discriminatory metabolite feature. We note also that isolate 183 also has increased levels compared with all the other isolates, although its level is not as high as the levels generated by 160 and 164.

198

Chapter Five

In terms of metabolic tests, isolate 183 is closer to isolates 160 and 164 as can be seen from its blue coding in Figure 5.4, and in GC-MS it is recovered to the right of the other 8 isolates and in the positive part of PC1. It shares the same metabolic results with these two isolates in all tests except GlyA and PHOS (phosphatase) tests. It is expected to observe a notable signal by phosphatase as the production of alkaline phosphatase is induced by alkaline environment generated by peptide metabolism (Van Dien and Keasling, 1998). This is reflected in the phosphate levels measured by GC-MS (Figure 5.5 c).

a Variable id. = 17, Leucine

0.5

0.45

0.4

0.35

0.3

0.25

NormalisedIntensities 0.2

0.15

2 25 48 52 75 124 160 164 183 184 230 Isolate i.d. b Variable id. = 49, Unknown 0.45

0.4

0.35

0.3

0.25

0.2

0.15

NormalisedIntensities 0.1

0.05

0 2 25 48 52 75 124 160 164 183 184 230 Isolate i.d. c Phosphate 4

3.5

3

2.5 NormalisedIntensities 2

2 25 48 52 75 124 160 164 183 184 230 Isolate i.d.

Figure 5.5: Box-whisker plots for each isolate demonstrating the concentration level of candidate intracellular metabolites from (a) variable 17 (leucine), (b) variable 49 (unknown), and (c) phosphate.

199

Chapter Five

5.4 Conclusion In metabolomics, investigation choices of the most appropriate analytical method have to be made. To date, most of these are based on early decisions to do with analytical procurement due to the expense of metabolomics instrumentation. The question arises as to whether equivalent results are generated by all platforms. In this investigation, we attempted to address this by analysing a set of well characterised uropathogenic E. coli isolates that had been analysed by a battery of metabolic tests (n = 47) and virulence factor determinations (n = 30). These tests probe different parts of the bacterial cell. Obviously, metabolic tests probe the enzyme component of the bacteria and are usually focused on central metabolism and carbon utilisation. By contrast, virulence factors tend to be cell wall associated and include adhesins, capsules and toxins.

Four different approaches for metabolomics were investigated. FT-IR spectroscopy was employed directly on intact bacteria for metabolic fingerprinting, or what is often described as whole-organism fingerprinting. Following quenching and extraction using methanol, GC-MS was performed following a two-stage derivatisation, and LC-MS was performed using reversed phase LC directly on the methanolic extracts in both positive and negative ion source modes.

In order to compare the clustering patterns from the six different analyses with one another, Procrustes transformations were performed and this allowed objective assessment of the similarity of the cluster patterns in terms of the spatial arrangement of the 11 E. coli isolates in either PCoA or PC-DFA scores space. We found that FT- IR and LC-MS in negative ionisation mode were comparable with each other and also with the virulence factors tests but not comparable to the metabolic tests. By contrast, GC-MS compared well with metabolic tests but not the virulence determinants. Although LC-MS in the positive ionisation mode was not statistically correlated with either, visual inspection of clusters with the metabolic tests suggested there may be some loose congruence between the two methods.

In conclusion, we believe that whenever possible more than one metabolomics modality should be used, and the analyst should consider carefully the analytical technique employed and these will certainly reflect the chemical bias of the methods used. We know for example that LC-MS mainly targets lipophilic species when

200

Chapter Five reversed phase is used; by contrast, GC-MS mainly focuses on polar small molecules. It is possible that there is a danger of false correlations between the analytical data and the biological characteristics of interest if the full metabolome has not been measured. This is clearly demonstrated in this study where the GC-MS data predominantly correlates with the metabolic tests, whilst LC-MS in negative ionisation mode and FT-IR spectroscopy correlate with the virulence determinants. Of course, if we did not know about these two different types of inherent characteristics we may have jumped to false conclusions, and the same rules are likely to be manifest when metabolomics is employed to study higher organisms like mammalian systems and plants as well as complex body fluids.

201

Chapter Five

5.5 References AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369. Andrade, J. M., Gomez-Carracedo, M. P., Krzanowski, W. & Kubista, M. 2004. Procrustes rotation in analytical chemistry, a tutorial. Chemometrics and Intelligent Laboratory Systems, 72, 123-132. Begley, P., Francis-McIntyre, S., Dunn, W. B., Broadhurst, D. I., Halsall, A., Tseng, A., Knowles, J., Goodacre, R. & Kell, D. B. 2009. Development and performance of a gas chromatography−time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Analytical Chemistry, 81, 7038- 7046. Behal, F. J. & Cox, S. T. 1968. Arylamidase of Neisseria catarrhalis. Journal of Bacteriology, 96, 1240-1248. Borg, I. & Groenen, P. J. F. 2005. Modern multidimensional scaling: theory and applications, London, Springer. Dunn, W., Erban, A., Weber, R. M., Creek, D., Brown, M., Breitling, R., Hankemeier, T., Goodacre, R., Neumann, S., Kopka, J. & Viant, M. 2013. Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics. Metabolomics, 9, 44-66. Dunn, W. B., Bailey, N. J. C. & Johnson, H. E. 2005. Measuring the metabolome: current analytical technologies. Analyst, 130, 606-625. Dunn, W. B., Broadhurst, D., Begley, P., Zelena, E., Francis-McIntyre, S., Anderson, N., Brown, M., Knowles, J. D., Halsall, A. & Haselden, J. N. 2011a. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols, 6, 1060- 1083. Dunn, W. B., Broadhurst, D. I., Atherton, H. J., Goodacre, R. & Griffin, J. L. 2011b. Systems level studies of mammalian metabolomes: the roles of mass spectrometry and nuclear magnetic resonance spectroscopy. Chemical Society Reviews, 40, 387- 426. Fiehn, O. 2002. Metabolomics - the link between genotypes and phenotypes. Plant Molecular Biology, 48, 155-171. Fiehn, O., Kopka, J., Dormann, P., Altmann, T., Trethewey, R. N. & Willmitzer, L. 2000. Metabolite profiling for plant functional genomics. Nature Biotechnology, 18, 1157- 1161. Garip, S., Bozoglu, F. & Severcan, F. 2007. Differentiation of mesophilic and thermophilic bacteria with Fourier transform infrared spectroscopy. Applied Spectroscopy, 61, 186-192. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Bolton, F. J., Fox, A. J. & Upton, M. 2012a. High metabolic potential may contribute to the success of ST131 uropathogenic Escherichia coli. Journal of Clinical Microbiology, 50, 3202-3207. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Fox, A. J., Bolton, F. J. & Upton, M. 2012b. Population structure, virulence potential and antibiotic susceptibility of uropathogenic Escherichia coli from Northwest England. Journal of Antimicrobial Chemotherapy, 67, 346-356. Goodacre, R., Timmins, E. M., Burton, R., Kaderbhai, N., Woodward, A. M., Kell, D. B. & Rooney, P. 1998. Rapid identification of urinary tract infection bacteria using hyperspectral whole-organism fingerprinting and artificial neural networks. Microbiology, 144, 1157-1170. Goodacre, R., Timmins, E. M., Rooney, P. J., Rowland, J. J. & Kell, D. B. 1996. Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance- absorbance Fourier transform infrared spectroscopy and artificial neural networks. FEMS Microbiology Letters, 140, 233-239.

202

Chapter Five

Goodacre, R., Vaidyanathan, S., Dunn, W. B., Harrigan, G. G. & Kell, D. B. 2004. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends in Biotechnology, 22, 245-252. Hacker, J. & Kaper, J. B. 2000. Pathogenicity islands and the evolution of microbes. Annual Review of Microbiology, 54, 641-679. Jaccard, P. 1912. The distribution of the flora in the alpine zone. New Phytologist, 11, 37-50. Jackson, D. A. 1995. Protest: a procrustean randomisation test of community environment concordance. Ecoscience, 2, 297-303. Lau, S. H., Kaufmann, M. E., Livermore, D. M., Woodford, N., Willshaw, G. A., Cheasty, T., Stamper, K., Reddy, S., Cheesbrough, J., Bolton, F. J., Fox, A. J. & Upton, M. 2008. UK epidemic Escherichia coli strains A-E, with CTX-M-15 beta-lactamase, all belong to the international O25:H4-ST131 clone. Journal of Antimicrobial Chemotherapy, 62, 1241-1244. Mariey, L., Signolle, J. P., Amiel, C. & Travert, J. 2001. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vibrational Spectroscopy, 26, 151-159. Naumann, D., Helm, D. & Labischinski, H. 1991. Microbiological characerisations by FT- IR Spectroscopy. Nature, 351, 81-82. Oliver, S. G., Winson, M. K., Kell, D. B. & Baganz, F. 1998. Systematic functional analysis of the yeast genome. Trends in Biotechnology, 16, 373-378. Oresic, M. 2009. Metabolomics, a novel tool for studies of nutrition, metabolism and lipid dysfunction. Nutrition Metabolism and Cardiovascular Diseases, 19, 816-824. Riley, P. S. & Behal, F. J. 1971. Amino acid-β-naphthylamide hydrolysis by Pseudomonas aeruginosa arylamidase. Journal of Bacteriology, 108, 809-816. Roessner, U., Wagner, C., Kopka, J., Trethewey, R. N. & Willmitzer, L. 2000. Simultaneous analysis of metabolites in potato tuber by gas chromatography-mass spectrometry. Plant Journal, 23, 131-142. Sumner, L. W., Amberg, A., Barrett, D., Beale, M. H., Beger, R., Daykin, C. A., Fan, T. W. M., Fiehn, O., Goodacre, R., Griffin, J. L., Hankemeier, T., Hardy, N., Harnly, J., Higashi, R., Kopka, J., Lane, A. N., Lindon, J. C., Marriott, P., Nicholls, A. W., Reily, M. D., Thaden, J. J. & Viant, M. R. 2007. Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211-221. Vaidyanathan, S., Kell, D. B. & Goodacre, R. 2002. Flow-injection electrospray ionisation mass spectrometry of crude cell extracts for high-throughput bacterial identification. Journal of the American Society for Mass Spectrometry, 13, 118-128. Vaidyanathan, S., Rowland, J. J., Kell, D. B. & Goodacre, R. 2001. Discrimination of aerobic endospore-forming bacteria via electrospray-ionisation mass spectrometry of whole cell suspensions. Analytical Chemistry, 73, 4134-4144. Van Dien, S. J. & Keasling, J. D. 1998. A dynamic model of the Escherichia coli phosphate- starvation response. Journal of Theoretical Biology, 190, 37-49. Wedge, D. C., Allwood, J. W., Dunn, W., Vaughan, A. A., Simpson, K., Brown, M., Priest, L., Blackhall, F. H., Whetton, A. D., Dive, C. & Goodacre, R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Analytical Chemistry, 83, 6689- 6697. Winder, C. L., Dunn, W. B., Schuler, S., Broadhurst, D., Jarvis, R., Stephens, G. M. & Goodacre, R. 2008. Global metabolic profiling of Escherichia coli cultures: an evaluation of methods for quenching and extraction of intracellular metabolites. Analytical Chemistry, 80, 2939-2948. Winder, C. L., Gordon, S. V., Dale, J., Hewinson, R. G. & Goodacre, R. 2006. Metabolic fingerprints of Mycobacterium bovis cluster with molecular type: implications for genotype-phenotype links. Microbiology, 152, 2757-2765.

203

Chapter Five

5.6 Supplementary Information 5.6.1 Supplementary Materials and Methods

5.6.1.1 Sample preparation of Escherichia coli inoculates for metabolic fingerprinting and metabolic profiling

The whole protocol adopted for metabolomics is provided pictorially in Figure S 5.1 and specific details are discussed below:

Following the protocol of AlRabiah et al. (2013; Chapter 3), 49 mL of lysogeny broth (LB) was used. This was prepared by adding 5 g yeast extract (Amersham Life Sciences, Cleveland, USA) to 10 g tryptone (Formedia, Hunstanton, UK) and 10 g sodium chloride (Fisher Scientific Ltd., Loughborough, UK) and dissolving this in 1 L of reverse osmosis water followed by autoclaving at 121 ºC, for 15 min at 15 psi. This was inoculated with 1 mL of bacterial stock (20% [v/v] glycerol) of each isolate and incubated for 24 h in a shaking incubator at 37 ºC and 200 rpm. The overnight culture of each isolate (1 mL) was incubated for an additional hour at 37 ºC at 200 rpm after dilution with fresh media (49 mL). These axenic overnight cultures were washed two times with physiological saline (0.9% [w/v] NaCl) then diluted with the same washing solvent to adjust the bacterial turbidity to 0.5 McFarland standard (OD 0.1 ± 0.02) optical density (OD) at 600 nm using a Biomate 5 spectrophotometer (Thermo, Hemel Hempstead, UK) in order to standardise the size of the inocula to be used in subsequent experiments (Figure S 5.1).

The next stage was to inoculate 19 mL of LB media with 1 mL of the experimental inocula. Each isolate was prepared in four biological replicates and the 20 mL was incubated for 21 h at 37 ºC and 200 rpm. For each biological replicate, the 21 h culture was split for FT-IR spectroscopy, GC-MS and LC-MS to ensure that results were obtained from the same biological cultures. Starting with FT-IR spectroscopy, 20 µL from each replicate was spotted directly on the FT-IR plate (in 3 different wells and each well was analysed 3 times) following the method of AlRabiah et al. (2013; Chapter 3). For GC-MS and LC-MS, 15 mL from each replicate was collected and quenched according to the procedures developed by Winder et al. (2008). The collected cultures were quenched in 30 mL of 60% cold methanol (-48 ºC) and rapidly mixed, after which the quenched culture was centrifuged at 4800 g and -8 ºC for 10 min. Following this, the supernatant was quickly removed and the bacterial 204

Chapter Five pellets that remained were centrifuged for another 2 min and the remaining supernatant was removed. It was possible at this point to sample the quench supernatant to determine whether there had been any leakage of metabolites. The bacterial pellets were stored at -80 ºC overnight then metabolite extraction was applied following the method of Winder et al., 2008 (the only change was 80% cold methanol used instead of 100% methanol). The bacterial pellets were suspended in 1 mL of 80% methanol at -48 ºC, put into 2 mL tubes then liquid nitrogen was used to flash freeze. After this, they were put on wet ice and when they were partially defrosted the samples were thoroughly vortexed for about 30 s.

The cycle of freeze-thawing and vortexing was repeated twice more to ensure the maximum possible intracellular metabolites were extracted from within the cells. The suspensions were centrifuged at 13000 g and -9 ºC for 5 min. The supernatants were collected and placed in clean 2 mL tubes then held on dry ice. The pellet had 500 µL of 80% methanol (-48 ºC) added to it and the entire process was repeated. The second extraction aliquot was mixed with the first one, which was held on dry ice and was subsequently thoroughly vortexed (Figure S 5.1). 800 µL of each extract of the GC-MS samples that had been normalised to equivalent OD and made up with

80% methanol was spiked with 100 µL of internal standard (0.3 mg/L succinic-d4 acid, malonic-d2 acid and glycine-d5 in HPLC grade water). 300 µL of each extract (normalised to OD by addition of 80% methanol) was collected for LC-MS samples. QC samples were created by combining ca. 100 µL from each sample (normalised to OD) and mixing thoroughly. The QC mix was divided into 11 QC samples for LC- MS, each containing 300 µL of the QC mix, and 2 QC samples for GC-MS, each containing 800 µL of the QC mix. All samples were dried for 16 h using a speed vacuum concentrator (Eppendorf 5301, Eppendorf, Cambridge, UK) operated at 30 ºC. For GC-MS samples, a chemical derivatisation with two stages was used as many metabolite classes within central metabolism are non-volatile. Prior to analysis of reaction products with GC-MS, carbonyl moieties were substituted by means of methoxyamination and followed by a silylation reaction. Samples taken from -80 ºC storage were put into a speed vacuum concentrator for 30 min to eliminate any remaining condensation. The next stage was to dissolve the extracts in 50 μL of 20 mg/mL O-methoxylamine hydrochloride in pyridine after which they were vortexed, and incubated at 60 ºC for 30 min in a dry-block heater. Then 50 μL of

205

Chapter Five

N-methyl-N-trimethylsilyl-trifluoroacetamide (MSTFA) was added, and the extracts were further mixed and incubated at 60 ºC for 30 min. When that was complete, 20 μL of retention index marker solution (0.3 mg/mL docosane, nonadecane, decane, dodecane, and pentadecane in pyridine) was added followed by 15 min of centrifugation at 15800 g. The resultant supernatant (100 μL) from this process was transferred to 2 mL amber glass GC-MS vials fitted with 200 μL inserts before being analysed. With the employed GC-MS analytical method, it is possible to achieve a throughput of 40 samples a day; therefore, for greater chemical stability of samples, randomised batches of 40 samples per day were derivatised throughout the period of analysis, and QC samples were also derivatised across multiple batches to enable derivatisation (technical) and instrument (analytical) error to be measured.

5.6.1.2 Fourier transform infrared (FT-IR) spectroscopy

Aliquots of 20 µL of the bacterial preparations were spotted directly onto clean 96- well zinc selenide (ZnSe) plates (Bruker Ltd., Coventry, UK) and were dried in an oven for 45 min at 40 ºC (as detailed by AlRabiah et al. (2013; Chapter 3)). A Bruker Equinox 55 infrared spectrometer equipped with a HTX™ module was used for high-throughput screening (HTS) FT-IR spectroscopic analysis using the method of Winder et al. (2006). The spectra were collected in the range of 4000−600 cm-1, and 64 co-adds were taken at 4 cm-1 resolution.

FT-IR data were converted after analysis to tab delimited files before being analysed in MATLAB 2010a (The Mathworks Inc., Natwick, USA). CO2 signals were removed as described by AlRabiah et al. (2013; Chapter 3) and FT-IR data were baseline corrected using the extended multiplicative signal correction (EMSC) algorithm (Martens et al., 2003) before performing multivariate analysis (vide infra).

5.6.1.3 Gas chromatography-mass spectrometry (GC-MS)

A LECO Pegasus III TOF/MS was used to conduct GC-TOF/MS. It was operated in GC-MS mode (Leco Corp., St. Joseph, MO, USA), with a Gerstel MPS-2 autosampler (Gerstel, Baltimore, MD, USA) and an Agilent 6890N GC × GC (operated in GC mode) with a split/splitless injector and Agilent LPD split-mode inlet liner (Agilent Technologies, Stockport, UK). A 30 m × 0.25 mm × 0.25 μm 206

Chapter Five

VF17-MS bonded phase capillary column (Varian, Oxford, UK) was used at a constant helium carrier gas flow of 1 mL per min. The temperature program was as follows: 4 min hold at 70 ºC, 20 ºC/min to 300 ºC, 4 min hold. A split ratio of 4:1 was used for sample injections of 1 μL. The operational temperature of the injector was 280 ºC, and after 30 s, a 25 mL/min gas saver flow was used, and the transfer line was held at 240 ºC. The mass spectrometer had a source temperature of 220 ºC and was operated at 70 eV ionisation energy, and acquired m/z 45-600 at 20 Hz. The full details of the GC-MS protocol was published (Begley et al., 2009; Dunn et al., 2011) and these follow the accepted Metabolomics Standards Initiative (MSI) guidelines (Sumner et al., 2007).

5.6.1.4 Liquid chromatography-mass spectrometry (LC-MS)

UHPLC-MS analysis was carried out on an Accela UHPLC autosampler system (Thermo-Fisher Ltd., Hemel Hempsted, UK) coupled to an electrospray LTQ- Orbitrap XL hybrid mass spectrometry system (Thermo-Fisher, Bremen, Germany). Analysis was carried out in both positive and negative ESI ionisation modes whilst each run was completely randomised to negate any bias. A gradient type UHPLC method was used during each run as is previously described by Dunn et al. (2011) and Wedge et al. (2011). 10 µL of the extract was injected onto a Hypersil GOLD

UHPLC C18 column (100 mm × 2.1 mm × 1.9 µm, Thermo-Fisher Ltd.) held at a constant temperature of 50 ºC whilst a solvent flow rate of 400 µL/min was used to drive the chromatographic separation.

Xcalibur software (Thermo-Fisher Ltd. Hemel Hempsted, UK) was used as the operating system for the Thermo LTQ-Orbitrap XL MS system following the method described by Wedge et al. (2011).

Data processing was initiated by the conversion of the standard UHPLC-MS raw files into the universal NetCDF format via the software conversion tool within Xcalibur. Subsequently, in-house peak deconvolution software containing the XCMS algorithm (http://masspec.scripps.edu/xcms/xcms.php) was used for peak picking as described previously (Dunn et al., 2008; Wedge et al., 2011). The output from this system resulted in a Microsoft Excel based data matrix of mass spectral features with related accurate m/z and retention time pairs. Data from the internally pooled QC

207

Chapter Five samples was then used to align for instrument drift and quality control (via application of an in-house MATLAB script (Dunn et al., 2008)). The data matrix was also signal corrected to remove peaks that exceeded the 20% RSD (relative standard deviation) threshold within QC samples across the analytical run. Normalisation of each peak within the samples was achieved using the total peak area whilst putative identification of metabolite features was performed applying the PUTMEDID-LCMS set of workflows as previously described (Brown et al., 2011). Ambiguity arising from the same m/z ratio can lie within lipid identification due to differing points of unsaturation and multiple isomeric identifications. Multiple adducts of the same lipid can also occur due to the presence of different charged (composite) species (i.e. protonated and sodiated ions).

208

Chapter Five

Streak Bacteria on NA plate

19 mL LB Collect axenic colonies dil. to get exact 21 h incubation conc. C1 V1=C2 V2 1 mL 1 mL 40 mL 1mL 1mL 1mL

1ml 20% glycerol 49 mL of LB medium 49 mL of LB medium Centrifuge it and wash it Measure the OD stored at -20 C keep it for 24 h keep it for 1 h 2 times with Normal saline for normalisation (1) Take 2 mL from supernatant (Leakage)

Remove supernatant and FT-IR plate collect the pellet (ZnSe)

Reconstitute in 1 mL 4800 g (-8 ºC) for 10 mL High throughput screening (HTS)

min. FT-IR spectroscopic analysis was mix)

Transfer to 2 mL tubes carried out using a Bruker Equinox (quick and (quickand Add 30 Add 55 infrared spectrometer (Bruker Ltd., Coventry, UK). equipped with Pre chilled (-48 ºC) an HTX™ module. FT-IR spectra (80%) methanol were recorded directly from the dried cell biomass in transmission mode.

Pre chilled (-48 ºC) (60%) methanol (2)

•Combine the two aliquots, mix and centrifuge. •Normalise by 80% methanol depending on the OD and split samples for GC and LC •add Internal standards (for GC samples)

Flash freeze in Liq. Nitrogen for 1 min. (UHPLC-LTQ-Orbitrap XL MS, m/z 100-1000, calibrated following manufacturers recommendations), 10ul injections were ran on a water-methanol (0.1% Formic Acid) reverse phase gradient (Hypersil Gold C18 100 x 2.1mm

(80%) methanol (80%) methanol on 1.9µm particle size: Thermo-Fisher Ltd

cycle

(3 times) (3

Remove Remove samples to wet ice defrost until

procedure Freeze/ Freeze/ thaw

prechilled (3)

of of

L L

Add 100 µL of µL 100 Add HPLC water

Transfer to 200 to 200 Transfer analytical µL vials

remaining remaining pellet and repeat the whole Add 500 µ 500 Add

After the 3 rd Retrieve Dervatise cycle supernatant 2 steps method

centrifuge to clean tube (1) methoxyamine (keep it dry for 16 h) hydrochloride Vortex for 30 s 13000 g (-9 ºC) for 5 (2) MSTFA min. GC-TOF/MS profiling was performed with a LECO Pegasus III EI-TOF/MS. Deconvolution of GC-MS profiles was performed within LECO ChromaTOF, metabolite identifications were performed by MS library matching to the NIST05, Max Plank Golm Metabolome Database, and in house libraries.

Figure S 5.1: General scheme of sample preparation approach used including: (1) FT-IR analysis of samples directly from the culture; (2) LC-MS; and (3) GC-MS analysis of samples after quenching and extraction using 60% and 80% cold (-48 ºC) methanol, respectively.

209

Chapter Five

5.6.2 Supplementary Results

Table S 5.1: Genetic backgrounds mediating quinolone resistance in the ST131 uropathogenic E. coli (UPEC) isolates used in this study.

Isolate no. Quinolones resistance mechanisms 2 mutation on both gyrA and parC 25 mutation on both gyrA and parC 48 aac(6')lb-cr 52 mutation on gyrA 75 mutation on both gyrA and parC 124 aac(6')lb-cr 160 aac(6')lb-cr 164 mutation on both gyrA and parC 183 mutation on both gyrA and parC 184 mutation on both gyrA and parC 230 aac(6')lb-cr

210

Chapter Five

Table S 5.2: Virulence factors for the Escherichia coli isolates used in this study.

Isolate no.  2 25 48 52 75 124 160 164 183 184 230 VF  papAH 0 0 0 0 0 0 0 0 0 0 0 papC 0 0 0 0 0 0 0 0 0 0 0 papEF 0 0 0 0 0 0 0 0 0 0 0 papG II,III 0 0 0 0 0 0 0 0 0 0 0 papG I 0 0 0 0 0 0 0 0 0 0 0 allele-I 0 0 0 0 0 0 0 0 0 0 0 allele-II 0 0 0 0 0 0 0 0 0 0 0 allele-III 0 0 0 0 0 0 0 0 0 0 0 sfa/focDE 0 0 0 0 0 0 0 0 0 0 0 sfaS 0 0 0 0 0 0 0 0 0 0 0 focG 0 0 0 0 0 0 0 0 0 0 0 afa/draBC 0 0 0 1 1 0 1 1 0 0 0 bmaE 0 0 0 0 0 0 0 0 0 0 0 gafD 0 0 0 0 0 0 0 0 0 0 0 nfaE 0 0 0 0 0 0 0 0 0 0 0 fimH 1 1 1 1 1 1 1 1 1 1 1 hlyA 0 0 0 0 0 0 0 0 0 0 0 cnfI 0 0 0 0 0 0 0 0 0 0 0 cdtB 0 0 0 0 0 0 0 0 0 0 0 fyuA 1 1 1 1 1 1 1 1 1 1 1 iutA 1 1 1 1 1 1 1 1 1 1 1 kpsMT II 1 1 1 0 0 0 0 0 1 1 1 kpsMT iii 0 0 0 0 0 0 0 0 0 0 0 kpsMT k1 0 0 0 0 0 0 0 0 0 0 0 kpsMT K5 1 1 1 0 0 0 0 0 1 1 1 Rfc 0 0 0 0 0 0 0 0 0 0 0 ibeA 0 0 0 0 0 0 0 0 0 0 0 cvaC 0 0 0 0 0 0 0 0 0 0 0 traT 1 1 1 1 1 1 0 0 0 1 0 PAI 1 1 0 1 1 1 1 1 1 1 1 VF SCORE 7 7 6 6 6 5 5 5 6 7 6 Data obtained from Gibreel et al. (2012a; 2012b).

Virulence factors (VF): Adhesion genes: fimH, papAH, papC, papEF, papG, alleles I, alleles II, alleles III, sfaS, focG, sfa/focDE, afa/draBC, bmaE, nfaE, gaf D. Toxin genes: cnf1, cdtB, hlyA. Siderophore genes: fyuA, iutA Capsule synthesis genes: kpsMT II, kpsMT III, kpsMT K1, kpsMT K5, rfc Miscellaneous genes: cvaC, traT, ibeA, PAI

211

Chapter Five

Table S 5.3: Results of metabolic/biochemical profiling of the Escherichia coli isolates used in this study.

Isolate no.  2 25 48 52 75 124 160 164 183 184 230 MT  APPA 0 0 0 0 0 0 0 0 0 0 0 ADO 0 0 0 0 0 0 0 0 0 0 0 PyrA 0 0 0 0 0 0 0 0 0 0 0 IARL 0 0 0 0 0 0 0 0 0 0 0 dCEL 0 0 0 0 0 0 0 0 0 0 0 BGAL 1 1 1 1 1 1 1 1 1 1 1 H2S 0 0 0 0 0 0 0 0 0 0 0 BNAG 0 0 0 0 0 0 0 0 0 0 0 AGLTp 0 0 0 0 0 0 0 0 0 0 0 dGLU 1 1 1 1 1 1 1 1 1 1 1 GGT 0 0 0 0 0 0 0 0 0 0 0 OFF 1 1 1 1 1 1 1 1 1 1 1 BGLU 0 0 0 0 0 0 0 0 0 0 0 dMAL 1 1 1 1 1 1 1 1 1 1 1 dMAN 1 1 1 1 1 1 1 1 1 1 1 dMNE 1 1 1 1 1 1 1 1 1 1 1 BXYL 0 0 0 0 0 0 0 0 0 0 0 BAIap 0 0 0 0 0 0 0 0 0 0 0 ProA 1 1 1 1 0 1 1 1 1 0 0 LIP 0 0 0 0 0 0 0 0 0 0 0 PLE 0 0 0 0 0 0 0 0 0 0 0 TyrA 1 1 1 1 1 1 1 1 1 1 1 URE 0 0 0 0 0 0 0 0 0 0 0 dSOR 1 1 1 1 1 1 1 1 1 1 1 SAC 1 1 1 1 1 1 1 1 1 1 1 dTAG 0 0 0 0 0 0 0 0 0 0 1 dTRE 1 1 1 1 1 1 1 1 1 1 1 CIT 0 0 0 0 0 0 0 0 0 0 0 MNT 0 0 0 0 0 0 0 0 0 0 0 5KG 0 0 0 1 0 0 0 0 0 0 0 ILATk 1 1 1 1 1 1 1 1 1 1 1 AGLU 0 0 0 0 0 0 0 0 0 0 0 SUCT 1 1 0 1 1 1 1 1 1 1 0 NAGA 0 0 0 0 0 0 0 0 0 0 0 AGAL 1 1 1 1 1 1 1 1 1 1 1 PHOS 1 0 0 0 0 0 1 1 0 1 1 GlyA 0 0 0 0 0 0 1 1 0 0 0 ODC 1 1 1 1 1 1 1 1 1 1 1 LDC 0 1 1 0 1 1 1 1 1 1 1 IHISa 0 0 0 0 0 0 0 0 0 0 0 CMT 1 1 1 1 1 1 1 1 1 1 1 BGUR 1 0 0 1 1 1 0 0 0 1 0 O129R 1 0 1 1 1 1 1 1 1 1 1

212

Chapter Five

Isolate no.  2 25 48 52 75 124 160 164 183 184 230 MT  GGAA 0 0 0 0 0 0 0 0 0 0 0 IMLTa 0 0 0 0 0 0 1 1 1 0 0 ELLM 1 1 1 1 1 1 1 1 1 1 1 ILATa 0 0 0 0 0 0 1 1 1 0 0 Data obtained from Gibreel et al. (2012a).

Metabolic Tests (MT) abbreviations (Abb.):

Test Abb. Test Abb. Test Abb. Ala-phe-pro arylamidase APPA Glutamyl arylamidse Pna AGLTp Citrate (sodium) CIT L-pyrrolydonyl-arylamidase PyrA β-glucosidase BGLU α-glucosidase AGLU L-arabitol IARL β-xylosidase BXYL β-N-acetyl-galactosaminidase NAGA D-cellobiose dCEL β-alanine arylamidase pNA BAIap L-histidine assimilation IHISa H2S production H2S Lipase LIP Glu-gly-arg- arylamidase GGAA β-N-acetyl-glucosaminidase BNAG Palatinose PLE Adonitol ADO Urease URE Phosphatase PHOS Beta-Galactosidase BGAL D-sorbitol dSOR GlycineArylamidase GlyA D-Glucose dGLU Saccharose/Sucrose SAC OrnithineDecarboxylse ODC Gamma-Glutamyl-Transferase GGT D-Tagatose dTAG Lysine Decarboxylase LDC Fermentation/Glucose OFF D-Trehalose dTRE Courmarate CMT D-Maltose dMAL Malonate MNT Beta-Glucoronidase BGUR D-Mannitol dMAN 5-Keto-D-Gluconate 5KG O/129 Resistance O129R D-Mannose dMNE L-Lactate alkalinisation ILATk L-Malate assimilation IMLTa L-ProlineArylamidase ProA Succinate alkalinisation SUCT Ellman ELLM Tyrosine Arylamidase TyrA Alpha-Galactosidase AGAL L-Lactate assimilation ILATa

213

Chapter Five

LC-MS positive mode vs. metabolic test 0.5

0.4

230 0.3

0.2 Red: LC-MS+ve 0.1 48 184

75 230 48 Blue: metabolic test PC2 75 2 0 52164 124 160,164 25184 160 25 124 2 183 -0.1 183

-0.2 52

-0.2 -0.1 0 0.1 0.2 0.3 0.4 PC 1

Figure S 5.2: Superimposed scatter plots of PCoA scores of the first two components of the metabolic test and Procrustean transformed LC-MS positive mode data.

Rotated Loadings plot 0.8

0.6

0.4

0.2

49 0 20

17

-0.2 PC2rotated Loadings

-0.4

-0.6 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 PC 1 rotated Loadings

Figure S 5.3: Rotated loading plot of GC-MS data showing the highly correlated metabolites, which were associated with metabolic profiling data.

214

Chapter Five

5.6.3 Supplementary References

AlRabiah, H., Correa, E., Upton, M. & Goodacre, R. 2013. High-throughput phenotyping of uropathogenic E. coli isolates with Fourier transform infrared spectroscopy. Analyst, 138, 1363-1369. Begley, P., Francis-McIntyre, S., Dunn, W. B., Broadhurst, D. I., Halsall, A., Tseng, A., Knowles, J., Goodacre, R. & Kell, D. B. 2009. Development and performance of a gas chromatography−time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Analytical Chemistry, 81, 7038- 7046. Brown, M., Wedge, D. C., Goodacre, R., Kell, D. B., Baker, P. N., Kenny, L. C., Mamas, M. A., Neyses, L. & Dunn, W. B. 2011. Automated workflows for accurate mass-based putative metabolite identification in LC/MS-derived metabolomic datasets. Bioinformatics, 27, 1108-1112. Dunn, W. B., Broadhurst, D., Begley, P., Zelena, E., Francis-McIntyre, S., Anderson, N., Brown, M., Knowles, J. D., Halsall, A. & Haselden, J. N. 2011. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols, 6, 1060- 1083. Dunn, W. B., Broadhurst, D., Brown, M., Baker, P. N., Redman, C. W. G., Kenny, L. C. & Kell, D. B. 2008. Metabolic profiling of serum using ultra performance liquid chromatography and the LTQ-Orbitrap mass spectrometry system. Journal of Chromatography B, 871, 288-298. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Bolton, F. J., Fox, A. J. & Upton, M. 2012a. High metabolic potential may contribute to the success of ST131 uropathogenic Escherichia coli. Journal of Clinical Microbiology, 50, 3202-3207. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Fox, A. J., Bolton, F. J. & Upton, M. 2012b. Population structure, virulence potential and antibiotic susceptibility of uropathogenic Escherichia coli from Northwest England. Journal of Antimicrobial Chemotherapy, 67, 346-356. Martens, H., Nielsen, J. P. & Engelsen, S. B. 2003. Light scattering and light absorbance separated by extended multiplicative signal correction. Application to near-infrared transmission analysis of powder mixtures. Analytical Chemistry, 75, 394-404. Sumner, L. W., Amberg, A., Barrett, D., Beale, M. H., Beger, R., Daykin, C. A., Fan, T. W. M., Fiehn, O., Goodacre, R., Griffin, J. L., Hankemeier, T., Hardy, N., Harnly, J., Higashi, R., Kopka, J., Lane, A. N., Lindon, J. C., Marriott, P., Nicholls, A. W., Reily, M. D., Thaden, J. J. & Viant, M. R. 2007. Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211-221. Wedge, D. C., Allwood, J. W., Dunn, W., Vaughan, A. A., Simpson, K., Brown, M., Priest, L., Blackhall, F. H., Whetton, A. D., Dive, C. & Goodacre, R. 2011. Is serum or plasma more appropriate for intersubject comparisons in metabolomic studies? An assessment in patients with small-cell lung cancer. Analytical Chemistry, 83, 6689- 6697. Winder, C. L., Dunn, W. B., Schuler, S., Broadhurst, D., Jarvis, R., Stephens, G. M. & Goodacre, R. 2008. Global metabolic profiling of Escherichia coli cultures: an evaluation of methods for quenching and extraction of intracellular metabolites. Analytical Chemistry, 80, 2939-2948. Winder, C. L., Gordon, S. V., Dale, J., Hewinson, R. G. & Goodacre, R. 2006. Metabolic fingerprints of Mycobacterium bovis cluster with molecular type: implications for genotype-phenotype links. Microbiology, 152, 2757-2765.

215

Chapter Six

6 Chapter Six

General Discussion and Future Work

6.1 Discussion and Future Work Metabolomics is the holistic qualitative and quantitative study of metabolites in a certain organism under a defined set of conditions (Fiehn, 2001). In essence, metabolomics has contributed to the advancement of various disciplines including environmental studies, agriculture, microbiology and medical sciences. In the clinical field, metabolomics has an important role in the elucidation of the mechanism of drug action, the molecular etiology of disease and the discovery of biomarkers (Putri et al., 2013). In particular, metabolomics has been applied to the study of infectious disease such as hepatitis (Rodgers et al., 2009), trypanosomiasis (Vincent et al., 2010) and urinary tract infections (UTIs) (Nevedomskaya et al., 2012). Urinary tract infections constitute one of the most commonly encountered types of infections especially in females (Fihn, 2003; Griebling, 2005; Mysorekar and Hultgren, 2006), and these infections are mainly caused by pathogenic E. coli (Kahlmeter, 2003; Ronald, 2003). However, due to the emergence of drug resistance and the lack of new antibiotic therapies for UTIs, there is a need for further research that applies interdisciplinary approaches including metabolomics. More comprehensive understanding of the mode of action of existing therapies, including their off-target effects, can inform the development of new strategies based on the newly identified targets. In addition, there is a need for a high-throughput, reproducible, comprehensive, and precise evaluation method to assess and discriminate between the causes of the infection. This study aimed to develop metabolomics-based approaches to investigate the interaction of antimicrobial agents that target DNA synthesis either directly or indirectly with E. coli bacteria, and also to characterise and discriminate between pathogenic E. coli isolates from different sequence types and even within the same sequence type (i.e. ST131). The choice of the microorganism E. coli was informed by the availability of many accessible databases on various metabolic pathways including KEGG (Kanehisa and Goto, 2000) and EcoCyc (Keseler et al., 2011) and by the involvement of pathogenic strains of this microorganism in urinary tract infections (Kahlmeter, 2003; Ronald, 2003).

216

Chapter Six

Metabolomics can provide insights into the mode of action of drugs and this was applied to the antibiotic trimethoprim. As a proof of principle, E. coli K-12 was used for this purpose as a model organism (Chapter 2). In principle, E. coli K-12 was challenged with three doses (below the minimum inhibitory concentration (MIC)) of trimethoprim, which is a weakly basic antibiotic that selectively inhibits dihydrofolate reductase (DHFR) and therefore is considered to be an indirect inhibitor of DNA synthesis. Although previous metabolomics and fluxomics studies had focused on the effects of trimethoprim challenge on this microorganism (Kwon et al., 2010; Kwon et al., 2008), our study was novel as it included the effect of urinary pH and analysis with complementary metabolomics techniques. Global snapshot of bacterial phenotypic and untargeted metabolic profiles were generated using FT-IR spectroscopy and GC-MS respectively.

As most of the drug molecules exert their action in urinary environment, two pH levels (5 and 7) were used as an additional factor to mimic the normal pH range of human urine, which provided a wider view of drug action at different ionisation states. LC-MS was used to prove the ability of drug molecules to pass into the cell and exert their effect when drug molecules are fully ionised (i.e. at pH 5). FT-IR spectroscopy was used in a preliminary experiment and showed global metabolic changes of the bacteria with high reproducibility. Subsequently, GC-MS was employed to carry out an untargeted (global) metabolic profiling to provide a greater understanding of trimethoprim‟s mode of action and its off-target effects.

The direct effect of trimethoprim, as an inhibitor of DHFR, manifests as a decrease in detected levels of nucleotides (thymine, uracil, inosine, adenine and guanine). Drug effects were generally stronger at pH 7 than pH 5, where drug molecules have higher permeability due to their ionisation state allowing more passage across the bacterial cell wall. Additionally, guanine shows an odd response after challenging with the highest dose used in this experiment at pH 5; it was up-regulated, and we found out that many amino acids (e.g. tyrosine, phenylalanine and leucine) gave the same response.

Glutamate, which plays a role in the targeted pathway, was highly up-regulated after antibiotic challenge at pH 7; an observation in line with a previous study (Kwon et al., 2008). This amino acid is involved in the biosynthesis of ornithine and proline,

217

Chapter Six and the same response was observed for these two metabolites. Methionine is one of the indirect products of the folate pathway. Blocking DHFR led to a decrease in the levels of methionine in this condition (i.e. drug challenge at pH 7). Alternatively, low levels of methionine may result in an accumulation of homoserine (not detected) which acts as an inhibitor of glutamate dehydrogenase and this may provide an additional explanation for the up-regulation of glutamate, which also acts as an osmolyte in certain conditions.

Trehalose, an osmolyte, was highly up-regulated (at pH 7, 0.2 mg/L), which may be explained by its protective role in stress conditions. As a cascade effect of this increase, consumption of glucose, a trehalose synthesis substrate, was observed and the same effect was seen for fructose. The levels of compounds related to pyruvate, an end product of glycolysis, including alanine, lactic acid and detected TCA members (citrate and malate) were up-regulated under this condition and this may be due to extensive sugar consumption due to drug stress (Table 6.1).

218

Chapter Six

Table 6.1: A summary of metabolites level changes in E. coli K-12 upon challenge with trimethoprim at either pH 5 or 7.

Down- Down- pH levels  Up-regulated Up-regulated regulated at regulated at at pH 5 at pH 7 Metabolites  pH 5 pH 7 Glutamate 

Ornithine 

Proline 

Thymine 

Uracil 

Methionine 

Inosine 

Adenine 

Guanine  

Histidine 

Tyrosine 

Phenylalanine 

Isoleucine 

Leucine 

Valine 

Tryptophan 

Aspartate   Lysine 

Nicotinamide 

Trehalose 

Glucose 

Fructose 

Alanine 

Citrate 

Malate 

Lactic acid 

The study in this chapter provides a wider view of trimethoprim‟s action at pH levels of healthy human urine and shows that there are certain areas in the central metabolism of E. coli that need to be further investigated using various approaches such as targeted metabolic quantification and fluxomic-based studies.

FT-IR spectroscopy, carried out in Chapter 2, provided a global snapshot of phenotypic profiles between different classes with high reproducibility. Previously, it has been proven that this analytical tool combined with multivariate data analysis can provide a rapid, non-destructive and relatively inexpensive method for the identification and differentiation between bacteria down to sub-species level

219

Chapter Six

(Maquelin et al., 2002; Mariey et al., 2001; Garip et al., 2007; Goodacre et al., 1996) compared with traditional biochemical methods. However, the sample preparation for IR measurement may become a labour intensive and time consuming technique when analysing a large number of samples. Therefore, we developed a high- throughput method (Chapter 3) that involved a simple sample preparation and can run up to 200 samples at the same time and under the same growth conditions. This method was applied on 10 pathogenic E. coli isolates, five of which are members of the ST131 clone.

Sample preparation in this study was modified in two steps. First, Bioscreen plates for micro-culturing were used rather than traditional shake flasks. The Bioscreen approach can be used to grow a large number of isolates and analyse them more rapidly and more easily compared to traditional methods (e.g. shake flasks in incubators). Second, we found out that excluding the sample washing step carried out in previous studies (Guibet et al., 2003; Al-Holy et al., 2006), as well as in Chapter 2 of this thesis, and analysing the biomass directly provided the richest information compared to conventional FT-IR spectroscopy. This new approach shows an ability to discriminate between isolates from different sequence types (ST131 and non-ST131) (Figure 6.1). A certain degree of sub-separation could be seen in the main clusters between isolates depending on their susceptibility to quinolones. Explanation of the superior results from excluding the washing step may require additional experiments either by using less complex, well-identified media or higher resolution analytical techniques such as mass spectrometry. However, we are confident that this is a high-throughput technique suitable for future identification and characterisation studies of bacteria using FT-IR spectroscopy.

Four isolates from the 10 used in Chapter 3 were subsequently chosen depending on their susceptibility to quinolones and sequence type for a lipidomics study (Chapter 4). These isolates were challenged with ciprofloxacin, a second generation fluoroquinolone that acts as a direct inhibitor of DNA synthesis by targeting topoisomerase enzymes. Doses were chosen depending on the MIC of sensitive isolates and samples were analysed with FT-IR spectroscopy and LC-MS.

In Chapter 4, the approach developed in Chapter 3 was applied on control and stressed bacterial isolates to produce a bacterial metabolic fingerprint for each

220

Chapter Six sample. This approach provided a distinction between isolates from different sequence types (even after challenge with ciprofloxacin), an observation that matches the results in Chapter 3.

Moreover, a reproducible workflow for lipidome analysis based on RPLC-MS was developed in Chapter 4. The workflow was able to identify a range of interesting and potentially clinically important alterations in the levels of lipid species (e.g. PEs, PCs, PAs, PGs, and acylglycerols) in E. coli upon challenge with ciprofloxacin. Based on the Metabolomics Standards Initiative (MSI) guidelines (Sumner et al., 2007), the putative identification level for detected species was categorised as level 2 or 3. However, to achieve further identification and a more satisfactory level of information, there is a requirement to develop an online RPLC-MS/MS and an offline direct infusion MSn method for the identification of lipids across all classes. This will provide more detailed information on the changes that occur due to antibiotic stress and will enable correlation between the identified lipids similar to the study carried out in Chapter 2.

In addition, high similarity was observed for FT-IR spectroscopy and RPLC-MS-ve mode in distinguishing between isolates of different sequence types (Figure 6.1). This observation may be indicative of a common feature between these two techniques and this requires additional investigation using a larger number of samples, which was carried out as part of Chapter 5. This investigation was more comprehensive owing to the application of four metabolomics platforms (FT-IR, GC-MS, RPLC-MS-ve and RPLC-MS+ve) to analyse 11 uropathogenic E. coli isolates from the same sequence type (ST131). The data generated by these platforms were compared to identify any similarities between them. Further, these 11 isolates were comprehensively characterised in previous studies using traditional microbiological methods including virulence factors (VFs) and metabolic tests (MTs) (Gibreel et al., 2012a; Gibreel et al., 2012b), which provided an opportunity to compare the results of this virulence and metabolic characterisation with those from the four analytical techniques. Cluster analysis was used to generate six ordination scores plots: 4 PC- DFA from analytical tools and 2 PCoA from microbiological tests. A series of Procrustes transformations were used to compare the patterns of the scores with regard to the spatial arrangement of the 11 bacterial isolates in the PC space.

221

Chapter Six

In line with findings in Chapter 4, FT-IR and LC-MS-ve produced comparable results in terms of sample classification. Additionally, both approaches were in good agreement with VF tests. On the other hand, GC-MS showed a high similarity with MT tests but not VF tests (Figure 6.1). LC-MS+ve results did not show any statistically significant correlation with the other analytical techniques or microbiological tests. Indeed, the congruence between the microbiological tests and the analytical techniques highlighted in this study emphasises the value of the complementary information provided by these tools which can cover different areas of bacterial physiology. This study can inform the decision of choosing the method to be used for a certain experiment and may indicate that a combination of methods is desirable.

Future work may extend this study to include additional valuable analytical tools (specifically, NMR and HILIC-LC-MS) that play a crucial role in the field of metabolomics which can afford more value to the study by providing further information and may exhibit correlation with the analytical tools and microbiological tests used in this study.

In summary, the work undertaken in this thesis provided further insights into antibiotic drug action at the metabolome level and developed methods to advance metabolomics analysis of bacteria further. Whilst this work was restricted to E. coli and involved the use of antibiotics that target DNA synthesis (as a proof of principle), it can be readily applied to study other microorganisms and other classes of antibiotics, which can provide further understanding of drug resistance in infectious disease.

222

Chapter Six

Chapter 3

Bioscreen+FT-IR

Chapter 4

Bioscreen+FT-IR

RPLC-MS –ve

Chapter 5

GC-MS Bioscreen+FT-IR

RPLC-MS –ve

ST131 Resistant Metabolic test A ST131 Sensitive Metabolic test B Non-ST131 Resistant Virulence factor A Non-ST131 Sensitive Virulence factor B

Figure 6.1: A summary of the development of a workflow (Chapters 3, 4 and 5) for the discrimination between different pathogenic E. coli isolates.

223

Chapter Six

6.2 References Al-Holy, M. A., Lin, M. S., Al-Qadiri, H., Cavinato, A. G. & Rasco, B. A. 2006. Classification of foodborne pathogens by Fourier transform infrared spectroscopy and pattern recognition techniques. Journal of Rapid Methods and Automation in Microbiology, 14, 189-200. Fiehn, O. 2001. Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comparative and Functional Genomics, 2, 155-168. Fihn, S. D. 2003. Acute uncomplicated urinary tract infection in women. The New England Journal of Medicine, 349, 259-266. Garip, S., Bozoglu, F. & Severcan, F. 2007. Differentiation of mesophilic and thermophilic bacteria with Fourier transform infrared spectroscopy. Applied Spectroscopy, 61, 186-192. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Bolton, F. J., Fox, A. J. & Upton, M. 2012a. High metabolic potential may contribute to the success of ST131 uropathogenic Escherichia coli. Journal of Clinical Microbiology, 50, 3202-3207. Gibreel, T. M., Dodgson, A. R., Cheesbrough, J., Fox, A. J., Bolton, F. J. & Upton, M. 2012b. Population structure, virulence potential and antibiotic susceptibility of uropathogenic Escherichia coli from Northwest England. Journal of Antimicrobial Chemotherapy, 67, 346-356. Goodacre, R., Timmins, E. M., Rooney, P. J., Rowland, J. J. & Kell, D. B. 1996. Rapid identification of Streptococcus and Enterococcus species using diffuse reflectance- absorbance Fourier transform infrared spectroscopy and artificial neural networks. FEMS Microbiology Letters, 140, 233-239. Griebling, T. L. 2005. Urologic diseases in America project trends in resource use for urinary tract infections in women. Journal of Urology, 173, 1281-1287. Guibet, F., Amiel, C., Cadot, P., Cordevant, C., Desmonts, M. H., Lange, M., Marecat, A., Travert, J., Denis, C. & Mariey, L. 2003. Discrimination and classification of Enterococci by Fourier transform infrared (FT-IR) spectroscopy. Vibrational Spectroscopy, 33, 133-142. Kahlmeter, G. 2003. An international survey of the antimicrobial susceptibility of pathogens from uncomplicated urinary tract infections: the ECO.SENS Project. Journal of Antimicrobial Chemotherapy, 51, 69-76. Kanehisa, M. & Goto, S. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research, 28, 27-30. Keseler, I. M., Collado-Vides, J., Santos-Zavaleta, A., Peralta-Gil, M., Gama-Castro, S., Muniz-Rascado, L., Bonavides-Martinez, C., Paley, S., Krummenacker, M., Altman, T., Kaipa, P., Spaulding, A., Pacheco, J., Latendresse, M., Fulcher, C., Sarker, M., Shearer, A. G., Mackie, A., Paulsen, I., Gunsalus, R. P. & Karp, P. D. 2011. EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Research, 39, D583-D590. Kwon, Y. K., Higgins, M. B. & Rabinowitz, J. D. 2010. Antifolate-induced depletion of intracellular glycine and purines inhibits thymineless death in E. coli. ACS Chemical Biology, 5, 787-795. Kwon, Y. K., Lu, W., Melamud, E., Khanam, N., Bognar, A. & Rabinowitz, J. D. 2008. A domino effect in antifolate drug action in Escherichia coli. Nature Chemical Biology, 4, 602-608. Maquelin, K., Kirschner, C., Choo-Smith, L. P., Van Den Braak, N., Endtz, H. P., Naumann, D. & Puppels, G. J. 2002. Identification of medically relevant microorganisms by vibrational spectroscopy. Journal of Microbiological Methods, 51, 255-271. Mariey, L., Signolle, J. P., Amiel, C. & Travert, J. 2001. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vibrational Spectroscopy, 26, 151-159.

224

Chapter Six

Mysorekar, I. U. & Hultgren, S. J. 2006. Mechanisms of uropathogenic Escherichia coli persistence and eradication from the urinary tract. Proceedings of the National Academy of Sciences of the United States of America, 103, 14170-14175. Nevedomskaya, E., Pacchiarotta, T., Artemov, A., Meissner, A., van Nieuwkoop, C., van Dissel, J. T., Mayboroda, O. A. & Deelder, A. M. 2012. 1H NMR-based metabolic profiling of urinary tract infection: combining multiple statistical models and clinical data. Metabolomics, 8, 1227-1235. Putri, S. P., Nakayama, Y., Matsuda, F., Uchikata, T., Kobayashi, S., Matsubara, A. & Fukusaki, E. 2013. Current metabolomics: practical applications. Journal of Bioscience and Bioengineering, 115, 579-589. Rodgers, M. A., Saghatelian, A. & Yang, P. L. 2009. Identification of an overabundant cholesterol precursor in hepatitis B virus replicating cells by untargeted lipid metabolite profiling. Journal of the American Chemical Society, 131, 5030-5031. Ronald, A. 2003. The etiology of urinary tract infection: traditional and emerging pathogens. Disease-a-Month, 49, 71-82. Sumner, L. W., Amberg, A., Barrett, D., Beale, M. H., Beger, R., Daykin, C. A., Fan, T. W. M., Fiehn, O., Goodacre, R., Griffin, J. L., Hankemeier, T., Hardy, N., Harnly, J., Higashi, R., Kopka, J., Lane, A. N., Lindon, J. C., Marriott, P., Nicholls, A. W., Reily, M. D., Thaden, J. J. & Viant, M. R. 2007. Proposed minimum reporting standards for chemical analysis. Metabolomics, 3, 211-221. Vincent, I. M., Creek, D., Watson, D. G., Kamleh, M. A., Woods, D. J., Wong, P. E., Burchmore, R. J. S. & Barrett, M. P. 2010. A Molecular mechanism for eflornithine resistance in African trypanosomes. PLOS Pathogens, 6, e1001204

225