1 Supporting Information

2

3 Automated High-Throughput Identification and Characterisation of Clinically

4 Important and Fungi using Rapid Evaporative Ionisation Mass

5 Spectrometry (REIMS)

6

7 Frances Bolt (1)†; Simon JS Cameron (1)†; Tamas Karancsi (2); Daniel Simon (2); Richard Schaffer (2);

8 Tony Rickards (3); Kate Hardiman (1); Adam Burke (1); Zsolt Bodai (1); Alvaro Perdones-Montero (1);

9 Monica Rebec (3); Julia Balog (2); Zoltan Takats (1)*.

10

11 (1) Section of Computational and Systems Medicine, Department of Surgery and Cancer, Imperial

12 College London, London, SW7 2AZ, United ; (2) Waters Research Centre, 7 Zahony Street,

13 Budapest, 1031, Hungary; (3) Department of Microbiology, Imperial College Healthcare NHS Trust,

14 Charing Cross Hospital, London, W6 8RF, United Kingdom

15

16 † Joint first authors

17

18 * Corresponding Author: [email protected]

19

20

21

22 This supporting material is provided to give more information on the experimental methodology

23 employed in this study and to include additional experimental data referred to in the primary

24 manuscript.

S-1 25 Table S1. Taxonomic classification and culture conditions of 25 microbial species analysed using handheld bipolar probe and high-throughput REIMS.

26 Taxonomic classifications and culture requirements of microbial species analysed in this study. Abbreviations: ‘+’ Gram-stain positive; ‘-‘ Gram-stain

27 negative; ‘CBA’ Columbia Blood Agar; ‘Choc’ Chocolate Agar;

Taxonomic Classifications Culture Conditions

Gram Phylum Class Order Family Media Temperature Atmosphere Duration Candida albicans N/A Fungi Ascomycota Saccharomycetes Saccharomycetales Debaryomycetaceae CBA 30 °C Aerobic 48 hrs Candida parapsilosis N/A Fungi Ascomycota Saccharomycetes Saccharomycetales Debaryomycetaceae CBA 30 °C Aerobic 48 hrs Corynebacterium amycolatum + Bacteria Actinobacteria Actinomycetales Corynebacteriaceae CBA 37 °C Aerobic 48 hrs Micrococcus luteus + Bacteria Actinobacteria Actinobacteria Actinomycetales Micrococcaceae CBA 37 °C Aerobic 24 hrs faecalis + Bacteria Lactobacillales Enterococcaceae CBA 37 °C Aerobic 24 hrs + Bacteria Firmicutes Bacilli Lactobacillales Enterococcaceae CBA 37 °C Aerobic 24 hrs

Lactobacillus jensenii + Bacteria Firmicutes Bacilli Lactobacillales Lactobacillaceae CBA 37 °C CO2 24 hrs

Streptococcus agalactiae + Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae CBA 37 °C CO2 24 hrs

Streptococcus pneumoniae + Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae CBA 37 °C CO2 24 hrs

Streptococcus pyogenes + Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae CBA 37 °C CO2 24 hrs aureus + Bacteria Firmicutes Bacilli CBA 37 °C Aerobic 24 hrs Staphylococcus epidermidis + Bacteria Firmicutes Bacilli Bacillales Staphylococcaceae CBA 37 °C Aerobic 24 hrs Staphylococcus haemolyticus + Bacteria Firmicutes Bacilli Bacillales Staphylococcaceae CBA 37 °C Aerobic 24 hrs + Bacteria Firmicutes Bacilli Bacillales Staphylococcaceae CBA 37 °C Aerobic 24 hrs difficile + Bacteria Firmicutes Clostridia Clostridiales Clostridiaceae CBA 37 °C Anaerobic 48 hrs Enterobacter cloacae - Bacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs Klebsiella oxytoca - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs Klebsiella pneumoniae - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs Morganella morganii - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs Proteus mirabilis - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs Serratia marcescens - Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae CBA 37 °C Aerobic 24 hrs

Haemophilus influenza - Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteurellaceae Choc 37 °C CO2 24 hrs Pseudomonas aeruginosa - Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Pseudomonadaceae CBA 37 °C Aerobic 24 hrs Stenotrophomonas maltophilia - Bacteria Proteobacteria Gammaproteobacteria Xanthomonadales Xanthomonadaceae CBA 37 °C Aerobic 24 hrs 28

S-2 29 Table S2. Instrument Operational Conditions for Xevo G2-XS Q-ToF Instrument

30 The operational conditions for the Xevo G2-XS Q-ToF instrument employed in this study are given

31 here.

Parameter Setting Scan Time 1000 ms Scan Mode Sensitive Mass Analyser Time of Flight Ionisation Mode Negative Ion Mode Mass Range 50 to 2500 Sampling Cone 80 V Source Offset 50 V Source Temperature 100 °C 32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

S-3 53 Table S3. Individual Species Cross-Validation Scores for All Species Analysis

54 The individual species-level cross-validation scores for the preliminary all species analysis model are

55 given for handheld bipolar probe REIMS and high-throughput REIMS, both with and without 2-

56 propanol (IPA) infusion. For both REIMS approaches, infusion with 2-propanol improves species-level

57 accuracy.

Handheld Bipolar REIMS High-Throughput REIMS

No IPA With IPA No IPA With IPA Average 84.27% 90.40% 88.80% 89.87% Candida albicans (CALB) 86.7% 100.0% 100.0% 100.0% Candida parapsilosis (CALP) 100.0% 100.0% 100.0% 93.3% Corynebacterium amycolatum (CAMY) 93.3% 100.0% 86.7% 100.0% Micrococcus luteus (MLUT) 93.3% 100.0% 93.3% 86.7% (EFAC) 73.3% 60.0% 80.0% 86.7% Enterococcus faecium (EFAM) 66.7% 66.7% 73.3% 86.7% Lactobacillus jensenii (LACJ) 93.3% 86.7% 93.3% 100.0% (SAGA) 80.0% 93.3% 73.3% 93.3% (SPNE) 86.7% 100.0% 73.3% 93.3% (SPYO) 86.7% 100.0% 60.0% 93.3% (SAUR) 93.3% 100.0% 100.0% 93.3% Staphylococcus epidermidis (SEPI) 60.0% 80.0% 86.7% 73.3% Staphylococcus haemolyticus (SHAM) 60.0% 73.3% 86.7% 73.3% Staphylococcus hominis (SHOM) 80.0% 86.7% 86.7% 73.3% Clostridium difficile (CDIF) 100.0% 100.0% 93.3% 100.0% Enterobacter cloacae (ECLO) 53.3% 100.0% 80.0% 86.7% Escherichia coli (ECOL) 73.3% 46.7% 86.7% 60.0% Klebsiella oxytoca (KOXY) 93.3% 100.0% 93.3% 86.7% Klebsiella pneumoniae (KPNE) 93.3% 86.7% 100.0% 86.7% Morganella morganii (MMORG) 80.0% 86.7% 93.3% 93.3% Proteus mirabilis (PMIR) 93.3% 100.0% 100.0% 93.3% Serratia marcescens(SMAR) 80.0% 93.3% 93.3% 93.3% Haemophilus influenza (HINF) 93.3% 100.0% 86.7% 100.0% Pseudomonas aeruginosa (PAER) 93.3% 100.0% 100.0% 100.0% Stenotrophomonas maltophilia (SMAL) 100.0% 100.0% 100.0% 100.0% 58

59

S-4 60 Figure S1. REIMS Interface with T-Piece Set-Up

61 To allow infusion with leu-enkaphaline containing 2-propanol (IPA), the analyte vapour is mixed with

62 the matrix solvent prior to entry into the REIMS interface. This is accomplished through a T-piece set-

63 up whereby the analyte vapour inlet (a) is positioned between the IPA inlet (b) and the REIMS interface

64 (c) allowing for its mixture prior to entry.

65

c

66

67

68

S-5 69 Figure S2. Groupings used in Taxonomic Groupings Models Approach

70 For each level of the taxonomic hierarchy present with the 25 microbial species analysed in this study,

71 a PCA/LDA model was created to measure the cross-validation accuracy within each taxonomic

72 grouping. A total of 13 taxonomic grouping models were constructed in this approach, for each REIMS

73 approach, with each of the groupings indicated by a separate colour.

74

Domain Phylum Class Order Family Species

albicans Fungi Ascomycota Saccharomycetes Saccharomycetales Debaryomycetaceae Candida parapsilosis Corynebacteriaceae Corynebacterium amycolatum Actinobacteria Actinobacteria Actinomycetales Micrococcaceae Micrococcus luteus faecalis Enterococcaceae Enterococcus faecium Lactobacillaceae Lactobacillus jensenii Lactobacillales agalactiae Streptococcaceae Streptococcus pneumoniae Bacilli Firmicutes pyogenes aureus epidermidis Bacillales Staphylococcaceae Staphylococcus haemolyticus hominis Bacteria Clostridia Clostridiales Clostridiaceae Clostridium difficile Enterobacter cloacae Escherichia coli oxytoca Klebsiella Enterobacteriales Enterobacteriaceae pneumoniae Morganella morganii Proteobacteria Gammaproteobacteria Proteus mirabilis Serratia marcescens Pasteurellales Pasteurellaceae Haemophilus influenzae Pseudomonadales Pseudomonadaceae Pseudomonas aeruginosa

Xanthomonadales Xanthomonadaceae Stenotrophomonas maltophilia 75

76

77

78

79

80

81

82

83

84

85

S-6 86 Figure S3. Gram-Stain, Genus, and Species-Level Accuracy for all REIMS Approaches

87 The Gram-stain, genus, and species-level cross-validation accuracy from an all species PCA/LDA model

88 is shown for handheld bipolar probe REIMS and high-throughput REIMS, both with and without

89 infusion with leu-enkaphaline containing 2-propanol (IPA). Gram-stain level accuracy is comparable

90 between all four REIMS approaches, but substantial differences are evident at the genus and species

91 level of classifications.

92

Gram Stain Genus Species 100 99 99 98 97

95 94 93 91 90 88 87 86 86

85 Cross Validation Score (% Accuracy) (% Score Validation Cross 80 77

75 No IPA With IPA No IPA With IPA HandheldBipolar Bipolar Forceps Probe REIMS REIMS HighHigh-Throughput-Throughput REIMS REIMS 93

94

95

96

97

98

99

100

101

102

S-7 103 Figure S4. Handheld Bipolar Probe REIMS Taxonomic Groupings Probability Matrix

104 The taxonomic groupings probability matrix for handheld bipolar probe REIMS shows that high levels

105 of accuracy is achieved at the domain, phylum, class, order, and family level of classifications, but that

106 slightly reduced levels of accuracy are achieved at the genus and species level of classification. Each

107 figure displays the percentage correct classification for each class within each constructed model. The

108 overall score is the figure calculated from multiplication of each model accuracy for the specific

109 taxonomic groupings of each microbial species.

110

Domain Phylum Class Order Family Genus Species Species Overall Score111 1.00 Candida albicans 100% 1.00 112 1.00 Candida parapsilosis 100% 1.00 Corynebacterium amycolatum 100% 1.00 1.00 Micrococcus luteus 100% 0.93 Enterococcus faecalis 91% 1.00 0.93 Enterococcus faecium 91% 1.00 Lactobacillus jensenii 98% 1.00 0.93 Streptococcus agalactiae 91% 1.00 1.00 Streptococcus pneumoniae 98% 0.99 0.99 1.00 Streptococcus pyogenes 98% 1.00 Staphylococcus aureus 98% 0.93 Staphylococcus epidermidis 91% 1.00 0.80 Staphylococcus haemolyticus 78% 1.00 0.93 Staphylococcus hominis 91% 1.00 Clostridium difficile 99% 1.00 Enterobacter cloacae 99% 0.93 Escherichia coli 92% 1.00 Klebsiella oxytoca 99% 1.00 0.99 1.00 Klebsiella pneumoniae 99% 1.00 Morganella morganii 99% 1.00 0.93 Proteus mirabilis 92% 1.00 Serratia marcescens 99% 1.00 Haemophilus influenza 100% 1.00 Pseudomonas aeruginosa 100% 1.00 Stenotrophomonas maltophilia 100%

S-8 113 Figure S5. High-Throughput REIMS Taxonomic Groupings Probability Matrix

114 The taxonomic groupings probability matrix for high-throughput REIMS shows that high levels of

115 accuracy are achieved at the domain, phylum, class, order, and family level of classifications, but that

116 slightly reduced levels of accuracy are achieved at the genus and species level of classification. Each

117 figure displays the percentage correct classification for each class within each constructed model. The

118 overall score is the figure calculated from multiplication of each model accuracy for the specific

119 taxonomic groupings of each microbial species.

120

Domain Phylum Class Order Family Genus Species Species Overall Score 1.00 Candida albicans 100% 1.00 1.00 Candida parapsilosis 100% 1.00 Corynebacterium amycolatum 97% 0.97 1.00 Micrococcus luteus 97% 0.80 Enterococcus faecalis 78% 0.97 0.87 Enterococcus faecium 84% 0.93 Lactobacillus jensenii 93% 1.00 0.93 Streptococcus agalactiae 86% 0.93 1.00 Streptococcus pneumoniae 93% 1.00 1.00 1.00 Streptococcus pyogenes 93% 0.93 Staphylococcus aureus 93% 1.00 Staphylococcus epidermidis 100% 1.00 0.80 Staphylococcus haemolyticus 80% 1.00 1.00 Staphylococcus hominis 100% 1.00 Clostridium difficile 100% 0.93 Enterobacter cloacae 92% 0.80 Escherichia coli 79% 1.00 Klebsiella oxytoca 99% 1.00 1.00 1.00 Klebsiella pneumoniae 99% 1.00 Morganella morganii 99% 0.99 1.00 Proteus mirabilis 99% 0.93 Serratia marcescens 92% 1.00 Haemophilus influenza 99% 1.00 Pseudomonas aeruginosa 99% 1.00 Stenotrophomonas maltophilia 99% 121

122

S-9 123 Figure S6. Comparison of Mass Spectra from Combined or Individual Analytical Repeats

124 To measure the intra-sample variability of each REIMS approach, an all species PCA/LDA model was

125 created using either mass spectra from combined or individual analytical repeats. Five analytical

126 repeats were taken from each culture plate of the 375 isolates analysed.

127

128 Combined 129

Vs.

Individual

S-10 130 Figure S7. Intra-Sample Variation Measured through All Species Models

131 All species analysis PCA/LDA models were constructed comparing the species-level cross-validation

132 accuracy of mass spectra from combination of the five analytical repeats and mass spectra from each

133 individual analytical repeat. Models were built using both the 50 to 2500 m/z range and the 600 to

134 900 m/z range, with mass binning to 0.1. Minimal differences were observed between species-level

135 accuracy in models built using both the 50 to 2500 m/z and 600 to 900 m/z ranges for both REIMS

136 approaches; suggesting minimal intra-sample and technical variation exists.

137

50 to 2500 m/z Individual 50 to 2500 m/z Combined 600 to 900 m/z Individual 600 to 900 m/z Combined 91 90 90 90 89 89 88 88 88

87 86 86 86 86

85 Cross Validation Score (% Accuracy) (% Score Validation Cross 84

83 HandheldBipolar Bipolar Forceps Probe REIMS REIMS HighHigh-Throughput-Throughput REIMS REIMS 138

S-11