Supplementary information for:

A -binding protein produced by Haemophilus haemolyticus inhibits non- typeable Haemophilus influenzae

Roger D. Latham 1, Mario Torrado 2, Brianna Atto 3, James L. Walshe 2, Richard Wilson 4, J. Mitchell Guss 2, Joel P. Mackay 2, Stephen Tristram 3, David A. Gell 1

Materials and Methods

Bacterial collection and culture The origin and method of identification of the bacterial strains has been described previously (1). For revival, subculturing, and enumeration of Haemophilus spp., chocolate agar (CA) was used and incubated for 18–24 h at 35°C in an atmosphere of 5–10% CO2. Isolates were stored at –80°C in 10% w/v sterile skim milk media (SMM).

Preparation of hemin solutions Solutions (1–5 mg mL–1) of ferriprotoporphyrin IX were prepared fresh by dissolution in 0.1 M sodium of either bovine hemin (ferriprotporphyrin IX chloride, Frontier Scientific), or porcine hematin (ferriprotoporphyrin IX hydroxide, Sigma-Aldrich) solid as specified in sections of Materials and Methods; both are referred to as hemin throughout the main text for describing the addition of the compounds to media or buffers (working concentration ~ 1–7 µg mL– 1). When generalising about protoporphyrin IX as a ligand in hemophilin, or in a biological context where oxidation state is not necessarily known, we use the term ‘haem’ for simplicity.

Agar well diffusion assay The agar well diffusion assay was performed as described previously (1), with the following clarifications. Solid media consisted of 18.5 g/L brain heart infusion (Oxoid) solidified with 7.5 g/L Bacteriological Agar (Oxoid), autoclaved at 121°C for 30 min, cooled to 50°C, then supplemented with 1% (v/v) resuspended Vitox® (Oxoid), along with 7.5 mg/L each of NAD and hematin (Oxoid); these media components are at half the normal concentration typically used for the culture of Haemophilus influenzae. 10 mL of agar was dispensed to a 90-mm Petri dish. Indicator strains (NTHi strain NCTC 4560 or NCTC 11315) were prepared by growing for 6–12 h on CA then suspended in Dulbecco’s phosphate buffered saline (DPBS, Gibco) to an absorbance of 1.0, diluting 10-fold with SMM and stored as 100-!L aliquots at –80°C. Thawed aliquots were diluted 100-fold in DPBS and mixed with 5 mL of molten overlay media at a dose predetermined to produce a dense lawn of colonies and immediately poured onto a petri dish of base media. 5-mm diameter circular holes were cut in the agar using a sterile stainless steel cork borer to accept 10–25 !L of test solutions. Plates were left open in a biological safety cabinet for one hour until wells were free of liquid then incubated for 18-24 h at 35°C in a humidified atmosphere containing 5% CO2 and the annular radius of cleared zones was recorded. Clearing zone size was affected by media age (older media giving larger the zone sizes), such that treatments were only compared within the same plate.

Production of native hemophilin Media for hemophilin production was cold filterable tryptone soya broth (TSB; Oxoid) made to the manufacturer specifications and sterilised by filtration through membrane with a pore size of 0.2 µm, then supplemented with HTM supplement (Oxoid) and Vitox® (Oxoid) according to manufacturers specification. Cultures of BW1 or RHH122 were grown on CA for 12-16 hours, suspended in pre-warmed (37°C) broth in a baffled shakeflask to an absorbance of 0.05 (OD600), then incubated with 200 RPM agitation at 37°C for 24 hours. Culture broths were clarified by centrifugation at 7000 " g for 30 minutes. Hemophilin activity was enriched by ammonium sulfate precipitation at 4°C, with the 50–70% saturation cut collected and redissolved in a volume of PBS equal to 1/20th of the initial culture broth volume then dialysed with 3500-Da molecular weight cut off (MWCO) dialysis membrane (Thermofisher) for 24 hours at 4°C against 50 mM Tris-HCl, pH 7.5. Following concentration by ultrafiltration with a 10-kDa MWCO centrifugal filter unit (Merck- Millipore) the samples were separated over a Superose 12 HR 10/300 GL column (GE Healthcare) with a bed volume of 24 mL, in 0.15 M sodium phosphate buffer, pH 7.0. Fractions with peak acitvity were applied to a C4 reversed-phase HPLC column (Symmetry300; Waters Corporation) that was developed with a linear gradient of 5–95% CH3CN, 95–5% H2O, 0.1% trifluoroacetic acid. Fractions were collected manually and lyophilised, then resuspended in aqueous buffer of choice.

RP-HPLC and mass spectrometry RP-HPLC fractions from the H. haemolyticus isolates BW1 or RHH122 that had peak hemophilin activity, or time/volume matched samples from control isolates, were subject to trypsin digestion in batch, or were processed further by Tris-tricine SDS-PAGE, silver staining and in-gel trypsin digestion of individual stained bands, as specified in Results. Silver stain was removed using 30 mM potassium ferricyanide and 100 mM sodium thiosulfate (1:1 mix) and gel pieces were dried by vacuum centrifugation. In-gel trypsin digestion was performed using proteomic grade trypsin (Sigma) as previously described (2). For trypsin digest of HPLC fractions, samples were precipitated in 9 volumes of ethanol, then reduced, alkylated and digested in 100 mM ammonium bicarbonate buffer as previously described (2). Peptide samples were reconstituted in 20 !L HPLC

Buffer (2% CH3CN, 0.05% trifluoroacetic acid) and analyzed by nanoLC-MS/MS using an Ultimate 3000 RSLCnano HPLC and LTQ-Orbitrap XL fitted with nanospray Flex ion source -1 (ThermoFisher Scientific). Tryptic peptides were loaded at 0.05 mL min onto a C18 20 mm ! 75 mm PepMap 100 trapping column then separated on an analytical C18 150 mm ! 75 mm PepMap 100 column nano-column Peptides were eluted in a gradient from 98% buffer A (0.01% formic acid in water) to 40% buffer B (0.08% formic acid in 80% CH3CN and 20% water) followed by washing in 99% buffer B (2 mins) and reequilibration in 98% buffer A for 15 mins. The LTQ-Orbitrap XL was controlled using Xcalibur 2.1 software (ThermoFisher Scientific) and operated in data- dependent acquisition mode where survey scans were acquired in the Orbitrap using a resolving power of 60,000 (at 400 m/z). MS/MS spectra were concurrently acquired in the LTQ mass analyzer on the eight most intense ions from the FT survey scan. Charge state filtering, where unassigned and singly-charged precursor ions were not selected for fragmentation, and dynamic exclusion (repeat count 1, repeat duration 30 sec, exclusion list size 500) were used. Fragmentation conditions in the LTQ were: 35% normalized collision energy, activation q of 0.25, 30-ms activation time and minimum ion selection intensity of 500 counts. The mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE (3) partner repository with the dataset identifier PXD013687. Raw MS/MS spectra were searched against a H. haemolyticus database downloaded from NCBI on Sept 19th 2016 (37, 881 entries) using the Andromeda search engine in the MaxQuant software (version 1.5.1.2). The settings for protein identification by Orbitrap MS/MS included carbamidomethyl modification of cysteine and variable methionine oxidation, two missed trypsin cleavage allowed, mass error tolerances of 20 ppm then 4.5 ppm for initial and main peptide searches, respectively, 0.5 Da tolerance for fragment ions. A false discovery rate of 0.01 was used for both peptide-spectrum matches and protein identification. No peptides corresponding to the first 22 amino acids of hemophilin/EGT80255 were identified in gel slices or RP-HPLC fractions. A non-tryptic cleavage site between residues 22 and 23 coincided with a signal peptide cleavage site predicted by SIGNALP 4.1 (4). Information from the MaxQuant output is compiled in Supplementary Table S5.

Recombinant hemophilin Genomic DNA was extracted from H. haemolyticus strain BW1. Primers NIS-F1 (5'- ATTACATATGCAGGTAGTGGGAAATGTATCA-3') and NIS-R1 (5'- TTATCTCGAGTTAATTTTTAGTACCGCCAAA-3') were used to amplify a DNA fragment corresponding to the hemophilin ORF residues 23–272. which coincided with a predicted signal peptide cleavage site (SIGNALP 4.1)(4), The hemophilin(23–272) fragment was cloned into the NdeI/XhoI sites of pET28a to express hemophilin with an N-terminal hexahistidine tag. A N- terminal truncated version, hemophilin(55–272), was similarly constructed by PCR cloning with primers NIS-F2 5'-ATTACATATGGACAGTAGTATTCCTAATGAT-3' and NIS-R1. H89Q and H119Q mutants were generated by standard overlap PCR using the NIS-F1 and NIS-R1 primers together with NIS-89Q-F 5'-TGGATTTCACAGCTTACAGG-3'; NIS-89Q-R 5'- CCTGTAAGCTGTGAAATCC-3'; NIS-119Q-F 5'-GCCAGATCAGCGTGGCTTAGG-3'; NIS- 119Q-R 5'-CCTAAGCCACGCTGATCTGGC-3'. Sequenced clones were transformed into E. coli strain Rossetta-2 (Novagen), grown in lysogeny broth (LB-Miller) containing 34 µg mL–1 –1 chloramphenicol and 25 µg mL kanamycin; expression was induced with 1 mM isopropyl #-D-1- thiogalactopyranoside for 3 hours with shaking at 37°C. For hemophilin produced for crystallography, hemin chloride was added to 5 µM final concentration one hour after induction. Bacterial cell pellets were collected by centrifugation, resuspended in lysis buffer (0.3 M NaCl, 0.05 M sodium phosphate, 0.02 M imidazole, 100 µM phenylmethylsulfonyl fluoride, pH 7.2), lysed by sonication, and clarified by centrifugation. Ni-affinity resin (Gold Biotechnology or Invitrogen) chromatography was performed by gravity at 4°C. The Ni-affinity column was developed with step-wise isocratic gradients increased from 0.02 to 0.25 M imidazole. Peak hemophilin fractions were dialysed against buffer at 4°C (0.3 M NaCl, 0.025 M Tris·HCl, 0.02 M imidazole, pH 8.0 at 21°C). The His-tag was cleaved by digestion with thrombin (Sigma-Aldrich) in the presence of 2 mM CaCl2 at 37°C for 2 hours; the His-tag was then removed by passing the sample over a second Ni-affinity column. Samples were dialysed at 4°C against 0.02 M sodium phosphate, pH 7, for loading onto cation exchange. Apo (colourless) and holo (orange-red) protein fractions of hemophilin were obtained by cation exchange (UnoS, BioRad), developed with a linear gradient: 0–0.2 M NaCl, 0.02–0.05 M sodium phosphate, pH 7. It was noted that repeated chromatographic separations of the holo protein by Ni-affinity, cation exchange, or SEC (Superose 12 HR 10/300 GL column; GE Healthcare) did not lead to detectible loss of heme, leading us to conclude that apo and holo pools of hemophilin protein were present in E. coli during expression and following cell lysis. Heme could be removed from holo hemophilin by acid acetone extraction using the method of Ascoli et al. (5) or by RP-HPLC (Fig. 2B), to yield apo hemophilin. Apo hemophilin produced in this manner had specific activity in the agar well diffusion assay of NTHi growth inhibition and heme binding properties indistinguishable from the properties of apo protein derived from E. coli lysates. The concentration of apo hemophilin was determined by 3 –1 –1 spectrophotometry using an extinction coefficient, "280 = 25.9 ! 10 M cm at 280 nm wavelength, calculated from amino acid composition. The concentration of the holo protein was determined 3 –1 –1 using an extinction coefficient, "280 = 38.6 ± 3.6 ! 10 M cm at 280 nm determined from area under the curve analysis of RP-HPLC chromatograms of apo and holo hemophilin performed in triplicate. The ~10% error in concentration measurements for the holo protein were below the detection limit in SDS-PAGE comparison of apo and holo samples (e.g., Fig. 2D).

BW1 hemophilin Knockout To confirm the role of hemophilin in generating the anti-NTHi activity of strain BW1, an NIS knockout was constructed using insertional inactivation. A partially assembled WGS of H. haemolyticus strain 11P18 (Seq ID: LCTK01000015, contig 00016) was used to acquire sequence flanking the hemophilin ORF for PCR primer design. These PCR primers, NIS-KO-F (gctagacgtgctgatgtt) and NIS-KO-R (tgttgttcttgtcgttgttg) were then used to generate a 1691 bp fragment using genomic DNA from strain BW1 as template. This 1691 bp fragment containing the hemophilin ORF (bp 700-1518) and a unique BspTI site was cloned in to pGEM-T (Promega) according to the manufacturer’s instructions. An 1132 bp kanamycin resistance cassette, generated by PCR with BspTI tagged primers Kana-Bsp-F (5'-GCGCCTTAAGTAAACCTGAACCAAA-3') and Kana-Bsp-R (5'-GCGCCTTAAGGTCGTCAGTCATAAA-3') using pLS88 (Genbank L23118) as template, was then sub-cloned into the BspTI site using standard methods. The hemophilin ORF containing the kanamycin cassette was then PCR amplified using NIS-KO-F and NIS-KO-R primers and transformed into strain BW1 using the MIV method (6). Transformants were selected on CA supplemented with 50 mg L–1 kanamycin and confirmed by sequencing of the hemophilin region. Sequence verified transformants were tested for the NTHi inhibitory phenotype using an ammonium sulfate extract of a broth culture in a well diffusion assay as previously described (1).

UV-visible spectroscopy UV-visible spectra were recorded on a Jasco V-630 spectrophotometer fitted with a temperature- controlled sample holder (Jasco) and a septum sealed spectrosil quartz cuvette with a path length of 1.0 cm (Starna, Baulkham Hills, Australia). Samples were prepared at final protein concentration 5– 7 !M in 0.1 M sodium phosphate, pH 7.0, unless otherwise stated. Solutions of hemin were prepared fresh by dissolving hemin chloride (Frontier Scientific) to a concentration of ~1 mM in 0.1 M NaOH; for chloride/fluoride binding experiments, hematin was used in place of hemin chloride. Solutions were filtered through a PVDF membrane with a pore size of 0.45 µm (Millipore) and concentrations were determined spectrophotometrically using an extinction coefficient "280 = 58.4 ! 103 M–1 cm–1 in 0.1 M NaOH (7). Ferrous heme was prepared by dilution of hemin into nitrogen– purged buffer with addition of slight molar excess of sodium dithionite; the formation of ferrous heme was monitored spectrophotometrically. All liquid transfer steps were performed under positive N2 pressure using a gas-tight syringe (SGE Analytical Science). O2 and CO adducts of holo – – – – hemophilin were formed by bubbleing O2/CO gas. Ferric ligands, CN , HS , Cl , F were supplied as solutions of KCN, Na2S, NaCl and NaF, respectively.

CD spectropolarimetry For CD, protein samples were exchanged to 25 mM sodium phosphate, 125 mM NaF, pH 7.0 by SEC over a 24-mL Superose 12 column. Samples were prepared at protein concentration 2.8–3.2 !M in a spectrosil quartz cuvette with a path-length of 0.1 cm (Starna). CD spectra were recorded on a Jasco model J-720 spectropolarimeter with a 450-W water-cooled xenon lamp over the spectral range 260–190 nm; the high tension voltage remained < 450 V. Data acquisition and processing was performed in Spectra Manager Version 1.54.03 with the following acquisition parameters: number of accumulations, 5; bandwidth, 1 nm; response rate, 0.5 s; scan speed 50 nm min–1; data pitch 0.5 nm; baseline corrections were performed by subtracting the spectra obtained from matched buffer samples. Prior to sample measurements, the CD amplitude was checked, and if necessary calibrated, using a freshly made solution 1S-(+)-10-camphorsulphonic acid (CSA).

Crystallography and structure analysis For crystallization, hemophilin was concentrated to 45 mg mL–1 in a buffer comprising 30 mM sodium phosphate, 70 mM NaCl, pH 7.3. Hexagonal crystals with distinct orange-red colour were grown by hanging drop method at 18°C in 0.1 M sodium acetate trihydrate pH 4.5, 2 M ammonium sulfate, by addition of 1 µL of protein solution to 1 µL of mother liquor. Crystals were cryoprotected in 25% glycerol before being flash cooled in liquid nitrogen. Diffraction data sets were collected at the Macromolecular Crystallography MX2 beamline, Australian Synchrotron (Clayton, Australia)(8) at a temperature of 100 K using x-ray wavelengths of 1.45866 Å to a resolution of 2.1 Å and 0.95372 Å to a resolution of 1.6 Å. Crystal diffraction images were processed in XDS (9), and data were indexed, scaled and merged using the program AIMLESS in the

CCP4 package (10). Diffraction data collected at 1.45866 Å (8.49986 keV) were phased by single- wavelength anomalous dispersion (11) from the heme iron (above the K-edge, which is 7.1120

KeV) using the PHASER SAD pipeline in CCP4, with solvent density modification implemented using PARROT (12, 13). Automated model building was performed using the BUCCANEER pipeline

(14) in CCP4, followed by iterative rounds of manual building in COOT (15) and refinement with

REFMAC5 (16). Peaks in the anomalous maps were identified at the following atom positions, in order of intensity (peak height/r.m.s): heme Fe (138 #), Met171 S (23.4 #), Met236 S (23.4 #), Fe- coordinating Cl (19.8 #), Met61 S (18.4 #), Met98 S (16.7 #), Met4 S (10.7 #), Met91 S (10.7 #),

SO4 (8.8 #), SO4 (8.7 #), Met88 S (9.3 #), SO4 (8.8 #), Cl (7.6 #). The anomalous scattering coefficient, f", for Fe, Cl, S at 8.5 KeV is 2.9, 0.63 and 0.50, respectively. Initial models built with the low-resolution data and reflections for Rfree calculation (5% of data) were transferred as input for refinement against the 1.6-Å data in REFMAC5. Error estimates for bond length measurements around the heme iron were calculated from Cruickshanks diffraction-data precision indicator (17) multiplied by 21/2 to convert from average coordinate error to bond length error. The volume of the heme-binding cavity was calculated by CASTP (18) and solvent accessible areas were determined using NACCESS (19), based on a 1.4 Å probe radius.

Propagation of H. haemolyticus in growth-limiting hemin conditions Isolates were propagated from SMM stock, followed by two overnight passages on CA at 35°C with 5-10% CO2 prior to experimental manipulation. Exposure to non-growth conditions was minimised by maintaining suspensions and diluents at 37°C in heat block with sand or benchtop incubator. A suspension (~1.0 OD600) of BW1 and BW1-KNOCKOUT was made in TSB from 8-10 h growth on chocolate agar. This suspension was diluted 1:10 in 5mL pre-warmed TSB supplemented (sTSB) with 2% (v/v) Vitox® (Oxoid), and either 15 or 0 µg mL–1 porcine hematin (Sigma-Aldrich) to generate heme- “replete’ and “starved” populations, respectively. Broths were incubated for 14 h at 37°C aerobically without shaking. Resultant starved and replete suspensions were centrifuged at 3000g for 10 min at 37°C and resuspended in fresh, pre-warmed TSB to an

OD600 of 0.5. A 1:10 dilution was made in pre-warmed TSB containing vitox and either 0, 0.94 or 15 µg mL–1 porcine hematin and incubated in a benchtop incubater at 37°C and 220 RPM. Colony counts were performed on all suspensions to confirm initial viability. The OD600 was measured by aliquoting 100 µL of growth into wells of a 96-well plate (Grenier Bio-One) and measured in a plate reader (Infinate 200 PRO, Tecan Life Sciences).

Detection and sequencing of hemophilin in clinical isolates Genbank sequences from H. haemolyticus strains M19107 (AFQN01000044.1) and M28486 (CP031238: region 316000-318200) were aligned and used to design RT-PCR primers to detect hemophilin in genomic DNA from our collection of 100 H. haemolyticus isolates using a method as previously described (20). Primers NIS-F 5'-GGCGTTGAGATATATGACAGTAG-3' and NIS-R 5’-TGTAAGGTGTGAAATCCATTTATCG-3' were used to screen for hemophilin and generated a 126 bp amplicon from position 148 to 273 of the ORF. Primers SEQF 5'- AATCCAGTATTAGTTGTTGATGC-3' and SEQR 5'-CTTGGTTGTTTATTGTTAATGTAG-3' were used for amplifying and sequencing hemophilin and generated an amplicon of 1056 bp that included regions 237 bp upstream and 223 bp downstream of the ORF.

Supplementary tables

Supplementary Table S1. MaxQuant analysis of HPLC fractions from H. haemolyticus isolates BW1, RHH122, BW39 and BWOCT3

This table is supplied is a separate file: Supplementary_Table_S1.xlsx

Supplementary Table S2. X-ray data collection and refinement statistics.

Hemophilin Data collection X-ray source Australian Synchrotron MX2 beamline Detector ADSC Q315r Temperature (K) 100 Wavelength (Å) 0.95372 Space group P3221 a, b, c (Å) 91.043, 91.043, 97.484 $, %, & (°) 90, 90, 120 Resolution range (Å)† 45.52–1.60 (1.63–1.60) Total observed reflections† 1241513 (64897) Unique reflections† 62031 (3086) †,* Rmerge (%) 7.1 (79.8) †,* Rmeas (%) 7.3 (81.8) †,* Rp.i.m (%) 1.6 (17.8) † 23.2 (4.0) † CC1/2 1.000 (0.909) Completeness (%)† 100.0 (100.0) Multiplicity† 20.0 (21.0) Overall B factor from Wilson plot (Å2)† 21.9 Refinement Resolution range (Å)‡ 41.49–1.60 Reflections in working set‡ 58934 Reflections in test set‡ 3056 ‡ Rwork (%) 17.12 ‡ Rfree (%) 18.45 Cruickshank DPI‡ 0.059 No. non-hydrogen atoms Protein 1985 Ligand/ion 84 Water 170 Average B-factors per atom (Å2)§ Protein 27.24 Heme 29.85 Heme Cl ligand 29.98 Other ligand/ion 38.84 Water 36.36 R.m.s. deviations‡ Bond lengths (Å) 0.008 Bond angles (°) 1.45 Ramachandran plot¶ Favoured (%) 99.6 Allowed (%) 0.40 MolProbity score¶ 1.05, 100th percentile PDB code 6om5

Values in parentheses are for the outer shell † Data from AIMLESS †,* + – Reported by AIMLESS, relative to the mean of all I and I ‡ Data from REFMAC5 § Data from BAVERAGE in CCP4 ¶ Data from MOLPROBITY Supplementary Table S3. A comparison of Fe(III)–Cl bond lengths

The table shows heme proteins with chloride ligands from the PDB have Fe–Cl distances in the range 2.34–2.69 Å (excluding pdb 4jet in which the Cl atom in the heme pocket is supposed weakly coordinated, or not coordinated). Fe–Cl bond lengths in small molecule crystals are slightly shorter, 2.22–2.30 Å and close to the theoretical ideal.

Complex Method a Fe–Cl (Å) Ref Hemophilin xrd 2.47 This work Rj DypB D153A (pdb 3vec) xrd 2.34 (21) Rj DypB D153H (pdb 3ved) xrd 2.69 (21) Pa HbV (pdb 3arj) xrd 2.40 (22) Pa HbV (pdb 3ark) xrd 2.43 (22) Pa HbV (pdb 3arl) xrd 2.45 (22) Yp HasA (pdb 4jet) xrd 3.01 b (23) [Fe(PPIX)Cl] c xrd 2.218 (24) [Fe(TpivPP)Cl]– d xrd 2.301 (25) 2+ [FeCl(OH2)5] exafs 2.26 (26) Fe–O (Å) metMb Fe(III)(OH2) pH 9.0 xrd 2.11 (27) metHb Fe(III)(OH2) pH 7.1 xrd 2.15 (28) metMb Fe(III)(OH2) pH 6.8 xrd 2.05 (29) a xrd = single crystal x-ray diffraction, exafs = extended x-ray absorption fine structure b The authors propose the Cl is weakly coordinated, or not coordinated, to Fe in this structure c Hemein choride, PPIX = protoporphyrinIX d TpivPP = tetrakis($,$,$,$-o-pivalamido)phenyl

Supplementary Table S4. Distribution of heme uptake genes in Haemophilus spp

This table is supplied is a separate file: Supplementary_Table_S4.xlsx

Only proteins involved in primary capture of heme from the environment in H. haemolyticus or H. influenzae/NTHi are show in the table; these being either integral outer membrane proteins or secreted proteins. The table contains accession numbers of heme uptake proteins from all 48 H. haemolyticus strains for which sequence data was available (as of December 2018; cells highlighted yellow); 18 H. influenzae strains encoding a hemophilin-like gene (highlighted green), from a total of 700 H. influenzae genomes; and 88 invasive H. influenzae strains collected in Portugal over a 24- year period (30) (highlighted blue). Gene/protein sequences were assigned based on BLAST searches. The table shows that 100% of H. influenzae strains have genes for the TonB dependent outer membrane heme transporters, hup (31) and hemR (32), as well as the hxuCBA genes for heme uptake from (33, 34) (Supplemental Table 4). In addition, 98%, 77% and 30% of strains have at least 1, 2 or 3 genes, respectively, encoding TonB dependent transporters for heme uptake from /haptoglobin (hgpA, hgpB, hgpC (35-37)). Thus, the majority of these H. influenzae strains have highly redundant pathways for accessing heme from a variety of host sources. The number of heme acquisition genes found in H. haemolyticus strains is much more variable. From the 46 available H. haemolyticus assemblies, 70% have hup, only 7% have hemR, and 87%, 54% and 0% have at least 1, 2 or 3 genes with similarity to hgpA/hgpB/hgpC. Notably, no H. haemolyticus strains carry the huxCBA system, which is considered a virulence factor in H. influenzae (38, 39).

Supplementary Table S5. RTPCR screening for hemophilin in a collection of 100 clinical H. haemolyticus isolates.

Isolate RTPCR* Sequencing Nucleotide Activity‡ Activity, PCR* sequence equalised similarity† for culture (%) density‡ BW1 + + 816/816 (100) 5.5 3.75 BW5 + + 804/816 (99) 0 0 BW15 + + 816/816 (100) 2 0 BW18 + + 816/816 (100) 3 0 BW36 + – ND 3 ND CF14 + + 783/816 (96) 0 0 L19 + + 706/830 (85) 0 0 L117 + – ND 0 0 L152 + – ND 3 0 L153 + – ND 3 0 NF4 + + 816/816 (100) 2 0 NF5 + + 816/816 (100) 3.5 2 NF6 + + 783/816 (96) 1 0 NF11 + + 785/816 (96) 0 0 RHH122 + + 816/816 (100) 5.5 3.5

* RTPCR primers NIS-F and NIS-R produced a 126-bp amplicon within the hemophilin ORF; primers SEQF and SEQR produced a 1056 bp amplicon for sequencing; primer sequences are provided in Materials and Methods. Four isolates failed to produce an amplicon with the sequencing primers (presumably due to differences at the primer annealing sites). † Number of matches/total alignment length ‡ Measured as annular radius of the clearing zone (mm) in the well diffusion assay ND, not determined

Supplementary Figures

Fig. S1. H. haemolyticus isolates BW1 and RHH122 produce an NTHi-inhibitory protein. Analysis of fractions obtained by SEC separation of ammonium sulfate precipitates of conditioned medium obtained from stationary phase culture of H. haemolyticus isolates BW39, RHH122, BW1 and BWOCT3. (A) Agar well diffusion assays of SEC column load (L) and eluted fractions (21–35) for strains as annotated. A control well (+) is concentrated media recovered from a culture of H. haemolyticus strain BW1. A negative control (–) is PBS. (B) Tris-glycine buffered SDS-PAGE and Coomassie stain analysis of SEC elution fractions. Fractions with NTHi-inhibitory activity in the agarose well diffusion assay contained a protein of ~30 kDa (arrows) that was absent in matched chromatographic fractions from control strains BW39 and BWOCT3.

Fig. S2. RP-HPLC purification of hemophilin from H. haemolyticus isolate RHH122. (A) Reversed phase HPLC separation of the fraction with peak hemophilin activity obtained from SEC separation of conditioned medium recovered from H. haemolyticus strain RH122. (B) Agar well diffusion assay of fractions from RP-HPLC separation of RHH122 sample in A. NTHi strain 11315 is the indicator strain. Samples from HPLC chromatography load (L) and eluted fractions (1–9) were applied to wells punched in the agar plate. Clear zones around wells indicate inhibition of NTHi 11315. A control (+) is concentrated media recovered from a culture of H. haemolyticus strain BW1. A negative control (–) is PBS. (C) Tris-tricine SDS-PAGE and silver stain analysis of fractions from HPLC separation of RHH122 sample shown in A, and assayed for activity in B; M, molecular weight markers (kDa).

Fig. S3. Recombinant expression of hemophilin in E. coli. Tris-glycine SDS PAGE and Coomassie stain analysis of soluble fractions from expression culture of His6-tagged hemophilin (residues 55–272) and hemophilin (residues 23–272): lane1, hemophilin (residues 55–272) pre- induction lysate; lane 2, hemophilin (residues 23–272) pre-induction lysate; lane 3, hemophilin (residues 55–272) post-induction lysate; lane 4, hemophilin (residues 23–272) post-induction lysate; lane 5, hemophilin (residues 55–272) Ni-affinity purified; lane 6, hemophilin (residues 23–272) Ni- affinity purified; M, molecular weight markers (kDa).

Fig. S4. An H. haemolyticus hemophilin gene knockout does not inhibit NHTi. (A) Agarose well diffusion assay: clockwise from top, recombinant hemophilin holo protein (400 pmole); phosphate buffered saline (PBS); ammonium sulfate (AS) precipitate from BW1 culture medium equivalent to 1 mL of stationary phase culture at OD 0.88; AS precipitate from the BW1 hemophilin deletion strain culture medium equivalent to 1 mL of stationary phase culture at OD 1.1 (BW1 KO). Indicator strain is NTHi strain 11315. (B) RP-HPLC of BW1 (red) or hemophilin KO samples (blue) after AS and SEC purification as described in methods. Fractions were collected by hand, as indicated, for further analysis. The size of the hemophilin peak(s) was variable between experiments: compare Fig. S4B with HPLC traces in Figure 1 and fraction 5 of Figure S2 (note, different elution gradients were employed). We attribute this to variable expression level between experiments. Factors that regulate hemophilin expression level are unknown. (C) Tris-Tricine SDS- PAGE and silver stain analysis of fractions collected as shown in B. Dashed box indicates the position of the 27-kDa hemophilin band present in BW1 fractions, but not in the hemophilin KO samples; M, molecular weight markers (kDa). (D) Agarose well diffusion assays of RP-HPLC samples as annotated in B, C. Activity is detected only in fraction 4 from BW1. The positive control is (+) is purified hemophilin (340 pmole); the negative (–) is PBS buffer.

Fig. S5. Hemophilin reconstituted with ferrous heme can bind O2 and CO. Apo hemophilin (10 !M) was reconstituted with ferrous haem, generated by treating hemin with excess sodium dithionite under nitrogen. Addition of O2 (red traces) or CO (blue traces) produced spectra with distinct Soret, $ and % absorption bands typical of a low-spin 6-coordinate ferrous haem, as evident from a comparison with the spectra of hemoglobin (Hb) and myoglobin (Mb) O2/CO complexes. All spectra were recorded in 0.2 M sodium phosphate buffer, pH 7.0.

Fig. S6. Hemophilin reconstituted with ferric heme can bind CN– and HS–. (Top) In 0.2 M sodium phosphate buffer, pH 7.2 (black traces), apo hemophilin (10 !M) reconstituted with ferric hemin gave a UV-visible spectrum with $ and % absorption bands at 538.5 and 574.5 nm, suggesting a 6-coordinate low-spin Fe(III) with OH– ligand (40). The aqua complex of Mb at pH 7.2 is high-spin Fe(III) with characteristic charge transfer bands at ~500 nm and 630 nm (40). In 0.1 M sodium acetate, 2 M ammonium sulfate, pH 4.5 (green traces), the condition used for crystallization, hemophilin produces a spectrum that is typical of a high-spin aqua complex. At pH 4.5 Mb is unfolded, but in 0.2 M sodium borate pH 9.8 (dashed trace), Mb converts to a mixture of high spin/low spin hydroxo Fe(III) species. The pKa of the acid alkaline transition of Mb is 8.9 (41, – 42). Clearly the pKa of the H2O/OH transition in hemophilin is much lower (pH<7.2). (Middle, bottom) Spectra were recorded in 0.2 M sodium phosphate buffer, pH 7.0. Compared to the unliganded form (black traces), addition of KCN (purple traces) or Na2S (orange traces) to hemophilin produced spectral changes indicating ligand binding, including red-shift in the Soret absorption peak to ~420 nm and the appearance of unresolved $/% bands with absorption maximum ~540 nm, similar to the spectra of CN–/HS– complexes of Mb and Hb, which are low-spin 6- coordinate Fe(III) species (40, 43).

Fig. S7. Identification of heme and a heme-coordinated ligand in electron density, before refinement. The figure shows portions of FO–FC electron density maps contoured at +3 # (green) and –3 # (red) obtained at intermediate stages of structure refinement, after building all residues of the hemophilin polypeptide, but before before placing ligands, waters or other solvent molecules. (Left) Diffraction data collected at an x-ray wavelength of 1.45866 Å to a resolution of 2.1 Å were phased by single-wavelength anomalous dispersion using the PHASER SAD pipeline in CCP4. Automated model building was performed using the BUCCANEER pipeline in CCP4, followed by manual building in COOT to complete the polypeptide (excluding the N-terminal expression tag which was disordered). The Figure shows a portion of the FO–FC map generated by REFMAC5 that clearly identified the position of a porphyrin with a single-atom ligand, distal to the heme- coordinating His119 side chain. (Middle) A data set collected at an x-ray wavelength of 0.95372 Å to a resolution of 1.6 Å was refined against the polypeptide model obtained from the low-resolution data. (Right) The figure shows the result of real-space refinement of Fe protoporphyrin IX and Cl– ion into the FO–FC density using COOT. Pyrrole rings and methine bridges of the porphyrin are annotated according to the Fischer system. For reference (not annotated) methyl groups are in the 2,7,12,18-positions, vinyl groups in the 3,8-positions and propionate groups in 13,17-positions, according to the IUAPC-IUB nomenclature of tetrapyrroles for trivial names and locants.

Fig. S8. NaF is a probe for the presence of ferric heme in hemophilin preparations. As shown in Supplementary Figures S5 and S6, ferrous hemophilin bound to O2 and the ferric hemophilin aqua complex both give UV-visible spectra characteristic of low-spin iron complexes, making it hard to detect low levels of cross-contamination between these forms. To detect contamination of the hemophilin O2 complex with ferric heme we used NaF as follows. (A) Large spectral changes, including appearance of a strong charge transfer band at ~606 nm, occur upon addition of NaF to hemophilin·Fe(III)heme (5 µM), consistent with a shift from low spin Fe(III) complex with axial OH– ligand to a high spin Fe(III) complex with axial F– ligand. (B) A small UV-visible spectral change is observed when NaF (20 mM) is added to hemophilin·Fe(II)heme·O2 (5 µM), freshly purified from E. coli, consistent with the presence of ferric heme in a small proportion of hemophilin molecules (F– being a ligand specific for Fe(III) haems). (C) Binding of F– to metmyoglobin generates a 6-coordinate high-spin complex similar to that for hemophilin·Fe(III)heme, although higher concentrations of NaF are required suggesting hemophilin has higher affinity for an F– ligand than does Mb. (D) Addition of 20 mM NaF to the hemophilin sample prepared for protein crystallization results in substantial changes in the UV-visible spectrum, including a blue shift of the Soret band and appearance of absorption at ~606 nm, suggesting the presence of contaminating Fe(III) heme. Spectra in A–C were recorded in 0.1 M sodium phosphate buffer, pH 7.0, with addition of NaF as indicated. Spectra in D were recorded in 0.1 M sodium phosphate, pH 7.3 in the absence or presence of 20 mM NaF.

Fig. S9. Stability of hemophilin in crystallization buffers. (A) hemophilin·Fe(II)heme·O2 (6 µM) in 0.1 M sodium phosphate pH 7. Spectra recorded at t=0, 168, 312, 2256 hours at room temperature. The uniform decrease in spectral intensity across all wavelengths suggests protein precipitation or loss of heme. (B) hemophilin·Fe(II)heme·O2 (6 µM) in sodium acetate pH 4.5. Spectra recorded at t=0, 5, 72, 168, 312, 2256 hours. Appearance of peaks at ~500 and 632 nm indicate the evolution of high-spin Fe(III) heme. (C) Conditions as B, with addition of 2 M ammonium sulfate; the condition of the crystallization experiment. (D) Samples from the final time point in A–C were adjusted to 20 mM NaF. Formation of a fluoride heme species is indicated by an absorption peak at ~605 nm. A small 605 nm peak at pH 7 confirms that there is little autooxidation at this condition (although there is a decrease in the total soluble hemophilin over time, as shown in A). Both low pH and 2 M ammonium sulfate promote autooxidation.

Fig. S10. Ferric hemophilin binds chloride ions. (A) Titration of NaCl into holo hemophilin (6 µM) reconstituted with hematin (ferriprotoporphyrin IX hydroxide) in chloride-free 0.1 M sodium acetate, 2 M ammonium sulfate, pH 4.5. Spectra recorded with [NaCl] in the range 0.5–25 mM are coloured black, and for [NaCl] in the range 50–2000 mM, spectra are coloured red. (B) Titration as in A, but with buffer comprising 0.1 M sodium phosphate, pH 7. (C) Binding isotherms for chloride calculated from absorbance changes at 574 and 634 nm shown in A, B. Errors are the asymptotic standard errors on the fit parameter Kd (dissociation equilibrium constant) as calculated by GNUPLOT version 4.6.

Fig. S11. Heme sites with chloride ligands in the PDB. Protein crystal structures containing heme with chloride coordinating, or positioned closely, to the heme iron have been deposited for three proteins, other than hemophilin. These are the dye decolourizing peroxidase (DypB) from Rhodococcus jostii (pdb 3vec, shown, and 3ved, not shown) (21), the monomeric hemoglobin component V (HbV) from Propsilocerus akamusi (pdb 3ark, shown, and 3arj and 3arl, not shown) (22), and the hemophore, HasA, from Yersinia pestis (pdb 4jet) (44). All of these examples share with hemophilin an Arg sidechain in the distal heme pocket positioned with the guanadinium group roughly planer with the porphyrin ring and within hydrogen bonding distance of the chloride ligand. Further similarities exist between the hemophilin, DypB and HbV heme sites, in that the distal site Arg side chain also hydrogen bonds with one of the heme 17-propionate groups. A water ligand can also be accommodated with a similar side chain arrangement where the proximal ligand is His. Mutational and structural analyses have confirmed a positive role for the distal site Arg in binding – – – – – to chloride (21) and other ionic ligands such as F , N3 , CN , SCN , OH (45-47); similarly it may also facilitate HS– binding. The heme site structure of hemophilin suggests that binding to ferrous and ferric heme ligands could occur in vivo. Bacteria from a variety of environments use heme proteins to sense CO, O2, NO and HS– (48, 49), and these small ligands also have powerful signalling roles in the mammalian immune system (50-52), making small molecule binding by hemophilin potentially important. On the other hand, hemophilin lacks some features of heme enzymes: for example oxidoreductase heme enzymes such as horseradish peroxidase and cytochrome c peroxidase (53), dye- decolourizing peroxidase (54), and nitric oxide dioxygenase (55) have an acidic Asp or Glu residue hydrogen bonded to the proximal His.

Fig. S12. Structure and topology of hemophilin and similar proteins. Richardson diagrams (top) and topology (bottom) for hemophilin and the three proteins with highest similarity to hemophilin according to DALI searches (56), coloured blue through red from N- to C-terminus. These proteins are: hemoglobin-haptoglobin utilisation protein (HpuA) from Kingella denitrificans (pdb 5ec6, rmsd 2.4 Å over 185 residues) (57), transferrin binding protein B (TbpB) from Actinobacillus suis (pdb 3pqu, rmsd 3.7 Å over 181 residues) (58), and complement factor H binding protein (fHbp; pdb 2w80, rmsd 3.7 Å over 174 residues) from Neisseria meningitides (59). A fourth highly similar protein, as identified by DALI, is Neisseria heparin binding antigen (NHBA; pdb 2lfu, rmsd 4.0 Å over 120 residues) (60) from Neisseria meningitides (not shown). Topology diagrams were produced using PROORIGAMI (61) with manual editing to facilitate comparisons between proteins. Secondary structure elements are numbered to indicate similarity with hemophilin. %-Strands that contribute to the same sheet or barrel structure are enclosed by coloured rectangles. Segments of irregular secondary structure that cross in the diagrams are coloured red/green for unambiguous backbone tracing. The C-terminal %-barrel has an antiparallel meander topology and is an unusual example of an eight-stranded (n=8) %-barrel with a shear value of eight (S=8) indicating that hydrogen bonding around the %-barrel links residue i and i±8 (62) (Fig. 3B). Hemophilin, HpuA, TbpB, fHbp and NHBA (not shown) also share a 2–6 strand %-sheet that packs against the barrel. Remarkably, the precise combination of %-barrel topology (8 strands in a meander topology with a shear value of 8) together with hydrophobic residues packed in the barrel core seems to occur only in the above group of bacterial proteins. A large number of other proteins contain an 8-stranded %- barrel with meander topology, but with different distributions of polar/non-polar residues (integral membrane versus aqueous soluble)(63, 64) and/or shear number (10 or 12)(65-67). Interestingly, the N-terminal region of hemophilin can be interpreted as an opened and distorted 8-stranded %- barrel, with insertions into the loops between the %-3/%-4 strands and between the %-5/%-6 strands forming the docking site for haem, thus providing a hypothetical evolutionary origin for hemophilin through domain duplication and subsequent divergence of an ancestral tandem-barrel structure. We note that the 8-strand %-barrel topology has been adapted to a ligand binding function in other ways, most notably in the lipocalin superfamily, wherein a greater shear produces a barrel with a larger diameter that can accommodate a ligand inside the barrel, for example, a heme group (68) or ferric- siderophore (69).

Fig. S13. Different ligands for members of the family of proteins with a C-terminal n=8 S=8 %–barrel with a hydrophobic core. Hemophilin is secreted from H. haemolyticus strains BW1 and RHH122; HpuA, TbpB, fHbp and NHBA (not shown) are expressed as lipoproteins on the bacterial outer membrane; all have ligand binding functions, but bind very different ligands. The ligand complexes are shown for hemophilin·heme (this work), HpuA in complex with hemoglobin ($% dimer in the crystallographic asymmetric unit) (HpuA·Hb($%); pdb 5ee4) (57), TbpB (N lobe) bound to human transferrin (TbpB·Tf; pdb 3ve1) (70) and fHbp bound to complement control protein (CCP) domains 6–7 of complement factor H (fHbp·fH; pdb 2w80) (59). A different face of the domain engages protein ligand in each case. Note both the N and C lobes of TbpB have structural similarity to hemophilin; the N lobe (shown here) is less similar than the C lobe (shown in Figure S12).

Fig. S14. Activities of hemophilin mutants H89Q and H119Q. (A) Richardson diagram of the N- terminal subdomain of hemophilin showing H89 and H119. (B) UV-visible spectra of hemophilin proteins following Ni-affinity purification. Spectra of hemophilin and hemophilin(H89Q) are identical suggests that the H89Q mutation has no effect on heme binding. The spectrum of hemophilin(H119Q) has no significant absorption above ~300 nm, showing that this protein purifies from E. coli without a heme cofactor, indicating a large drop in heme affinity. Signal is normalised to facilitate comparison. (C) UV-visible spectra of hemophilin proteins prepared by removal of heme in acid acetone and refolding (dashed lines): apo hemophilin (3.1 !M; black), apo hemophilin(H89Q) (3.0 !M; red) and apo hemophilin(H119Q) (2.3 !M; blue). Spectra are also shown after addition of ferric hemin (2.3 !M; solid lines). The similarity of holo hemophilin and hemophilin(H89Q) spectra at all wavelengths, including Soret absorption band at 413 nm and $ and % absorption bands at 538.5 and 574.5 nm, suggests that both proteins bind hemin as low-spin Fe(III) species. Mixing hemin with the H119Q mutant gave a spectrum with peaks at 404, 485 and 602 nm (Fig. S14), indicative of high-spin heme, similar to spectra of Fe(III) heme:protein complexes without an Fe-coordinating side chain (71) and monomeric hemin in methanol:water mixtures (72). All spectra recorded in 0.2 M Tris.HCl, pH 8.0. (D) Agar well diffusion assays of apo hemophilin proteins in duplicate. Protein load is 20 µL at concentrations of: 1.25 µM (a), 2.5 µM (b), 5 µM (c) and 10 µM (d).

Fig. S15 Analysis of area under growth curve (AUC) for data in Figure 5. Groups were compared by Tukey’s multiple comparisons test implemented in the program PRISM: the group means indicated (*) differed at a significance level P $ 0.004; ns = not significant.

Fig. S16 Viability of cultures seeded in Figure 5. Colony counts were performed on all suspensions by plating on CA to demonstrate that the pre-incubation period did not reduce cell viability and that equal numbers of cells were inoculated into the test conditions. Error bars represent ± SEM (n=3).

Fig. S17. The effect of heme-loaded hemophilin on the growth of Haemophilus species. Growth on TSB supplemented with hemin at 0.6 !g/mL (blue, solid line). Growth on TSB supplemented with hemin at 5.6 !g/mL (red, solid line). Growth on TSB supplemented with hemin at 0.6 !g/mL plus holo hemophilin at 7.7 !M (equivalent to 5 !g/mL hemin), giving a total hemin concentration of 5.6 !g/mL (blue dashed line). Error bars are mean ± SD (n=3).

Fig. S18. Addition of hemin neutralises the inhibitory activity of hemophilin. Agar well diffusion assay with wells containing hemophilin from strain BW1 mixed with hemin at variable concentrations (values shown, !g/mL). Positive control (+) contains BW1 hemophilin only. Negative control (–) contains DPBS. Indicator is NTHi strain NCTC 11315.

hi_19P68H7 hi_19P81H1 hi_74P49H2 hh_m19135 hh_m19071 hi_14P26H2 hi_84P21H1 hi_84P23H1 hi_121P25H1 hi_1P22H3 hi_10P129H1 hi_14P6H hi_1P16H5 hi_M11578 hi_1P18H1 hi_124P1H1 hi_94P5H hh_m19161 hi_56P127H1 hh_ccug24149 hh_m19354 hh_m21127 hh_m25342 Hp_C2008001710 hh_m19346 hh_m19080 hh_m19122 hh_m19140 hh_m19164 hh_m19197 hh_m19201 hh_m19528 hh_m26166 hh_m16167 hh_m26173 hh_m26176 hh_m19079 hh_m26161 hh_BW15 hh_BW18 hh_NF4 hh_NF5 hh_m19066 hh_BW1 hh_m19107 hh_m11818 hh_RHH122 hh_m26157 hh_m21639 hq_c860 hq_k068 Hq_mp1 hh_m28486 hh_11p18 hh_BW5 hh_CF14 hh_NF6 hh_NF11 hh_L19

0.02

Fig. S19. Phylogenetic clustering of ORF sequences with homology to hemophilin. Sequences are annotated as species_strain, using the species initials hh (H. haemolyticus), hi (H. influenzae), hq (H. quentini), Hp (H. parainfluenzae). The tree includes all strains from our collection for which hemophilin sequence information was obtained (BW15, BW18, NF4, NF5, BW1, RHH122, BW5, CF14, NF6, NF11, L19) as well as all strains for which a hemophilin homologue was identified on Genbank. Sequences in red text are unique, or representative of clusters of identical ORFs. The amino acid sequence alignment of all unique/representative ORF sequences in shown in Supplemental Fig. S20.

A

B C M79 K45

I125 R82 Q74 L90

H119 L112 M106 W86 T113 K72 L122 P117 M116 I115 A69 M109

Fig. S20. Conservation of hemophilin sequences from strains of H. haemolyticus. (A) Multiple sequence alignment of unique hemophilin variants. Star symbols indicate residues that make contacts with the heme group. (B) Ribbon structure of hemophilin coloured to represent amino acid conservation by physicochemical properties on scale ranging from 0 (dark blue, no conservation) to 1 (dark red, identical). (C) Conservation of residues in contact with the heme group, coloured as in B.

Fig. S21. Models of the putative hemophilin receptor. Structural models of a predicted TonB- dependent transporter ORF immediately upstream of the hemophilin gene in H. haemolyticus strain M28486 (see Supplementary Table S3) were prepared using the PHYRE and I-TASSER web servers. This query sequence was chosen for modelling because the hemophilin gene from M28486 is 100% identical to that in BW1/RHH122. The top ranking template structures identified by both servers were the transferrin binding protein A (TbpA), and the receptor for HasA hemophore (HasR). The top-ranking model (Model 1) from PHYRE (orange) aligned 73% of the query sequence with TbpA (pdb 3v89 (73); 15% sequence identity over 811 residues) with a confidence score of 100%. Model 1 from I-TASSER used the same template but built more residues into the final model (cyan) with a C-score of –2.11 (acceptable range –5 to 2, which higher score indicating higher confidence) and TM score of 0.45±0.15 (TM>0.5 indicates shared topology). Model 2 was based on the HasR receptor using template pdb structures 3csl (PHYRE; 22% sequence identity over 721 aligned residues; orange) or 3ddr (I-TASSER; cyan) (74). Overall, Models 1 and 2 are similar with the greatest confidence in alignment in the transmembrane 22-strand %-barrel structure. It is tempting to speculate that a conserved mechanism of interaction between hemophilin/HpuA/TbpB and their respective TonB dependent receptors, coupled with divergence of hemophilin/HpuA/TbpB ligand binding, might explain how these proteins have evolved to transport heme or iron from different host protein sources to the bacterial periplasm. Once delivered to the periplasm, there are several substrate binding proteins (SBPs) in Haemophilus spp. that can bind promiscuously to haem, peptides, glutathione, and other substrates (75-77), targeting these cargoes to ABC transporters that shuttle them across the plasma membrane.

Supplementary references

1. Latham RD et al. (2017) An isolate of Haemophilus haemolyticus produces a bacteriocin- like substance that inhibits the growth of nontypeable Haemophilus influenzae. Int J Antimicrob Agents 49: 503-506. 2. Wilson R, Belluoccio D, Bateman JF (2008) Proteomic analysis of cartilage proteins. Methods 45: 22-31. 3. Perez-Riverol Y et al. (2019) The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res 47: D442-D450. 4. Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 340: 783-795. 5. Ascoli F, Fanelli MR, Antonini E (1981) Preparation and properties of apohemoglobin and reconstituted . Methods Enzymol 76: 72-87. 6. Herriott RM, Meyer EM, Vogt M (1970) Defined nongrowth media for stage II development of competence in Haemophilus influenzae. J Bacteriol 101: 517-524. 7. Dawson RMC, Elliott DC, Elliott WH, Jones KM, Data for biochemical research (Clarendon Press, Oxford, ed. 2nd, 1969), pp. 654. 8. McPhillips TM et al. (2002) Blu-Ice and the Distributed Control System: software for data acquisition and instrument control at macromolecular crystallography beamlines. J Synchrotron Radiat 9: 401-406. 9. Kabsch W (2010) XDS. Acta Crystallogr D Biol Crystallogr 66: 125-132. 10. Winn MD et al. (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr 67: 235-242. 11. Hendrickson WA, Teeter MM (1981) Structure of the hydrophobic protein crambin determined directly from the anomalous scattering of sulphur. Nature 290: 107-113. 12. Cowtan K (2010) Recent developments in classical density modification. Acta Crystallogr D Biol Crystallogr 66: 470-478. 13. McCoy AJ et al. (2007) Phaser crystallographic software. J Appl Crystallogr 40: 658-674. 14. Cowtan K (2012) Completion of autobuilt protein models using a database of protein fragments. Acta Crystallogr D Biol Crystallogr 68: 328-335. 15. Emsley P, Cowtan K (2004) Coot: model-building tools for molecular graphics. Acta Crystallogr D Biol Crystallogr 60: 2126-2132. 16. Murshudov GN et al. (2011) REFMAC5 for the refinement of macromolecular crystal structures. Acta Crystallogr D Biol Crystallogr 67: 355-367. 17. Cruickshank DW (1999) Remarks about protein structure precision. Acta Crystallogr D Biol Crystallogr 55: 583-601. 18. Tian W, Chen C, Lei X, Zhao J, Liang J (2018) CASTp 3.0: computed atlas of surface topography of proteins. Nucleic Acids Res 46: W363-W367. 19. Hubbard SJ, Thornton JM (1993) NACCESS. (Department of Biochemistry and Molecular Biology, University College London, London). 20. Latham R, Zhang B, Tristram S (2015) Identifying Haemophilus haemolyticus and Haemophilus influenzae by SYBR Green real-time PCR. Journal of Microbiological Methods 112: 67-69. 21. Singh R, Grigg JC, Armstrong Z, Murphy ME, Eltis LD (2012) Distal heme pocket residues of B-type dye-decolorizing peroxidase: arginine but not aspartate is essential for peroxidase activity. J Biol Chem 287: 10623-10630. 22. Kuwada T et al. (2011) Involvement of the distal Arg residue in Cl– binding of midge larval haemoglobin. Acta Crystallogr D Biol Crystallogr 67: 488-495. 23. Ozaki SI et al. (2014) Spectroscopic studies on HasA from Yersinia pseudotuberculosis. J Inorg Biochem 138: 31-38. 24. Slater AF et al. (1991) An iron-carboxylate bond links the heme units of malaria pigment. Proc Natl Acad Sci U S A 88: 325-329. 25. Schappacher M et al. (1983) Synthesis and spectroscopic properties of a five coordinate tetrafluorophenylthiolato Iron II 'picket fence' porphyrin complex and its carbonyl and dioxygen adducts: Synthetic analogs for the active site of cytochromes P 450. Inorganica Chimica Acta 78: L9-L12. 26. Inada Y, Funahashi S (1999) Equilibrium and structural study of chloro complexes of iron (III) ion in acidic aqueous solution by means of X-ray absorption spectroscopy. Zeitschrift für Naturforschung B 54: 1517-1523. 27. Isogai Y et al. (2018) Tracing whale myoglobin evolution by resurrecting ancient proteins. Sci Rep 8: 16883. 28. Robinson VL, Smith BB, Arnone A (2003) A pH-dependent aquomet-to-hemichrome transition in crystalline horse methemoglobin. Biochemistry 42: 10113-10125. 29. Hersleth HP et al. (2007) Crystallographic and spectroscopic studies of peroxide-derived myoglobin compound II and occurrence of protonated FeIV O. J Biol Chem 282: 23372- 23386. 30. Pinto M et al. (2018) Insights into the population structure and pan-genome of Haemophilus influenzae. Infect Genet Evol 67: 126-135. 31. Morton DJ et al. (2004) Identification of a haem-utilization protein (Hup) in Haemophilus influenzae. Microbiology 150: 3923-3933. 32. Bracken CS, Baer MT, Abdur-Rashid A, Helms W, Stojiljkovic I (1999) Use of heme- protein complexes by the Yersinia enterocolitica HemR receptor: histidine residues are essential for receptor function. J Bacteriol 181: 6063-6072. 33. Cope LD, Thomas SE, Hrkal Z, Hansen EJ (1998) Binding of heme-hemopexin complexes by soluble HxuA protein allows utilization of this complexed heme by Haemophilus influenzae. Infect Immun 66: 4511-4516. 34. Cope LD, Yogev R, Muller-Eberhard U, Hansen EJ (1995) A gene cluster involved in the utilization of both free heme and heme:hemopexin by Haemophilus influenzae type b. J Bacteriol 177: 2644-2653. 35. Jin H, Ren Z, Whitby PW, Morton DJ, Stull TL (1999) Characterization of hgpA, a gene encoding a haemoglobin/haemoglobin-haptoglobin-binding protein of Haemophilus influenzae. Microbiology 145 ( Pt 4): 905-914. 36. Morton DJ, Whitby PW, Jin H, Ren Z, Stull TL (1999) Effect of multiple mutations in the hemoglobin- and hemoglobin-haptoglobin-binding proteins, HgpA, HgpB, and HgpC, of Haemophilus influenzae type b. Infect Immun 67: 2729-2739. 37. Ren Z, Jin H, Morton DJ, Stull TL (1998) hgpB, a gene encoding a second Haemophilus influenzae hemoglobin- and hemoglobin-haptoglobin-binding protein. Infect Immun 66: 4733-4741. 38. Morton DJ et al. (2007) The haem-haemopexin utilization gene cluster (hxuCBA) as a virulence factor of Haemophilus influenzae. Microbiology 153: 215-224. 39. Morton DJ et al. (2009) The heme-binding protein (HbpA) of Haemophilus influenzae as a virulence determinant. Int J Med Microbiol 299: 479-488. 40. Smith DW, Williams RJ (1968) Analysis of the visible spectra of some sperm-whale ferrimyoglobin derivatives. Biochem J 110: 297-301. 41. Brunori M et al. (1968) The transition between 'acid' and 'alkaline' ferric heme proteins. Biochim Biophys Acta 154: 315-322. 42. McGrath TM, La Mar GN (1978) Proton NMR study of the thermodynamics and kinetics of the acid in equilibrium base transitions in reconstituted metmyoglobins. Biochim Biophys Acta 534: 99-111. 43. Bostelaar T et al. (2016) Hydrogen Sulfide Oxidation by Myoglobin. J Am Chem Soc 138: 8476-8488. 44. Kumar R et al. (2013) The hemophore HasA from Yersinia pestis (HasAyp) coordinates hemin with a single residue, Tyr75, and with minimal conformational change. Biochemistry 52: 2705-2707. 45. Draganova EB et al. (2016) Corynebacterium diphtheriae HmuT: dissecting the roles of conserved residues in heme pocket stabilization. J Biol Inorg Chem 21: 875-886. 46. Conti E et al. (1993) X-ray crystal structure of ferric Aplysia limacina myoglobin in different liganded states. J Mol Biol 233: 498-508. 47. Muraki N et al. (2016) Structural Characterization of Heme Environmental Mutants of CgHmuT that Shuttles Heme Molecules to Heme Transporters. Int J Mol Sci 17. 48. Martinkova M, Kitanishi K, Shimizu T (2013) Heme-based globin-coupled oxygen sensors: linking oxygen binding to functional regulation of diguanylate cyclase, histidine kinase, and methyl-accepting chemotaxis. J Biol Chem 288: 27702-27711. 49. Ramos-Alvarez C et al. (2013) Reactivity and dynamics of H2S, NO, and O2 interacting with hemoglobins from Lucina pectinata. Biochemistry 52: 7007-7021. 50. Naito Y, Takagi T, Higashimura Y (2014) -1 and anti-inflammatory M2 macrophages. Arch Biochem Biophys 564: 83-88. 51. Bogdan C (2001) Nitric oxide and the immune response. Nat Immunol 2: 907-916. 52. Toliver-Kinsky T et al. (2019) H2S, a Bacterial Defense Mechanism against the Host Immune Response. Infect Immun 87: doi: 10.1128/IAI.00272-00218. 53. Goodin DB, McRee DE (1993) The Asp-His-Fe triad of cytochrome c peroxidase controls the reduction potential, electronic structure, and coupling of the tryptophan free radical to the heme. Biochemistry 32: 3313-3324. 54. Liu X et al. (2011) Crystal structure and biochemical features of EfeB/YcdB from Escherichia coli O157: ASP235 plays divergent roles in different enzyme-catalyzed processes. J Biol Chem 286: 14922-14931. 55. Shepherd M et al. (2010) The single-domain globin from the pathogenic bacterium Campylobacter jejuni: novel D-helix conformation, proximal hydrogen bonding that influences ligand binding, and peroxidase-like redox properties. J Biol Chem 285: 12747- 12754. 56. Holm L, Rosenstrom P (2010) Dali server: conservation mapping in 3D. Nucleic Acids Res 38: W545-549. 57. Wong CT et al. (2015) Structural analysis of haemoglobin binding by HpuA from the Neisseriaceae family. Nat Commun 6: 10172. 58. Calmettes C et al. (2011) Structural variations within the transferrin binding site on transferrin-binding protein B, TbpB. J Biol Chem 286: 12683-12692. 59. Schneider MC et al. (2009) Neisseria meningitidis recruits factor H using protein mimicry of host carbohydrates. Nature 458: 890-893. 60. Esposito V et al. (2011) Structure of the C-terminal domain of Neisseria heparin binding antigen (NHBA), one of the main antigens of a novel vaccine against Neisseria meningitidis. J Biol Chem 286: 41767-41775. 61. Stivala A, Wybrow M, Wirth A, Whisstock JC, Stuckey PJ (2011) Automatic generation of protein structure cartoons with Pro-origami. Bioinformatics 27: 3315-3316. 62. Murzin AG, Lesk AM, Chothia C (1994) Principles determining the structure of beta-sheet barrels in proteins. I. A theoretical analysis. J Mol Biol 236: 1369-1381. 63. Marassi FM, Ding Y, Schwieters CD, Tian Y, Yao Y (2015) Backbone structure of Yersinia pestis Ail determined in micelles by NMR-restrained simulated annealing with implicit membrane solvation. J Biomol NMR 63: 59-65. 64. Hagn F, Etzkorn M, Raschle T, Wagner G (2013) Optimized phospholipid bilayer nanodiscs facilitate high-resolution structure determination of membrane proteins. J Am Chem Soc 135: 1919-1925. 65. Bommer M, Bondar AN, Zouni A, Dobbek H, Dau H (2016) Crystallographic and Computational Analysis of the Barrel Part of the PsbO Protein of Photosystem II: Carboxylate-Water Clusters as Putative Proton Transfer Relays and Structural Switches. Biochemistry 55: 4626-4635. 66. Cowan SW, Newcomer ME, Jones TA (1990) Crystallographic refinement of human serum retinol binding protein at 2A resolution. Proteins 8: 44-61. 67. Pautsch A, Schulz GE (2000) High-resolution structure of the OmpA membrane domain. J Mol Biol 298: 273-282. 68. Weichsel A, Andersen JF, Champagne DE, Walker FA, Montfort WR (1998) Crystal structures of a nitric oxide transport protein from a blood-sucking insect. Nat Struct Biol 5: 304-309. 69. Goetz DH et al. (2002) The neutrophil lipocalin NGAL is a bacteriostatic agent that interferes with siderophore-mediated iron acquisition. Mol Cell 10: 1033-1043. 70. Calmettes C, Alcantara J, Yu RH, Schryvers AB, Moraes TF (2012) The structural basis of transferrin sequestration by transferrin-binding protein B. Nat Struct Mol Biol 19: 358-360. 71. Gao JL et al. (2018) Structural properties of a haemophore facilitate targeted elimination of the pathogen Porphyromonas gingivalis. Nat Commun 9: 4097. 72. de Villiers KA, Kaschula CH, Egan TJ, Marques HM (2007) Speciation and structure of ferriprotoporphyrin IX in aqueous solution: spectroscopic and diffusion measurements demonstrate dimerization, but not mu-oxo dimer formation. J Biol Inorg Chem 12: 101-117. 73. Noinaj N et al. (2012) Structural basis for iron piracy by pathogenic Neisseria. Nature 483: 53-58. 74. Krieg S et al. (2009) Heme uptake across the outer membrane as revealed by crystal structures of the receptor-hemophore complex. Proc Natl Acad Sci U S A 106: 1045-1050. 75. Vergauwen B, Elegheert J, Dansercoer A, Devreese B, Savvides SN (2010) Glutathione import in Haemophilus influenzae Rd is primed by the periplasmic heme-binding protein HbpA. Proc Natl Acad Sci U S A 107: 13270-13275. 76. Tanaka KJ, Pinkett HW (2018) Oligopeptide-binding protein from nontypeable Haemophilus influenzae has ligand-specific sites to accommodate peptides and heme in the binding pocket. J Biol Chem. 77. Rodriguez-Arce I et al. (2019) Moonlighting of Haemophilus influenzae heme acquisition systems contributes to the host airway-pathogen interplay in a coordinated manner. Virulence 10: 315-333.