Clinical metagenomics in bone and joint infections

Teresa Street Modernising Medical Microbiology Nuffield Department of Medicine University of Oxford, UK ESCMID eLibrary © by author Joint Surgery England, Wales & Northern Ireland National Joint Registry data 2017

Joint Total procedures Number of Number of % of revisions due to primary revisions infection/suspicion of procedures infection

Hip 105,306 96,717 8,589 (8%) 16% Knee 112,836 106,334 6,502 (6%) 23% Shoulder 7525 6,828 697 (9%) 17% Elbow 813 668 145 (18%) 23% Ankle 886 769 117 (13%) 18% low/3% high ESCMID eLibrary © by author Prosthetic joint infection (PJI) diagnosis

• Current gold standard is culture from periprosthetic tissue (PPT) samples collected during surgery (n=5 for optimal results)

• Also culture from sonication fluid, generated by sonicating explanted prostheses/metalwork in saline

• Culture from PPT is relatively insensitive, even with multiple tissues – Cause of infection identified in ~65% of cases

• Infections with fastidious organisms/in patients with prior antimicrobials are often ESCMIDculture negative eLibrary © by author PJI microbiology

Sample/device collected during surgery

DEVICE/METALWORK TISSUE/BIOPSY/BONE JOINT FLUID Sonication BACTEC bottles Gram stain aerobic & anaerobic

Anaerobic culture: agar Anaerobic culture: Blood agar 10 days Anaerobic culture: Blood agar 5 days 18-24 hrs Aerobic culture: Blood & Aerobic culture: Blood & Chocolate agar Aerobic culture: Blood, Chocolate agar 5 days Chocolate, Chromogenic & 40-48 hrs CNA agar 18-24 hrs

BD Phoenix MALDI-TOF Antimicrobial sensitivity Gram stain Catalase/Oxidase test ESCMIDSpecies identification eLibrarytesting © by author Molecular methods for diagnosing PJI

Specific PCR Broad-range PCR • Targets specific bacterial species • Targets 16S rRNA • qPCR can be extremely sensitive • Can detect all bacterial species • Fast • Contamination can cause problems • Only detects what your primers • Need to sequence to confirm species are designed to find detected

Mass spec • Helps identify coagulase negative Staphylococcus species •ESCMIDStill need to culture first eLibrary © by author Culture-free diagnostics

Diagnose infection using DNA isolated directly from clinical sample

ESCMID eLibrary © by author Metagenomic Whole Genome Sequencing (mWGS)

• Extract DNA/RNA directly from a clinical sample, sequence and compare to a reference genome database to identify pathogens

• This approach has already successfully demonstrated potential: – CSF & Leptospira/HSV-1/HHV-3/Balamuthia mandrillaris – Brain biopsy & Astrovirus – Sputum & TB – BAL/ETA/sputum & Lower respiratory infections – Urine & UTIs – Blood & bacteremia/Ebola/chikungunya/Dengue/HCV ESCMID– Sonication fluid/synovial fluid/PPT & PJI eLibrary © by author Metagenomic Whole Genome Sequencing (mWGS)

• Extract DNA/RNA directly from a clinical sample, sequence and compare to a reference genome database to identify pathogens

• This approach has already successfully demonstrated potential: – CSF & Leptospira/HSV-1/HHV-3/Balamuthia mandrillaris – Brain biopsy & Astrovirus – Sputum & TB – BAL/ETA/sputum & Lower respiratory infections – Urine & UTIs – Blood & bacteremia/Ebola/chikungunya/Dengue/HCV ESCMID– Sonication fluid/synovial fluid/PPT & PJI eLibrary © by author Study Aims

• Can we improve the microbiological diagnosis of infections associated with orthopaedic devices using whole genome sequencing technologies?

– Faster diagnosis? – Increased sensitivity of diagnosis? • Diagnosis in culture negative cases? – Diagnosis in patients already treated with antibiotics?

• Can we do this (almost) directly from clinical samples (sonication fluid) without the need for a culture step? ESCMID eLibrary © by author Sonication fluid DNA extracts for sequencing

Device removed in surgery Device placed in saline and 40ml sonication fluid received for processing DNA extracted and sent to micro lab sonicated

DNA cleaned and sequencing libraries prepared Libraries sequenced ESCMID eLibrary © by author PJI microbiology

Sample/device collected during surgery

DEVICE/METALWORK TISSUE/BIOPSY/BONE JOINT FLUID Sonication BACTEC bottles Gram stain aerobic & anaerobic

Anaerobic culture: Blood agar Anaerobic culture: Blood agar 10 days Anaerobic culture: Blood agar 5 days 18-24 hrs Aerobic culture: Blood & Aerobic culture: Blood & Chocolate agar Aerobic culture: Blood, Chocolate agar 5 days Chocolate, Chromogenic & 40-48 hrs CNA agar 18-24 hrs

BD Phoenix MALDI-TOF Antimicrobial sensitivity Gram stain Catalase/Oxidase test ESCMIDSpecies identification eLibrarytesting © by author PJI microbiology

Sample/device collected during surgery

DEVICE/METALWORK Sonication

Anaerobic culture: Blood agar 10 days

Aerobic culture: Blood & Chocolate agar 5 days

BD Phoenix MALDI-TOF Antimicrobial sensitivity Gram stain Catalase/Oxidase test ESCMIDSpecies identification eLibrarytesting © by author PJI metagenomic sequencing: Illumina

Sample/device collected during surgery

DEVICE/METALWORK DNA extraction Sonication ~ 4 hours

Anaerobic culture: Blood agar 10 days Library preparation ~ 3 hours Aerobic culture: Blood & Chocolate agar 5 days Sequencing Data analysis 54 hrs 15 min

BD Phoenix MALDI-TOF Antimicrobial sensitivity Gram stain Catalase/Oxidase test ESCMIDSpecies identification eLibrarytesting © by author Illumina MiSeq study

Derivation Samples 72 prosthesis/orthopaedic device sonication fluids

81 sonication fluid sequences (including 9 technical replicates) + 8 saline negative controls

50 sonication fluid sequences 22 samples excluded: analyzed Negative control and sample similarly contaminated 18 sequences with no growth + 32 sequences with 1 (n=26) or 2 (n=6) species isolated from 9 technical ESCMIDsonication fluid replicates eLibrary © by author Illumina MiSeq study

Derivation Samples Analysis 72 prosthesis/orthopaedic Thresholds (maximising device sonication fluids sensitivity and specificity of sequencing): Filtering thresholds to determine true infection from background or contaminating 81 sonication fluid sequences 1) ≥1150 reads from a single (including 9 technical species species: replicates) 2) ≥125 reads from a single + ≥ 1150 reads for an individual species 8 saline negative controls species if this is ≥15% of the total bacterial reads OR 50 sonication fluid sequences 22 samples excluded: analyzed Negative control and sample similarly ≥ 125 reads for an individual species IF ≥ 15% of contaminated total bacterial reads belong to same species 18 sequences with no growth + 32 sequences with 1 (n=26) or 2 (n=6) species isolated from 9 technical ESCMIDsonication fluid replicates eLibrary © by author Illumina MiSeq study

Derivation Samples Analysis Validation Samples 72 prosthesis/orthopaedic Thresholds (maximising 59 prosthesis/orthopaedic device sonication fluids sensitivity and specificity of device sonication fluids sequencing):

81 sonication fluid sequences 1) ≥1150 reads from a single 59 sonication fluid sequences (including 9 technical species + replicates) 5 saline negative controls + 2) ≥125 reads from a single 12 samples excluded: species if this is ≥15% of the Negative control 8 saline negative controls contaminated total bacterial reads

50 sonication fluid sequences 22 samples excluded: 47 sonication fluid sequences analyzed Negative control and analyzed sample similarly contaminated 18 sequences with no growth 17 sequences with no growth + + 32 sequences with 1 (n=26) or 30 sequences with 1 (n=29) or 2 (n=6) species isolated from 9 technical 2 (n=1) species isolated from ESCMIDsonication fluid replicates eLibrarysonication fluid © by author Illumina MiSeq study

Derivation Samples Analysis Validation Samples 72 prosthesis/orthopaedic device Thresholds (maximising sensitivity and 59 prosthesis/orthopaedic device sonication fluids specificity of sequencing): sonication fluids

1) ≥1150 reads from a single species 81 sonication fluid sequences 59 sonication fluid sequences 2) ≥125 reads from a single species if (including 9 technical replicates) + this is ≥15% of the total bacterial + 5 saline negative controls reads 8 saline negative controls 12 samples excluded: Negative control 22 samples contaminated excluded: 50 sonication fluid sequences analyzed Negative control 47 sonication fluid sequences analyzed and sample similarly contaminated 18 sequences with no growth + 17 sequences with no growth + 32 sequences with 1 (n=26) or 2 (n=6) 9 technical 30 sequences with 1 (n=29) or 2 (n=1) species isolated from sonication fluid replicates species isolated from sonication fluid

Derivation Samples Validation Samples Combined

Species-level sensitivity 92% (79-98%) 84% (66-95%) 88% (77-94%)

Species-level specificity* 86% (73-94%) 89% (77-96%) 88% (79-93%) ESCMID*Adjusted values. Unadjusted combined specificity 80% (71-88%) eLibrary © by author What did we learn from the Illumina study?

1. Human DNA is problematic 2. Negatives are important as contamination can be a problem 3. Need more reads to be able to do more than just identify species

ESCMID eLibrary © by author What did we learn from the Illumina study?

1. Human DNA is problematic 2. Negatives are important as contamination can be a problem 3. Need more reads to be able to do more than just identify species

1. Human reads accounted for >90% of reads in 97% of our samples – Throwing away >90% of our sequence!

ESCMID eLibrary © by author What did we learn from the Illumina study?

1. Human DNA is problematic 2. Negatives are important as contamination can be a problem 3. Need more reads to be able to do more than just identify species

2. Contamination can be introduced at different times & from different places: During sample collection (theatre – flora) During sample processing (laboratory – other lab studies) During sample processing (kits/reagents – the ‘kit-ome’)

Negatives help identify and control for these events, and are vital for interpreting results ESCMID eLibrary © by author Culture-negative sonication fluid Saline negative

ESCMID eLibrary © by author What did we learn from the Illumina study?

1. Human DNA is problematic 2. Negatives are important as contamination can be a problem 3. Need more reads to be able to do more than just identify species

3. High human DNA content means there are very few bacterial reads for analysis and we can only identify presence of species – No suggestion of antimicrobial sensitivity profiles without more data

ESCMID eLibrary © by author Nanopore Sequencing

Changes in current DNA Strand Sequence ESCMIDthrough NanoporeeLibrary © by author Nanopore sequencing: Why the excitement?

• Portable

• Long read lengths

• Real-time analysis of data ESCMID eLibrary © by author Portable

ESCMID eLibrary © by author Longer read lengths Illumina: short reads low read coverage over reference genome

Reference genome

Nanopore: longer reads higher read coverage over ref genome

Coverage breadth (%)

Coverage depth ESCMID(x) eLibraryReference genome © by author Real-time analysis of data Metagenomics sequencing strategy

Metagenomic reads Classified reads Binned reads

Aligned reads

Reference genome ESCMID eLibraryReference genome © by author Real-time analysis of data Bioinformatics analysis CRuMPIT (Clinical Real-time Metagenomics Pathogen Identification Test)

fast5 files Fastq High complexity File(s) Fastq Albacore prinseq Centrifuge Tax ID per read and score High centrifuge output complexity Fastq minimap2 Alignment details MongoDB ESCMIDBasecalling information eLibrary © by author Proof of principle: Species identification Majority species identified by Sample Species identified by culture sequencing 229a Staphylococcus aureus 249a acnes 259a Staphylococcus epidermidis Staphylococcus epidermidis 312a Citrobacter koseri Citrobacter koseri 335a Morganella morganii Morganella morganii Bacillus cereus 352a Bacillus species Bacillus thuringiensis

Arcanobacterium haemolyticum haemolyticum 354a Enterococcus faecalis Enterococcus faecalis ESCMID eLibraryFusobacterium nucleatum © by author Proof of principle: Species identification Majority species identified by Sample Species identified by culture sequencing 229a Staphylococcus aureus Staphylococcus aureus 249a Cutibacterium acnes Cutibacterium acnes 259a Staphylococcus epidermidis Staphylococcus epidermidis 312a Citrobacter koseri Citrobacter koseri 335a Morganella morganii Morganella morganii Bacillus cereus 352a Bacillus species Bacillus thuringiensis

Arcanobacterium haemolyticum Arcanobacterium haemolyticum 354a Enterococcus faecalis Enterococcus faecalis ESCMID eLibraryFusobacterium nucleatum © by author Time to species identification

ESCMIDSample 352aeLibrary © by author Time to species identification

ESCMIDSample 352aeLibrary © by author How can we improve on this?

Remove more human DNA – Detergent-based treatment selectively lyses human cells, then use endonuclease to degrade human DNA away before lysing bacterial cells*

Sample Total sequenced bases % human bases % bacterial bases

A + depletion 510,369,847 1.3 96.8 A - depletion 550,330,000 99.2 0.8 B + depletion 1,908,060,640 12.9 86.9 B - depletion 2,432,642,558 97.7 2.3 ESCMID* Depletion protocol from Justin O’Grady and his team at University eLibrary of East Anglia © by author How can we improve on this?

Increase fragment length and sequencing library yield – Reducing proportion of human DNA reduces overall DNA yield – New library preparation methods allow much lower DNA input for same overall yield

• Rapid PCR barcoding kit allows 10ng DNA input – 6 minutes elongation in PCR allows for longer fragments – More PCR cycles increases DNA yield – Allows multiplexing of samples together ESCMID eLibrary © by author Good fragment & read lengths

Post-PCR fragment lengths Sequence read lengths

• Peak at 4901bp • Read length distribution per multiplex

)

bp

Read length ( length Read Sample Intensity (FU) SampleIntensity

Size (bp) ESCMID eLibraryBatch © by author Species identification

Sample Species identified by culture Species identified by sequencing 001 Staphylococcus aureus Staphylococcus aureus 002 (Staphylococcus aureus) Staphylococcus aureus 003 Enterobacter cloacae Enterobacter cloacae Staphylococcus haemolyticus 004 Coagulase-neg Staphylococcus Staphylococcus capitis 005 Streptococcus pyogenes Streptococcus pyogenes 006 Escherichia coli Escherichia coli 007 Staphylococcus aureus Staphylococcus aureus 008 Streptococcus dysgalactiae Streptococcus dysgalactiae 009 Staphylococcus aureus Staphylococcus aureus 010 Staphylococcus aureus Staphylococcus aureus 011 Staphylococcus lugdunensis Staphylococcus lugdunensis ESCMID012 Cutibacterium acnes CutibacteriumeLibraryacnes © by author Species identification

Sample SpeciesSonication identified by culture Species identified by sequencing fluid <5cfu Also identify: 001 StaphylococcusS. aureus aureus StaphylococcusC. acnes (skin contaminant?) aureus E. Ludwigii (taxonomic 002 (Staphylococcus aureus) Staphylococcusmisclassification?) aureus MALDI-ToF identified Staphylococcus caprae 003 Enterobacter cloacae Enterobacter cloacae (not in reference genome Staphylococcus haemolyticus database) 004 Coagulase-neg Staphylococcus Staphylococcus capitis 005 Streptococcus pyogenes Streptococcus pyogenes 006 Escherichia coli Escherichia coli 007 Staphylococcus aureus Staphylococcus aureus 008 Streptococcus dysgalactiae Streptococcus dysgalactiae 009 Staphylococcus aureus Staphylococcus aureus 010 Staphylococcus aureus Staphylococcus aureus 011 Staphylococcus lugdunensis Staphylococcus lugdunensis ESCMID012 Cutibacterium acnes CutibacteriumeLibraryacnes © by author Genome coverage & depth Reference genome Reference genome Sample Species identified by sequencing coverage breadth (%) average depth (x) 001 Staphylococcus aureus 89% 4.48 002 Staphylococcus aureus 34% 1.46 003 Enterobacter cloacae 35% 1.99 Staphylococcus haemolyticus 13% 6.97 004 Staphylococcus capitis 78% 16.7 005 Streptococcus pyogenes 93% 72.8 006 Escherichia coli 3% 2.44 007 Staphylococcus aureus 94% 184 008 Streptococcus dysgalactiae 90% 30.2 009 Staphylococcus aureus 95% 93.1 010 Staphylococcus aureus 78% 18.1 011 Staphylococcus lugdunensis 98% 638 ESCMID012 Cutibacterium acnes eLibrary100% 330 © by author Genome coverage & depth Reference genome Reference genome Sample Species identified by sequencing Sonicationcoverage breadth (%) average depth (x) fluid <5cfu 001 Staphylococcus aureus S. aureus 89% 4.48 002 Staphylococcus aureus 34% 250-490 1.46 CFU/ml 003 Enterobacter cloacae 35% 1.99 Staphylococcus haemolyticus 13% 6.97 004 Staphylococcus capitis 78% 16.7 005 Streptococcus pyogenes 93% 250-490 72.8 CFU/ml 006 Escherichia coli 3% 2.44 007 Staphylococcus aureus 94% 184 008 Streptococcus dysgalactiae 90% 30.2 009 Staphylococcus aureus 95% 93.1 010 Staphylococcus aureus 78% 18.1 011 Staphylococcus lugdunensis 98% 638 ESCMID012 Cutibacterium acnes eLibrary100% 330 © by author Genome coverage

012_C. acnes Read coverage Read

Position on chromosome ESCMIDSample 012: Cutibacterium eLibraryacnes culture positive © by author Genome coverage

009_S. aureus Read coverage Read

Position on chromosome ESCMIDSample 009: Staphylococcus eLibrary aureus culture-positive © by author Updated Aims

• Can we improve the microbiological diagnosis of infections associated with orthopaedic devices using whole genome sequencing technologies?

– Faster diagnosis? – Increased sensitivity of diagnosis? • Diagnosis in culture negative cases? – Diagnosis in patients already treated with antibiotics?

• Can we do this (almost) directly from clinical samples (sonication fluid) without the need for a culture step? • ESCMIDCan we deduce anything about antimicrobial eLibrary sensitivities? © by author Staphylococcus AMR : Presence confers resistance

Gene coverage percent per sample Antibiotic Gene 001 002 004 007 009 010 Penicillin blaZ 99.8 99.4 100 100 99.9 99.9 tetK 99.9 100 100.0 Tetracycline tetL 84 83.0

Most results match sensitivity profile seen for these drugs in the microbiology lab ESCMID eLibrary © by author Read coverage over important genes occurs rapidly

ESCMIDSample 009: Staphylococcus eLibrary aureus culture-positive © by author Summary

• Metagenomic sequencing can successfully identify cause of infection in PJI

• Nanopore sequence analysis allows real-time detection of pathogens

• With improvements in DNA extraction and sample preparation we are now sequencing near-whole genomes

• Detection of antimicrobial resistance determinants is within hours of start of sequencing ESCMID eLibrary © by author Limitations of Metagenomic WGS

• Human DNA can be problematic - too few bacterial reads make it difficult to determine: – cause of infection above background reads – antimicrobial sensitivity profile

• Any contaminating or DNA introduced before library preparation will also be sequenced – Need filtering methods to identify true infection from background

• Your taxonomic data is only as good as your database ESCMID– Taxonomic misclassification is also a problem eLibrary © by author Future work • Sequence more samples!

• Start to address the background/contamination problem with Nanopore data – Current bioinformatics pipeline was developed for low abundance of bacterial reads, so sensitivity was key – Many more reads and full genomes may need a different approach – Greater numbers of samples will allow us to determine filtering thresholds

• What about AMR determinants in other species? •ESCMIDCan we apply our methods to different sampleeLibrary types? © by author ESCMID eLibrary © by author Acknowledgements

Colleagues at Modernising Medical Microbiology, Colleagues at the Bone Infection Unit, University of Oxford: Oxford University Hospitals NHS Trust: • Nicholas Sanderson • Bridget Atkins • David Eyre • Andrew Brent • Leanne Barker • Matt Scarborough • Kevin Cole • Maria Dudareva • Sarah Hoosdally • Adrian Taylor • Dona Foster • Martin McNally • Tim Peto • Derrick Crook http://modmedmicro.nsms.ox.ac.uk/ http://www.ouh.nhs.uk/boneinfection/ ESCMID eLibrary © by author