Metagenomic Sequencing to Unveil Undetected Microorganisms

Charles Chiu, MD / PhD

Professor, Department of Laboratory Medicine and Medicine / Infectious Diseases Director, UCSF-Abbott Viral Diagnostics and Discovery Center Associate Director, UCSF Clinical Microbiology Laboratory ESCMIDUniversity of ,eLibrary © by author Disclosures

• Abbott Diagnostics (research support for pathogen discovery)

• Amazon.Com (research support for cloud-computing analysis pipeline development)

• Mammoth Biosciences, Inc. (scientific advisory board)

The mNGS assay discussed here is a laboratory-developed test (LDT) validated in the CLIA-certified UCSF Clinical Microbiology ESCMIDLaboratory and is not FDA-approved eLibrary © by author Summary

• Introduction • Clinical mNGS (metagenomic next-generation sequencing) at UCSF • PDAID (Precision Diagnosis of Acute Infectious Diseases) study • Novel target enrichment methods MSSPE CRISPR-Cas based enrichment • Host response NGS in infectious diseases Machine learning based RNA-Seq analyses for prediction of infections • Clinical nanopore sequencing ESCMIDUniversal metagenomic sequencing eLibrary of body fluids © by author Major Diagnostic Challenges in Infectious Diseases Fever / Sepsis 40 – 55% culture-negative, 16 – 30% unknown cause

• Jones, et al. (2010) Clinical Infectious Diseases 50(6):814-820 • Vincent, et al. (2006) Critical Care Med 34(2) • Phua, et al., (2013) Critical Care 2013, 17:R202 Pneumonia • Martin, et al., (2003), NEJM 348(16):1546-54 15 – 62% unknown cause Meningitis / Encephalitis • Van Gageldonk-Lafeber, (2005) CID 41:490-497 • Louie, et al., (2005) CID 41:822-828 40 – 60% unknown cause • Ewig, et al. (2002) Eur Respir J 20:1254-1262 • Jain, et al., (2015) NEJM 373(5) • Glaser, et al., (2006) CID 43:1565-1577 • Vora, et al., (2010) 82:443-451

Failure to obtain a timely diagnosis leads to delayed / inappropriate therapy, ESCMIDincreased mortality, and eLibrary excess healthcare costs © by author ESCMID eLibrary © by author ESCMID eLibrary © by author Nearly All Microbes can be Uniquely Identified by Metagenomic “Shotgun” NGS

Bacteria Viruses

Fungi Parasites ESCMID eLibrary © by author Differential Diagnosis of Sepsis NON-INFECTIOUS NON-BACTERIAL BACTERIAL (“pseudo-sepsis”) (VIRAL, FUNGAL, PARASITIC)

• GI tract source (e.g. hepatitis, colitis) . Arboviral infections

• GU tract source (e.g. urinary tract infection . Viral hepatitis . Hemorrhage

• Pelvic source (peritonitis, abscess) . Viral respiratory infection (e.g. influenza) . Pulmonary embolism • Lower respiratory tract trace (e.g. community-acquired . Enterovirus . Myocardial infarction pneumonia) . Viral Myocarditis . Pancreatitis • Intravascular source (e.g. IV line sepsis) . Filovirus infection (Ebola, Marburg), Lassa . Pseudoaneurysm • Cardiovascular source (e.g. acute bacterial . Bunyaviruses endocarditis, myocarditis) . Systemic vasculitis . Fungal infection (esp. immunocompromised) • Neurological source (e.g. meningococcemia) . Drug-induced hypovolemia . Malaria • Emerging infectious disease / travel infections (e.g. . Amebiasis plaque, leptospirosis, tick-borne infections, brucellosis, typhoid fever, bartonellosis) Mycobacterial disease . Visceral leishmaniasis Cunha BA, Shea KW. Fever in the intensive care (e.g. tuberculosis) . Acute schistosomiasis unit. Infect Dis Clin North • Pneumococcal sepsis (in asplenic patients) . Trypanosomiasis Am.Mar 1996;10(1):185- ESCMID eLibrary209 © by author NGS Library Preparation Protocol

ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author FDA Draft Guidance for NGS ID Validation

ESCMID eLibrary © by author mNGS Assay Validation • Limits of Detection – By 95% probit analysis, • DNA viruses (CMV) – 9.4 copies/mL • RNA viruses (HIV) – 100.7 copies/mL • Bacteria (Klebsiella pneumoniae, Staphylococcus agalactiae) – 8.7, 8.9 copies/mL • Fungi (Cryptococcus neoformans, Aspergillus niger) 0.01, 130.1 copies/mL • Parasite (Toxoplasma gondii) – 55.1 copies/mL • Precision – Inter-run precision for quality control metrics (15.8 – 65.1 % CV) and positive control organisms (27.4 – 153.8 % CV) • Accuracy – 86.1% sensitivity, 97.9% specificity after discrepancy testing • Interference – human host interference at WBC >1000/μL, effect of lipid, heme • Stability – Stability at 4°C x 6 days and after 3 freeze/thaw cycles • Mixing Study – Differentiation between Staphylococcus aureus and Staphylococcus epidermidis with pure cultures and at mixing ratios of 20:80, 50:50, and 80:20 • Clinical Utility – PDAID clinical study ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author SURPI (Sequence-Based Ultra-Rapid Pathogen Identification)

ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author Clinical mNGS Assay for Pathogen Detection in CSF (Heat Map of Read Counts)

ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author Clinical mNGS Assay for Pathogen Detection in CSF (Coverage and Pairwise Identity Plots)

ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author Clinical mNGS Assay for Pathogen Detection in CSF (Clinical Results Report)

ESCMID eLibraryMiller and Chiu, et al., 2019 (in press) © by author The Precision Diagnosis of Acute Infectious Diseases (PDAID) Study

8 in CA and nationwide Enroll/consent patients 204 total completed study CSF collected 40-60% unknown cause Clinical chart review Meningitis/Encephalitis

mNGS assay validated in CLIA lab 86% analytic sensitivity, 98% specificity

Clinical report in Clinical microbial patient EMR sequencing board

Steve Miller,ESCMID MD, PhD Michael Wilson, MD Hannah Sample, BS eLibraryWilson, et al. 2019, in press © by author Study Aims

 Collect CSF specimens from acute meningoencephalitis patients for mNGS testing at UCSF

 Compare the diagnostic yield mNGS assay to standard microbiologic testing

 Interpret clinical mNGS results in the context of a precision medicine board and clinical interface

 Evaluate impact of the mNGS assay on costs and clinical outcomes ESCMID eLibraryWilson, et al. 2019, in press © by author Demographics

• 204 patients

• 44% women

• Mean age: 40 years (7 days – 81 years)

• 41% immunocompromised

• 49% ICU admission

• 34% meningitis alone

• 86% acute presentation ESCMID eLibraryWilson, et al. 2019, in press © by author Primary Outcome: mNGS Versus Direct-Detection Microbiological Testing

• For non-serological diagnosis of infections from CSF, mNGS was 80.0% sensitive and 98.2% specific VERSUS 67.5% sensitivity and 99.4% specificity for conventional, direct CSF testing (culture, antigen testing, PCR) ESCMID eLibraryWilson, et al. 2019, in press © by author Clinical Utility

PDAID Trial Summary Diagnoses (n=58) Clinical Utility

9 cases with direct clinical impact 13 - mNGS only 50% had no • Neisseria sp. – changed antibiotics diagnosis • Nocardia farcinica – changed antibiotics 28% • Candida tropicalis – changed infectious 19 by both antibiotics, discontinued antifungals • Hepatitis E virus – antiviral, stopped liver transplant n = 204 • Enterobacter aerogenes – changed antibiotics • Enterococcus faecalis – changed antibiotics • 26 - All other methods Streptococcus mitis – changed 8% antibiotics • Streptococcus agalactiae – changed 14% Other Autoimmune antibiotics • Streptococcus agalactiae – changed • Clinical false negatives by mNGS (n=26) antibiotics  Diagnosed by serology (n=11)  Diagnosed by samples other than CSF (n=7)  Low pathogen titer in CSF (n=8) ESCMID eLibraryWilson, et al. 2019, in press © by author 58 y/o immunosuppressed woman with fever, headache, nausea/vomiting

• Immunosuppressed, s/p bilateral lung transplant in 2011 • Admitted to in October 2016 with acute meningitis • On history, with 5-years of neurological symptoms (slurred speech, dizziness, leg weakness, first-time seizure in 016) • Fever to 38.3oC, pancytopenic, transaminitis (negative for hepatitis A,B, and C); Ct MRI scans unrvealing • Started on empiric antimicrobials: IV vancomycin, ceftazidime, acyclovir, and voriconzole

ESCMID eLibraryMurkey, et al., 2017, Open Forum Infectious Diseases, 4(3):ofx121 © by author • Lumbar puncture done, showing a lymphocytic pleocytosis

• WBC 10, 88% lymphocytes, protein 29, glucose 48

Murkey, etESCMID al., 2017, Open Forum Infectious Diseases, 4(3):ofx121 eLibrary © by author mNGS Diagnosis: Viral Meningoencephalitis from Hepatitis E, genotype 3A

ESCMID eLibrary © by author • Patient re-admitted with liver failure, placed on liver transplant list • After confirmation and communication of mNGS results, patient treated with ribavirin and clinically improved • This is a likely case of donor-transmitted HEV (positive anti-HEV antibody testing of donor’s serum) We may have spared this patient from having to undergo a liver transplant ($812,500 USD)* Murkey, etESCMID al., 2017, Open Forum Infectious Diseases, 4(3):ofx121 eLibrary*2017 U.S. organ and tissue transplant cost estimates and discussion, Milliman, 2017 © by author CLIA-Validated mNGS Assay at UCSF

ESCMID eLibrary © by author http://nextgendiagnostics.ucsf.edu

ESCMID eLibrary © by author Target Enrichment Methods

• Capture probe NGS enrichment (Metsky, et al., 2019, Nature Biotechnology, 37(2):160168)

• PCR amplicon NGS enrichment (Quick, et al., 2016, Nature, 530:228-232)

• MSSPE enrichment (Theze, et al., 2018, Cell Host Microbe 23(6):855-864)

• CRISPR-Cas based enrichment (Quan, et al., bioRxiv, https://doi.org/10- .1104/426338)

ESCMID eLibraryChiu, et al., Nature Review Genetics, 2019 (in press) © by author Conventional Target Enrichment Methods

Chiu, et al.,ESCMID Nature Review Genetics, 2019 (in press) eLibrary © by author MSSPE (Metagenomic Sequencing with Spiked Primer Enrichment)

More sensitive Less sensitive

More biased More unbiased

Less comprehensive More comprehensive

Tiling Amplicon PCR GENOME SEQUENCING Metagenomic Sequencing + Probe Enrichment

MSSPE PATHOGEN Singleplex / Metagenomic Shotgun DETECTION ESCMIDMultiplex PCR eLibrarySequencing © by author MSSPE (Metagenomic Sequencing with Spiked Primer Enrichment)

ESCMID eLibrary © by author MSSPE (Metagenomic Sequencing with Spiked Primer Enrichment)

Thézé, et al., 2018, Cell Host & Microbe, 23(6):855-864 Deng, etESCMID al., 2019, under review eLibrary © by author Tracking ZIKV Emergence in Mexico and Central America

Julien Theze, PhD

ESCMIDThézé , eLibraryet al., 2018, Cell Host & Microbe, 23(6):855-864 Oliver Pybus, PhD © by author ESCMID eLibrary(with Dr. Asim Ahmed, Boston Children’s Hospital) © by author Arboviral MSSPE Panel (to facilitate viral detection and genome assembly)

• West Nile virus (1,100 genome sequences) • Zika virus (96 genome sequences) • dengue virus, all 4 strains (3,860 genome sequences) • chikungunya virus (281 genome sequences

ESCMID eLibraryDeng, et al., 2019, in preparation © by author POWV polyprotein is >99% complete ESCMID eLibrary(with Dr. Asim Ahmed, BCH) © by author Rapid Multiplexed Identification using CRISPR-Cas and Metagenomic Nanopore Sequencing

Potential Applications: (1) Antimicrobial drug resistance profiling directly from clinical specimens (2) 16S/18S/ITS universal pathogen detection (3) Lyme and tickborne infection diagnosis )) Chen, etESCMID al, 2018, Science, 360(6387):436-439 eLibrary(collaboration with Mammoth Biosciences) © by author FLASH (Finding Low-Abundance Sequences by Hybridization) (~3 hr sample-to-answer turnaround time)

Quan, et al., 2019, bioRxiv, https://doi.org/10.1101/426338, ESCMID eLibrary(collaboration with the Chan-Zuckberberg ) © by author AMR Gene FLASH patient 288 DNA FLASH patient 288 RNA NGS ctrl patient 288 RNA Mechanisms TEM 225 61 0 ESBL β-lactamase tetC 220 31 0 tetracyclin R cat/catl 69 3 0 chloramphenicol R regulator promotes acrD expression cpxR 51 0 0 (efflux) baeS/baeR 32 1 0 efflux pump rpoC 30 412 0 daptomycin R, e.g. Staph aureus rpoB 0 663 0 rifampin/daptomycin gyrA 20 144 0 fluoquinolone R mecA 1 404 1 MRSA plasmid-encoded 18 0 0 chloramphenicol R cat parE 18 16 0 fluoquinolone R gyrB 16 36 0 aminocoumarin R glpT 16 0 0 fosfomycin R mdtF 4 0 0 MDR efflux

ESCMID eLibraryWith Charles Langelier and Emily Crawford, UCSF Biohub © by author Machine Learning-Based RNA-Seq Analyses of Host Response

ESCMID eLibrary © by author Predictive Model from CSF RNA-Seq Data 1. Randomly split dataset to training & test subsets 2. Train a ML model on training subset 80% training 69 viral 36 bacterial model 14 fungal 3. Make predictions on 18 autoimmune testing subset 25 other 20% testing prediction

Features & Targets 4. Compare testing predictions to actual ESCMID eLibrarytarget = accuracy © by author CSF Metagenomic NGS Data

100

Median = 21.99% 75 Mean = 26.30%

%

e

g SD = ± 20.02%

a

r

e

v

o Type

C

e 50 Bacteria

m

o

t Virus

p

i

r

c

s

n

a

r

T 25

0

Bacteria Virus Infection

ESCMIDTranscriptome coverage relatively eLibrary low! Li and Santos, et al., 2019 (manuscript in preparation) © by author Bootstrapping Model by Random Permutations (“Shuffling”) of the Training and Validation Sets

• The best model with 1000X bootstrapping provides a robust estimate of accuracy (sensitivity analysis)

• Over 95% accuracy on average

• This is not a dummy model!

ESCMID eLibraryLi and Santos, et al., 2019 (manuscript in preparation) © by author Learning Curve from Cross-Validation

ESCMID eLibraryLi and Santos, et al., 2019 (manuscript in preparation) © by author Host Response Pathways Corresponding to Viral Infection- Associated Genes

ESCMID eLibraryLi and Santos, et al., 2019 (manuscript in preparation) © by author Summary of Model Performance

• Accuracy was 90.91% precision recall f1-score support viral 88% 100% 93% 14 bacterial 100% 75% 86% 8 avg / total 92% 91% 91% 22

• Confusion matrix (rows = predictions; cols = actual)

viral bacterial

viral 14 0

bacterial 2 6

91% accuracy in discriminating bacterialESCMID from viral infection from CSF eLibraryLi, et al., 2019 (manuscript in preparation) © by author Probability of Viral Infection = 92.39% Probability of Bacterial Infection = 7.61% ESCMID eLibrary © by author Probability of Viral Infection = 86.1% Probability of Bacterial Infection = 13.9% ESCMID eLibrary © by author Probability of Viral Infection = 16.5% Probability of Bacterial Infection = 84.5% ESCMID eLibrary © by author

Clinical Nanopore Metagenomic Sequencing

10 minute batch analysis time minutebatch 10analysis ~

Wayne Deng, PhD Scot Federman, BA Dianna Ng, MD Asmeeta Achari, BS, MS ESCMID eLibraryDeng, et al., 2019, manuscript in preparation © by author Nanopore Sequencing

• Advantages – Real-time sequence analysis (Oxford Nanopore Technology MinION) – Expansion potential (ProMethion, GridION, etc.) – Portable, pocket-size, amenable for field work – Potentially fast turnaround times, key for infectious disease sequencing

• Disadvantages – Cost of sequencing ($500 per flow cell) – Error rates still 8-12% ESCMID– Quality of flow cells can be variable eLibrary © by author ESCMID eLibrary © by author Universal Metagenomic Assay from Body Fluids

Illumina versus Nanopore Sequencing Diagnostic Accuracy

ESCMID eLibraryGu, et al., 2019, manuscript in preparation © by author 2-year old with new-onset seizures and large left brain abscess • History of multiple ear infections and cellulitis • Last month had a febrile seizure s/p antibiotics and tympanic membrane tube placement • Allergic to penicillin  treated with ceftriaxone or azithromycin • Presented with seizures, wobbly gait, decreased use of right hand, hearing difficulties (associated with otitis media) • Brain CT/MRI at OSH  large cyst in L parietal lobe with surrounding clarifications • Exposure history: lives in northern California, no sick contacts, no rural/farm exposures; no undercooked foods; 2 pet dogs, healthy; no time in backyard or digging in dirt • ESCMIDAdmitted for neurosurgical evaluation eLibrary © by author ESCMID eLibrary © by author 2-year with new-onset seizures and large left brain abscess

• Workup for immunodeficiency (lymphocyte subsets, HIV testing) • Extraction of 40 cc of CSF from abscess • Concern for – Anaerobic infection – Balamuthia mandrillaris – Neurocysticercosis – Nocardia spp. – Mycobacterium tuberculosis – Toxoplasmosis – Listeria monocytogenes ESCMID– Non-infectious etiology (malignancy) eLibrary © by author Streptococcus pyogenes abscess

• Direct DNA extraction  nanopore sequencing • First read detected in 40 minutes • Overall turnaround time from sample to answer: 3 hours • 16S universal PCR also positive for S. pyogenes, cultures negative

ESCMID eLibrary © by author ESCMID eLibrary © by author Key Take-Home Messages from Clinical mNGS Testing

• It’s already here – so clinicians and laboratories need to be aware of it • Well-designed studies are needed to address clinical utility and indications • 3 categories of mNGS “misses”: (1) wrong sampling site (e.g. brain biopsy versus CSF), (2) serology diagnosable only (it’s a direct detection method), (3) low pathogen titers (may be addressed in future by reporting/discussion of pathogen detections below formal reporting thresholds) • Turnaround time, cost, and regulatory issues are still major concerns • Clinical metagenomic assays may leverage CRISPR-Cas and MSSPE technologies, host response (for clinical interpretation), and nanopore sequencing in the futureESCMID eLibrary © by author Precision Diagnosis of Acute Infectious Diseases (PDAID) Team

UCLA UCD Romney Humphries, PhD/D(ABMM) UCSF Christopher Polage, MD Jeffrey Klausner, MD/MPH Charles Chiu, MD/PhD (PI) Stuart Cohen, MD Paul Vespa, MD Steve Miller, MD/PhD (Co-I) Czarina Ganzon Jamie Murkey, BS (Site Coordinator) Hannah Sample, BS (Study Nicolle Ocampo Coordinator) Kelsey Zorn, BS (UCSF Site St. Jude Children’s Coordinator) Research Hospital Randall Hayden, MD Joseph DeRisi, PhD (Co-I) UCB Gabriela Maron, MD Michael Wilson, MD (Co-I) Brent Fulton, PhD Anthony Joseph, PhD UCSF Chiu Lab and VDDC Children’s Hospital Los Angeles Scot Federman, BA Matt Massie, BS Jeffrey Bender, MD Andrea Granados, PhD Jennifer Dien-Bard, MD Asmeeta Achari, BS Children’s Hospital Colorado Samia Nacache, PhD Shaun Arevalo, CLS Kevin Messacar, MD Jerome Bouquet, PhD Samuel Dominguez, MD/PhD Guixia Yu, BS Dianna Ng, MD Children’s National Medical Center Quest Diagnostics Wayne Deng, PhD DNAnexus Brittany Goldberg, MD Rick Pesano, MD/PhD Doug Stryke, MS David Shaywitz, MD/PhD Tony Li, BS Roberta DiBiasi, MD John Leake, MD/MPH Guixia Yu, BS Marcus Kinsella, BS Web: https://chiulab.ucsf.edu Wei Gu,ESCMID MD/PhD eLibrarymNGS Testing: https://nextgendiagnostics.ucsf.edu © by author CDPH Acknowledgements Sharon Messenger, PhD Shigeo Yagi, PhD Debra Wadford, PhD UCSF Chiu Lab and VDDC Biohub, UCSF Calla Martyn, BS Emily Crawford, PhD UCSF Neurology Scot Federman, BA Charles Langelier, MD/PhD Michael Wilson, MD Asmeeta Achari, BS Joseph DeRisi, PhD Shaun Arevalo, CLS Blood Systems Research Institute Jerome Bouquet, PhD Michael Busch, PhD Guixia Yu, BS UCLA-DRC / INRB Funding Dianna Ng, MD Anne Rimoin, PhD • NIH R01 HL105701-01 and R21 AI120977 Wayne Deng, PhD Jean-Jacques Muyembe, PhD • UC Discovery Award Doug Stryke, MS Placide Mbala, PhD • Translational Research Institute / NASA Grant Matt Massie, BS Steve Ahuka, PhD • Abbott Pathogen Discovery Award Tony Li, BS Matt Bramble, PhD • California Initiative to Advance Precision Medicine Guixia Yu, BS Russell Williams, PhD • Charles and Helen Schwab Foundation Wei Gu, MD/PhD • George and Judy Marcus Innovation Fund Johns Hopkins University Steve Miller, MD, PhD • National Institutes of Health John Aucott, MD • Sandler Foundation Oxford University, UK Mark Soloski, MD Julien Theze, PhD • Steven and Alexandra Cohen Foundation • UCSF Medical Center (Clinical Laboratories) Oliver Pybus, PhD University of British Columbia William K. Bowes, Jr Foundation Nuno Faria, PhD Patrick Tang, MD, PhD ESCMID eLibraryE-mail: [email protected] © by author