Whole genome sequence analysis of population snapshots at continental scale

Dr. Corinna Glasner Wellcome Genome Campus Hinxton, Cambridgeshire, United Kingdom ©[email protected] by author

ESCMID OnlineCapacity-Building Lecture Workshop: Library Rapid NGS for Characterization and Typing of Resistant Gram-Negative Bacilli Disclosures

© by author ESCMID Online Lecture Library Network Approaches at Both Ends of the Pipeline

translation population into public sampling health © by author knowledge ESCMID Online Lecture Library Part I

Population Snapshots at Continental Scale © by author ESCMID Online Lecture Library Roadmap for Surveillance

Provision of a roadmap for establishing a network of laboratories for active surveillance: i. a questionnaire survey to identify diagnostic and response gaps ii. a consensus and standardised laboratory approach for the identification and confirmation of the of interest iii. a laboratory capacity-building initiative using a ’train-the- trainer' approach and strict criteria for proficiency iv. the stetting up of a web-based© by authorcommunication tool for data and biological characteristics of isolates v. a laboratory-based structured survey that will ultimately pave theESCMID way for an integrated Online surveillance Lecture and response Library approach Structured Surveys

• consistent sampling frame • scalable (in intensity and extensiveness) • syndrome or pathogen oriented • web-based data collection • multi-centre, multinational, continental • hierarchical distributed© by networks author ESCMID Online Lecture Library Structured Survey Overview

• Step 1: Recruitment of laboratories and hospitals • Step 2: Capacity-building workshop • Step 3: External quality assessment • Step 4: Sampling of isolates and data collection • Step 5: Reference identification/confirmation © by author • Step 6: Submission of data • StepESCMID 7: Data analysisOnline & Lecturepublication Library Structured Survey - Workflow

Hosp 1 Hosp 2 Hosp 3 Hosp 4 Hosp 5 Hosp 6 Step 4 – sampling of isolates and data collection Lab 1 Lab 2 Lab 3

per country

National© by Expert author Laboratory Step 6 – Submission of data

Step 5 – ReferenceESCMID identification and confirmation Online of CPE Lecture Library European Structured Surveys

• Staphylococcal Reference Laboratories (SRL) – 1st structured survey – 2nd structured survey supported through ECDC tender

• European Survey on Carbapenemase-Producing Enterobacteriaceae© (byEuSCAPE) author supported through ECDC tender ESCMID Online Lecture Library 1st SRL Structured Survey

• between September 2006 and February 2007 • 357 laboratories serving 450 hospitals in 26 countries • 2890 MSSA and MRSA isolates from patients with invasive S. aureus

© by author ESCMID Online Lecture Library

Grundmann et al., PLoS Med, 2010 1st SRL Structured Survey

© by author ESCMID Online Lecture Library

www.spatialepidemiology.net/SRL-Maps 1st SRL Structured Survey

© by author ESCMID Online Lecture Library 1st SRL Structured Survey

© by author ESCMID Online Lecture Library Genomics and Discrimination

100s+ Decades Years Weeks/Months Hours/Days years

High AMOUNT OF GENETIC CHANGE© by author Low LowESCMID Online LectureDISCRIMINATION Library REQUIRED High Short-Term vs. Long-Term

à Short-Term/Local Epidemiology

ISOLATE FROM INDEX CASE ISOLATES FROM SECONDARY CASES

© by author ESCMID Online Lecture Library

ISOLATES

UNRELATED ISOLATES Short-Term vs. Long-Term Epidemiology

à Long-Term/Global Epidemiology

RELATED ISOLATES

© by author UNRELATED ISOLATES ESCMID Online Lecture Library

spatio-temporal epidemiology of isolates on a global scale Part II

Whole Genome © by authorSequencing ESCMID Online Lecture Library www..com/scientificreports

ARTICLES The Invasion of WGS

Published online: February 20, 2015 PAPER OPEN How clonal are Neisseria species? The epidemic COLLOQUIUM Whole-genomeResearch Articlesequencing reveals the efect ofR eviewvaccination articlesPhylogeographicalclonality on model the analysis revisited of the dominant multidrug- resistantMichel Tibayrenc H58a,1 and clade Francisco J.of Ayala Salmonellab Typhi identifies Overviewa of molecular typing methods for outbreak Infectious Diseases and Vectors: Ecology, Genetics, and Control, Institute of Research for Development 224, CNRS 5290, Universities of evolution Ahigh-resolutiongenomicanalysisofmultidrug-of BordetellaMontpellier 1 andpertussis 2, 34394 Montpellier Cedex 5, France; and bDepartment of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697 Received: 02 April 2015 detectioninter- and and epidemiological intracontinental surveillance transmission events 1,* 2,4,* Edited by3,* John C. Avise, University1 of California,2,4 Irvine, CA, and approved March 26, 2015 (received for review February 11, 2015) Accepted: 10 July 2015 Yinghua Xu , Bin Liu , Kirsi Gröndahl-Yli-Hannuksila , Yajun Tan , Lu Feng , Articles 3 resistant1 1 2,4 hospital2The three3,5,6 species3 Neisseria outbreaks42,4,7 meningitidis1 , Neisseria of5 gonorrheae, 1 (MLGs), which bias LD tests, and obscure1 the general pattern of Published: 18 August 2015 Teemu Kallonen , Lichan WangA, J DingSabat Peng, A BudimirThe, Qiushui emergence, D Nashev He of ,multidrug-resistant, R Lei Sá-Leão Wang, J M &van (MDR) Dijl , F typhoid Laurent is, Ha majorGrundmann global, health A W Friedrich threat affecting ([email protected]) many countries where, the disease is 1 and Neisseria lactamica are often regarded as highly recombining6 predominant recombination. Therefore, to distinguish epidemic Shumin Zhang on behalf of theendemic. ESCMID HereStudy whole-genome Group of Epidemiological sequence analysis Markers of (ESGEM)1,832 Salmonella enterica serovar Typhi (S. Typhi) identifies a single dominant Klebsiella1. Department of Medical pneumoniaebacteria. ,N. meningitidis Universityhas been of consideredGroningen, a University paradigmatic Medical case Centerclonality Groningen, from “true Groningen,” clonality, The it has been proposed that the Research articles NetherlandsMDR lineage,of the “semiclonal H58, that modelhas emerged” or of “ andepidemic spread clonality, throughout” demonstrat- Asia and Africarepeated over MLGsthe last be 30 counted years. Our only analysis once (5). identifies The disappearance of Herd immunity can potentially induce2. Department a change of numerousof circulating Clinicaling occasional transmissionsMicrobiology viruses. boutsHowever, ofand of H58, clonalMolecular it includingremains propagation Microbiology, largely multiple in an transfers otherwise Clinical from Hospital recom- Asia toCentreLD Africa with Zagreb, and such an Zagreb, aongoing, counting Croatia unrecognized method would MDR be epidemic evidence that epi- 1,† 2,† 1 3 3 unknown that how bacterial Haopathogens3. Chung Department adapt The to vaccination.withinof, Microbiology, Abhilashabining Africa In species. itself. this KarkeyNational study,Notably, In this BordetellaCenter model,, our Duy analysis of occasional Pham Infectious pertussis indicates Thanh clonality, and thatParasitic generates, Christine H58 lineagesDiseases, linkage J Boinett are Sofia, demicdisplacing Bulgaria, clonality Amy antibiotic-sensitive K is Cain the cause, of the isolates, LD. The transforming semiclonal the model (9) 4. Laboratory of Molecular Microbiology of Human , Instituto de Tecnologia Quimica e Biologica, Oeiras, Portugal Monitoring meticillinglobal disequilibriumpopulation resistant structure in the shortofStaphylococcus this term. pathogen. In the H58 long isolates run, however, can harboraureus the a complexis similar: MDR It states element that residing recombination either on is transmissible obscured by epidemic the causative agent of whoopingMatthew cough,5. Department was Ellington selected of Bacteriology,3 ,4as, Katean example S BakerNational to explore3, SabinaReference possible Dongol Centre efect for2 ,of CorinneStaphylococci, Thompson Inserm U81,1,5, Hospices Simon R Civils Harris de Lyon,3, University of vaccination on the bacterial pathogen.Lyon, We sequencedLyon, FranceIncHI1 andeffects plasmids analysed of clonality or the withinWhole-genome complete are multiple countered genomes chromosomal by recombination. of 40 integration sequencing We show sites. that We alsoclonality for identify prediction in new the shortmutations term butthatof deletes defineMycobacterium the H58 effects lineage. of clonality in This6 phylogeographicalmany data are at oddsanalysis1 with provides this proposal a framework and that1 toN. facilitatemeningitidis globalthe management long1 term. Theof MDR population typhoid2 of andN. meningitidisis applicablewould to appear as B. pertussisand strains its from spread FinlandThibaut and6. China, European in Jombart as Copenhagen, well Society ,as Tu 11 for Lepreviously Clinical Thi Phuong Microbiology sequenced ,Denmark, Nhu strains and Tran Infectious from Do the Hoang Diseases, 2013,, TuyenBasel, Switzerlandthrough Ha Thanh , Shrijana Shretha , Hindawi Publishing Corporation similarfits MDR the lineages criteria thatemerging we have in other proposed bacterial for predominantspecies. clonal a highly diverse and freely recombining mixture of different ge- JournalNetherlands, of Biomedicine where and diferent BiotechnologySuchita vaccination Joshi strategies2, Buddha haveevolution been Basnyat used (PCE). over2, Guy We the point Thwaitespast out50 years. that1,5 ( ,iThe)theproposedwaytodis- Nicholasdrug susceptibility R Thomson3,7,notypes,‡, Maia and on A Rabaa which resistance: the1,8,‡ epidemic& clones are superimposed, as Volumeresults 2012, routineshowed Article that ID 251364,the molecularwhole11 pages Citationclock genome moved style for atthis diferent article:tinguish sequencing rates epidemic in these clonality countries from and PCE in maydistinct be faulty and (ii) the ev- predicted by the epidemic model (10). 1,5,7,‡,* 17–19 doi:10.1155/2012/251364 StephenSabat AJ, Baker Budimir A,S .Nashev Typhi,idence D, the Sá-Leão primary of deepR, van global aD phylogeniesij l retrospectiveJM, cause Laurent of F, byhuman Grundmann microarraysREVIEWS typhoid H, Friedrich (enteric andcohort whole-genome AW, fever), on behalf study Reviewthousandof the ESCMID years Study ago Group of. Epidemiological It also indicated Markers that the population is periods, which suggested that evolution(ESGEM). of Overview the B.1,2 ofpertussis molecular population typing methods was2,3 for closely outbreak associated detection3 and with epidemiological 3 surveillance.1,4 Euro Surveill. 2013;18(4):pii=20380.1,3 Available online: M D Bartels ([email protected])http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=20380is, aH monophyleticLarner-Svenssonsequencing isserovar at odds, H of withMeiniche S. enterica the predictions, K. Unlike Kristo offfmanyersen the semiclonalSalmonella, K Schønning model., relatively, J TheB Nielsen small Predominant and, that recombination Clonal Evolution between Model S. Typhi and other the country vaccination3 coverage. 3Comparative whole-genome3 5 analyses indicated6 that evolution7 in 5 8 9 crossmark S M Rohde , L B Christensen , A W SkibstedS,. TyphiJ O JarløvLast, are highly we, H revisit K restrictedJohansen© theTimothy species to,by infectionLM P Walker*, status Andersen of Thomas humansNauthor. meningitidis, I AS Kohl*, Petersenand areShaheed ,associatedN., gonorrheaeD V Omar*,W Crook JessicaSalmonellae,, R Hedge*, BowdenWe have Carlos is rare, proposed Del12 Ojo,19,20 Elias,. thatSimple Phelim“predominant Bradley,SNP-based Zamin typing clonalIqbal, Silke schemes evolution Feuerriegel, have” (PCE) this human-restrictedK Boye1, P Worning pathogen3, H Westh was1,3,4 mainly characterised by ongoing genetic shiftRESEARCH and gene loss. ARTICLESArticle submitted on 30 June 2012 / published on 24 January 2013

with systemicand N. lactamica infection,Katherinein the prolonged light E Niehaus, of thefever Daniel PCE and J model. Wilson, an asymptomaticDavid A Clifton, Georgia been Kapatai, developedbe defined Camilla that L asC Ip, stratify“ Rorystrongly Bowden, the restrainedS. FrancisTyphi Apopulation Drobniewski, genetic recombination into Caroline haplotypes, Allix-Béguec,” (11). EVOLUTION 1. Department of Clinical Microbiology, Hvidovre University Hospital, Denmark Furthermore, 116 SNPs wereAbstract specifcally detected in currently1 circulating ptxP3-containing strains.DOI 10.15252/emmm.201404767 | Received 21 October 2014 | Revised 19 2. These authors contributed equally APPLICATIONS to thiscarrier work state OF . NEXT-GENERATIONTyphoid isCyril still Gaudin, a common Julian diseaseParkhill,SEQUENCING inRoland many Diel, regions Philip Supply, of the Derrickand W these Crook,This schemes E definition Grace Smith, are isnow A usedSarah used widelyWalker, to unequivocally Nazir (5, 11), Ismail†, including Stefan map Niemann†, new by many isolates authors The fnding3. MRSA Rmighteview Knowledge explain articles theCenter, successful Hvidovre emergence Universityworld of Hospital,thiswithdeep lineage phylogenypoor Denmark infrastructure and| linkageTim its Espread A Peto†, disequilibrium and worldwide. and limited the Modernizing| economicnear-clade January Medical |development2015 Microbiology| Accepted to20(MMM) theJanuary studyingphylogeny Informatics2015 | PublishedN Group‡17. meningitidis,19,21,22 online. Notably,20(12February– 14),this approach2015 although identified it is notby accepted aarise sin in- disease by caused by nonvaccine-serotype Collectively,4. Institute our results of Clinical suggest Medicine,Multidrug-resistant that the University immune(MDR) ofand pressure GenomeCopenhagen, is alsomolecularKlebsiella of a riskvaccination epidemiology forpneumoniaeDenmark Sequencestravelers is one| whopredominanthas major visit become suchdriving of clonal aregions 64 force evolutionEMBO2 Non-O157:H7. It is Mol estimated Med (2015 )gle7: 227emerging,all–239 Shiga scientists highly workingclonal Toxin-Producing MDR on haplotype pathogen of population S. Typhi, H58, geneticspneumococci, which (11). such as the multidrug-resistant Reviewfor the5. evolution Department Article of B. of pertussis, Clinicalleading Microbiology, which cause facilitates of Herlev nosocomial thatfurther University 20–30 exploration million Hospital, cases ofSummary the worldwide.of Denmark typhoid pathogenicity occur Despite annually, of B. although deaths is beingRestrained reported with recombination increasing generatesfrequency bothfrom amany strong countriesserotype (i.e, statisti- 19A strains now common in the USA 6. Department of Clinical Microbiology,Transforming Rigshospitalet, Denmark clinical microbiologycally significant) LD and discrete, stable genetic subdivisions(11). Based thaton evidence from multilocus sequence Clinicalpertussis. value of itswhole-genome prominence, littleare isEscherichia less known frequently sequencinghe about clonality/sexuality reported theBackground coligenetic than RapidofStrains controversy diversitybefore Diagnosing the of availability in Pneumococcal drug-resistance microbiology of effective has remains gonein Africa an obstacle and Evolution Asia to12 the,17, 19elimination,23. Within in theof tuberculosis.H58 lineage, PhenotypicIncHI1 MDR drug- Lancet Infect Dis 2015 7. DepartmentBioinformatics of Infection control, Rigshospitalet, in bacterial Denmark molecular epidemiologycan be clouded and by occasional recombination/hybridizationtyping (MLST) (near- from America (12)andEurope ComparisonESCMID ofK. Next-Generation pneumoniaeTyping in methods resource-poor forTOnline ondiscriminating hospital for3,4 Sequencing more settings. than 35 Through different y in bacteria whole- Systems Lecture bacte- (1, 2) as wellIntroduction as in parasitic Library antimicrobials . susceptibility testing is slowIntroduction and expensive, plasmids and commercial of the restricted genotypic subtype assays PST6 screen (ref. only23) and common chromosomal resistance- Published Online 8. Infectious Diseases and Microbiology Unit Johnprotozoa Radcliffe (3Hospital,–6). Early Oxford, on, severalUnited Kingdom instances of bacterial and clades). In contrast to the epidemic clonality/semiclonal(13), some model, of these are thought to include capsule Mycobacterium tuberculosisgenomewithrial sequencing isolates bacterial (WGS), of the we same reconstructed species genome an are outbreak essential of MDR epide- sequencingIdentifying different types of organisms within a spe- 14,24–26 June 24, 2015 9. Wellcomepublic Trust Centre health: for Human Genetics, databases,In University addition toofa Oxford,improvements toolsdetermining Oxford,b,c Response inand United access mutations. Kingdomto the b,cclean We water usednext-generation to whole-genomeand Clinicalsanid - point sequencing mutationsPCE Interventions inb,c the to conferring characterise long run quinolone overcomes common resistance the anda effects rare are mutations of common recombination, predicting. and th Magalyparasitic Toro, speciesGuojie wereCao, removedLydia Rump, from theT. G. clonal Nagaraja, paradigmJianghong and Meng, Narjol Gonzalez-Escalona switch variants ofhttp://dx.doi.org/10.1016/ PMEN1 lineage. Te 20 century witnessed enormousK. pneumoniaemiological successesoccurring tools intation, vaccination on in typhoid high-dependency infection campaignscan potentially preventiondrug wards againstresistance, be in controlled and a infectious hospital or control. consistency by diseases, other Klebsiella interventions with cies susceptibility, pneumoniae is called However, typing.is for onethe all relatively near-clades,fi of Traditionalrst-line the leading littleand which second-line istyping causesknown are distinctsystems ofabout drugs hospital- the from for basedemergence tuberculosis. cryptic, biological and evolu species- 1Divisioninstead of Microbiology, were considered1,2,3 Center for Food as “ Safetyhighly and recombining Applied2,4 Nutrition,1 ” (5). Food The and Drug con-3,41 Administration, College2 Park, Maryland, USAa; Joint1 Institute for Food1 SafetyTo and study howS1473-3099(15)00062-6 this lineage has evolved as it Howarddrastically E Taki ffreducing , Oscar Feo burden ofin these KathmanduXavierTraditional diseases. Didelot during However,typingsuch2012, Rory as systems withvaccination certain Bowden a case-fatality recentbased5–7 and studieson, antimicrobial Daniel ratephenotypes, haveNicholas of 75J. reported%. Wilson therapy J. The Croucher,such that8. Chloramphenicol,acquired, theTimSimon on E. infections phenotype, A. R. Harris,Peto globally.tionaryChristophe andsuch(5), history These as tend serotype, Fraser, infections of to the separate H58Michael (typicallybiotype, lineage from A. or Quail, pneumonia, each phage-typehow it otherJohn is moving Burton, more or across and more.endemic Within Citation stylesequencing for this article: revolutionAppliedtroversy Nutrition now (JIFSAN), can University be reconsidered of Maryland, College in the Park, light3 Maryland, of major USAb results; Department4 of Nutrition and5 Food Science, University6 of Maryland, College7 Park,has spread, we usedSee Online/Comment Illumina sequencing of mul- efectivenessBartelsLin MD, of Liu, Larner-Svenssonviral Yinhu vaccines, Li, WGSH, such Meiniche Siliang analysis as the H, Kristo Li,permittedinf Niuenzaffersen Hu,ampicillin the K,vaccine, Schønning4,3 Yimin identification and has K, He,trimethoprim-sulfamethoxazole beenNielsen of Ray twoa MethodsfJB,ected Pong,Rohde MDR byK. SM,MarkBetween pneumoniaeDannimismatches Christensen van derSept Lin, Linden, LB, were between1, Skibsted urinary2010, traditionalLesley and AW, tract JarløvDec McGee,first- infections, JO,1, Johansen2013,regions.Anne bacteraemiawenear-clades, von HK, Heresequenced Andersen Gottberg, we have and PCE a usedJae training wound occurs Hoon phylogenetic infections)andSong,set of generates Kwan2099 analysis are Mycobacterium Soo LD based Ko, and on lesser the whole- tuberculosis near-clades Derrickas serotype, W. Crook biotype,Maryland,obtained USA phage-type,c; Department with modern of Diagnostic or technologies, antibiogram, Medicine and such8 Pathobiology, as whole-genomeantibiogram, College9 of Veterinary se- have Medicine, been9 Kansasused Statefor University,many years. Manhattan,10 However, Kansas, USA 11d tiplexed genomichttp://dx.doi.org/10.1016/ DNA libraries to characterize a Whole-genomevaccineLP, componentsPetersenLihua sequencingIS, Lu, Crook and and DW, the Bowden Maggie(WGS) circulating R, is Boye Lawnow virusK, Worning common due P,lineto Westh asevolutionary drugs a H. result Monitoring commonly of dri new meticillinf1,2 used technologies. In thetoresistant treat caseBruno acute Staphylococcusof that Pichon,viral typhoid, can pathogens, rapidlyStephen aureus and sequence theseand Baker, its agents spreadChristopher a inLancet Copenhagen,genome Infect M.(the Dis sequences Parry, 2015; Denmark,“Russian Lotte of a dollM. global Lambertsen,” pattern) collection (15). Deaof S Near-clades. Shahinas,Typhi from and63 countries Russian doll 2013, through routine whole genomelineageshave sequencing. causing been1 Euro distinct used Surveill. for2 outbreaksquencing 2015;20(17):pii=21112.many years. within (WGS),2 genomes. theHowever, single Available complex nucleotide 1 Formore online: endemic 23 http://www.eurosurveillance.org/ViewArticle.recent polymorphism candidate11 frequently genesthe analysis severe, identifi methods (SNP), notoriously ed12 from that examinethe affecting drug-resistance1 incapacitated the relatedness scientifi patients 13 cof with literature, isolates a we algorithmicallyglobal collection S1473-3099(15)00088-2 of 240 PMEN1 strains isolated completeherd immunity bacterialJ A Carriço is genomeconsidered ([email protected]) for US$500to potentiallyAbstract or less. ,|contribute Whole-genomeA Many J Sabat studies to, A evolutionaryW have Friedrich sequencing addressed ,dri M f Ramirezquestions1,3 of, but bacteriaDylan it, onaboutis unclear R.behalf has Pillai,tuberculosis recentlyofthat theTimothy how ESCMID with emerged J. WGS, Mitchell, Study as 15:Group a 1077–90 Gordoncost-effective for Epidemiological Dougan, Alexander Tomasz,i aspx?ArticleId=21112 K.3 pneumoniae. Usingcontinue phylogeneticmicroarrays, to be used reconstruction in and areascharacterised megacomputing, of the and world lineage-genetic where which mutations S. Typhisuppressed have areas made deemednot immune conferring possible to status investigate resistancepatterns and neonates the are(benign), genomic evidenced in resistance intensive architecture by ( care)determinants, surveys units of this over highly or the uncharacterised. successful wholebetween ecogeo- 1984 and*Contributed 2008. Strains equally were identified Markers (ESGEM) methods that examineShiga toxin-producing the relatednessEscherichia of isolates coli (STEC) at 4,5,14 strainsat are a molecular humanPublished1 pathogens. Online level have Although revolutionised2,15>400 non-O157 our ability serotypes1 to have been andthe knowing bacterialNGS the pathogens Sequencing sequence adaptDepartment, of the to entire vaccination Beijing genome, Genomics due rather susceptible. to Institute bacteria’sthanconsiderable only (BGI), However, slowera few 4th fragments, advances rateFloor, sinceWe of thenBuilding the evolution inhas Keith 1970s, assessed evolutionary greatly 11, P. S. Beishan compared Klugman, Typhiincreased the Industrialabilitybiology, have with the emerged ofJulian precision population Zone, these Parkhill, that characterisations ge-S. WilliamTyphigraphical lineage. P. to Hanage, predict range of phenotypicStephen the species, D. drug-susceptibility Bentley retrospective* studies, testing and for anthe 1. Instituto de Microbiologia,specificanda PCR, molecular convenient our Instituto data level predicted approachdeinvolved Medicina have a in scenario for revolutionised human Molecular, addressing in disease, which FaculdadeK. many whole-genome our pneumoniae Articleabilitymicrobiological de Medicina, submitted, to sequencing (Podschun onUniversidade 12 differentiatequestions. March information & Ullmann,2014 de/August publishedHere, Lisboa, among 1998). 13, is 2015 missingwe on Lisbon,Klebsiella review30bacterial April for Portugal2015 many pneumoniae types serotypes. (orare subtypes). commonly We sequenced The 64 STECeither by using MLST†Contributed or on equally the basis of serotype, ofthat molecular of viruses.Yantian epidemiology District, Guangdong, and contact Shenzhen tracing. 518083, Additionally,display Chinanetics, multidrug topics and such resistance, molecular as the defined epidemiology.mutation as resistancerate, Wedrug reconsider to resistance, the above the anti-the issue of analysis of ancient collections, which make it possible to ascertain 2. Department ofcirculating Medical Microbiology, for 6 months University before the of outbreak, Groningen,independent underwent UniversityEpidemiological validation a series Medical studies setresistant Center of of 1552 theGroningen, to multiplenaturally genomes.http://dx.doi.org/10.1016/ antimicrobialsGroningen, transformable We sought The bacterial and mutations are generally pathogen under acknowledgedStreptococcus similar selection pressuredrug-resistance to those profile,Downloaded from and targeted polymerase thedifferentiate current status amongstrains of clinical comprising bacterial microbiology types38 serotypes, and and isolatedhow subtypes. it has from already clinicalchoice begun sources, of to ananimals, be appropriate transformedtheir and strong environmental molecular by stability in samples,typing space and method to time, improve and (or the (ii) phyloge- flexible phyloge- ‡For members see end of paper targetBordetella of new drugs, pertussisNetherlands and could the phylogeny serve as aand favourable evolutionmicrobials, example of thethe Mycobacterium population forcompromising illustrating structure tuberculosis treatment the adaptation of Neisseria9– complex11. Since of meningitidis the bacterialbacteria 1990s, have alternativein thebeen light S1473-3099(15)00071-7RESULTS of of ward-specific clonal expansions after thecharacterised acquisitionpneumoniae ofas genesresistancehave previouslytodeterminants be a major been outside confounded source candidate of antimicrobial by high genes rates to resistanceof account recombination. for genes residual thatSequencing phenotypic can resistance.chain reaction (14Nu).ffi eld Selected Department isolates of were dis- pathogens.Correspondence Before the introduction should beusing addressedImportantly,of mass next-generation vaccination, to Lin Liu, thenetic [email protected] developmentpertussis understanding sequencing. was one of of We molecular the these focus primary important on methods threecauses foodborne essential of infant pathogens. methods)tasks: 23Fidentifying dependsnetic the significantly analysesspecies based on on the the problem“congruence to solve principle ” (16), that is, as elucidated by3. WGS. European Nonetheless, Society for WGS Clinical has notMicrobiology explainedtreatmentthese diand optionsff erences advances. Infectious have in included transmissibility Diseases, 240fluoroquinolones, Basel, isolates between Switzerland of the strains,third-generation PMEN1 (Spain or why -1) LaboratorioPhylogeography multidrug-resistant de Genética of lineage H58 enabled base substitutions tributed among Europe (seven countries, 81 strains); mortality, but since the introductionfacilitating of whole-cell andpertussis MDR. vaccines We suggest (WCVs) that thein many early detec-countries inspread the into other Gram-negativemore reliable pathogens data. This are hastaken led into to multi-account, the phylogenetic signal Medicine, University of Oxford, some strains are more virulent thanof hasothers an isolate,provided or more testing cephalosporinspronenew tools itsto developmentproperties, for (such enhanced as ofsuch ceftriaxone) multidrug surveillanceasto resistance be andresistance. distinguished the andazalideto antibioticsWith from azithromycin advances polymorphismsand and the in1 virulence,epidemiological. Molecular,Of arising a CMBC, global throughand Instituto collection context horizontal in of sequence which 1,832 sequencedthe transfer. method More S. isthan Typhi (listedSouth Africa in (37 strains);John Radcli Americaff e Hospital, (six Oxford, countries, Received 11 February 2012;tion of Revised a specific 27 March NDM- 2012;1 containingN Accepted. meningitidis lineage 2 AprilFindingsand 2012 in 2011 the We“wouldEpidemic characterised have Clonalitydrug-resistant 120” training-setand (MDR) mutationsK. pneumoniaeincreases. as resistance Thisbeing evidence recently determining, showssingled and that out 772 theas species as benign. under With study these has technology, WGS of clinical specimensoutbreak could become detection. routine This in hashigh-income resulted countries; in700 better recombinations however, imple- its wererelevance detected,going will to with beVenezolano genes used, de encoding Investigaciones as well major as theantigens time frequently and geographi- affected. 54 strains); and AsiaUK (T (eightM Walker countries, MRCP, 68 strains) Citation style for this alertedarticle:monitoring the high-dependency theThe emergence earlyReceived“Semiclonal wardemergence5 August and staff” 2015spread of toModels mutations,MDR intervene.Accepted of S. Typhibacterial21 August Wewe was couldargue 2015 driven pathogens. Published predict inan large “urgent 1 89·2% October Wepart predictthreatby 2015 of the the to that humanSupplementaryvalidation-set thepassed health” a phenotypes by “Tablesclonality the U.S. 1 and threshold. Centers with 2), a853 for mean” (47%) Diseasebeyond 92·3% belonged which sensitivity theto haplotype effects (95% of CI re- 12 Cientifi cas (IVIC), Caracas, J Hedge PhD, C Del Ojo Elias MSc, probably1 Academic dependCarriço on JA, Editor: easySabat P. toAJ, J. Friedrich use Oefner softwarementation AW, Ramirez to e ffiM, ofciently onacquisition rationalbehalfCitation process of the ofToro infectionESCMID IncHI1 the M,Cao sequences Study G,plasmids Rump control Group L, Nagaraja producedforcarryingAmong Epidemiological programmes TG, these Mengantibiotic and J, were Gonzalez-Escalona accessible Markers resistance 10 capsule-switching (ESGEM). cal genomic genes N. scale 2015. Genome ofH58, events, its sequences initiallyin use.bacterial one of of 64Importantly,defined which molecular non-O157:H7 accompanied by the Shiga SNP human toxin-producing glpA a population -C1047T pathogensEscherichia shift (position coli strains. 2,348,902(table S1) and included a variety of drug-resistance Key Laboratoryepidemiology of the Ministry and publicthat of health: Health some databases, form for Research of tools real-time and on theThe Quality next-generation genetic molecular and characterisation, Standardization90·7–93·7) sequencing epidemiology revolution. and ofalongside Biotech 98·4%and Euro Products,population Surveill.specifiControl city 2013;18(4):pii=20382. and(98·1–98·7). genetics PreventionVenezuela of 10·8% Available (CDC),(Hcombination E Taki offf validation-setonline:MD, the World http:// are increasinglyHealth phenotypes Organization overcome could not and be predicted divergence because among D J Wilson DPhil, databases that can be mined in futureapplication studies. of next-generationGenome Announc 3(5):e01067-15.sequencing doi:10.1128/genomeA.01067-15.as will vaccine-escape soon be sufficiently serotype 19A fast, isolates accurate emerged and in the cheap USA after the introduction of the profiles, as well as five serotypes distinct from NationalTyping Instituteswww.eurosurveillance.org/ViewArticle.aspx?ArticleId=20382 of of meticillin Food and Drugresistant andControl, efficient Staphylococcus Beijing allocation and, 100050, moreN . P. meningitidis recently, aureusR. of China. resources by 2TEDA chromosomal,uncharacterised a major (SNP)acrossSchool agent calling of Europe.mutations Biological of mutations , and The associated Sciences has were has improved withofpresent. received one resist- species ourWith muchO Feo in understanding an PhD)S. Typhiin-silico can; theand Unité near-cladescomprise CT18, decomparison, BiP33 of very becomes in ref.characterised diverse 17). irreversible. The organisms. earliest resistance Data H58 basedisolatesdeterminants on in theour had con- D W Crook FRCPath, Copyright © 2012 Linclade-specific Liu et al. This is PCR an open during access an articleoutbreak, distributed should under be factoredconjugate the Creative into polysaccharide Commons(WHO), Attribution vaccine. the Government The License, evolution of which the of resistance United Kingdom to fluoroquinolones, and other interna- rifampicin, the ancestral 23F: 19F (also included in PCV7), (MRSA) by whole genometoemergence be sequencing used in routine ofance benchtop(WGS) Copyrightto clinicalfluoroquinolones, is ©3 per- sequencers 2015 microbiology Toro et al.andMRSA This MDR using is anpractice, open-accesstransmission. strains next have where gen-article been distributed it reported couldTherefore, under replaceacross the terms Génétique many ofcollection typing the Creative Mycobactérienne,gruence complex techniqueswere Commons principlefrom Attribution 1992 should can (Fiji) 3.0 include and Unported have 1993 (i) license adding(Fiji excellent. and a Vietnam), higher number and H58 of gene A S Walker PhD, T E A Peto FRCP); and Biotechnology, Nankai University,future healthcare Tianjin 300457, infection P.R. attention. China. control Department practices Thousandshigher in of of both Medical sensitivitystrains high- have Microbiology and thanArticle been the characterizedsubmittedtional andmutations healthcare on 29 by June from various 2012 organizations three / published line-probe on (Centres 24 January assays for 2013 (85·1% Disease vs Control 81·6%). & No additional resistance permits unrestricted use, distribution, and reproductionAddressone correspondence inor anymore13–16 medium, loci. to NarjolSix provided patients Gonzalez-Escalona,and macrolides the were original initially [email protected]. was work observed infected is properly to with occur cited. Insitut on multiple Pasteur, Paris, occasions. France This study details how genomic 19A, 6A, 15B, andMolecular 3. Mycobacteriology, Introductionformed routinely in Copenhagencurrenteration techniques sequencing since4 Asia January and with technology Africa a 2013. single, . moremakesdeterminants efficient bacterial were workflow. whole identifi ed amongtypeability mutations isolates to under befragments were ableselection represented to type inpressure MLST all every in the studies non-candidateyear isolates from (phylogenies 1992 studied genes.to 2013, based at a mean on individual rate Immunology, Turku University, Turkulow-income 20520, Finland. settings. Key Laboratorymarkers, of includingMolecular multilocusMicrobiology enzyme and Technology, electrophoresisPrevention, 2013; (MLEE), UK(H E Taki Departmentff ) of Health & Department for Envi- http://genomea.asm.org/ Whole-genome sequencing (WGS) has become Phylogenetic a two strains, analysis, manifested initiallyplasticity by based more within on than lineages subgenomic one nucleotide of recombinogenic DNA of 40% bacteriagenes per year maycan ( permitFig. show 1a someadaptation). H58 discrepancies, isolates to clinical formed but a tight the phylogenetic cluster withinConstruction signal Forschungszentrum of the phylogeny. Borstel, Sequence WeWith describe fast development the relatedness, and widegenome applications based sequencing on of WGS next-generationmultilocus data (WGS)5 and sequence sequencing feasible analysis (NGS) even technologies, in(MLST), small random genomic[1]. primedsequence In outbreak poly- information investigations, is a typing method must Leibniz-Zentrum für Medizin Ministry of Education, 23 Hongda Street, Tianjin 300457, P. R. China. Department of Infectiousinterventions Disease Surveillance overronment remarkably Food short & time RuralCorrespondence scales. Affairs, to: 2013). Klebsiella pneumoniae has ii reads were mapped against the complete ref- common technique for investigation of pathogenicsequences and morphicseen but atlater DNAmany on whole-genome(RAPD), sitesInterpretation in the and genome DNA WGS A broadsequences, and sequence. SNP catalogue analysisshowed If only of that (reviewed geneticone the the mutations in whole-genomeincreases enable when maximum-likelihood data more from gene whole-genomesareconsidered);( phylogeny sequencing (Fig. 1)comparisonsofb), forming to be used und Biowissenschaften, and Control,epidemiologicalwithin National reach Institute to aid data, the forKeywords achievement Health Clinical ofresearch 341 andantimicrobial microbiology Welfare, ofMRSA goals and Turku toisolates. resistance; clinicaldecode is20520, a discipline life bloodstream laboratories.Finland. These mysteries, that6Department infections; make focusesIntroduction WGSbetter carbapenemases; ofon crops, hasMedical multiple-step detectalready Microbiology, pathogens, earned process have notoriety and takes improve the largelyfrom Dr discriminatory lifeHoward days qualities. due E Taki (for toff , resistance Laboratorio the isola power to- “last needed resort” to antimicro- distinguish erence chromosome of S. pneumoniae ATCC on July 16, 2015 environmentalEscherichia bacteria, coli and has been used to address of the strains had been identifi ed originally but the other trees obtained from different software or phylogenetic trees Borstel, Germany (T A Kohl PhD, on July 16, 2015 NGS systems are typically represented by SOLiD/Ionglobal7 ref.S. TorrentTyphi 7). population Pioneering PGM fromclinically is Life MLEE highly Sciences, to clonalpredict studies Genome and drug showed likely resistance, Analyzer/HiSeq originated that, although drug from 2000/MiSeqsusceptibilityrd de this aGenética unique Molecular, from ,lineage or to CMBC, identify separated drug by 151phenotypes SNPs from that the cannot nearest yet neighboring be genetically CapitalcomprisedA Medicalcommon inhabitant University, all MRSA of the Beijing (nKlebsiella =rapidly 100069,1–11 300)been pneumoniae identifiedcharacterizing used P. R. China. ;for nosocomial thein StatepathogenCopenhagen characterisation infections Key Laboratorysamples to direct ofWhole Medicinalbacterial the genome treptococcustion Chemical isolates by sequencing culture, Biology, bials pneumoniae species suchall (WGS) epidemiologically identification asis a3 is highly generationexpected re-and susceptibilityThe cephalosporinstounrelated firsttrans- recognized isolates. and example carbapenems, Ideally, was Pneumococcal such a 700669 (3)and,byusingthecriteriadescribedin various aspects of tuberculosis. This Review a willcommon pathogennoted ancestor in has a that subsequenta drastically moved into sputum restricted the human sample, ecological population a treatment niche, several infecting IVIC,non-H58 Caracasobtained 1020A, cluster, Venezuela with which nonphylogenetic consisted exclusively methods of isolates to see whetherfrom Fiji these S Feuerriegel PhD, gutsIllumina, of many animals, and but GS FLX Titanium/GS* Junior fromn Roche. 1982, the Beijing first Genomics Shigapredicted. toxin-producing Institute This approach (BGI),Escherichia which could possesses coli be integrated(STEC) the world’s types into biggest routine and for thediagnostic presence workfl of virulence ows, phasing genes (stx out, eae phenotypic,andexhA drug-). focusNankai in onUniversity the theAdvances first potential 300457, five in Tianjin, months clinicaltypingSubject P.management usefulnessR.methodologiesin of CategoriesChina. several 2013. These of Moreover, ofChromatin, largeindividual WGSauthors have outbreaks Epigenetics,only contributed infectedfailurebeen because humans, the Genomicspatients wouldin equallydriv- Europe it exhibits &have(diagnos-form Functionalto and,this considerable beentheof work. in Genomics;databases practicecombinogenic testing thefalsely Correspondence near geneticfor designatedof rapidly which forclinical diversity, human biological growing hasmethod microbiology as significantly nasopharyngeal suggesting abacteria, can data,htaki ff discriminate @ivic.gob.ve narrowed,such andanddifferent as Molecular datainfectionEscherichia or approaches analysis, veryin Epidemiology some closely settings give Network related convergent completely clone isolates 1 (PMEN1),results; (iii)comparisonsHarris et al.(15),S 39,107 Niemann polymorphicPhD); Centre for sites were somesequencing strains can cause capacity, serious has multiple NGS systems includingserotype, 137E. HiSeqcoli O157:H7,susceptibility 2000, 27 wasSOLiD, testing associated one Ionwhile Torrentwith reporting mild PGM, to drug severeone resistance MiSeq,Forty-eight and early. one strains (75%) carried stx ,and41strains(64%)carried Tuberculosis, National Institute and requestsMRSAing for of materials force staphylococcal in should theMicrobiology, tic fieldbe future, microbiology) addressed protein of Virology molecularis tolikely A & Q.H. Host and (spa I to Pathogen(email: to)-type epidemiologyreplace monitor [email protected]) Interaction 304 currently the epidemiology of controlused or ranging L.W.typing [1,2]. commensalcoli (email: )Advances meth- fromto wanglei@months andtheremoved, respiratoryin creation(for WGSto slow-growing reveal the technology pathogen of therapeutic mathematical person-to-person esti- bacteria, and optionsan extendedS. such and pneumoniae for as strainstatis- the treatment lineage transmission, typically of1 MDR identified which as identified within the PMEN1 lineage. Maximum for tuberculosisfood poisoning, treatment as reminded and control. thatreinfection.N. meningitidis undergoesS extensive genetic recombination. 454 sequencer. We have accumulated extensiveA full experience list of authors in and sample affiliations handling, appears sequencing, at the end of andthe paper. bioinformatics analysis. In this for Communicable Diseases, of infectious diseasehuman (public disease health and microbiology). outbreaksmated (1). SinceMycobacterium to be then, responsible itsK. diversity pneumoniaetuberculosis for a has global, beenorinfections to burden producestx2. of (Munoz-PriceAtotalof25strains(39%)carriedboth fullbeing typing sequence etfor al , type2013). (ST) 81 and serotype 23F,stx genes as likelihood (Table 1). analysis produced a phylogeny with a nankai.edu.cn)Pulmonary(t304),by thepaper, 2011pathogens. tuberculosis sequenceor technologiesoutbreak S.Z. (email: in typeTheis [email protected]) ofgenerally thesedevelopment (ST)odologies systems 6 diagnosed had arebeen of reviewed,due bymolecular associated tothe its andHowever, ultimateAlthough first-handmethodolo- with the resolution. dataWGS natural frommultiplexing is populations moreextensive However,tical precise models experience onWGS of for thisdesktop-based to discriminating is species computational summarizedis important consistently and WGS toolsanalyzed to machines develop and to data strategies has mining to prevent further Johannesburg, South Africa Receivedwidely 21 studied December (2 2014;). However,Funding accepted Wellcomealmost not23 March only 15 2015; millionTrust,O157:H7 published National cases has ofonline been Institute invasive 11 impli- May diseaseof 2015; HealthTwenty-five doi:10.1038/ng.328 inThis Research, paperwell as strainsresults exhibiting Medical from1 (39%) the resistanResearch Arthur carriedce M. to SacklerCouncil, fourmultiple Colloquium different antibiotics,and the of subtypes theEuropean Nationalhigh of proportion Academy Union.eae: of of homoplasic sites (23%) and a presenceaGermany. continuous of gies,Mycobacterium and neonatal more tuberculosis recently Applications wardis still outbreak ofbacilli too DNAin epidemiology in laborious sequencing in the Copenhagen exhibitrelapse andinclude a methods fromstrong time-consuming detecting reinfection, linkage reduced out disequilibriumtechniques.- MIRU-VNTR sequencing toany obtainpathogen) (LD), or cost( IS6110-basedFIG. i.e., spread.1) . to nonrandom approximately At the same time EUR it 100 must be rapid, inexpensive, (S V Omar MSc, N Ismail FCPath); discuss the advantages and specifics associated withcated each with sequencing human diseases, system. At2000 as last,Ͼ (4001 applications). Since STEC the serotypes 1970s, of NGS the are havesusceptibility summarized. also ofbeta1, theSciences, epsilon,including“In the gamma1, penicillin. Light of Evolution and The theta2. genome IX: Clonal The sequence Reproduction: plasmid-borne of the AlternativesweakexhA togene correlation Sex,” held between the date of a strain’s breaks, monitoring trendsassociation in infection between and identifying genotypes atIdeally, different all loci of the (8). information To reconcile that is necessary– for Centre for Scientific RepoRts | 5:12888patient’s | DOi: starting10.1038/srep12888 sputum,to complementin and 2011, response 41 t304 and to therapyusefulisolatesimprove is data gaugedcollectedphenotypic in by routine inrestriction identificationthe surveillance.city fragmentCopyright per Also, genome. length © Walker a largely polymorphism Whileet al. Open detailed1 Access highly (RFLP) bioinformaticsarticle reproducible, distributedJanuary and analysisunder 9 10, easy 2015, the attermsto the perform Arnold of CC-BY. and and Mabel interpret Beckman Center of theisolation National and Acad- its distance from the root of the tree 1 Thethe Hospitalemergence for Tropical of new Diseases,been threats.these found Wellcome Ongoing two to observations,Trust be responsible developments Major Overseas which forpneumococcal Programme, them atboth first (3 ). appearindividual Oxford In population the University mutually United treatment to Clinical States, antibiotics inconsistent, Research and six haspublic Unit,was de-detected Hoemies health Chifirst of Minh Sciences identifiedprotec in City,52 and strains Vietnam- Engineering member (81%). of in TheIrvine, the clone, information CA. The isolated complete we in programreport a and here video will recordi- Human Genetics, University of the disappearancebetweenmethods, 2010 of bacilli andwas from 2012accompanied the unresolved were sputum also by after included.the question 2 generationor analysis Isolates is how of could genomelarge accurately remains sequencesIn identify bacterial a challenge, must reinfection molecular commercial or[1,2]. relapse epidemiology, When software typing and is bioinformatics applied web- for continuous surveil- 2 2 Patan Academy of Health632 Sciences,serogroups, Wellcome O26, Trust O45, Major O103, Overseascreased, O111, Programme, largely O121, Oxford as and a consequence University O145, cause Clinical of the70% Research emergencehelp Unit,ngs Kathmandu,better ofhospitalVOLUME most understand presentations Nepal in Barcelona 47 | NUMBER the are evolution in available 1984, 6 | JUNE revealed on of the these2015 NAS that emergent websiteNATURE it had at www.nasonline.org/GE foodborne(PearsonNETICS correlation,Oxford,N Oxford,=222, UK R =0.05,p = in DNA-sequencing technologiesMaynard are Smith likely15 etto al.affect (5) the proposed tion thewould epidemic be gained clonality in a single model, step. In principle, the 3 months of antibiotic treatment.3 The If the Wellcome patient’s Trust Sangersputum Institute, in Hinxton,most cases. Cambridge, Thus, UK uncertainty remains over whether ILE_IX_Clonal_Reproduction. (P Bradley MPhil, Z Iqbal DPhil, from 2013amounts found of datato be and ofbe t304,the examined need ST6 to(n=14) develop for epidemiologicalwere ways com- of stor-Introduction characterisation.basedanddrove solutions spread the of a creation feware In multidrug-resistantnow availablelance, of online the clones for databases respective performing (2). acquired for methodopportunities mul- microbial a Tn5252 must-type to yield integrativecharacterise results and conjugativeother with loci as 0.001) predictive (fig. S1),of on October 5, 2015 by guest which suggested that variation www.sciencemag.org of non-O157which states STEC that diseases the species (4). Although under study some undergoesof these serotypes occasionalpathogens (non-O157 STECs) and improve the accuracy of food- www.sciencemag.org 1. Introduction 4 Addenbrookediagnosis ’ands Hospital, monitoring Cambridge, of UK all pathogens,applications, including and togenome evaluate sequence the recently16 of an isolate introduced contains PGMall, or nearly all, C L C Ip PhD, R Bowden PhD); is clearpared of bacilliing to the and after 41 analysing earlier receiving isolates. the antibiotics, them. coming In Simultaneously, the but years, study, the bouts theuse isolates of lessonsof advances the clonal more WHO’s learntpropagation expensivetilocus target fromtyping sequence WGS in is currently to an dataend is otherwise justifi the typing (e.g. tuberculosis ed. antibioticrecombining andHowever,adequate screening epidemic resistance pop- stability by for 2035.Author gene-based profiles,overelement contributions:resistance time (ICE) phage to M.T. thator allow andnot. carries F.J.A.2,7,8 implementation wroteTo a linearized assess the paper. whether chloram- datawas from primarily whole- arising through incorporation of 5 Centreviruses, for bacteria, Tropical Medicine, fungihave and Nuffield parasites, been Department described but(personal for of Clinical geneticallythis genomeReview Medicine, (5 machines)Oxford),of many the University, information more and Oxford, strains third-generation required UK still have to directrelated sequenc- treatment trace-back and to investigations outbreaks caused by these patho- German Center for Infection sputum is later positive for bacilli, or if the bacilli were England plans1 to routinely obtain WGS on The authors declare no conflict of interest. (Deoxyribonucleicof clonalin computing complex acid) DNA (CC)6 allowedMRC was22used Centre were demonstrated the for molecular Outbreak examined development Analysisnot methods as beenulation in and the detail, Modelling,assigned of structure. special- will Department to allowMultidrug-resistant serogroups“Epidemicantibiotic us of InfectiousThetyping, to Wellcomeor” condenseclones,resistance sequenced. Disease serotyping Trust tuberculosis which Epidemiology,Sanger on Inst are theitute, orof greatly School poses WellcomeWGS otherefficient of favored data Public Trust the phenotypic Genome Health, infection[3].gens. greatest Imperialphenicol information), controlgenome College, resistance London, sequencing measures. plasmidUK can and Moreover, be a Tnused916 -typeclinically a el- toimported predict both DNA andResearch, not throughBorstel Site, steady Borstel, accumu- we focus on bacterial pathogens toing demonstrate technologies the andinform applications. public health All of measures. these2 aspects Indeed, will it beis becoming never cleared from the sputum,7 Thethis London raises School the question of Hygiene andbyall Tropical both M tuberculosis Medicine, positive London,and clinical purging UKCampus, isolates Hinxton, selection, in the Cambridge UK are beginning CB10 repeatedly 1SA, UK. in Department repre- ofThis articleement is with a PNAS a tetM Directtetracycline-r Submission. esistance gene (3). lation of base substitutions.Germany (S Feuerriegel, As these strains are as1 this CC has been shown to include theShiga hospital- toxin (Stx) isobstacle a cytotoxin to similar success, to Shigella with an dysenteriae estimatedtoxin 480 000 Nucleotide cases drug sequence resistance accession and drug numbers. susceptibility,The draftwe characterised genome geneticDepartment materialised of algorithmsbyStatistics, Oswald 8 Theodore forCentrelikely the image forchanges WGS Immunity, Avery analysis, that data Infection in arise into1944. data andfrom epidemiologically Evolution, Its sharingthe17 adoption University and of of routine Edinburgh, usefulthe Edinburgh, clear analytic informa- that UK rapid, methodologies inexpensivetyping method genome for gel-based sequencing that is going (BOX molecular 1)to be used in international of whether the initial bacteria were not eff ectively treated sented2016, in suggesting naturaldescribed populations that in this Infectious the paper. potential as Disease repeated Most Epidemiology, benefi1 data multilocus ts and justify Imperial conclusions genotypes College, its St Mary's are1 fromTo whomThis correspondence lineage was should subsequently be addressed. Email:found [email protected]. to be closely related, sequencesS Niemann); Institute acquired of by recombi- University of Oxford, *Corresponding author. Tel:type +84 8 1 39239210 (6); the; two E-mail: main [email protected] gene invariants 2013 arealone. Shiga Phenotypic toxin 1 (stx drug-susceptibility)and3 sequences forthe these genetic 64 STEC variation strains in are a large available training in GenBank set of samples and doubleacquired helicalintegration, strand epidemic structure and MRSA whole-genome fortion.composed mining (EMRSA-15) On this the sequencing. of basis, everfour clone. baseslarger we Finally, have amounts reviewed 18 The current Campus,typing sequence-basedholds Norfolkand techniques newthe Place, potential London and typing W2networks to 1PG,the replace 1 UK.study method Institute shouldmany and for complex staphylococ- analysis produce multi- of data phylo- that are portable (i.e. or the patient was newly infected† with another strain decreasingindependent cost. users who have extensive first-hand experiencepresent in Africa, Asia, and America (4–8)and, nation could beBiomedical identified Engineering, as loci with a high 1 South Parks Road, Oxford These authors contributedShiga equally toxin to this 2 work (stx ), whichtesting damage forMedical Mycobacterium intestinal Microbiology, epithelial National tuberculosis Reference cells and Center can kid- for take Strepto-are many listed inandTable validated 1. the fi ndings by predicting phenotypes in was determinedall MRSAof accumulated byST8012,13 James were D. also Watsondata.‡ molecularBacterial further In and this pathogens Francis analysed, review,typing Crick account methods we as in will repre- for much discussfor2 outbreak of the cal world- proteindetectiongeneticfaceted inference A and procedures (spa )-typing models. easilythat are ofused transferrable meticillin-resistant to characterize between a path- different systems) and Department of Engineering of M OX1 tuberculosis 3TG, UK. . Diff erentiatingJoint senior between authors these in these typical NGS systems in BGI (Beijing Genomicsby the late 1990s, was estimated to be causing density of polymorphisms. These events were re- neys,www.pnas.org/cgi/doi/10.1073/pnas.1502900112 causing hemorrhagicweeks, colitisandcocci, access (HC) University andto Hospital,the hemolytic-uremic necessary RWTH Aachen, laboratory Pauwelsstrasse syn- facilities 30, in2,3 an independentPNAS | dataset.July 21, 2015 | vol. 112 | no. 29 | 8909–8913 Science, University of Oxford, 1953,2 leading to the central dogmawide burden of molecular of infection. biology. For patients with bacterial ogen after it has4 been isolated by culture . However, sentativesWellcomehow Trust bioinformatics ofCentre an for important epidemiological accompanied community-acquired the surveillance changes MRSA Institute). in of bac- bacterialStaphylococcus pathogens aureusin (MRSA)that can has be been easily performed accessedalmost1 40% of via penicillin-resistant an open source pneumococcal web- constructed onto the phylogeny and, by using an possibilities is crucial for assessment of the cure rate of Microevolution:countries tracing52074 with transmission Aachen,the greatest Germany. diseaseRespiratory and burden Diseases is Branch,oftenACKNOWLEDGMENTS Cen-scarce. Oxford, UK (K E Niehaus MS, In mostHuman cases, Genetics, genomic University DNAinfections, defined the the crucial species dromesteps are and (HUS), to grow respectively an isolate from (7,ters8 ).a for Other Diseasethere virulence are Control substantial and factors Prevention, carriedchallenges Atlanta, by GA to 30333, be overcome, and new treatmentin Europe.terial regimens. Overall molecular the analysis epidemiology.clinical identified practice, We 85 willaiming spadefi discuss-types ning to give outbreaks Before the an since overview talking In2003 this about at of review the their the MRSA NGS we aimKnowledge systems,based to provide database, we Center, would a perspective orDepartment like diseasea client-server in the on USA the database (9). Following connected the introduc- iterative algorithmD A (Clifton14), DPhil) an alignment; Microbiology and tree Downloaded from of Oxford, Roosevelt Drive, STECs include intiminAlthough (eae)andplasmid-borneenterohemolysinUSA. genotypic5Respiratory assays and Meningeal are faster Pathogens and Research haveThe Unit, diag- study wasMethods supported by the FDA Foods Program Intramural Funds individuals,and 35benefits which STs makes from for 17 thepublic ª CCs. DNA2015specimen, health specificThe WGS sequence Authors. confirmedtoof identifyadvantages Publishedspecialised fundamental its under thespecies, the and related-online to terms to disadvantages. determine of totyping the review CC BY its4of.0 thepatho- license Clinical historybioinformaticssuccess Microbiology, of DNA will depend tools sequencing via thaton at thethe Hvidovre have briefly.development 19Internet. been In Hospital applied 1977,EMBOAdditionally, of thetion Molecular genomic [4]. ofin a theIn heptavalentMedicine afield typing Vol conjugate7 method| No 3 | 2015 polysaccharide used227 for based on verticallyServices, inherited Public Health base substitutions In theOxford Rapid OX3 Evaluation 7BN, UK. of Moxifl oxacin in Tuberculosis(ehxA ),Traditional both of which strain-typingnostic contribute usefulnessNational to techniques severe Institute in disease both for Communicable (IS6110 high-income in humans RFLP, Diseases ( 6 and, of9 ). the low-income Nationaland the ORISE Sample Fellowship selection Program. and processing the research on the structures andgenic functions potential of and cells to test and its the susceptibility20 to antimi- knowledge21 and analytical methods requiredvaccine to extract (PCV7) in many countries since 2000, alone was generated.England, London, UK (REMoxTB)ness3NIHR Oxforddrugdatabases of epidemiologically Biomedicaltrial 50 andpatients algorithms had linked positive allowing t304 cultures neonatal for spoligotyping,real-time out-Frederick data and January SangerMIRU-VNTRHealth2–4of bacterial 2013,developed Laboratory ) identify this Service molecular DNA Sanger and genotypes University sequencingsurveillance sequencingepidemiology. ofof Witwatersrand, technology should method We rely will was on explore an internationally stand- Sixty-four STECcountries, cultures were these grown assays aerobically screen a overnightsmall6 number in of geneticThe findings We and included conclusions 3651 in thisM report tuberculosis are those complex of theauthors genome (G Kapatai PhD); Public Health decoding of life mysteries [1]. DNAcrobial sequencing drugs. Together, technologies14 this information facilitates theJohannesburg and 2000,interpret South Africa.this informationSamsung Medical correctly. Centre, whichIndeed, includes the capsule type 23F as one of its sev- From this analysis (Fig. 1), a total of 57,736 after 17breakResearch weeksanalysis isolates. Centre, of treatment. and Several visualisation. Bryant non-outbreak and The colleagues impact relatedtryptic of Mpatients soy the tuberculosis agar newwhich (TSA) and loci isolatesat was 37°C, replacedcommonly based and andtheir on then defi byassociated chain-termination applications DNAne WGS outbreaks. was with of extracted all drug in However,ardised MRSA publicresistance, method using isolatesthe nomenclature, health, (also butand are known do and documenting not not we necessary sequencesand rou- it represent should how from thebe the official applicable UK, position Sierra for of Leone, the Food South and Drug Africa, England National couldJohn help Radcliffe biologists Hospital, and healthspecific care providersand rational in treatment a broad of patients. For publicSungkyunkwan application University of School new of Medicinesequencing and Asia technologies Pacific en antigens, will be a decrease in the frequency of se- single-nucleotide polymorphisms (SNPs) were comparedhad isolates WGSdisruptive with closely next-generation analysis related by to mycobacterial thesequencing neonatalDNeasy methodologiesthese isolates blood techniquesas and Sanger tissuedesigned cantinely sequencing),kit be (Qiagen, Foundationthey toinaccurate produce identify have Valencia, for and Infectious when 24 Walterchanged or CA). full exclude Disease,tracing Gilbert Librariesgenomes a and Seoul, broadroutes resistance developed Southdiscussing were twicerange Korea. byAdministration. another7 De- aof other possible week. bacterial Germany, From avenues species. and Uzbekistan, There should representing also all seven global Mycobacterial Reference Oxford OX3 9DU, UK. health purposes, knowledge also needs to be gained highly disruptive, and we predict that it willrotype take 23Fmany invasive disease and carriage has been identified, 50,720Laboratory, (88%) Queen of which Mary’s were intro- range of applications such as molecular cloning, breeding, 5,6 We thank Lili Fox Vélez for editorial9 support. interspersedsuggesting4 will repetitive be unrecognised evaluated, unit-variable and community number we will tandem look chains ahead of transmission trans- intosequencing these orWGS determining technologypartment for data, future of Molecular script-based basedwhether research Cell on Biolo isolates chemical gy,and bioinformatics Sungkyunkwan development with modification University programmesin ofthe field. Nuffield Department of prepared using the Nexteramechanisms. XT kit (Illumina, Culture-based San Diego, drug-susceptibility CA) and 1 ng8 testing clades (appendix 1). We did phenotypic drug-susceptibility School of Medicine and finding pathogenic genes, andabout comparative the relatedness and evolution of the pathogen to other strainsSchool ofyears Medicine, to Suwontransform 440-746,22,23 clinical South Korea. microbiologyRespiratory laboratoriesobserved (10). However, this has been accompanied duced by 702 recombination events. This gives a repeatsClinical (MIRU-VNTR) Medicine, University to distinguish treatment failure minor diDNAff erencesthus and subsequentremains belong tothe an gold-standard cleavage outbreak. at specific assay WGS for bases. can testing Because resistance. of its testing at reference laboratories in each of the countries missionnovel and challenges. insufficientof the epidemiological same species toof investigate genomic data. Only DNA. transmission Sequencesare routes wereusedand Systemic obtained tofully. identify Infection Ultimately, with Laboratory, themec MiSeqA deployment, Healthmec IlluminaC Protection, nuc, will Agencyccr crucially and Panton– require Dentistry, London, UK studies.of the ofnew Oxford, DNA drug John sequencingregimen Radcliffe being technologies tested from ideally reinfection should identify be SNP diff erences in strains that were classed as9 REFERENCES four CC22 isolates were relatedwww.eurosurveillance.org to EMRSA-15.V2 (2 ϫNo250 com- bp)high or1 V3 effiWhole-genome kitciencyValentine (2 ϫCentre and300 for lowbp), leukocidin Infections, sequencing radioactivity,according London (PVL) to NW9 enables the Sanger 5HT, manufac-genes UK. the sequencingThe asscreening Hospital well for as was of the(appendix argi- 1) using the WHO-endorsed1 proportion (F A Drobniewski PhD); Hospital, Oxford OX3 9DU, and to allow the recognition of outbreaks . Each of theOnline substantial databases validation for of genotypicbacterial prediction typing of the fast,with accurate,a new strain. easy-to-operate, When the WGS and of cheap.isolates Intaken the after past thirtyidentical adopted by otherknown as thetechniques, resistance-associatedTropical primary Diseases, and technology Wellcome can Trust traceloci in Major while the strain Overseas “first also Programme, generation” providing1. RileyFig. LW, method Remis1. Phylogeography RS, in Helgerson an automated SD, and McGee sequence Mycobacterial HB, variation Wells JG, of DavisGrowth PMEN1. BR, (Indicator HebertA)GlobalphylogenyofPMEN1.Themaximum Department of Infectious munityUK. Introduction spread was observedsteps in amongthis process the turer’s of13 characterizing ST80 instructions, iso- the and pathogenninede novo catabolicMicrobial-assembled phenotype, mobiletyping sequences particularly methods element were gener-for allow (ACME) antimicrobial the genes characterisation resistance; (arcA to of years,17 weeks DNA of sequencingtreatment were technologies compared and with applications the same havetransmissionof laboratory by delineatingOxford and commercial University the order Clinical sequencingResearch of nucleotide Unit, Ho applications Chi Minh City,RJ, [ Olcott2likelihood]. ES, Johnson tree, constructed LM, Hargrett using NT, Blake substitutions PA, Cohen outsi MLde.1983.Hem- of recombination events, is colored according to lates.CorrespondenceIn WGS the topast successfully D.W.C. twenty years, replaced the advances conventionalated in using several typing CLC Genomicsfields arc WorkbenchD).bacteria Bioinformatics10 version to the 7.6.1 strain is (CLC also level, bio, used Ger- providing to determineorrhagic researchers colitis direct associated with with a rare Escherichia coli serotype. N Engl J Med 308: undergonepatients’ isolates tremendous before development treatment,depends the and on relapses many act as specialized, were the enginechanges species-specific (fi gure 1). method-24–27 Vietnam.thisDepartment work is yet of Microbiological to be done. Surveillance In this Review, and location, we provide as reconstructed through the phylogeny byusingparsimony.Shadedboxesanddashedlines e-mail: derrick.crook@ndcls. mantown, MD,At USA). that Strains time,Research, DNA were sequencedsequencing Statens Serum toInstitut, wasa coverage 2300 laborious Copenhagen depth and S, Denmark. radioac-681–685. http://dx.doi.org/10.1056/NEJM198303243081203. and addedof biology, information molecularologies to biology that epidemiological have in been particular, developed surveil- ledover to decades. anwww.thelancet.com/infection repeat Theseimportant units a brief (dru) overview Published information types online ofand current June multilocus for 24, 2015practice, surveillance http://dx.doi.org/10.1016/S1473-3099(15)00062-6 sequence and thenindicate of we types infectious isolatesout- that have switched capsule type from the ancestral 23F serotype. †Independent switches1 to ofobvious theox.ac.uk genome because era the which recurrent is characterized strains diff ered by vastfrom amount theranging ofSpolPred from 34tive to software 118 materialsϫ.WeusedRidomSeqSphere can11 wereDepartmentpredict required. spoligotypes of Laboratory After Medicine fromϩ yearsfor andWGSin Pathobiology,of silico improvement,2. Uni-Rump LV, Gonzalez-Escalona N, Ju W, Wang F, Cao G, Meng S, Meng J. lance.increased Creation of capacity a MRSArequire to database generate the extensive allows data. knowledge Thisclustering base resulted28 of clinical in (MLST).microbidiseases,- Furthermore,line the outbreakpotential single of investigation sequencing nucleotide technology polymorphismand the control. to same deliver serotype These are distinguished by annotation with daggers. Specific clades referred to in the text are genomeinitial isolatesdoi:10.1038/nrg3226 data andby only subsequently six or fewer broad nucleotides, range of or research single areasdata; however,Applied obtaining Biosystemsversity ofMIRU-VNTR Torontointroduced and Ontario theprofi Agency first les for automatic from Health Protection sequenc- and2015. Genomic diversity and virulence profiles of historical Escherichia coli ologists who apply labour-intensive,MLST analysis, complex and resulting and often sequences the following were annotated key diagnostic using theinformation12 inmarked the clinical on the tree: A (South Africa), I (International), V (Vietnam), S (Spain 19A), and U (USA 19A). (B) andnucleotide multipleofPublished isolates the polymorphisms online applications. accumulation based on It (SNPs). single is of necessary large 15 nucleotide The datasets to average look polymorphism back andWGS on the data need is to not(SNP) straightforward.Promotion, methods analysis Toronto, offer is Ontario, used Next-generationinsights M5G routinely 1V2, into Canada. the toInstitute pathogenesis compare ofO157strains related- and isolated natu- from clinical and environmental sources. Appl En- 7 August 2012 slow techniques to yieldNCBI the relevant Prokaryotic inginformation. machineGenomes ThisInfection, (namelyAnnotation laboratory Immunity AB370) Pipeline and after Inflammation, in(PGAP) culture 1987, University adoptingof (http: an isolate: of Glasgow, capillary identificationvironRecombinations Microbiol of81: 569–577. detectedhttp://dx.doi.org/10.1128/AEM.02616-14 in PMEN1. The panel shows the chromosomal. locations of the putative recombination store, manage and analyse them. This was the start- ness ofral MRSA history isolates. of 13an infection, and into bacterial population thediff erence history was of sequencing0·47 SNPs, and technology in 27 cases development of recurrence, to//www.ncbi.nlm.nih.gov/genome/annotation_prok reviewsequencingelectrophoresis technologiesGlasgow which(high-throughput G12 made 8TA, UK. theLaboratory sequencing sequencing)). of Microbiology, faster The and Rocke-3. Blanco moreevents JE, Blanco detected M, inAlonso each MP, terminal Mora taxon. A, Dahbi Red G, blocks Coira are MA, recombinations Blanco J. predicted to have occurred on an no SNP diff erencesing point were for present. the development By contrast, the of three the multidisciplinarysequence DNA in manyfellergenetics University,small fragments[2,3], 1230 York areas Avenue, that of New rangeresearch York, NY 10021, that USA. have an important the NGS systems (454, GA/HiSeq, and SOLiD), to compareThese genomesaccurate. varied from AB37014 4.7 tocould 5.4 Mb; detect the number 96 bases of contigs one time,2004. 500internal K Serotypes, branch virulence and, therefore, genes, and aresharedbymultipleisolatesthroughcommondescent.Blueblocksare intimin types of Shiga toxin strains from reinfections diff ered by at least 1306 SNPs from less than 100 bp to Hubert400 bp. Department29 These of short GlobalHealth, fragments Rollins School of Public(verotoxin)-producing Escherichia coli isolates from human patients: preva- their advantagesNATUREfield REVIEWS and of bioinformatics.| disadvantages, GENETICS to Hesper discuss the andper various assembly Hogeweg rangedbases origi- from a day, 52 to andimpact 303 the (data read on not human lengthshown). health could Most VOLUME strains reach issues 13 | 600SEPTEMBER such bases. asrecombinations 2012 the | develop- 601 predicted to occur on terminal branches and hence are present in only one strain. The green Health. and Division of Infectious Diseases, School of Medicine,lence in Lugo, Spain, from 1992 through 1999. J Clin Microbiol 42:311–319. from www.eurosurveillance.org the patients’nally coined original the isolates. term bioinformatics MIRU-VNTR, ain 1970do not [1]. include It was entire MIRU-VNTRment of vaccines repetitive or novel regions, antimicrobial15 blocksdrugs indicate [4],1 with recombinations predicted to have occurred along the branch to the outgroup (S. pneumoniae represented 27 previously describedEmory University, sequence Atlanta, types GA 30322, (STs); USA. sevenDepartment4. ofGriffin P.2007.CDCperspectiveonnon-O157Shigatoxin-producingE. coli genotyping method based on microsatellites, falsely© 2012 Macmillantherefore Publishers determining Limited. All Epidemiology,the rights original reserved Harvard number School ofofPublic sequence Health, 677 Huntington BM4200), used to root the tree. (C)Biologicalrelevanceofrecombination.Theheatmapshowsthedensity broadly defined as “the study of informaticsstrains were processes novel STs ( Table 1).significant Using the CGE social server and at economical the Tech- implications.(STEC) in the United States. Centers for Disease Control and Prevention, Avenue, Boston, MA 02115, USA. of independent recombination events within PMEN1 in relation to the annotation of the reference genome. identifi ed sixin of biotic 33 relapses systems”. as reinfections But it was on thethe convergencebasisnical Universityrepetitions of math- of isDenmark diffi cult when (10), the we fragments ran in silico areanalyses assembled of sero- Atlanta, GA. of diff erences in the number of repetitive sequences in into large contiguous*To sequences whom correspondence (termed should contigs). be addressed.30 E-mail All regions that have undergone 10 or more recombination events are marked and annotated (Tn916 is ematicians, computer scientists, physicists, biologists, [email protected] typing methods, such as pulsed-fieldencompassed within gel ICESpn23FST81). chemists and health professionals for the analysis of electrophoresis (PFGE), provided the intra- and inter- September/October 2015 Volume 3 Issue 5 e01067-15 Genome Announcements genomea.asm.org 1 www.thelancet.com/infectionthe biological Vol 15 September data generated2015 in the genomic revolu- laboratory reproducibility needed to create databases1077 tion that resulted in the diverse disciplines comprised 430 of isolates that could be used28 JANUARY for longitudinal 2011 VOL stud- 331 SCIENCE www.sciencemag.org within bioinformatics. The field can also be subdivided ies [3]. This allowed for bacterial typing to extend into two large, interrelated subareas: data manage- beyond outbreak investigation. Results were originally ment, encompassing the creation and management stored in local databases, using specialised software

www.eurosurveillance.org 1 Whole Genome Sequencing

• highest discriminatory method • reproducible, unambiguous and can be compared on a local, national and international scale • costs are dropping rapidly • allows us to address both short and the long-term epidemiology © by author ESCMID Online Lecture Library à All – in – One Method Phylogenetic Trees

Single Nucleotide Polymorphisms (SNP) ACTCGTGCTGCTGGC ACTGGTGCCGCTGGC ACTGGTGCCGCAGGC ACTTGTAC

ACTTGTAC

ACTTGTTC

ACTTGTTC © by author ACTTCTAC

ACTTCTAC ESCMID Online Lecture Library ACTTCTAC

ACTTCTAC S. aureus Epidemic MRSA15 – ST22

© by author ESCMID Online Lecture Library S. aureus ST22

Holden et al., Genome Res, 2013 S. aureus Epidemic MRSA15 – ST22

© by author ESCMID Online Lecture Library

Holden et al., Genome Res, 2013 Predicting Antibiotic Resistance

© by author ESCMID Online Lecture Library

Köser et al., N Eng J Med, 2012 Population Whole Snapshots Genome at Continental Sequencing Scale © by author ESCMID Online Lecture Library 2nd SRL - Structured Survey

• between January and July 2011 (5 years later) • 350 laboratories serving 453 hospitals in 26 countries • 3753 MSSA and MRSA isolates from patients with invasive S. aureus infection

© by author ESCMID Online Lecture Library

Grundmann et al., Euro Surv, 2014 The EuSCAPE project

© by author ESCMID Online Lecture Library EuSCAPE - Questionnaire Survey

© by author ESCMID Online Lecture Library

Glasner et al., Euro Surv, 2013 EuSCAPE - Evolution of CPE

40

35

30

25

20

15 Number of counteis 10 © by author 5

0 2010 2013 ESCMID Online LectureYears Library not available Stage 0 Stage 1 Stage 2a Stage 2b Stage 3 Stage 4 Stage 5

Grundmann et al., 2010, Euro Surv Glasner et al., 2013, Euro Surv EuSCAPE – Structured Survey

• between November 2013 and April 2014 • 357 laboratories serving 455 hospitals in 36 countries • 1397 suspected non-susceptible to carbapenem K. pneumoniae or E. coli and 1306 susceptible control isolates © by author ESCMID Online Lecture Library Population Genomics of Klebsiella pneumoniae

• Sequencing of 288 K. pneumoniae isolates – Global • Australia • USA/Caribbean • Vietnam (Ho Chi Minh and Hanoi) • Singapore • Laos – Multi-Hosts • human © by author • bovine • mouse ESCMID• sea Porpoise Online etc. Lecture Library

• “non-structured” collection Genomic Diversity by Host

1. Global dissemination. 2. Klebsiella ubiquitous, diversity is huge, everything is everywhere. © by author ESCMID Online Lecture Library

Isolates are coloured by host Holt et al., PNAS, 2015 K. pneumoniae,particularlywhenhypermucoid, can cause invasive K. pneumoniae genomes are sequenced. KpI, KpII, and KpIII PNAS PLUS disease in several animal species (8, 9) and is a common cause of shared 1,888 “common” genes that were present in ≥95% of mastitis in dairy herds (10). Moreover it can thrive in a range of plant genomes from each phylogroup. However, each individual hosts and environmental niches, including water, soil, and plant K. pneumoniae carried thousands of additional accessory genes matter (4, 5, 11). Although it is clear that K. pneumoniae is genetically (median 3,817, yielding a median of 5,705 genes per genome). and phenotypically diverse (12, 13), previous efforts to identify spe- Some of these are likely to be on plasmids. It is not feasible to cific features that can distinguish human clinical isolates from plant, reconstruct whole novel plasmid sequences, at scale, from animal, or environmental isolates have yielded no markers of human- short-read data; however many genes associated with virulence specific lineages (14). Three distinct phylogroups of K. pneumoniae— and AMR were correlated with the presence of known plasmids KpI, KpII, and KpIII—have been defined based on sequencing of a (SI Appendix,TableS1). small number of genes (15, 16), and it has been proposed that these The majority of accessory genes were rare, with 66% of genes phylogroups be redesignated as distinct species, namely, K. pneu- found in ≤5% of K. pneumoniae and one third found in only one moniae (KpI), K. quasipneumoniae (KpII) (17), and K. variicola genome. Analysis of G+C content diversity and taxonomy in- (KpIII) (18); however, all three cause infections in humans (15, 19). dicated the K. pneumoniae accessory genes likely were acquired Critically, the emergence of multiple drug-resistant (MDR) from a wide range of bacterial taxa including Enterobacteriaceae, K. pneumoniae has been identified as an urgent threat to human Vibrio, and Acinetobacter (SI Appendix, Fig. S1). Accessory genes health, featuring, for example, in recent reports on antimicrobial of intermediate frequency tended to be associated with one of the resistance (AMR) from the US Centers for Disease Control and major phylogroups (SI Appendix, Fig. S2) or were correlated with Prevention (CDC) (20) and the UK Department of Health (21), phylogenetic lineages of KpI (SI Appendix, Fig. S3). These data because of a high prevalence of resistance to carbapenems and highlight how broadly K. pneumoniae samples genetic diversity broad-spectrum β-lactams (22–25). The most notorious example from other genera and, importantly, the considerable genomic of AMR K. pneumoniae is a lineage identified as clonal complex plasticity that is contained within this species. (CC) 258 by multilocus sequence typing (MLST) (13); CC258 fre- quently carries the K. pneumoniae carbapenemase (KPC) gene as Whole-Genome Analysis Supports KpI, KpII, and KpIII as Distinct Species well as numerous other acquired AMR genes and has been re- K. pneumoniae, K. quasipneumoniae, and K. variicola. Within each sponsible for hospital outbreaks on several continents (13, 26, 27). phylogroup the mean pairwise nucleotide divergence between ge- The tracking of AMR organisms is one of the four core actions nomes was ∼0.5%, whereas nucleotide divergence between phy- proposed in the CDC AMR action plan to limit the emergence logroups was 3–4% (calculated across the core genes). The two MICROBIOLOGY and spread of AMR bacteria. Several recent genomic analyses KpII-A isolates were 1.8–1.9% divergent from KpII-B and were indicate that sequence type (ST) 258 is a recombinant strain that 3.2–3.7% divergent from KpI and KpIII. As the split network in- has undergone capsular exchange since its emergence as a cause dicates (Fig. 1A), there was very little evidence of homologous re- of KPC outbreaks (28–30). However, little attention has been combination between phylogroups, with the exception of a single paid to other MDR clones, which also are common and can human gut carriage isolate from Vietnam (Fig. 1A and SI Appendix, spread carbapenem resistance (31). Relatively little is known Fig. S4). Further, principal components analysis (PCA) on acces- about this broader population of K. pneumoniae, and there re- sory gene content clearly distinguished the four phylogroups (Fig. mains a lack of data regarding transmission, pathogenicity, and 1C). These data provide whole-genome support for the proposal the evolution and spread of MDR clones globally. Moreover, that KpI, KpII, and KpIII are distinct species by demonstrating K. pneumoniae is considered a source and a reservoir of AMR these phylogroups constitute discrete bacterial populations that are genes, with many of the major families being described first in evolvingPopulation independently, with structure: limited homologous KpI recombination - KpIV be- K. pneumoniae (22–25) before being identified in a range of tween groups (Fig. 1). Between-phylogroup nucleotide conservation, other Gram-negative bacteria; hence it is crucial to improve our understanding of the broader population of K. pneumoniae be- yond a handful of well-known clones. Many consider this knowl- edge to be fundamental to support efforts to control the threat to A 1,000 SNPs KpII-B human health posed by this bacterium. With this aim, we sequenced the genomes of nearly 300 di- verse K. pneumoniae isolates spanning four continents and col- KpII-A KpI (K. quasipneumoniae) lected from a range of human and animal sources, including str. D022 infection, colonization, and the environment (Dataset S1). We also performed a pangenome-wide association study (PGWAS) to look for associations between gene repertoire and disease potential/outcome and to identify distinct sets of accessory genes KpIII (K. variicola) associated with virulence traits in humans, world-wide. © by author C Results and Discussion B 30,000 KpII-B A total of 288 K. pneumoniae isolates were sequenced and com-• SNP25,000 analysis on 1743 core genes pared with publicly available whole-genome sequences for anESCMID20,000 Online Lecture LibraryKpII-A additional 40 isolates (Dataset S1). A total of 1,743 core genes,• Over 175 120 SNPs were identified 15,000 encoded in 1.48 Mbp of sequence, were conserved in all 328 PC4 • 4 distinct10,000 phylogenetic groups wereKpI identified genomes, and we identified 175,120 SNPs within these genes. All

Unique protein count KpI 5,000 Split network analysis and maximum likelihood (ML) phyloge- KpII netic analysis of these SNPs (Fig. 1A) identified four phy- KpIII KpIII 0 -20 -10 0 10 logroups, with 100% bootstrap support and corresponding to the 0 100 200 300 -10 0 10 20 30 groups previously defined as KpI, KpII-A, KpII-B, and KpIII. K. pneumoniae genomes PC1 We identified a pangenome of 29,886 unique protein-coding Fig. 1. The phylogroups and pangenome of K. pneumoniae.(A) Split sequences among the 328 K. pneumoniae genomes. The gene ac- network of 328 K. pneumoniae genomes with phylogroups highlighted. cumulation curve (Fig. 1B)revealedanopenpangenome,indicating (B) Pangenome accumulation curves. (C) PCA analysis based on the presence that further genes will continue to be detected as additional of common (5–95% prevalence) accessory genes.

Holt et al. PNAS | Published online June 22, 2015 | E3575 KpII & KpIII

A B Source Human Bovine Plant + + Ant Sea Water + + Location Australia + Canada US * Laos + Singapore Vietnam + * © by author Phenotype * Hypermucoid + ESBL+ + +

50% Infection KpIII 63% Infection ESCMIDKpII • 39% Invasive Online Lecture• 40% Library Invasive • 50% Nosocomial (K. variicola) • 42% Nosocomial • 0 Deaths • 0 Deaths Centre for Genomic Pathogen Surveillance

• shared sequence and metadata repository for global public health

• pathogen surveys and collation of existing datasets

• collective interpretation by public health/scientific community © by author ESCMID Online Lecture Library Centre for Genomic Pathogen Surveillance

• static datasets – good baseline population

• addition of data in real-time from laboratories with sequencing capabilities

• prediction of resistance / virulence determinants © by author • tree construction and real-time tree interaction ESCMID Online Lecture Library at 96%, is at the level commonly usedasacutoffforspeciesdiffer- KpI, with KpIII occupying a niche in which the ability to fix nitrogen PNAS PLUS C. difficile entiation in taxonomic analysis (32), and differences in gene content is essential and KpI occupying a niche in which such ability is un- (Fig. 1C)furthersupportthepropositionthatKpI,KpII,andKpIII necessary and possibly disadvantageous and selected against. The V. cholerae MTB are separately evolving populations that can be considered as intermediate frequency of nif in KpII is intriguing; because all our separate species. KpII isolates originated from humans, we hypothesize that nitrogen The observed speciation into genetically distinct phylogroups in- fixing is important in environmental-source populations of KpII but dicates that there are barriers to gene flow among these closely re- the nif operon is lost rapidly upon colonization of humans, possibly lated populations. These barriers could arise through ecological through negative selection. separation into distinct niches, mechanistic barriers to homologous recombination, or adaptive selection against hybrid genotypes (33). Population Structure and Dynamics of K. pneumoniae KpI. We iden- There are no obvious mechanistic barriers to homologous re- tified a total of 91,898 core genome SNPs among 283 KpI ge- combination betweenC. KpI, difficile KpII, and KpIII; indeed the observa- nomes (247 newly sequenced and 36 publicly available genome tion of a large recombination between KpI and KpII (SI Appendix, sequences) and inferred from these SNPs an ML phylogeny (Fig. V. cholerae Fig. S4) shows that homologous recombination is possible, al- 2SpeciesA) and neighbor-joining split network (SI of Appendix, Fig.InterestMTB S6A). in cGPS though the rarity of this event (1 out of >300 genomes) suggests These revealed a deep branching, star-like population structure, there could be selection against such hybrids. Although our sam- suggesting an early radiation of K. pneumoniae KpI into hundreds pling of K. pneumoniae isolates was blind to the distinction of KpI of distinct equally distant lineages (Fig. 2A). The deep branching from KpII and KpIII, the characteristics of isolates falling into structure, which was supported by genome-specific and lineage- each phylogroup were quite distinct (SI Appendix,Fig.S5), sug- specific SNPs (SI Appendix, Fig. S6B), is polytomous at the root gesting that their speciation is likely driven by long-term separation with low bootstrap support for sequential branching patterns in distinct ecological niches. (Fig. 2A). Differences in gene content provided further support Campylobacter Our isolate collection, which focused on human and bovine- for the inferred population structure (SI Appendix, Fig. S3). associated bacteria but also contained isolates from nonhuman We divided the KpI genomes into 157 distinct phylogenetic primates and marine mammals, was comprised mainly (87%) of lineages based on analysis of the core gene ML tree using RAMI K. pneumoniae KpI. Because a number of different criteria, (Fig. 2A) (38). Median divergence between lineages was 0.46% S. aureus mainly unrelatedS. to core aureus phylogeny, were used to select the (range 0.04–0.61%), whereas genomes within the same lineage isolates included in this study, there is unlikely to have been a differed by a median of 0.02% (range 0–0.08%) and generally E. coli

sampling bias with respect to phylogroup. Therefore, we hy- sharedK. the same pneumoniae MLST sequence type. We used fineSTRUCTURE MICROBIOLOGY S. pneumoniae pothesize that this preponderance of KpI is associated with the S. pneunomiae bias of our collection toward mammalian-associated infection isolates. Other studies of human clinical K. pneumoniae isolates 23 35 report similarly high rates of KpI and low rates of KpII or KpIII A 133 186Campylobacter 416 540 198 15 (34, 35), and all the sequence types reported in the literature as 592 14 being linked to hospital outbreaks or pyogenic liver abscess be- 228 36 long to KpI (including CC258 and CC23). Notably, all the pub- 34 111 C. difficile licly available genomes of K. pneumoniaeS.clinical isolatesaureus that we 17 60 analyzed, including the K. pneumoniae subsp. rhinoscleromatis 17 V. cholerae20 43 MTB reference genome, clustered within KpI (15). S. pneumoniaeAlthough both KpII (K. quasipneumoniae)andKpIII(K. variicola) 42 are capable of causing infections in humans, they appear to be less 1109 pathogenic than KpI, being associated more frequently with carriage 1 Shigella (SI Appendix,Fig.S5). KpII was found almost exclusively in humans but was generally associated with colonization (50%) or HA infection

(25%), consistent with low virulence and opportunistic infection (258) 184 (SI Appendix,Fig.S5A). No KpII or KpIII isolates in our collection 185 were linked to either liver abscess or the death of a patient. We 300 detected no KpII among the bovine isolates. In contrast, almost half 11 25 65 of our KpIII isolates were of bovine origin, compared with 20% of 395 Salmonella KpI isolates [odds ratio (OR) 5.2; P = 0.001; Fisher’sexacttest)(SI 105 Appendix,Fig.S5). The KpIII phylogroup was proposed in 2004 to be 45 36 147 48 100 0 37 309 adistinctplant-associatednitrogen-fixingspecies,K. variicola,based 221 Rh 495 on DNA–DNA hybridization and gene-sequence analysis (18). It has Shigella been isolated frequently from a wide range of plants (18, 36) and also has been shown to be an important nitrogen-fixing symbiont of leaf- B Campylobacter cutter ants (37). Consistent with these reports, in our analysis the 60 US (0.98) Campylobacter two publicN. reference gonorrhoeae genomes of plant-associated K. pneumoniae Vietnam © by(0.97) author belonged to KpIII (SI Appendix,Fig.S5). It is likely that the Australia (0.95) KpI Lineages Laos (0.93) high number of bovine-derived KpIII isolates compared with Indonesia (0.89) SalmonellaSalmonella0 20 40 Singapore (0.86) Acinetobacter ssp. human-derived KpIII isolates reflects bovine consumption of raw 020406080 S. aureus plant matter rather than any particular adaptation of KpIII to Isolates Sampled colonize or infect bovine hosts. Consistent with this notion, seven of the nine bovine KpIII were fecal carriage isolates, and only Fig. 2. Population structure of the K. pneumoniae KpI phylogroup. (A)Phy- logeny of core gene SNPs. Branch colors indicate bootstrap support according etc.. two were associated with infection. Importantly,S. pneumoniae our data show to the legend provided in the figure. Black leaves indicate bovine isolates. that the nif nitrogen-fixing operon (36) was present in all KpIII Lineages with more than one genome are highlighted in alternating colors (K. variicola) genomes, supporting its identification as a nitrogen- and labeled by sequence type. Rh, rhinoscleromatis. (B) Rarefaction curves fixing species. In contrast,ESCMIDnif was detected in only one KpI genome show the accumulationOnline of KpI lineages in each country, labeled withLecture Simpson’s Library (a bovine mastitis isolate) and in half of the KpII-B genomes. This diversity index (1-D) on a scale of 0–1 (0 = no diversity, i.e., all isolates are in finding strongly supports the ecological separation of KpIII from same lineage; 1 = total diversity, i.e., every isolate is in a different lineage).

Holt et al. PNAS Early Edition | 3of8 etc.. Shigella and many more...... Salmonella

etc.. cGPS Applications

http://phylocanvas.net

Demonstration Tomorrow! http://microreact.org © by author ESCMID Online Lecture Library

http://wgsa.net WGSA Acknowledgments

The SRL working group The EuSCAPE working group University Medical Center Groningen Prof. Hajo Grundmann

Imperial College London & Wellcome Genome Campus Dr. David Aanensen © by authorand the cGPS Team ESCMID Online Lecture Library Wellcome Trust Sanger Institute (WGC) Dr. Christine Boinett Dr. Nick Thomson Pathogen Genomics Group Thank you very much for your attention!

© by author ESCMID Online Lecture Library