Whole Genome Epidemiological Typing of Escherichia Coli
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from orbit.dtu.dk on: Oct 05, 2021 Whole Genome Epidemiological Typing of Escherichia coli Kaas, Rolf Sommer Publication date: 2014 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Kaas, R. S. (2014). Whole Genome Epidemiological Typing of Escherichia coli. Technical University of Denmark. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Whole Genome Epidemiological Typing of Escherichia coli Rolf Sommer Kaas PhD Thesis 2014 Supervisors and Funding This thesis is written in collaboration with three institutions: DTU Food, DTU Center for Biological Sequence analysis (CBS), and Statens Serum Institute. The main supervisor was Frank Møller Aarestrup (DTU Food). Co-supervisor on the first half of the PhD was David W. Ussery (CBS) and co-supervisor on the second half was Ole Lund (CBS). The PhD was supported by the Center for Genomic Epidemiology (www.genomicepidemiology.org) grant 09-067103/DSF from the Danish Council for Strategic Research. 1 Table of Contents Supervisors*and*Funding*................................................................................................*1! Table*of*Contents*...............................................................................................................*2! Acknowledgements*...........................................................................................................*4! List*of*original*articles*.....................................................................................................*6! List*of*original*articles*not*included*in*PhD*.............................................................*7! Summary*...............................................................................................................................*8! Danish*Summary*.............................................................................................................*11! Problem*Statement*........................................................................................................*14! E.*coli*...................................................................................................................................*15! Taxonomy*..................................................................................................................................*15! Ecology*........................................................................................................................................*15! Pathogenic*Classification*......................................................................................................*16! Epidemiology*&*Clinical*importance*................................................................................*17! Pathogenesis*.............................................................................................................................*20! Typing*of*Escherichia*coli*............................................................................................*26! Serotyping*..................................................................................................................................*27! Pulse*Field*Gel*Electrophorese*(PFGE)*............................................................................*28! Multi*Locus*Sequence*Typing*(MLST)*...............................................................................*30! Next*generation*sequencing*(NGS)*in*epidemiology*...................................................*31! Defining*a*gene*................................................................................................................*34! The*E.#coli*genome*..........................................................................................................*37! Whole*genome*typing*....................................................................................................*40! 2 Single*Nucleotide*Polymorphism*(SNP)*analysis*.........................................................*40! KUmer,*nucleotide*difference*(ND),*and*geneUbyUgene*...............................................*44! Defining*clones*.........................................................................................................................*45! Future*perspectives,*challenges*&*Conclusion*.....................................................*49! Conclusion*..................................................................................................................................*51! References*.........................................................................................................................*53! Articles*...............................................................................................................................*65! 3 Acknowledgements The most important person to thank is of cause my awe-inspiring wife Chilie Maria Sommer Kaas. Chilie gave birth to our lovely daughter half way through this PhD and having a baby and an absent minded, busy husband cannot have been easy. But she has remained supportive and even managed to travel with me on my external research stay to the United States, so that I didn’t have to leave my only 1 year old daughter for several months. I want to thank Frank M. Aarestrup for including me in his wild ambitions. In the beginning I felt like we were running really fast towards a goal we weren’t sure existed down a path completely covered in impenetrable fog. However, as time has passed, the path remained, the goal became clearer, and the fog slowly started to lift. I am truly excited to do research with Frank and exciting research requires a leap of faith from time to time. I am also really thankful that Ole Lund stepped in as co-supervisor after David left for Oakridge National Lab. Ole has many great ideas, and some of them even have to do with science. Ole has been the key in development of several of the bioinformatic methods. As mentioned, David Ussery left for Oakridge, but he still deserves huge thanks for including me in exciting projects, and not least his dedication to his students. I also want to thank Rene Hendriksen and Henrik Hasman for including me in several very exciting research projects, which has ended up in several of the publications not included in this PhD. Whenever administrative tasks seemed confusing or overwhelming Vibeke Hammer stepped in and made everything better. Thanks to Vibeke for relieving me of many administrative headaches. 4 Thank you also to Carsten Friis who was the first real bioinformatician in Franks group. Carsten took very good care of me when I started here, and made sure to introduce me to all the right people. A huge thanks also goes to Mette Christiansen and Maria Seier-Petersen who also received me with open arms and made me feel right at home. Mette and Maria showed me that there was more to a PhD than studying and writing. I would also like to thank Marlene Hansen, who always claims that she can’t contribute anything to my PhD, but nonetheless has. Marlene provided me with great articles to read but also gives great feedback both professionally and personally, which is an important skill when you share an office. This leads me to Pimlapas “Shinny” Leekitcharoenphon who I also owe many thanks, for all her help with several articles and always spreading some good mood in the office. As a bioinformatician I am most dependent on the people I see the least during my workday – the technicians. Many thanks go to all the technicians who made sure that there were actually sequence data for me to work with. Thanks to Katrine Joensen and Ea Zankari for being my surrogate office colleagues when Shinny and Marlene were absent. Finally thanks to all of department G for providing a great working environment. 5 List of original articles I. Kaas RS, Friis C, Ussery DW, Aarestrup FM (2012) Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes. BMC Genomics 13:577. II. Leekitcharoenphon P, Kaas RS, Thomsen MCF, Friis C, Rasmussen S, Aarestrup FM (2012) snpTree - a web-server to identify and construct SNP trees from whole genome sequence data. BMC Genomics 13(Suppl 7):S6. III. Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O (2014) Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms. PLoS ONE 9(8): e104984. IV. Kaas RS, Rasmussen S, Scheutz F, Lund O, Aarestrup FM (2014) Investigation of methods to define Escherichia coli outbreak strains based on whole genome sequence data from 10 different outbreaks. Manuscript for submission to: J Clin Microbiol. 6 List of original articles not included in PhD Evaluation of whole genome sequencing for outbreak detection of Salmonella enterica. Leekitcharoenphon, Pimlapas; Nielsen, Eva M.; Kaas, Rolf Sommer; Lund, Ole; Aarestrup, Frank Møller.