Barcoded Transposon Directed Insertion-Site Sequencing (Tradis); a Tool to Improve Our Understanding of the Functional Genomics of Streptococcus Equi Subsp
Total Page:16
File Type:pdf, Size:1020Kb
Barcoded Transposon Directed Insertion-site Sequencing (TraDIS); a tool to improve our understanding of the functional genomics of Streptococcus equi subsp. equi Amelia Rose Louise Charbonneau Department of Veterinary Medicine University of Cambridge This thesis is submitted for the degree of Doctor of Philosophy Lucy Cavendish College December 2018 Barcoded Transposon Directed Insertion-site Sequencing (TraDIS); a tool to improve our understanding of the functional genomics of Streptococcus equi subsp. equi Amelia Rose Louise Charbonneau Strangles, caused by Streptococcus equi (S. equi) remains the most frequently diagnosed infectious disease of horses and is a cause of significant welfare and economic cost. Vaccine research has been limited by the time taken to make mutations in individual genes to determine their role in the disease process. However, the development of transposon directed insertion-site sequencing (TraDIS) technologies provides an opportunity to simultaneously determine the importance of every gene in S. equi under disease relevant conditions, significantly enhancing the capacity to identify new vaccine targets. In this project, a novel barcoded TraDIS technique was designed, which identified that 19.5 percent of the S. equi genome is essential to basic survival in rich medium in vitro, 73.4 percent of genes being non-essential, with the remainder either not defined or of an ambiguous assignment. Comparative analysis revealed that more than 83 percent of the essential gene set of S. equi was concordant with the essential genomes of S. pyogenes and S. agalactiae, highlighting the close genetic relationships between these important pathogenic bacteria. Barcoded libraries were exposed to hydrogen peroxide (H2O2) and whole equine blood, to simulate the interaction with the equine immune system. Sequencing of surviving mutants enabled identification of genes important to S. equi under these conditions in vitro. Fifteen and 36 genes were implicated in the survival of S. equi in H2O2 and whole equine blood, respectively. Results were validated by generating deletion mutant strains in 4 of the genes (pyrP, mnmE, addA and recG). Mutant strains were exposed to H2O2 or whole equine blood and surviving bacteria measured over time. An additional 2 deletion strains in eqbE and hasA, generated prior to this project, were also utilised. Barcoded TraDIS is proposed to reduce the effects of stochastic loss commonly seen in similar datasets, enhancing the ability to resolve differences in the fitness of mutants. To determine the in vivo capabilities of barcoded TraDIS, 12 Welsh mountain ponies were each infected with 2 of 3 barcoded libraries. Viable mutants were recovered and sequenced from the abscess material of infected lymph nodes and data analysed both exploiting (barcoded analysis; BC) and disregarding (per animal analysis; PA) the input library barcodes. Exploiting the barcodes enables output data to be combined on a per input library basis, as opposed to a per animal basis as is traditionally completed in comparable in vivo transposon library studies. From the BC analysis, sequencing identified 368 genes required for fitness. Mutations in a further 85 genes conferred a fitness advantage in vivo. In the PA analysis, only 97 genes required for fitness were identified, which were all similarly identified in the barcoded analysis. No genes in which an insertion conferred a fitness advantage were identified in the PA analysis. To validate these results and confirm the benefit of applying a barcoded technique, 12 genes required for fitness were selected, plus 1 control gene where transposon insertions did not alter fitness, for tagged allelic replacement mutagenesis and repeat challenge in vivo. Seven genes required for fitness in both methods of analysis were selected, plus an additional 5 genes uniquely identified by the BC analysis. All deletion mutants appeared to be attenuated in vivo, however the control mutants and wild-type S. equi did not behave as expected, confounding statistical analysis. Thirty-nine percent (14/36) and 60 percent (9/15) of fitness genes identified in the whole equine blood and H2O2 TraDIS screens, respectively, were also identified as being required for in vivo fitness. Nine consensus genes were identified as required in all 3 experiments. Comparison of the genes implicated in in vivo survival of S. equi to those in S. pyogenes in a non-human primate model of necrotising myositis and in a mouse model of subcutaneous infection, uncovered a set of 23 pan-species fitness genes. Eighteen genes were also commonly identified between the S. equi in vivo data and S. pyogenes ex vivo in human saliva, alluding to the potential genes required by S. equi to survive in the nasopharynx before translocation to the local lymph nodes. The data presented in this thesis provide an unprecedented insight into the mechanisms employed by S. equi to cause disease in the natural host. The data also shed light on the pan-streptococcal pathways important for virulence that are likely to be important for future development of novel therapeutics and vaccines. Declaration I hereby declare that the contents of this dissertation are the result of my own work except where specific reference is made to the work of others and includes nothing which is the outcome of work done in collaboration except where specified in the text. This work has not been submitted in whole or in part for consideration for any other degree or qualification in this, or any other university. This dissertation contains fewer than 60,000 words excluding appendices, bibliography, tables and figures. Signed:______________________________________________________________ Date:_________________________________________________________________ Amelia Rose Louise Charbonneau Acknowledgements I would like to thank Dr Andrew Waller, for his outstanding guidance and encouragement throughout my time at the Animal Health Trust. He has been an incredibly supportive and considerate supervisor who has believed in me since day 1. His enthusiasm and optimism have made this PhD a rewarding and fulfilling experience that I will remember fondly. I would like to thank other members of the Animal Health Trust family, Carl Robinson for his advice and assistance in particular with generating the validation mutants used in this thesis and overall guidance when it came to troubleshooting experiments. Dr Oliver Forman assisted in the initial TraDIS runs and taught me most of what I know about sequencing. I am also grateful to Catriona Mitchell for her support in the laboratory but also for the emotional support and friendship that was completely invaluable, especially over the last year. All of the in vivo work completed in this PhD would not have run as smoothly as it did, without her assistance. I would also like to thank my Animal Health Trust mentor, Sally Ricketts, for keeping an eye on me and meeting up for chats about my progress. I would also like to thank 2 people in particular at the Wellcome Sanger Institute, Dr Amy Cain and Dr Matthew Mayho for their interest in my project and for the help with experimental design. Sequencing of the in vivo samples at the Sanger would not have been possible without Dr Matthew Mayho. Dr Lars Barquist also provided invaluable advice regarding the analysis of the in vivo data and helped implement the TraDIS scripts. Prof. Duncan Maskell and Prof. James Leigh have also provided me with a wealth of advice and have spent many hours discussing data and experiments with me. Regular meetings with both Duncan and James always made me feel good about my research and spurred me on for the next experiment. I am incredibly grateful for their encouragement and interest in my project. Last, but not least, I would like to acknowledge my amazing network of family and friends that have supported me thought my studies. In particular, my partner Richard Armstrong, has seen me through the good and the bad. He has always believed in me and has encouraged me through this PhD, to make it the best piece of research I could produce. Abstract Strangles, caused by Streptococcus equi (S. equi) remains the most frequently diagnosed infectious disease of horses and is a cause of significant welfare and economic cost. Vaccine research has been limited by the time taken to make mutations in individual genes to determine their role in the disease process. However, the development of transposon directed insertion-site sequencing (TraDIS) technologies provides an opportunity to simultaneously determine the importance of every gene in S. equi under disease relevant conditions, significantly enhancing the capacity to identify new vaccine targets. In this project, a novel barcoded TraDIS technique was designed, which identified that 19.5 percent of the S. equi genome is essential to basic survival in rich medium in vitro, 73.4 percent of genes being non-essential, with the remainder either not defined or of an ambiguous assignment. Comparative analysis revealed that more than 83 percent of the essential gene set of S. equi was concordant with the essential genomes of S. pyogenes and S. agalactiae, highlighting the close genetic relationships between these important pathogenic bacteria. Barcoded libraries were exposed to hydrogen peroxide (H2O2) and whole equine blood, to simulate the interaction with the equine immune system. Sequencing of surviving mutants enabled identification of genes important to S. equi under these conditions in vitro. Fifteen and 36 genes were implicated in the survival of S. equi in H2O2 and whole equine blood, respectively. Results were validated by generating deletion mutant strains in 4 of the genes (pyrP, mnmE, addA and recG). Mutant strains were exposed to H2O2 or whole equine blood and surviving bacteria measured over time. An additional 2 deletion strains in eqbE and hasA, generated prior to this project, were also utilised. Barcoded TraDIS is proposed to reduce the effects of stochastic loss commonly seen in similar datasets, enhancing the ability to resolve differences in the fitness of mutants.