Nring Thesis Corrected 161020.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
University of Bath PHD Investigating the genome of Bordetella pertussis using long-read sequencing Ring, Natalie Award date: 2020 Awarding institution: University of Bath Link to publication Alternative formats If you require this document in an alternative format, please contact: [email protected] General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 10. Oct. 2021 Investigating the genome of Bordetella pertussis using long-read sequencing submitted by Natalie Ring for the degree of Doctor of Philosophy of the University of Bath Department of Biology and Biochemistry July 2020 ORCID ID: 0000-0002-8971-8507 COPYRIGHT Attention is drawn to the fact that copyright of this thesis rests with the author. A copy of this thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with the author and that they must not copy it or use material from it except as permitted by law or with the consent of the author. This thesis may be made available for consultation within the University Library and may be photocopied or lent to other libraries for the purposes of consultations with effect from…………………..(date) Signed on behalf of the Faculty of Science……………………….. i Acknowledgements Firstly, many thanks to Dr Stefan Bagby and Dr Andrew Preston for their supervision, guidance, and patience throughout the last three (and then some more) years, as well as for coming up with the project idea and finding funding for it in the first place. On that note, I am indebted to the University of Bath and Oxford Nanopore Technologies for funding this project, and to ONT for letting me present (and paying my transport and accommodation fees!) at several of their international meetings. Thanks also to the Microbiology Society and Genetics Society for funding travel to two conferences in 2018, and to Bath’s Department of Biology and Biochemistry for funding my trip to the International Bordetella Symposium in 2019. My first first-author research paper (Chapter 2) would not have been possible without the bucket load of data I produced at ONT HQ in summer 2017. I am very grateful to ONT for the provision of many flow cells and reagents during that time, and to Dan Fordham and Phill James for sharing their lab and expertise with me. Additional ONT thanks go to Dan Turner, my industrial supervisor. Thanks to the Nanopore Group in Santa Cruz for helping Stefan get started with the R7 MinION sequencing in 2015, and for one final R9.4 run which helped me at the beginning of my PhD in 2016. Likewise, acknowledgement is due to Robert Chapman and Joshua Spear, Bagby Lab final year project students in 2015 and 2016, who performed some of the original nanopore sequencing of Bordetella pertussis, laying the groundwork for this PhD project. Josh Quick and Nick Loman at the University of Birmingham hosted me for several days over the summer of 2018, helping me to extract and sequence some (fairly) ultra-long B. pertussis DNA from various strains for Chapter 4, even if my assembly masterplan didn’t work in the end. Thanks to them for their time, as well as for being nanopore gurus in general. Many thanks go to Audrey and colleagues at New Zealand’s Institute of Environmental Science and Research for selecting, preparing and shipping their B. pertussis samples around the world, after being contacted out of the blue by us in May 2018. Without them, Chapter 3 wouldn’t exist! Thanks also to Ayla, Mohsina, Mareike and everyone else who has come and gone from 4S 0.30 for their support and comradery, for watching Crufts and various sports with me during quiet moments, and for listening to me moan when things weren’t working more often than was probably interesting! Massive thanks to Iain MacArthur and Stacy Ramkissoon for their endless help and moral support during my ventures into “proper” microbiology lab work, and particularly to Stacy for her friendship and positivity since Day One (including so many lunches, WhatsApp gossips, crazed phone calls and trips to Disneyland!). To my parents, sister and brother-in-law, friends from back home, and parents-in-law, thanks for providing much needed distraction through trips to Auckland, Mayrhofen, Cornwall and beyond, as well as for all your support and, again, listening to more moaning than was probably strictly necessary. Thanks also to my nephew Arthur for being an endless source of entertainment and pride since November 2017, and acknowledgment in advance to KulaRing number 2, due September 2020. I will be forever grateful to my husband and the most patient person I know, Jamie McBrien. Thanks for letting me drag you back to Bath (and now to Edinburgh), for listening to roughly a thousand different iterations of essentially the same presentation, and for your help with the VBA code used in Chapter 5 – finally, a use for your least useful skill! Lastly, thanks to McBRing junior for unknowingly keeping me company whilst I finished this thesis - we look forward to meeting you in November 2020! ii Abstract Whooping cough, the respiratory disease caused by the bacterium Bordetella pertussis, has been resurgent for the last thirty years. Several reasons have been suggested for this resurgence, including increased awareness and improved diagnosis techniques, waning immunity conferred by the whooping cough vaccine, and genetic shifts of circulating bacteria away from the vaccine strains. These genetic changes may have been accelerated by the switch in many countries from a whole cell vaccine to an acellular vaccine containing just one to five B. pertussis antigens. Aside from certain key genes, however, variation between B. pertussis strains appears to be very limited on the level of single genes. Instead, in recent years, a picture of genome level inter-strain variation has been emerging, beginning with the revelation that genomic rearrangements, mediated by the numerous insertion sequence elements in the B. pertussis genome, are common. Investigations of whole genome variation, alongside classic molecular epidemiological studies, may therefore be important to our understanding of how genetic changes in B. pertussis are contributing to whooping cough resurgence. Many genome level changes may be observable only in closed genome sequences. The highly repetitive B. pertussis genome, which contains up to 300 identical copies of a 1,000 bp insertion sequence, has traditionally been difficult to resolve to the single-contig level, with most genomes assembled using Illumina sequencing data consisting of at least as many contigs as there are insertion sequence copies. Here, I first define a sequencing and data processing pipeline, utilising nanopore long-read and Illumina short-read sequencing to enable the assembly of accurate, closed B. pertussis genome sequences. Using this hybrid sequencing pipeline, I then investigate the genomes of 66 B. pertussis strains isolated in New Zealand between 1982 and 2008. New Zealand commonly sees a higher rate of incidence of whooping cough than most other countries, and no isolates from the country had previously been sequenced. Several of the genomic features of the New Zealand isolates match those observed in many other countries, including a selective sweep from strains carrying the ptxP1 allele to the ptxP3 allele and a recent rapid increase in the number of strains which are unable to produce pertactin, one of the antigens usually included in the acellular vaccine. Nonetheless, the data also indicate that the strains circulating in New Zealand might be more genetically similar than those circulating in other countries, particularly in recent years, and particularly during whooping cough outbreaks. This strain screen is the first of its kind to use nanopore sequencing, and to include traditional analysis of genotypes with analysis of genome level variation, such as rearrangements and copy number variations. Next, I attempt to investigate an ultra-long genomic duplication identified whilst testing the hybrid assembly pipeline on five UK B. pertussis strains. This work ultimately shows that the complexity of the B. pertussis genome can make in vitro studies into the links between genotype and phenotype difficult. Finally, I use the closed genome sequences of every B. pertussis strain sequenced with long read technologies to investigate any recent changes in filamentous haemagglutinin, another of the antigens included in the acellular vaccine. Studies of the genes coding for this antigen have typically been limited by its length and repetitive nature, which have hindered attempts to assemble its whole sequence. This work reveals a homopolymeric locus which may be prone to slippage and which, under selective pressure, could therefore lead to an increase in the numbers of strains which are deficient in this vaccine antigen. Overall, the work in this thesis demonstrates how long-read sequencing can reveal previously unstudied or intractable aspects of B. pertussis biology, along with defining an affordable method for using nanopore long-read sequencing to assemble and study closed B. pertussis genomes. iii Conflict of Interest Statement This PhD was 50% funded by Oxford Nanopore Technologies.