Imperial College London London Institute of Medical Sciences Patterns of Horizontal Gene Transfer into the Geobacillus Clade Alexander Dmitriyevich Esin September 2018 Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy of Imperial College London For my grandmother, Marina. Without you I would have never been on this path. Your unwavering strength, love, and fierce intellect inspired me from childhood and your memory will always be with me. 2 Declaration I declare that the work presented in this submission has been undertaken by me, including all analyses performed. To the best of my knowledge it contains no material previously published or presented by others, nor material which has been accepted for any other degree of any university or other institute of higher learning, except where due acknowledgement is made in the text. 3 The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work. 4 Abstract Horizontal gene transfer (HGT) is the major driver behind rapid bacterial adaptation to a host of diverse environments and conditions. Successful HGT is dependent on overcoming a number of barriers on transfer to a new host, one of which is adhering to the adaptive architecture of the recipient genome. My aim was to investigate how HGT gain is spatially patterned, both on arrival and in long term maintenance. I chose to focus on HGT into a model group of Bacillaceae, that includes Geobacillus spp., not only to avoid ambiguity associated with aggregate analyses, but also because observed biases could enhance our ability to engineer this emerging chassis. In this thesis I first present my methodology for detecting HGT into a specific group; I augmented existing approaches to improve computational tractability while deriving a stringent set of horizontally transferred (HT) gene pre- dictions. In the second results chapter, I assess the predicted HGTs in the context of previous work to find that they are highly consistent, justifying my detection approach. Finally, I dissect the topology of HT genes across Geobacillus genomes to find three large zones of contiguous HGT enrichment. I find that this patterning is driven by gene function, with metabolic genes clustering towards the terminus. Interestingly, the HGT-rich origin-proximal zones, home to many HT genes involved in membrane biogenesis, overlap with the section of the chromosome trapped within the nascent endospore during sporulation. Similar functional enrichment patterns are found in other spore-forming Bacilli, but not those unable to sporulate. This suggests that HGT flow into Geobacillus genomes may be spatially constrained, at 5 least in part, by the sporulation program. In the final chapter, I discuss the im- plications of this research in a bioengineering context, and suggest possible future directions to confirm the link between HGT topology and sporulation. 6 Acknowledgements First and foremost, I must thank my supervisor Toby. You provided me with this opportunity to study for a PhD with you, despite me wearing a suit to my interview. I could not have been more lucky with a supervisor and mentor. Thank you for your astute insights and your always timely reminders to stop focusing on the small things and look at the bigger picture. Your immeasurable patience with me throughout the four years has been nothing short of extraordinary. The Molecular Systems group - thank you for all your support and days of cumulative patience hearing out my, often confusing, ramblings. Thank you Jelena for always being ready to help. Maria, Shivani, and Val - despite making me explode with kittens, you have been awesome and always there for me, thank you. Antoine and Jake, you have brought so much energy to the lab - keep the Costa del Cutter tradition strong and be the wizards you know you can be. Voracious Vas, thank you for always keeping an eye out for me, even if it led to many awkward stares - you are now in charge of the charm! Thank you also to everyone in our neighbouring CRG group. From winning the rounders trophy as the “Boris’ Babes”, to my frequent and unashamed theft of your coffee, you’ve made the last four years so much better. Leonie, the figures in this thesis are dedicated to you. Without your mentorship on all things colourful this publication would not look half as good, thank you. 7 Tom Ellis and members of Ellis Lab, thank you for your many keen observations and suggestions, and for always making me feel like part of the group. Hugh, Julian, Ketan - my friends and, most importantly, partners in tea and spades. It is an incredible privilege to know that you are always there, whether for a walk, a chat, or an eerily lifelike impersonation of a certain aquatic mammal! The finest co-workers I could have asked for, Alex and James. It just would have never been the same without you. For the games, the drinks, the risotto, the all-nighters arguing over CAPS lock, the snakes, and the krabs. Life may take us sideways, but I will always find you in the oven. My family. There is no warmer embrace to come back to than yours. Nicholas and Catherine, you have never failed to brighten my day, even when I did not know I needed it. Max, you have been an incredible and loving force driving me to succeed since I can remember; you have taught me so much about so many things, I could not be more proud to be your son. Mama, you, more than anyone, have made all of this possible. You have done so much for me that gratitude cannot be put into words. I love you all. Alena, you have been my rock. Through my ups, and my downs, my dis- tractions, and my laziness you have encouraged, supported, humoured, and pushed me. From the time I applied, to the end of my PhD, you have taken every step with me and I cannot imagine this journey without you. Stay magical, my little Zebra. 8 Contents Abstract 5 Acknowledgements 7 List of Figures 12 List of Tables 14 Abbreviations 15 List of Publications 17 1 Introduction 18 1.1 Gene turnover in bacterial genomes . 19 1.2 HGT in evolution: preference to pattern . 20 1.2.1 Function . 21 1.2.2 Form follows function . 23 1.2.3 HGT: a matter of time . 27 1.2.4 Space: the final frontier . 29 1.3 Perspectives for synthetic biology . 33 1.4 The GPA clade . 36 1.5 Aims of this thesis . 39 2 Clade-specific HGT inference 41 9 2.1 Introduction . 42 2.1.1 Compositional and phylogenetic approaches . 42 2.1.2 Specific steps in phylogenetic analysis . 47 2.1.3 Chapter aims . 55 2.2 Methods . 57 2.3 Results . 60 2.3.1 Two genomic datasets . 60 2.3.2 Genes to proteins, proteins to homologues . 61 2.3.3 Homologues to gene families . 62 2.3.4 Gene families to gene trees . 67 2.3.5 Reference tree reconstruction and local correction . 68 2.3.6 HGT inference with Mowgli . 72 2.3.7 Refining HGT and vertical gene history predictions . 75 2.3.8 Determining the age of transfer . 79 2.3.9 RAxML vs FastTree . 81 2.4 Discussion . 83 3 Assessment of HGT into GPA 84 3.1 Introduction . 85 3.1.1 Function and nucleotide content as hallmarks . 85 3.1.2 Temporal patterns of HGT into GPA . 86 3.1.3 Sources of genes transferred into GPA . 87 3.2 Methods . 88 3.3 Results . 92 3.3.1 Function of HT and vertical genes . 92 3.3.2 Nucleotide content of HT and vertical genes . 95 3.3.3 Temporal dynamics of gene flow into GPA . 99 3.3.4 Sources of HGT into GPA . 102 3.4 Discussion . 104 10 4 Topology of HT genes in GPA genomes 109 4.1 Introduction . 110 4.2 Methods . 111 4.3 Results . 118 4.3.1 Gene history spatially patterns GPA genomes . 118 4.3.2 Inversions around origin-terminus axis drive symmetry . 121 4.3.3 HGT zones encompass genomic islands . 122 4.3.4 HGT zones are equally selective of incoming genes . 126 4.3.5 HGT zones are patterned by gene function . 129 4.3.6 The Near Origin zone and sporulation . 134 4.4 Discussion . 137 5 Discussion 143 5.1 Clade-specific HGT inference . 145 5.2 Assessment of HGT into GPA . 148 5.3 Topology of HT genes in GPA genomes . 151 5.4 Evolutionary lessons for synthetic biology . 155 5.5 Future directions . 157 5.5.1 HGT domain topology across prokaryotes . 157 5.5.2 Spore-specific genetic innovation in GPA clade . 159 References 161 Appendix A Supplementary Figures 186 Appendix B Supplementary Tables 188 Appendix C Other Publications 192 11 List of Figures 1.1 The GPA clade in the Bacillaceae family . 37 2.1 Phylogenetic context in HGT inference . 49 2.2 Clustering of paralogues . 63 2.3 Gene family size reduction . 66 2.4 Reference tree refinement . 71 2.5 GPA-to-GPA transfers and refining the recipient branch . 75 2.6 GPA assignment to genomospecies groups . 80 2.7 Comparison of RAxML and FastTree methods .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages203 Page
-
File Size-