This Thesis Has Been Submitted in Fulfilment of the Requirements for a Postgraduate Degree (E.G
Total Page:16
File Type:pdf, Size:1020Kb
This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. Evolutionary Analyses of Plasmodium Genomes Oscar A. MacLean Submitted for the degree of Doctor of Philosophy Institute of Evolutionary Biology University of Edinburgh 2019 Declaration The composition of and work contained within this thesis is my own work, except where explicitly stated otherwise. Oscar A. MacLean i Abstract Plasmodium is the genus responsible for causing malaria in humans, killing more than 400,000 individuals a year. Species from the genus are also capable of infecting different primate, avian and rodent species. In this thesis I use previously published genomic data to investigate several aspects of Plasmodium evolution. I compare mutation rates and the strength of selective constraint across different subgenera of Plasmodium. I find little to no conservation of mutation rate across the genus, and no evidence that these mutation rate changes are driven by selective optimisation. I analyse patterns of polymorphism in human-infecting and ape-infecting Plasmodium species, looking for signatures of natural selection. Using this data, I find evidence for a breakdown in the efficacy of purifying selection in the recent evolutionary history of human-infecting species, but not in ape-infecting species. I suggest this breakdown is caused by recent population expansions in the human- infecting populations. I investigate different methods to study the historic selective regime of individual genes without being confounded by this recent breakdown in the efficacy of purifying selection. I find strong evidence for ongoing GC biased gene conversion in human-infecting Plasmodium vivax from polymorphism patterns, and find evidence for this process in the evolutionary history of species from other subgenera. I suggest the differential strength of this process across the genus is responsible for the wide range of genome GC contents found across different Plasmodium species. Additionally, using the same polymorphism data, I find little evidence for ongoing selection on codon usage in Plasmodium falciparum. ii Lay Summary The Plasmodium are a group of different species which are responsible for causing the disease malaria in humans, which kills more than 400,000 individuals a year. Different Plasmodium species can infect various primate, bird and rodent species. In this thesis I use previously generated and published DNA sequencing data to investigate several aspects of the evolutionary history of Plasmodium. I compare the rate of DNA mutations across these Plasmodium species, finding little to no conservation of mutation rate across the diversity of these species. I analyse patterns of genetic diversity in populations of human-infecting and ape-infecting Plasmodium species, and find evidence for a reduction in the strength of natural selection in the recent history of two human-infecting Plasmodium species. I find strong evidence for a distinct mutation process called GC biased gene conversion in one human-infecting Plasmodium species, but not another. This process mimics the action of natural selection by increasing the frequency of G/C DNA bases at the expense of A/T bases in the population. I suggest this process is responsible for the very uneven and differently biased usage of A/T/G/C DNA bases observed across different Plasmodium species. iii iv Acknowledgements A huge thanks to Paul Sharp for four years of ideas, guidance and constructive criticism. Thanks also to Lindsey Plenderleith for providing help, snippets of Perl code, and many USB sticks with data on them. Thanks to Billy Palmer and Jarrod Hadfield for guiding me through implementing the SnIPRE methodology in mcmcglmm. Thanks to Tom Booker and Ben Jackson for help running software to generate and analyse site frequency spectra. Thanks to Frank Venter for useful preliminary analyses on mutation rate variation. Thanks to Jenni Howe for putting up with my (all too common) moaning and erratic working hours. v Contents . Section 1 - General introduction ........................................................................... 1 1.1 Introduction to malaria .................................................................................................. 1 1.2 Parasite classification ..................................................................................................... 1 1.3 Lifecycle Introduction to malaria .................................................................................... 2 1.4 Six human-infecting Plasmodium species ...................................................................... 3 1.5 P. falciparum and the Laverania subgenus .................................................................... 4 1.6 P. vivax and the Plasmodium subgenus ......................................................................... 5 1.7 P. ovale ........................................................................................................................... 6 1.8 P. malariae ...................................................................................................................... 7 1.9 What are the key problems in malariology? .................................................................. 7 1.10 Origin of P. falciparum in humans ................................................................................ 9 1.11 P. vivax evolutionary history and origin in humans ................................................... 15 1.12 Identifying recent selection ........................................................................................ 16 1.13 Previous evolutionary genetics work ......................................................................... 19 1.14 The distribution of fitness effects............................................................................... 23 1.15 Plasmodium genomic complexity Identifying recent selection.................................. 24 1.16 Thesis aims ................................................................................................................. 27 Chapter 2 - Investigating relative mutation rates across the Plasmodium genus ...29 2.1 Introduction .................................................................................................................. 29 2.2 Methods ....................................................................................................................... 36 2.3 Results .......................................................................................................................... 36 2.4 Discussion ..................................................................................................................... 43 Chapter 3 - Investigating patterns of polymorphism and identifying signatures of selection in Plasmodium falciparum ................................................................51 3.1 Introduction .................................................................................................................. 51 3.2 Methods ....................................................................................................................... 53 3.3 Results .......................................................................................................................... 56 3.4 Discussion ..................................................................................................................... 69 Chapter 4 - AT content variation across the Plasmodium genus ...........................75 4.1 Introduction .................................................................................................................. 75 vi 4.2 Methods ....................................................................................................................... 78 4.3 Results .......................................................................................................................... 79 4.4 Discussion ..................................................................................................................... 89 Chapter 5 – General discussion ......................................................................... 105 5.1 What has been found? ............................................................................................... 105 5.2 What is the significance of these findings? ................................................................ 106 5.3 Limitations and future directions ............................................................................... 109 5.4 General conclusions from results ............................................................................... 111 Reference list.................................................................................................... 115 Appendices