ENDOGENOUS RETROVIRUSES in PRIMATES Katherine Brown Bsc
Total Page:16
File Type:pdf, Size:1020Kb
ENDOGENOUS RETROVIRUSES IN PRIMATES Katherine Brown BSc, MSc Thesis submitted to the University of Nottingham for the degree of Doctor of Philosophy July 2015 Abstract Numerous endogenous retroviruses (ERVs) are found in all mammalian genomes, for example, they are the source of approximately 8% of all human and chimpanzee genetic material. These insertions represent retroviruses which have, by chance, integrated into the germline and so are transmitted vertically from parents to offspring. The human genome is rich in ERVs, which have been characterised in some detail. However, in many non-human primates these insertions have not been well- studied. ERVs are subject to the mutation rate of their host, rather than the faster retrovirus mutation rate, so they change much more slowly than exogenous retroviruses. This means ERVs provide a snapshot of the retroviruses a host has been exposed to during its evolutionary history, including retroviruses which are no longer circulating and for which sequence information would otherwise be lost. ERVs have many effects on their hosts; they can be co-opted for functional roles, they provide regions of sequence similarity where mispairing can occur, their insertion can disrupt genes and they provide regulatory elements for existing genes. Accurate annotation and characterisation of these regions is an important step in interpreting the huge amount of genetic information available for increasing numbers of organisms. This project represents an extensive study into the diversity of ERVs in the genomes of primates and related ERVs in rodents. Lagomorphs (rabbits and hares) and tree shrews are also analysed, as the closest relatives of primates and rodents. The focus is on groups of ERVs for which previous analyses are patchy or outdated, particularly in terms of their evolutionary history and possible transmission routes. A pipeline has been developed to comprehensively and rapidly screen genomes for ERVs and phylogenetic analysis has been performed in order to characterise these ERVs. Almost 200,000 ERV fragments, many of which have not previously been characterised, were identified using this pipeline, distributed across six retroviral genera and 33 vertebrate genomes. These fragments were used to investigate several areas of interest: the potential origin of primate ERVs, in rodents or other hosts; the ERV content of the less well-studied primates; the endogenous lentiviruses; mammalian endogenous epsilonretroviruses and the origin of pathogenic gibbon ape leukaemia virus. Laboratory study was used to complement the bioinformatics analysis where appropriate. This analysis had several interesting outcomes. First, a novel endogenous member of the lentivirus genus of retroviruses, which are rarely found in an endogenous form, was identified in the bushbaby Galago moholi. This ERV may represent an ancient ancestor of modern human immunodeficiency virus (HIV), as it is the oldest member of the lentivirus genus (the genus which HIV belongs to) that has been identified in a primate living on the African mainland, alongside the primate hosts where the HIV pandemic originated. This ERV appears to have been transmitted between G. moholi and two species of Malagasy primate in the last five million years, many millions of years after these species have had any contact, suggesting that the virus has been transmitted from one host to another via a third, vector species. (Hart et al., 1996) Gibbon ape leukaemia virus was responsible for leukaemia and lymphoma in several gibbon colonies during the 1970s and has since then been thought of as a circulating pathogen in this species. Using a combination of techniques we have established that this virus is not a common pathogen of modern gibbons and identified a route through which a single cross species transmission event from a rodent may have resulted in all known cases of this disease worldwide. We have also identified endogenous epsilonretroviruses, usually considered to be viruses of fish and amphibians, in all screened species of primates. Based on these results, there is an ancient evolutionary relationship between epsilonretroviruses and primates. As these viruses once had the potential to infect primates and are currently widespread in fish, this result raises questions about the pathogenic potential of these viruses. Many other ERVs were identified in primates, rodents and related species and we propose a classification scheme for these viruses and use this scheme as a basis to explore the ERV content of these hosts. Using this technique, previously unknown ERVs which are recombinant and which have the potential to produce active viral particles have been identified. ii List of Published Papers BROWN, K., EMES, R. D. & TARLINTON, R. E. 2014. Multiple Groups of Endogenous Epsilon-Like Retroviruses Conserved across Primates. Journal of Virology, 88, 12464- 12471. BROWN, K., MORETON, J., MALLA, S., ABOOBAKER, A. A., EMES, R. D. & TARLINTON, R. E. 2012. Characterisation of retroviruses in the horse genome and their transcriptional activity via transcriptome sequencing. Virology, 433, 55-63. TARLINTON, R. E., BARFOOT, H. K. R., ALLEN, C. E., BROWN, K., GIFFORD, R. J. & EMES, R. D. 2013. Characterisation of a group of endogenous gammaretroviruses in the canine genome. The Veterinary Journal, 196, 28-33. iii Acknowledgements I would like to thank the following people for their contribution to this work: My supervisor Dr. Rachael Tarlinton for her guidance, support, friendship and constant reassurance over the last four years. Prof. Ed. Louis for supervision during the early stages of this project and advice throughout. Dr. Li Li for prior work on the prosimian lentiviruses and Dr. Liz Bailes, who was involved in the planning stages of the project and sadly passed away before the work was underway. Dr. Richard Emes for development of the Exonerate pipeline used to generate the majority of these results and general bioinformatics advice. Prof. John Brookfield, Dr. Beth Hellen and the University of Nottingham HPC team for advice on computing, bioinformatics and data analysis. Dr. Christian Roos at the German Primate Centre, Dada Gottelli at the Zoological Society of London and Mads Frost Bertelsen at Copenhagen Zoo for kindly providing primate samples. BIAZA for providing a letter of support for this project. Frank Wessely for answering a huge number of bioinformatics questions on a daily basis in the early days of my PhD. My friends at the University of Nottingham Vet School for keeping me sane and making me laugh throughout this PhD, especially Mansi, Frank, Donna, Jasmine, James, Tim and Laura. I will never hear the words robot or funnel again without laughing. Isaac Malter for entertainment during my thesis pending period. My mum, dad and brother for emotional (and occasionally financial!) support and practical help. And Dad and Olly for proofreading while sending me photos of monkeys. The University of Nottingham School of Veterinary Medicine and Science and School of Biology for funding, plus travel funding from the University of Nottingham Graduate School, the Genetics Society and the Society for General Microbiology. iv Table of Contents Scope of Thesis _____________________________________________________________ 1 Chapter 1. Introduction ________________________________________________________ 5 1. 1. Classification, Structure and Life Cycle of Exogenous Retroviruses ________________ 5 1.1.1. Classification _______________________________________________________ 6 1.1.2. Structure _________________________________________________________ 10 1.1.2.1. Viral Particles __________________________________________________ 10 1.1.2.2. Genome Structure, Genes and Proteins _____________________________ 11 1.1.2.3. Long Terminal Repeats ___________________________________________ 12 1.1.3. Life Cycle _________________________________________________________ 15 1.1.3.1. Receptor Binding _______________________________________________ 15 1.1.3.2. Entry _________________________________________________________ 17 1.1.3.3. Transport to the Nucleus and Reverse Transcription ___________________ 17 1.1.3.4. Reverse Transcription ___________________________________________ 18 1.1.3.5. Nuclear Import _________________________________________________ 21 1.1.3.6. Integration ____________________________________________________ 21 1.1.3.7. RNA Synthesis __________________________________________________ 23 1.1.3.8. Translation ____________________________________________________ 23 1.1.3.9. Assembly, Packaging and Release __________________________________ 25 1.1.3.10. Maturation ___________________________________________________ 26 1.1.4. Accessory Proteins __________________________________________________ 26 1.1.4.1. Betaretroviruses ________________________________________________ 27 1.1.4.2. Lentiviruses____________________________________________________ 27 1.1.4.3. Deltaretroviruses _______________________________________________ 29 1.1.4.4. Epsilonretroviruses _____________________________________________ 29 1.1.4.5. Spumaviruses __________________________________________________ 30 1. 2. Exogenous Retroviruses, Disease and Host Defences __________________________ 32 1.2.1. HIV, SIV and FIV ____________________________________________________ 32 1.2.1.1. Naturally Infected Hosts _________________________________________ 32 1.2.1.2. Progression to AIDS _____________________________________________ 33 v 1.2.1.3. Treatment _____________________________________________________