Artificial Intelligence, Bioinformatic and Systems Biology Approaches to Understanding Genome Evolution and Viral Control
Total Page:16
File Type:pdf, Size:1020Kb
Artificial Intelligence, Bioinformatic and Systems Biology Approaches to Understanding Genome Evolution and Viral Control Hamid Alinejad Rokny (Hamid Alinejad-Rokny) (MS.c Computer Science-Artificial Intelligence) A thesis in fulfilment of the degree of Doctor of Philosophy (Bioinformatics and Systems Biology) Supervisors Miles Davenport, Diako Ebrahimi and Vanessa Venturi School of Medical Sciences Faculty of Medicine The Kirby Institute November 2017 ii iii iv Originality statement ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’ Signed: Date: 20 October 2017 v vi Copyright Statement ‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.’ Signed: Date: 20 October 2017 vii viii Abstract Human immunodeficiency virus (HIV) continues to be a major global health problem. Decades of research have still not produced a successful vaccine and understanding many aspects of this virus and HIV infection continues to challenge researchers. In this thesis I have addressed several key questions about HIV biology and infection using a computational biology approach. Macaque models of HIV infection play an important role in HIV research and yet the viral peptides presented by MHC Class-I and recognised by cytotoxic T-lymphocytes (CTL) in macaques are not well characterised. I developed an in-house bioinformatics pipeline to investigate novel CTL epitopes and their associated patterns of escape mutations in pigtailed macaques. I identified new potential CTL epitopes and numerous novel non-synonymous point mutations and regions of non-synonymous mutation associated with specific MHC-I haplotypes. I also investigated the nature and distribution of APOBEC3-induced hypermutation signatures and whether this information provides clues about the HIV inhibition by different APOBEC3 enzymes such as APOBEC3G and APOPEC3F. I developed a new method for hypermutation detection. I also used a novel approach to identify preferential patterns of G-to-A mutation for both APOBEC3G and APOBEC3F. The source of CpG depletion in the HIV genome is another aspect of HIV biology that is not well understood. My bioinformatics analyses suggest that the methylation mechanism may be responsible for depletion of CpG dinucleotides in the HIV genome. Importantly, the results showed that viral genome adaptation to the host CpG machinery is a highly specific pattern that is only observed in HIV and its simian counterpart, SIV. The thesis finally reports a meta-analysis approach to investigate the effect of gene expression level on the clonal expansion process of latently-infected cells during HIV treatment. The results of this analysis show that among the HIV proviruses that integrated into genes, those integrated into poorly expressed genes are more likely to become clonally expanded. ix In conclusion, my findings shed light on several important components of the host immune system and their roles in viral control and viral evolution. These findings have implications for the future design of immunotherapies and vaccines against HIV. x xi Acknowledgements Firstly, I would like to thank my beautiful wife Hoda, for her love and generous support in all my endeavours, and particularly, her support throughout my studies. I would like to thank my supervisors, Professor Miles Davenport, Assistant Professor Diako Ebrahimi and Associate Professor Vanessa Venturi for their complete support and invaluable guidance throughout my candidature. Over the past three and a half years they went above and beyond to ensure I was equipped for the research I was conducting. I am truly thankful for the advice and general counsel they have provided throughout my studies, and for the opportunities they provided me to be directly involved in interdisciplinary collaboration. I would also like to thank all of my experimental collaborators. In particular, Professor Stephen Kent and Associate Professor Sarah Palmer and the researchers in their laboratories. They provided the experimental data necessary for several projects during my PhD candidature. I need to acknowledge the other group members, along-side whom I have worked these past three years, for all their support and advice. In particular, Janka Petravic, Deborah Cromer, Mykola Pinkevych, Andrew Grimm, Alexey Martyushev, Adeshina Adekunle, Arnold Reynaldi and David Khoury. Finally, I thank the Lord my God who has carried me every step of the way. xii xiii Table of Contents Originality statement ...................................................................................................... v Copyright statement ..................................................................................................... vii Abstract ........................................................................................................................... ix Acknowledgements....................................................................................................... xiii List of figures, tables and abbreviations ..................................................................... xx Figures ......................................................................................................................................... xx Tables ........................................................................................................................................ xxv Abbreviations .......................................................................................................................... xxvii Publications during candidature ............................................................................... xxx Chapter 1: General introduction and scope of thesis ................................................ 33 Author contributions to thesis Chapter 1 ..................................................................................... 33 The virus ...................................................................................................................................... 34 HIV types and subtypes ............................................................................................................... 36 The viral life cycle ....................................................................................................................... 37 Binding to host cells. ...................................................................................................... 37 Reverse transcription. ..................................................................................................... 37 Integration into host DNA. ............................................................................................. 37 Transcription and translation. ......................................................................................... 38 Viral assembly and budding. .......................................................................................... 38 SIV, an animal model of HIV infection ...................................................................................... 39 HIV disease progression .............................................................................................................. 41 Immune response activation ........................................................................................................ 42 HIV immune escape. ...................................................................................................... 44 Viral latency ................................................................................................................................ 45 Vaccine and antiretroviral therapy. ................................................................................ 46 APOBEC protein ......................................................................................................................... 47 xiv Host restriction factor APOBEC3 versus HIV Vif protein. ........................................... 50 HIV hypermutation. ......................................................................................................