Dutta2020 Redacted.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. The Genomic and Transcriptomic Landscape of the Indian Water Buffalo (Bubalus bubalis) Prasun Dutta (Student ID: s0928794) A Thesis submitted for the degree of Doctor of Philosophy The University of Edinburgh 2019 Declaration of Authorship I declare that this thesis has been composed solely by myself and that it has not been submitted, in whole or in part, for any other degree or professional qualification except as specified. Except where it is stated otherwise by reference or acknowledgment, the work presented is entirely my own. Prasun Dutta August, 2019 i ii Abstract The water buffalo (Bubalus bubalis) is one of the most important domesticated species in India providing milk, meat, hide and draft power. At over 100 million animals, India has the highest number of water buffalo in the world, however, the species is found across the globe, including Europe where the Mediterranean subspecies is farmed. Despite the importance of this domesticated bovid, there are limited high-resolution genomic and transcriptomic analyses across these animals. The aim of this thesis was to use whole genome and RNA sequencing data to characterise regulatory variation and genome evolution in the water buffalo. Specifically, I explored the presence of regulatory variation in macrophages of water buffalo in the form of allele- specific expression (ASE) and investigated signatures of selection and breed divergence across water buffalo breeds. Water buffalo are exposed to a range of important pathogens, many of which that are zoonotic in nature. Differences in regulatory variation between animals have been shown to underlie some of the diversity in response to these pathogens. Macrophages are among the first cells of the innate immune system to act against a pathogen through its recognition, phagocytosis and destruction playing an important role in host disease susceptibility. Regulatory variants acting in macrophages are thus important candidates for explaining differences in disease susceptibility to infectious diseases among water buffalo. To detect the presence of regulatory variation, I used whole genome sequencing and RNA-seq data in 4 Mediterranean water buffalo to identify ASE in macrophage expressed genes. The analysis revealed that regulatory variation does exist in macrophage expressed genes which could be reliably detected as ASE signature. To understand the impact of domestication and how water buffalo have evolved I used whole genome sequencing data from 81 animals spanning seven distinct breeds. I identified the population structure of these breeds and explored how gene flow has shaped their genomes. I also characterised the signatures of putative selection between breeds. Sites identified included genes linked to milk iii production, coat colour and body size and interestingly a number of these overlapped those found to be under selection in other domesticated species suggesting some extent of convergent domestication. In this thesis, I consequently undertook one of the first high-resolution evolutionary and regulatory variation analyses of an important domesticated species, Bubalus bubalis. The results from this study are likely to be invaluable to inform future studies of how regulatory variants may confer tolerance to water buffalo pathogens as well as the impact of domestication on its genome. iv Lay summary The water buffalo is one of the most important domesticated animals in India providing milk, meat, hide and draft power. Their dung is used as fertilizer and fuel. India has the highest number of water buffalo in the world (over 100 million) which represents around 57% of the global buffalo population. In India, buffalo are economically important livestock and diseases affecting them directly impact the livelihood of Indian farmers and disturb the economic market of the country. Many of the diseases are zoonotic in nature which means that they affect species beyond the water buffalo, such as humans and livestock that may be in physical proximity. Many organisms, including the water buffalo, are capable of fighting against disease causing pathogens. There are cells in the body known as ‘macrophages’ that act as a defence system against a pathogen. Macrophages contain genes which produce proteins that play a central role in this defence. The difference in regulation of protein production in macrophages makes the defence system of some buffalo strong against the pathogen, while some buffalo produce less protein, making their defence system weak. In this thesis, I have studied the possible existence of this regulation in the water buffalo, which will allow us to comment on the differences between various water buffalo defence systems, and help in the production of healthier buffalo in the future. I have also worked with genomic data from different breeds of water buffalo from India and Europe, and have tried to understand how domestication has brought about differences in their DNA. I have tried to understand if those differences are responsible for the diversity in physical features (such as body size and coat colour, milk production and nutritive content) among buffalo and if there are any advantageous genomic differences that allow some water buffalo breeds to adapt better to their environment and survive among disease causing pathogens, as opposed to others. Thus, in this thesis, I have tried to evaluate if regulation of certain genes exists in cells which confer tolerance to the water buffalo against pathogens, and how domestication has brought differences in different breeds of water buffalo. The v knowledge produced in this thesis will be important for future work on pin- pointing the reason of the gene regulation, thereby paving a pathway to understand disease tolerance in water buffalo in future research. It will also be useful in understanding how domestication has led to the evolution of its genome which will allow researchers to concentrate on their characteristics of interest such as reproduction, milk production and defence system in the water buffalo. vi Acknowledgements This PhD project would not have been possible without the help and support of many people at various stages of the research. Firstly, I would like to express my gratitude towards my principal supervisor, Prof David Hume for introducing me to the world of macrophages and his constant inputs and guidance. His constant motivation and mentorship kept me afloat and on my toes through the whole process. I would like to thank my additional supervisors, Dr James Prendergast and Prof Eileen Wall who made sure I meet my deadlines. I would like to especially thank James, who was always there to discuss my queries on population genetics and bioinformatics and was patient even when I barged into his office without prior notification. My PhD project was made possible with the immense help from my senior colleagues in ‘buffalo team’- Dr Rachel Young, Dr Stephen Bush (Steve) and Lucas Lefevre. All the bioinformatics analysis work would not have been possible without the work done by Rachel, Lucas and everyone else who was involved in sample collection, wet-lab experiments and sequencing. Specifically, Rachel and Lucas led the larger buffalo atlas project that provided the data for my PhD. Steve established and ran the bioinformatics pipeline that built the complete buffalo gene expression atlas. An extended thanks to Steve, for guiding me with various bioinformatics techniques/methods during the start of my PhD. I am also thankful to Dr Andrea Talenti, Dr Mazdak Salavati and Anirudh Patir who were not directly involved in my PhD project but were always there for enriching discussions about bioinformatics/statistical methodologies and population genetics concepts. The research process would have been very difficult without the support of my family and friends. I am grateful to my parents for believing in me when I started this journey. Kudos to my wife Shweta, who dealt with all my crankiness and mood swings during the last phase of my PhD, boosted my confidence about my abilities and made me understand that taking breaks is important. Last but not the least, a big thank you to all my friends in Edinburgh whose company made this journey extremely fruitful and enjoyable. vii viii Table of Contents Declaration of Authorship ................................................................................. i Abstract .......................................................................................................... iii Lay summary .................................................................................................. v Acknowledgements ......................................................................................