Borne, Richard (2019) Natural Killer (NK) Cell Receptor Expression in Cattle: Development of a Pipeline for Accurate Quantification of RNA Sequencing (RNA-Seq) Data
Total Page:16
File Type:pdf, Size:1020Kb
Borne, Richard (2019) Natural Killer (NK) cell receptor expression in cattle: development of a pipeline for accurate quantification of RNA Sequencing (RNA-Seq) data. PhD thesis. https://theses.gla.ac.uk/71955/ Copyright and moral rights for this work are retained by the author A copy can be downloaded for personal non-commercial research or study, without prior permission or charge This work cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given Enlighten: Theses https://theses.gla.ac.uk/ [email protected] Natural Killer (NK) cell receptor expression in cattle: development of a pipeline for accurate quantification of RNA Sequencing (RNA-Seq) data Richard Borne BSc. (Hons) Submitted in fulfilment of the requirements for the Degree of PhD College of Medical, Veterinary & Life Sciences University of Glasgow September 2018 2 Abstract Cattle have undergone significant gene expansion within two of the major NK receptor encoding complexes, the leukocyte receptor complex (LRC) and natural killer complex (NKC). This expansion has resulted in a number of highly similar genes densely packed within each complex. The genes can be highly polymorphic and encode for fundamentally important receptors for controlling the functional response of NK cells. Understanding the nature of transcription of the LRC and NKC genes is required to confirm existing gene models, predicted functional status and also their distribution in cell and tissue types. RNA-Seq offers the potential for high-resolution analysis of LRC/NKC gene transcription. For this to occur, an analysis pipeline needed to be developed to account for the high sequence similarity and resulting multi-mapping of reads. In this project an analysis pipeline, UniMMap, has been created that utilises the concept of mappability to weight the contribution of multi-mapping reads to the total read count. UniMMap was utilised to assess transcription of LRC/NKC genes in a number of RNA-Seq datasets. Transcription in peripheral blood mononuclear cells (PBMCs) and NK cells from two cattle was compared. Genes predicted to be non-functional based on the reference assembly were found to be consistently transcribed. Transcription was also compared between multiple immune cell types. CD8+ T cells were shown to transcribe killer-cell immunoglobulin-like receptor (KIR) at an equivalent level to NK cells. RNA-Seq data from multiple cattle tissues was analysed to produce an LRC/NKC gene atlas. Transcription of at least one LRC and NKC gene was observed in every tested tissue. Total transcription of the LRC/NKC genes was found to be highest in the lymph node, mammary gland and lung tissue. To assess the utility of UniMMap in a controlled study, as well as in a different species, the transcription of LRC/NKC genes were investigated in goats from a peste des petits ruminants virus (PPRV) vaccination study. A subset of the genes were found to respond differently in PBMCs between unvaccinated and vaccinated animals. This supports a role for these genes during PPRV infection. UniMMap has facilitated analysis of transcription of LRC/NKC genes from multiple species, cell/tissue types and RNA-Seq library types. The consistently observed transcription of predicted non-functional genes suggests the presence of functional alleles or possibly a role in regulating transcription of other genes 3 within each complex. The changes observed in their transcription during infection, coupled with a similar distribution among tissues and immune cell types, suggests a functional similarity with their human counterparts. 4 Table of Contents Abstract .............................................................................................. 2 List of figures and tables .......................................................................... 9 Acknowledgements ............................................................................... 13 Author’s declaration .............................................................................. 14 List of abbreviations .............................................................................. 15 1 Chapter 1. Introduction ....................................................................... 16 1.1 Natural cells ................................................................................. 16 1.1.2 Cell surface phenotype of NK cells ................................................. 16 1.1.3 Major histocompatibility complex class I .......................................... 17 1.1.4 Mechanisms of NK cell killing ....................................................... 18 1.1.5 Protection against tumour cells by NK cells ...................................... 18 1.1.6 Pregnancy and NK cells .............................................................. 19 1.1.7 NK cells as immune regulators ...................................................... 19 1.2 NK cell education .......................................................................... 20 1.3 NK receptors in model organisms ....................................................... 22 1.3.1 The killer cell immunoglobulin-like receptors .................................... 22 1.3.2 The killer cell lectin-like receptors ................................................ 25 1.4 Cattle NK cells .............................................................................. 26 1.4.1 NK receptors in cattle ................................................................ 26 1.4.2 KIR in cattle and goats ............................................................... 28 1.4.3 KLR in cattle and goats .............................................................. 32 1.5 Transcription and expression of NK receptors ......................................... 33 1.5.1 Transcriptional regulation of KIR and KLR ......................................... 34 1.5.2 Stability of the NK receptor repertoire ............................................ 35 1.5.3 Expression of NK receptors on other immune cells .............................. 36 1.5.4 Tissue specific NK cells .............................................................. 37 1.6 RNA sequencing............................................................................. 38 1.6.1 Illumina RNA sequencing ............................................................. 38 1.6.2 Long read RNA-Seq .................................................................... 40 1.6.3 RNA-Seq read mapping ............................................................... 40 1.6.4 Multi-mapping reads .................................................................. 41 1.6.5 RNA-Seq of NK cell receptors ....................................................... 42 1.7 Aims of the project ........................................................................ 43 2 Chapter 2. Overcoming multi-map problems for RNA-Seq data ....................... 45 2.1 Introduction ................................................................................. 45 2.2 Methods ...................................................................................... 50 5 2.2.1 Custom genome construction ....................................................... 50 2.2.2 Custom transcriptome construction ................................................ 50 2.2.3 RNA-Seq read simulation ............................................................. 50 2.2.4 Mappability calculations ............................................................. 51 2.2.5 RNA-Seq read mapping ............................................................... 51 2.2.6 UniMMap read counts ................................................................. 51 2.2.7 Length analysis of datasets on the Sequence Read Archive (SRA) ............. 52 2.3 Results ....................................................................................... 53 2.3.1 Multi-mapping is highly prevalent in the LRC and NKC .......................... 53 2.3.2 Maximum Illumina sequence length does not resolve mappability ............ 57 2.3.3 The Sequence Read Archive (SRA) contains a large number of Bos taurus RNA- Seq datasets ................................................................................... 58 2.3.4 Accuracy of weighted counts is impacted by uneven read coverage .......... 62 2.3.5 Masking of non-functional KIR in the genome has little impact on functional KIR multi-mapping ............................................................................ 65 2.3.6 Grouping the functional KIR into gene families reduces total multi-mapping 65 2.3.7 Filtering alignments on alignment score or mappability reduces multi- mapping ........................................................................................ 67 2.3.8 UniMMap performs better than standard weighted counts for the LRC but is inaccurate for the NKC ....................................................................... 71 2.3.9 A precise annotation is crucial to the accuracy of UniMMap ................... 73 2.3.10 Mapping to the transcriptome results in a substantial accuracy increase for UniMMap counts ............................................................................... 73 2.3.11 UniMMap can accurately quantify transcription of KIR gene families ........ 74 2.3.12