Dissecting the Genetics of Inflammatory Bowel Disease
Total Page:16
File Type:pdf, Size:1020Kb
Dissecting the genetics of Inflammatory Bowel Disease Heather Elding January 2015 Thesis submitted for the degree of Doctor of Philosophy to Research Department of Genetics, Evolution and Environment University College London (UCL) 1 I, Heather Elding confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. 2 Abstract Inflammatory Bowel Disease (IBD) can be classified into two main subtypes: Crohn’s Disease (CD) and Ulcerative Colitis (UC). The aim of this study is to identify the genetic contribution to the susceptibility to IBD. In the first part of the study, I focused on Crohn’s Disease, the subtype that shows the greatest heritability. Using both pooled and sub-phenotype data, followed by replication, the results reveal substantial genetic heterogeneity and the total number of confirmed CD susceptibility loci was increased from 71 published by others to 200. This was achieved by analyzing the data using a multi- marker approach and high-resolution genetic maps in Linkage Disequilibrium Units. In the second part of the study, I focused on Ulcerative Colitis. The results show that although UC has a lower reported heritability, many loci were also found for Ulcerative Colitis. Some of these overlap with those found for Crohn’s Disease. 3 Acknowledgements I would firstly like to thank my primary Ph.D supervisor Dr. Nikolas Maniatis for providing me with this opportunity and for his expert guidance and significant support throughout my time at UCL. Dr. Nikolas Maniatis’s challenging ideas, creativity, and moral support have greatly helped me in shaping my thoughts as a researcher and for successfully carrying out my PhD research. I would also deeply like to thank my secondary Ph.D supervisor Professor Dallas M. Swallow for her invaluable and exemplary scientific expertise and constant support throughout my Ph.D. Professor Dallas M. Swallow has been, and will be, a great inspiration on both a professional and personal level and I have learnt a lot and made significant progress under her supervision, both as a scientist and as an individual. I would also like to thank Dr. Winston Lau for his patience at always taking time to teach me the novel skills and tools that were crucial for carrying out my Ph.D and for always being there to answer my questions. I would like to thank Dr. Adrien Rieux for providing valuable feedback on my Ph.D Thesis, Dr. Toby Andrew, Dr. Doug Speed and Dr. Kaustubh Adhikari for their opinions on certains aspects of the analysis carried out in this thesis work and Karim Boustani who, as a summer student, has contributed with invaluable work. 4 I would also like to thank my parents Josette and Mario Elding and my sister Davinia Elding for supporting me throughout this journey. I would like to thank my aunt Maria Jordanov for always making sure I always had a plain-sailing stay in London and for her support and my aunt Anna Christie for always being there in time of need. I would sincerely like to show my great appreciation to the Departmental Graduate Tutors Dr. Julia Day, Professor Kevin Fowler and Dr. Lazaros Foukas for their great help and understanding as well as their emotional support during very difficult times and life obstacles that I have encountered during my Ph.D. Last, but definitely not least, I would like to thank all my friends, particularly Arjunan Rajasingam, Leanne Grech, Adrien Rieux, Florent Lassalle, Pascale Gerbault, Anke Liebert, my colleagues and friends at UCL Genetics Institute and all the people in my office in the Darwin Building for making my time at UCL an unforgettable happy time and for always being there for me. 5 Table of Contents Abstract 3 Acknowledgements 4 CHAPTER I. GENERAL INTRODUCTION 10 I. Inflammatory Bowel Disease: the story of two diseases. 10 II. An overview of the immune response and Inflammatory Bowel Disease 16 III. The gut microbiota and inflammatory bowel disease 18 IV. Genetic Studies 20 i. Association Studies 25 ii. Linkage Disequilibrium 27 iii. Crohn’s Disease and Ulcerative Colitis GWASs 32 V. Another approach- Linkage Disequilibrium Unit Maps 35 VI. General Aims of Study 43 CHAPTER II. DISSECTING CHROMOSOME 16: GENETIC AND PHENOTYPIC STRATIFICATION PROVIDES INSIGHT INTO CROHN’S DISEASE. 45 I. Introduction 45 II. Methods 48 i. Subjects and Methods 48 ii. Genetic analysis 49 iii. Genetic and Phenotypic stratification 55 III. Results 57 I. NOD2 and CYLD region 57 II. Genetic Heterogeneity within the NOD2 and CYLD region 65 III. CDH3/CDH1 and IRF8 67 IV. Discussion 75 CHAPTER III. NOVEL GENE REGIONS IDENTIFIED FOR CROHN’S DISEASE. 82 I. Introduction 82 II. Methods 83 i. Genetic Analysis 83 ii. Phenotypic Stratification 84 iii. Previously reported and novel significant gene-regions 85 iv. Meta-analysis of shared locations 85 v. Significant gene-regions and Gene Ontology 86 III. Results 87 iv. Identifying the previously reported CD intervals 87 v. Genetic Heterogeneity within the previously reported CD intervals 88 vi. 134 novel gene-regions identified 90 IV. Discussion 94 CHAPTER IV. NOVEL GENE REGIONS IDENTIFIED FOR ULCERATIVE COLITIS. 104 I. Introduction 104 II. Methods 107 i. Subjects 107 ii. Quality Control- WTCCC 2 UC Case-Control Dataset 110 a. Principal Components Analysis 113 b. Relatedness Analysis- Identity-by-Descent and Identity-by-State Analysis 118 c. SNP filtering and individual missingness 119 iii. Quality Control Analysis- NIDDK UC datasets 122 6 iv. Genetic Analysis 138 v. Replications and meta-analysis of shared locations 139 vi. Significant gene-regions identified 140 III. Results 141 i. Identifying the previously reported UC intervals. 141 i. 138 novel gene-regions identified 149 IV. Discussion 160 CHAPTER V. GENERAL DISCUSSION AND FUTURE WORK. 166 i. The story so far. 166 a. Genetic and phenotypic heterogenetiy. 167 b. Interactions and Epistatic effects. 174 ii. Considerations for future work involving the LDU mapping approach 176 iii. Future work- Inflammatory Bowel Disease meta-analysis and beyond. 179 Bibliography 191 Appendix I- Different LD metrics 203 Appendix II- BioMart Analysis from the Crohn’s Disease GWAS described in Chapter III. 205 Appendix III- WTCCC Phase II Ulcerative Colitis Cases, NBS Controls and 1958BC Controls- Quality Control and Data manipulation prior to analysis flowchart. 254 Appendix IV- NIDDK Ulcerative Colitis Cases and Crohn’s Disease Controls- Quality Control and Data manipulation prior to analysis flowchart. 255 Appendix V- BioMart Analysis from the Ulcerative Colitis GWAS described in Chapter IV 256 7 Table of Figures FIGURE 1. THE HUMAN GUT AND STRUCTURE OF THE SMALL-INTESTINAL EPITHELIAL ............................................................................................................................................................. 13 FIGURE 2. INTERACTION OF THE MAJOR IBD-CONTRIBUTORY FACTORS. ........................ 15 FIGURE 3. REPRESENTATION OF THE NOD2 GENE AND THE RELATIVE POSITIONS OF THE THREE COMMON CD-ASSOCIATED NOD2 MUTATIONS (RS2066844- R702W, RS2066845- G9608R AND RS5743293- 3020INSC). ................................................... 23 FIGURE 4. THE RELATIONSHIP BETWEEN LD AND PHYSICAL DISTANCE ........................... 37 FIGURE 5. EXPECTED PAIRWISE ASSOCIATION PER SNP INTERVAL. ................................... 40 FIGURE 6. PARTITIONING THE CHROMOSOME. ...................................................................................... 50 FIGURE 7. LDU MAPPING OF THE ASSOCIATIONS DETECTED IN THE NOD2, CYLD AND COMBINED WINDOWS. ....................................................................................................................... 61 FIGURE 8. LOCALISATION WITHIN THE CDH3/CDH1 REGION. ...................................................... 70 FIGURE 9. LOCALISATION WITHIN THE IRF8 REGION. ........................................................................ 73 FIGURE 10. LOCALISATION WITHIN THE IL1RL1/IL18RAP REGION. ........................................... 90 FIGURE 11. LOCALISATION WITHIN THE ERBB4 REGION. ............................................................... 91 FIGURE 12. PRINCIPAL COMPONENTS ANALYSISN OF WTCCC PHASE II CASE- CONTROL DATASET.......................................................................................................................................114 FIGURE 13. PRINCIPAL COMPONENTS ANALYSIS OF MERGED HAPMAP3 AND WTCCC2 UC CASE-CONTROL DATASETS. ....................................................................................117 FIGURE 14. IDENTITY-BY-DESCENT ANALYSIS OF THE MERGED WTCCC2 UC CASE- CONTROL DATASET BASED ON PAIRWISE KINSHIP COEFFICIENTS. .......................119 FIGURE 15. MULTIDIMENSIONAL SCALING ANALYSIS, BASD ON IBS DISTANCES OF THE MERGED HAPMAP3 AND WTCCC2 UC CASE-CONTROL DATASET. .................121 FIGURE 16. PRINCIPAL COMPONENTS ANALYSIS OF NIDDK IBDO UC CASES GENOTYPED ON THE ILLUMINA HUMANHAP 550V.3 PLATFORM. ................................123 FIGURE 17. PRINCIPAL COMPONENTS ANALYSIS OF NIDDK GRU UC CASES GENOTYPED ON THE ILLUMINA HUMANHAP 550V.3 PLATFORM. ................................124 FIGURE 18. PRINCIPAL COMPONENTS ANALYSIS OF PARKINSON'S DISEASE GWAS CONTROL DATA GENOTYPED ON THE ILLUMINA HUMANHAP 550V.3 PLATFORM. .......................................................................................................................................................................................125 FIGURE 19. PRINCIPAL COMPOENENTS ANALYSIS OF NIDDK