This Thesis Has Been Submitted in Fulfilment of the Requirements for a Postgraduate Degree (E.G
Total Page:16
File Type:pdf, Size:1020Kb
This thesis has been submitted in fulfilment of the requirements for a postgraduate degree (e.g. PhD, MPhil, DClinPsychol) at the University of Edinburgh. Please note the following terms and conditions of use: This work is protected by copyright and other intellectual property rights, which are retained by the thesis author, unless otherwise stated. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis cannot be reproduced or quoted extensively from without first obtaining permission in writing from the author. The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the author. When referring to this work, full bibliographic details including the author, title, awarding institution and date of the thesis must be given. Post-GWAS functional characterisation of colorectal cancer risk loci Li Yin Ooi Thesis submitted for the degree of Doctor of Philosophy School of Medicine and Veterinary Medicine The University of Edinburgh 2016 Dedication I dedicate this thesis to Euan, for giving mummy a good run of nights, which made the writing of this thesis possible. i ABSTRACT Large bowel cancer, or colorectal cancer (CRC) is the third most common cause of cancer worldwide and the fourth biggest cause of cancer mortality. Twin studies have shown that the heritable contribution is ~35%, with ~5% of cases due to rare, high- penetrance mutations. In the last decade, the use of genome-wide association studies on large, well-characterised case-control cohorts of CRC has facilitated the identification of over 25 common genetic variants that carry with them an increased predisposition to colorectal cancer, invoking the common-disease common variant paradigm. As almost all of these variants lie within non-coding regions, the underlying causal mechanism is to-date poorly understood for the majority of these loci, and it is thought that they mediate risk by influencing gene expression levels. To test this hypothesis, an agnostic approach that utilises expression quantitative trait loci (eQTL) analysis was first carried on 115 normal colorectal mucosa samples and 59 peripheral blood mononuclear cells (PBMC). As these heritable variation on gene expression are likely to be subtle, there is a strong emphasis on the technical methodology to minimise experimentally-induced non-biological variations, including the extraction of high-quality RNA from primary tissue, the selection and validation of reference genes for normalisation of gene expression quantification, as well as internal validation of the samples and data processing. Thereafter, the association between the 25 CRC risk variants and the expression of their cis-genes were examined systematically, demonstrating that ten of these variants are also tissue-specific eQTLs. This intermediate phenotype strongly suggests that they confer risk, at least in part, by modifying regulatory mechanisms. One of the best eQTL associations (Xp22.2) is investigated in further detail to reveal a novel indel polymorphism (Indel24) at the distal promoter region of target gene SHROOM2 that influenced both transcript abundance and CRC risk more than the original tagging SNP. Functional verification with gene reporter assays indicated that Indel24 displays differential allelic control over transcriptional activity. Further in silico analysis and mutations to the reporter gene constructs provided evidence that Indel24 modulates transcription by modifying the spacing between CCAAT motifs and the iii consequent binding affinity of NF-Y transcription factor. siRNA depletion of NF-Y was associated with a reduction in transcriptional activity of the Indel24 gene construct as well as endogenous SHROOM2, which is strongly supportive of the interaction between Indel24 and NF-Y in the transcriptional activation of SHROOM2. Preliminary evidence is suggestive of SHROOM2 being expressed at the top of the intestinal epithelial crypt and playing a role in cell cycle regulation. Hypothesis-driven approaches can also be of utility in demonstrating functionality of CRC risk variants, complementing the hypothesis-free approach of eQTL analysis. Guided by a recently discovered gene-environment interaction between the 16q22.1 risk variant and circulating vitamin D levels, the influence of the rs9929218 SNP on CDH1 gene expression was examined, in relation to the expression of putative regulatory genes derived from in silico analysis and studies of other target genes. Although there was no direct association between rs9929218 and CDH1 expression, there were multiple two-way interactions that were together suggestive of rs9929218 influencing the VDR/FOXO4 regulation of CDH1. This provides functional support for the mechanism underlying the epidemiological observation of the gene- environment interaction between 16q22.1 and vitamin D, and demonstrates a candidate-based approach in deciphering the link between genetic locus and CRC susceptibility. In summary, the research presented in this thesis has validated the experimental rationale of utilising expression studies of normal colorectal mucosa to hone in on the molecular mechanisms and susceptibility genes underlying the association between common genetic variation and CRC risk. iv LAY ABSTRACT Large bowel cancer is the third most common cause of cancer worldwide and the fourth biggest cause of cancer deaths. Large-scale comparative studies of people with and without colorectal cancer have shown that there are inherited genetic factors that predispose one to the disease. These genetic factors are present at varying frequencies in the general population with varying effects on disease risk; rare genetic mutations have a big impact on the lifetime chances of developing the disease, whereas common normal DNA sequence differences have a smaller influence on disease susceptibility. Although the risk conferred by these common DNA differences are individually modest, collectively they have a significant influence on the risk of developing the disease. How these variants lead to the development of large bowel cancer is poorly understood, and this study seeks to shed light on the underlying mechanisms. Understanding how these heritable factors lead to disease is important as not only will it improve our understanding of how cancer develops, it will also inform the design of preventative and therapeutic strategies. By analysing the cells of the human large bowel and blood, this study demonstrates that some of these common genetic differences linked to large bowel cancer do not alter the function of genes, but instead influence the levels of gene products that are expressed. Further investigation of one of the genetic variants with the strongest influence on gene expression identifies the underlying molecular mechanism and the gene it influences (known as SHROOM2). The research presented in this thesis also presents a framework of investigation into the function of this gene in the large bowel, and how differences in its expression could lead to large bowel cancer. Finally, this thesis describes the investigation of the molecular mechanism underlying the synergistic effect of DNA variation and vitamin D levels on the risk of developing large bowel cancer. This is an important aspect to address as it is known that large bowel cancer arises from a combination of genetic and environmental factors, and a clearer understanding of this complex relationship will ultimately be of public health benefit. v ACKNOWLEDGEMENTS First and foremost, I cannot be more thankful to my supervisors Dr Susan Farrington and Professor Malcolm Dunlop for their dedicated support during the course of this PhD. You have both been inspiring in different yet complementing ways. Thank you for teaching me the principles of good research, for offering kind encouragement, for critically analysing my work, for protecting my time, and for acknowledging my efforts. Special thanks to Susan for tirelessly proofreading my thesis! A round of applause must also go to Miss Farhat Din, for her selfless encouragement and advice throughout, and for helping with the organoid set-up. I would like to give my sincerest thanks to all the patients who have kindly donated tissue for my research, without which none of this research would be possible. The Western General surgical, pathology, Tissue Governance and ECMC teams have facilitated the clinical aspects of sample collection and for that I am very grateful. I would also like to express my gratitude to Marion Walker, Stuart Reid, Asta Valenciute and Maria Timofeeva for their continued help, technical support as well as friendship. Many thanks goes to Marion for sorting out my samples and putting up with my last-minute orders, as well as always laughing at my jokes and my life’s little disasters. Particular thanks to Stuart and his brilliant cloning green fingers, his help has facilitated a lot of my research and the time working with him has been a pleasure. I have learnt the importance of being robust (in experimental technique and life in general) from Asta, thank you for always sharing antibodies as well as having me over at dinner parties! Special thanks go to Maria for helping me with statistical genetics, and for the company during the period of writing-up. Many thanks to everyone else in the lab and the wider unit that have assisted with various aspects of this work, and who have endured my pestering without complaint. An extra special thank you to my family