
Claude Fiifi Hayford A Computational Analysis of Motif- Negative VDR-DNA Interaction and Possible Explanations of Its Mechanism of Interaction Master’s Thesis in Molecular Medicine Trondheim, June 2014 Supervisor: Professor Finn Drabløs Subject Supervisor: Kjetil Klepper Norwegian University of Science and Technology Faculty of Medicine Department of Cancer Research and Molecular Medicine Abstract Transcription factor binding to DNA has generally been assumed to be as a result of the recognition of sequence-specific motifs in the transcription factor binding sites. Recent studies have however shown several examples of transcription factor binding where no recognizable motif was identified. The exact mechanisms by which these associations occur remain unclear, although several explanations have been put forward. By employing MotifLab, a computational tool for the analysis of regulatory regions and data from the ENCODE project, this study examines the properties of these motif-positive and motif- negative regions and how they relate or differ from each other to get a better understanding of the factors that lead to transcription factor binding in these cases. Two well-described Vitamin D receptor datasets where there is a mixture of binding sites both with and without a clear motif is utilised. The results showed that there are differences between motif-negative regions and motif- positive regions in terms of DNA accessibility, the type of regions in which either type of binding takes place, as well as in the types of motifs that are overrepresented in each case. These findings suggest that VDR binding in motif-negative regions does employ a mechanism different from its sequence-specific binding; however a clear elucidation of this mechanism has not been possible. iii iv Acknowledgement I would like to thank my supervisor, Professor Finn Drabløs for giving me the opportunity to work on this interesting project, for the guidance, feedback and insightful discussions during the course of the project and also for the motivation on occasions when I felt I had hit a brick wall. I would also like to extend my sincere appreciation to Kjetil Klepper, my co-supervisor for all the help with respect to using MotifLab. Thank you for taking the time to answer my questions. I know I was not always clear what I meant but you were patient and tried to help as best as you could. Finally, I want to thank all those who in one way or the other believed and supported me through the whole master’s program. To my family and friends I say a big thank you for the unflinching support. v vi Table of Contents Abstract ................................................................................................................................... iii Acknowledgement .................................................................................................................... v List of abbreviations and labels ............................................................................................. ix 1. Introduction ...................................................................................................................... 1 1.1. Genes and gene regulation .......................................................................................... 1 1.2. The role of epigenetic modifications in gene regulation ............................................. 2 1.3. Transcription factors and their role in gene regulation ............................................... 3 1.4. TF’s and cis-regulatory modules ................................................................................. 4 1.5. Identification of TF binding sites ................................................................................ 4 1.5.1. Systematic Evolution of Ligand by Exponential Enrichment (SELEX) ............. 5 1.5.2. DNAse I footprinting ........................................................................................... 5 1.5.3. Chromatin Immunoprecipitation (ChIP) .............................................................. 7 1.5.4. Computational analysis of TF binding sites......................................................... 8 1.5.5. Use of additional information to guide binding site prediction and motif discovery ........................................................................................................................... 11 1.6. Workbenches for analysing biological data .............................................................. 11 1.7. Gene Regulation by the Vitamin D Receptor (VDR) ............................................... 12 1.8. Context and aims of the study ................................................................................... 15 2. Methods ........................................................................................................................... 19 3. Results and Discussion ................................................................................................... 29 3.1. VDR binds to an RXR-VDR-like motif ..................................................................... 29 3.2. VDR binds in HOT (High Occupancy of Transcription Factor) regions .................. 33 3.3. Differences in enriched DNA feature regions in VDR- positive versus negative sequences .............................................................................................................................. 37 3.4. DNA feature regions not discriminative for VDR/VDR-like binding sites .............. 40 3.5. Enrichment of Sp1-like motif in VDR-negative sequences ...................................... 41 3.6. VDR- positive and negative sequences have differences in enriched motifs ........... 43 3.7. Weaker VDR binding may be supported by other transcription factor activity/response elements .................................................................................................... 44 3.8. VDR binding and cluster formation is Cohesin and CTCF independent .................. 46 3.9. VDR ChIP-Seq regions not enriched for direct interaction partners ........................ 48 3.10. Identification of putative CRMs modulating VDR action ........................................ 49 vii 3.11. Motif position distribution ........................................................................................ 50 4. Conclusion ....................................................................................................................... 55 4.1. MotifLab as a workbench for regulatory region analysis ......................................... 55 4.2. Suggestions for future work ...................................................................................... 56 References ............................................................................................................................... 59 Appendix ................................................................................................................................. 65 viii List of abbreviations and labels CCDS Human genome high confidence gene annotations from the Consensus Coding Sequence project ChIP-Seq Chromatin Immunoprecipitation followed by high throughput sequencing Conservation Evolutionary conservation information from multiple alignment of 44 vertebrate species CpG Islands Genomic regions where CpGs are present at significantly higher levels than the rest of the genome as a whole CRM Cis regulatory module DNaseHS_hotspots DNAse 1 hypersensitivity hotspots from several cell lines indicating general chromatin accessibility and active cis- regulatory sequences DR3 Direct repeat with a spacer of three nucleotides ENCODE Encyclopaedia of DNA elements Enriched_Heikkinen Statistically enriched motifs in Heikkinen dataset Enriched_Ramagopalan Statistically enriched motifs in Ramagopalan dataset EnsemblGenes Gene predictions generated by Ensembl FAIRE-Seq Nucleosome depleted regions of the genome identified using Formaldehyde Assisted Isolation of DNA Regulatory Elements GM12878 Cell line derived from B-lymphocyte from International HapMap Project (similar to that used by Ramagopalan) H3K27ac Acetylation of histone 3 on lysine 27 H3K4me1 Mono-methylation of histone 3 on lysine 4 H3K4me3 Tri-methylation of histone 3 on lysine 4 H3K9ac Acetylation of histone 3 on lysine 9 JSVDR (Classic VDR) Jaspar_Core representation of the VDR motif JSVDR_positive Collection of sequences bearing one or more instances of the JSVDR K562 Cell line derived from patient with chronic myelogenous leukaemia (similar to that used by Heikkinen) MEME Multiple Expectation Maximization for Motif Elicitation MEMEVDR RXR-VDR motif representation derived from de novo motif discovery using MEME NR4A2_positive Collection of sequences bearing one or more instances of the motif for NR4A2 PSSM Position specific scoring matrix PWM Position weight matrix Repeat327 Repeat regions detected by Repeatmasker 3.2.7 RXR Retinoid X receptor SINE Small interspersed nuclear element ix TF Transcription factor TFBS Transcription factor binding site TFBS_ChIP-Seq General regions bound by transcription factors identified by ChIP-Seq TFBS_observed Motifs identified from motif scanning TSS Transcription start site VDR Vitamin D receptor VDR_negative Sequences without any RXR-VDR or VDR-like motif VDR_positive Sequences bearing either RXR-VDR or NR4A2 motif VDRE Vitamin D response element x 1. Introduction 1.1. Genes and gene regulation Genes are regions of nucleic acids (DNA) in the genome said to be the basic units of heredity in living organisms that direct the manufacture
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages92 Page
-
File Size-