Biq Analyzer: Visualization and Quality Control for DNA Methylation Data from Bisulfite Sequencing

Biq Analyzer: Visualization and Quality Control for DNA Methylation Data from Bisulfite Sequencing

A T G A T G A A A C G A CT G BiQ A T G A C G A Analyzer bisulphite sequencing simplified Computational Epigenetics PhD Thesis by Christoph Bock Computational Epigenetics Bioinformatic methods for epigenome prediction, DNA methylation mapping and cancer epigenetics Dissertation zur Erlangung des akademischen Grades des Doktors der Naturwissenschaften (Dr. rer. nat.) im Fach Informatik der Naturwissenschaftlich-Technischen Fakultäten der Universität des Saarlandes von Christoph Bock eingereicht im Mai 2008 ii Datum der Einreichung: 31. Mai 2008 Gutachter: Prof. Dr. Thomas Lengauer, Ph.D. Prof. Dr. Jörn Walter Prof. Dr. Martin Vingron Datum des Kolloquiums: 2. Oktober 2008 Dekan der Fakultät: Prof. Dr. Joachim Weickert Vorsitzender des Kolloquiums: Prof. Dr. Gerhard Weikum Protokollant Dr. Mario Albrecht iii iv CONTENTS LIST OF TABLES ..................................................................................................................... vi LIST OF FIGURES .................................................................................................................. vii ACKNOWLEDGMENT ............................................................................................................ x ABSTRACT .............................................................................................................................. xi KURZFASSUNG ..................................................................................................................... xii Part A. Introduction into Computational Epigenetics ................................................................. 1 A-1 Outline ......................................................................................................................... 1 A-2 Two facets of epigenetic inheritance........................................................................... 1 A-3 Mechanisms of epigenetic regulation.......................................................................... 2 A-4 Generation, low-level processing and quality control of epigenetic data ................... 4 A-5 Epigenome data analysis ............................................................................................. 6 A-6 Epigenome prediction: inferring epigenetic states from the DNA sequence .............. 9 A-7 Cancer epigenetics: toward improved diagnosis and therapy ................................... 10 A-8 Outline of the remainder of this thesis ...................................................................... 12 Part B. Epigenome Prediction ................................................................................................... 14 B-1 Outline ....................................................................................................................... 14 B-2 Predicting DNA methylation based on the genomic DNA sequence ....................... 14 B-3 EpiGRAPH: A user-friendly tool for advanced (epi-) genome analysis and prediction................................................................................................................... 26 B-4 CpG island mapping by epigenome prediction ......................................................... 43 B-5 An optimization-based approach to CpG island annotation ..................................... 62 Part C. DNA Methylation Mapping .......................................................................................... 75 C-1 Outline ....................................................................................................................... 75 C-2 BiQ Analyzer: Visualization and quality control for DNA methylation data from bisulfite sequencing .......................................................................................... 75 C-3 Insights from computational analysis of high-resolution DNA methylation data….. ...................................................................................................................... 78 C-4 Inter-individual variation of DNA methylation and its implications for large-scale epigenome mapping ................................................................................ 89 v Part D. Cancer Epigenetics ..................................................................................................... 107 D-1 Outline ..................................................................................................................... 107 D-2 Relevance of the methyl-CpG binding protein MeCP2 for Polycomb recruitment in cancer cells....................................................................................... 108 D-3 Optimizing a DNA-methylation-based biomarker of chemotherapy resistance for use in clinical settings ....................................................................... 119 Part E. Conclusion and Outlook ............................................................................................. 129 E-1 Outline ..................................................................................................................... 129 E-2 Conclusion ............................................................................................................... 129 E-3 Outlook .................................................................................................................... 133 Part F. List of Publications ..................................................................................................... 136 Part G. References .................................................................................................................. 138 vi LIST OF TABLES TABLE 1. METHODS FOR GENOME -WIDE MAPPING OF EPIGENETIC INFORMATION .................................................... 5 TABLE 2. LARGE -SCALE EPIGENOME MAPPING PROJECTS ......................................................................................... 8 TABLE 3. DNA-RELATED ATTRIBUTES DIFFER SIGNIFICANTLY BETWEEN METHYLATED AND UNMETHYLATED CPG ISLANDS .................................................................................................................... 19 TABLE 4. THE PREDICTIVE POWER OF ATTRIBUTE CLASSES DIFFERS REMARKABLY ; CONTROL EXPERIMENTS CONFIRM THE CHOICE OF THE PREDICTION METHOD ............................................................... 22 TABLE 5. TWELVE CPG ISLANDS WERE ANALYZED EXPERIMENTALLY TO VALIDATE OUR PREDICTIONS ................ 24 TABLE 6. LIST OF DEFAULT ATTRIBUTES INCLUDED IN EPI GRAPH ....................................................................... 30 TABLE 7. PREDICTION PERFORMANCE FOR DNA METHYLATION AND PROMOTER ACTIVITY AT CPG ISLANDS ........................................................................................................................................................ 50 TABLE 8. A SUBSET OF CPG ISLANDS EXHIBIT HIGHLY SIGNIFICANT OVERLAP WITH MULTIPLE EPIGENETIC MODIFICATIONS SIMULTANEOUSLY ........................................................................................... 52 TABLE 9. PREDICTION PERFORMANCE FOR THE DISTINCTION BETWEEN CPG ISLANDS THAT OVERLAP WITH A PARTICULAR EPIGENETIC MODIFICATION AND THOSE THAT DO NOT ................................................. 52 TABLE 10. PERFORMANCE COMPARISON BETWEEN THE COMBINED EPIGENETIC SCORE AND THE CPG ISLAND LENGTH ............................................................................................................................................ 60 TABLE 11. FUNCTIONS FOR COMPUTATIONAL SIMULATION OF EXPERIMENTAL METHODS FOR DNA METHYLATION MAPPING ............................................................................................................................... 94 TABLE 12. CORRELATION BETWEEN HIGH -RESOLUTION IMPROVEMENT AND ITS POTENTIAL PREDICTORS ........... 102 TABLE 13. STATISTICAL EVALUATION OF CANDIDATE BIOMARKERS ASSESSING MGMT PROMOTER METHYLATION ............................................................................................................................................ 125 vii LIST OF FIGURES FIGURE 1. CARRIERS OF EPIGENETIC INFORMATION : DNA AND NUCLEOSOME ......................................................... 2 FIGURE 2. PREDICTED DNA STRUCTURE DIFFERS IN THE NEIGHBORHOOD OF METHYLATED CPG ISLANDS COMPARED WITH THEIR UNMETHYLATED COUNTERPARTS .............................................................. 20 FIGURE 3. OUTLINE OF EPI GRAPH’ S SOFTWARE ARCHITECTURE .......................................................................... 29 FIGURE 4. DOCUMENTATION OF EPI GRAPH ANALYSES IN THE X-GRAF FORMAT ................................................ 33 FIGURE 5. RESULTS SCREENSHOT OF EPI GRAPH’ S MACHINE LEARNING MODULE QUANTIFYING THE PREDICTABILITY OF ULTRACONSERVED ELEMENTS BASED ON DIFFERENT GROUPS OF (EPI -) GENOMIC ATTRIBUTES ................................................................................................................................... 37 FIGURE 6. RESULTS SCREENSHOT OF EPI GRAPH’ S STATISTICAL ANALYSIS MODULE IDENTIFYING SIGNIFICANT GENE REGULATORY DIFFERENCES BETWEEN ULTRACONSERVED ELEMENTS THAT ARE RESTRICTED TO MAMMALS AND THOSE THAT ARE ALSO PRESENT IN BIRDS ................................................... 38 FIGURE 7. EPI GRAPH RESULTS SCREENSHOTS INDICATING THAT PROMOTERS OF MONOALLELICALLY EXPRESSED GENES ARE ENRICHED WITH REPRESSIVE HISTONE MODIFICATIONS AND CAN BE PREDICTED BIOINFORMATICALLY .................................................................................................................. 39 FIGURE 8. RESULTS SCREENSHOT OF EPI GRAPH’ S DIAGRAM GENERATION MODULE HIGHLIGHTING

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    164 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us