Computational Analysis of the ENCODE Datasets and Other Related Epigenetic Explorations

Computational Analysis of the ENCODE Datasets and Other Related Epigenetic Explorations

Computational analysis of the ENCODE datasets and other related epigenetic explorations . Ved Topkar Harvard College class of 2016 Gunawardena Lab Harvard Medical School Department of Systems Biology 13 August 2013 . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 1 / 1 Introduction Presentation Goals FULL understanding of discussed material Ask questions along the way! . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 2 / 1 Introduction Outline 1. Molecular biology in a jiffy 2. A case study Hypothesis formulation Analyzing data 3. More examples . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1 Introduction Outline 1. Molecular biology in a jiffy 2. A case study Hypothesis formulation Analyzing data 3. More examples . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1 Introduction Outline 1. Molecular biology in a jiffy 2. A case study Hypothesis formulation Analyzing data 3. More examples . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 3 / 1 Molecular Biology Essentials The Cell . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 4 / 1 Molecular Biology Essentials The Central Dogma . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 5 / 1 Molecular Biology Essentials Transcriptional Regulation . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 6 / 1 Molecular Biology Essentials Transcriptional Access . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 7 / 1 Background Epigenetics and Gene Expression Things beyond just the base pairs in DNA matter ! gene expression . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 8 / 1 Background The Question Analyze the ENCODE dataset . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 9 / 1 Background The Question Analyze the ENCODE dataset . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 9 / 1 Background . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 10 / 1 Background ENCODE (Overview) . Overview . National Human Genome Institute: Encyclopedia of DNA Elements (ENCODE) Nearly 600 collaborating labs post HGP . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 11 / 1 Background The Data Set . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 12 / 1 Background Data Types Raw signals Raw signal peak calling outputs (e.g. PeakSeq results) Relatively course-grain peak data . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 13 / 1 A simple example TFBS The Game Plan . Can we reduce transcription factor binding landscapes into categories? . Scan across genome, looking for promoters Bin promoters appropriately Score binding at each promoter Clustering analysis . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 14 / 1 A simple example TFBS RefSeq . Overview . Curated database of genes New versions released as frequently as Firefox Includes pseudogenes, haplotype variations, and predicted genes . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 15 / 1 A simple example TFBS Defining Promoter? Only upstream from TSS? Incredibly far regulatory regions? Intronic regulation? Post termination regulatory elements? 1000 bp upstream . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1 A simple example TFBS Defining Promoter? Only upstream from TSS? Incredibly far regulatory regions? Intronic regulation? Post termination regulatory elements? 1000 bp upstream . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1 A simple example TFBS Defining Promoter? Only upstream from TSS? Incredibly far regulatory regions? Intronic regulation? Post termination regulatory elements? 1000 bp upstream . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1 A simple example TFBS Defining Promoter? Only upstream from TSS? Incredibly far regulatory regions? Intronic regulation? Post termination regulatory elements? 1000 bp upstream . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1 A simple example TFBS Defining Promoter? Only upstream from TSS? Incredibly far regulatory regions? Intronic regulation? Post termination regulatory elements? 1000 bp upstream . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 16 / 1 A simple example TFBS Binning How do we quantitatively analyze promoter presence? Break promoter regions into bins for a finer metric? Do we give weights to bins as a function of their position? Single, unweighted 1000 bp bin of counts . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1 A simple example TFBS Binning How do we quantitatively analyze promoter presence? Break promoter regions into bins for a finer metric? Do we give weights to bins as a function of their position? Single, unweighted 1000 bp bin of counts . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1 A simple example TFBS Binning How do we quantitatively analyze promoter presence? Break promoter regions into bins for a finer metric? Do we give weights to bins as a function of their position? Single, unweighted 1000 bp bin of counts . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1 A simple example TFBS Binning How do we quantitatively analyze promoter presence? Break promoter regions into bins for a finer metric? Do we give weights to bins as a function of their position? Single, unweighted 1000 bp bin of counts . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 17 / 1 A simple example Survey of Promoters Forward vs. Backward Strand . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 18 / 1 A simple example Survey of Promoters TF Binding Frequency . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 19 / 1 A simple example Survey of Promoters Histogram of promoter binding . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 20 / 1 A simple example Survey of Promoters Promoter/TFBS intersections . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 21 / 1 A simple example Survey of Promoters Computational Efficiency This was an exercise in program optimization Original algorithm took about 5 days, optimized/parallelized algorithm took just a few hours . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 22 / 1 A simple example Survey of Promoters Clustering The unsupervised grouping of information such that groups have similar elements that are dissimilar from elements in other groups . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 23 / 1 A simple example Clustering analysis Clustering . Minkowski Distance (p = 2 ! Euclidean Distance) . ! 1 Xn p p jxi − yi j . i=1 . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 24 / 1 A simple example Clustering analysis Clustering . Simple Linkage Clustering . D(X ; Y ) = min(d(x; y)) 8 (x; y)ϵ(X ; Y ) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 25 / 1 A simple example Clustering analysis Clustering (simple linkage exhibiting chaining) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 26 / 1 A simple example Clustering analysis Clustering . Complete Linkage Clustering . D(X ; Y ) = max(d(x; y)) 8 (x; y)ϵ(X ; Y ) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 27 / 1 A simple example Clustering analysis Clustering (complete linkage) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 28 / 1 A simple example Clustering analysis Computational Efficiency Pairwise distance calculations require a LOT of RAM . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 29 / 1 A simple example Clustering analysis 2-way clustering heatmap . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 30 / 1 A simple example Clustering analysis 2-way clustering heatmap (with random data arrays) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 31 / 1 A simple example Roadmap Dataset Analysis Histone Modifications (Roadmap Dataset) . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 32 / 1 A simple example Roadmap Dataset Analysis A quick correlation test R = 0.042 . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 33 / 1 A simple example Roadmap Dataset Analysis Conclusions There are numerous methods of promoter binning that prove useful for complexity reduction Unsupervised clustering of preprocessed promoter data yield results with biological significance . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 34 / 1 A simple example Roadmap Dataset Analysis Next Steps Incorporation of nucleosome enrichment, methylation, and histone modification data in a more meaningful way Further refining of cluster analysis pipeline to uncover more unknown biology . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 35 / 1 A simple example End Thanks! Jeremy Gunawardena and the HMS Department of Systems Biology! My collaborator Tobias Ahsendorf! PRISE and Greg Llacer! The lovely PRISE staff! This audience! . Ved Topkar (Harvard College) Epigenetic analysis at promoters 13 August 2013 36 / 1.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    46 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us