Epigenomic Analysis Reveals DNA Motifs Regulating Histone Modifications in Human and Mouse

Epigenomic Analysis Reveals DNA Motifs Regulating Histone Modifications in Human and Mouse

Epigenomic analysis reveals DNA motifs regulating histone modifications in human and mouse Vu Ngoa,1, Zhao Chenb,1, Kai Zhanga, John W. Whitakerb, Mengchi Wanga, and Wei Wanga,b,c,2 aGraduate Program of Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA 92093-0359; bDepartment of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359; and cDepartment of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359 Edited by Steven Henikoff, Fred Hutchinson Cancer Research Center, Seattle, WA, and approved January 3, 2019 (received for review August 6, 2018) Histones are modified by enzymes that act in a locus, cell-type, and An analogy is that a transcription factor (TF) recognizes the same developmental stage-specific manner. The recruitment of enzymes DNA motif but its binding sites are cell-type–dependent. However, if to chromatin is regulated at multiple levels, including interaction we identify all motifs enriched in the TF binding sites across a large with sequence-specific DNA-binding factors. However, the DNA- and diverse set of cell types, the most common motif is likely the one binding specificity of the regulatory factors that orchestrate spe- recognized by the TF. Histone modifications are more complicated cific histone modifications has not been broadly mapped. We have than a single TF binding and one histone mark can be regulated by analyzed 6 histone marks (H3K4me1, H3K4me3, H3K27ac, H3K27me3, K3H9me3, H3K36me3) across 121 human cell types and tissues from multiple factors recognizing different motifs. Therefore, a compar- the NIH Roadmap Epigenomics Project as well as 8 histone marks ative analysis across diverse cell types/tissues is critical. (with addition of H3K4me2 and H3K9ac) from the mouse ENCODE Recently, machine learning approaches have proven to be Consortium. We have identified 361 and 369 DNA motifs in human useful in understanding epigenetic processes. For example, a and mouse, respectively, that are the most predictive of each histone support vector machine has been used to predict the impact of mark. Interestingly, 107 human motifs are conserved between the SNPs on DNase I sensitivity in their native genomic context (1). two species. In human embryonic cell line H1, we mutated only the Prediction of histone modifications solely from knowledge of TF found DNA motifs at particular loci and the significant reduction of binding both at promoters and at potential distal regulatory ele- H3K27ac levels validated the regulatory roles of the perturbed motifs. ments (2) was done using logistic regression-based classifier or The functionality of these motifs was also supported by the evidence using k-mer features to train a logistic regression model that dis- that histone-associated motifs, especially H3K4me3 motifs, signifi- tinguishes peak sequences from flanking regions (3). Our previous cantly overlap with the expression of quantitative trait loci SNPs in cancer patients more than the known and random motifs. Further- work also demonstrated that DNA motifs are predictive of histone more, we observed possible feedbacks to control chromatin dynamics modifications and DNA methylation in five cell types (4). All of as the found motifs appear in the promoters or enhancers associ- these works have suggested the possibility of deciphering the ated with various histone modification enzymes. These results pave the way toward revealing the molecular mechanisms of epigenetic Significance events, such as histone modification dynamics and epigenetic priming. How the locus-specific histone modifications are achieved is epigenomics | cis-regulatory elements | locus specificity | chromatin not fully understood. One of the contributing mechanisms is dynamics | CRISPR that DNA binding molecules recognize specific sequences and their binding recruits or stabilizes the histone modification istone modifications play key roles in many biological pro- enzyme complexes. Comprehensive identification of such se- Hcesses. Mammalian genomes contain histone-modifying en- quence patterns is the first step toward revealing possible zymes that are responsible for modifying histone tails by adding or regulatory grammar for establishing histone modifications. In removing chemical groups, such as methyl and acetyl groups. The this study, we have cataloged the DNA motifs tightly associ- placement of histone modifications is precisely regulated to en- ated with six and eight important histone modifications in sure that specific regulatory elements and genes are correctly human and mouse, respectively. We show that mutating the activated or repressed in a given cell-type, environment, or de- found motifs at particular loci led to significant reduction of the velopment stage. Understanding the mechanisms that regulate histone modification levels. These histone-associated motifs, locus-specific modification in a cell-state–dependent manner is especially H3K4me3 motifs, significantly overlap with expres- critical toward uncovering the grammar of epigenetic regulation. sion of quantitative trait loci SNPs in cancer patients more than A possible mechanism to establish or maintain locus-specific known motifs, further suggesting their regulatory roles. We histone modification is through binding of sequence-specific pro- also found possible feedback loops mediated by these motifs, teins or noncoding RNAs, which recruit or enhance the modifying implicating their possible roles in histone modification dynamics enzymes’ binding to a particular locus. Other factors can contribute and epigenetic priming. to this specificity, such as DNA methylation, chromatin accessi- Author contributions: V.N., J.W.W., and W.W. designed research; V.N., Z.C., K.Z., and bility, and 3D chromatin contacts. Because histone modifications J.W.W. performed research; M.W. contributed new reagents/analytic tools; V.N., Z.C., are wiped out and reestablished in the zygote, the information K.Z., and J.W.W. analyzed data; and V.N., Z.C., and W.W. wrote the paper with contri- encoded in the DNA sequence is pivotal to initiate the process of bution from all authors. locus-specific histone modifications. Despite the existence of other The authors declare no conflict of interest. contributing factors, it is still critical to comprehensively catalog This article is a PNAS Direct Submission. the sequence motifs that can provide locus-specific guidance for Published under the PNAS license. the enzymatic functions, which can be the first step toward fully 1V.N. and Z.C. contributed equally to this work. decoding the mechanisms regulating locus specificity of histone 2To whom correspondence should be addressed. Email: [email protected]. modifications. Furthermore, if particular DNA motifs are associ- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. ated with histone modifications in many and diverse cell types, they 1073/pnas.1813565116/-/DCSupplemental. are likely important or even causally related to histone modifications. Published online February 12, 2019. 3668–3677 | PNAS | February 26, 2019 | vol. 116 | no. 9 www.pnas.org/cgi/doi/10.1073/pnas.1813565116 Downloaded by guest on September 30, 2021 grammar encoded in the genome regulating epigenetic modifica- tion weight matrices (PWMs) are then generated by first picking tions, but the scope of the previous studies is still limited. a top k-mer and enriched k-mers similar to itself to construct a Furthermore, because the protein sequences of many histone- “seed” PWM, which is then extended by adding more enriched modifying enzymes are conserved, it would also be interesting k-mers that are a few base pairs shifted from the original one. The to investigate whether the regulatory grammar that controls the motifs are then further ranked and filtered based on how well they placement of histone modification is conserved. However, a differentiate the foreground from the background using LASSO direct comparison between the human and mouse genome is (least absolute shrinkage and selection operator) logistic regres- unlikely to identify these motifs because they may be dispersed sion. The final set of motifs is then evaluated by random forest. in the overall nonconserved genomic regions. A strategy to cir- Epigram was individually applied to each dataset (see Mate- cumvent this difficulty is to uncover the DNA motifs associ- rials and Methods for details). For each histone modification in ated with the same histone modification patterns in different each sample, Epigram found DNA motifs that discriminate en- species and then compare the similarities between them to assess richment peaks of the mark under consideration from a back- their conservation. ground of regions that do not overlap with any peak of the six Here, we present a comprehensive survey of histone modification- histone modifications. Importantly, the background has the associated motifs in a large set of diverse cell types and tissues in equal GC content, number of regions, and sequence lengths as both human and mouse (5, 6). Comparative analyses have revealed the foreground to avoid inflated prediction results caused by that 107 motifs are conserved between human and mouse. Fur- simple features or an unbalanced dataset (4). In our previous thermore, in the human embryonic stem cell H1 cell mutating the paper (4), we performed several additional analyses to remove motifs led to significant perturbation of the H3K27ac levels. We also confounding

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us