Large-Scale Discovery of Enhancers from Human Heart Tissue
Total Page:16
File Type:pdf, Size:1020Kb
LETTERS Large-scale discovery of enhancers from human heart tissue Dalit May1, Matthew J Blow1,2, Tommy Kaplan3,4, David J McCulley5, Brian C Jensen6,8, Jennifer A Akiyama1, Amy Holt1, Ingrid Plajzer-Frick1, Malak Shoukry1, Crystal Wright2, Veena Afzal1, Paul C Simpson5,7, Edward M Rubin1,2, Brian L Black5, James Bristow1,2, Len A Pennacchio1,2 & Axel Visel1,2 Development and function of the human heart depend on To generate genome-wide maps of predicted cardiac enhancers in the dynamic control of tissue-specific gene expression by the human genome, we determined the occupancy profiles of two distant-acting transcriptional enhancers. To generate an enhancer-associated coactivator proteins in human fetal (gestational accurate genome-wide map of human heart enhancers, week 16) and adult heart. We performed chromatin immunoprecipi- we used an epigenomic enhancer discovery approach and tation with a pan-specific antibody that recognizes both p300 and identified ~6,200 candidate enhancer sequences directly the closely related CBP coactivator protein16–18. Massively parallel from fetal and adult human heart tissue. Consistent with their sequencing and enrichment analysis19 of the aligned sequences predicted function, these elements were markedly enriched from fetal heart tissue identified 5,047 p300/CBP-bound regions near genes implicated in heart development, function and (peaks) throughout the genome that were located at least 2.5 kb from All rights reserved. disease. To further validate their in vivo enhancer activity, the nearest transcription start site (TSS; Fig. 1a–c, Supplementary we tested 65 of these human sequences in a transgenic mouse Inc Fig. 1, Supplementary Table 1 and Online Methods). In an equiva- enhancer assay and observed that 43 (66%) drove reproducible lent analysis, we identified 2,233 regions from adult human heart. reporter gene expression in the heart. These results support the Nearly half of the adult human heart enhancer candidates (1,082; discovery of a genome-wide set of noncoding sequences highly 48%) coincided with candidate enhancers found in fetal human heart. America, enriched in human heart enhancers that is likely to facilitate In addition, many peaks identified in one of the samples exhibited downstream studies of the role of enhancers in development read densities above background but below the peak significance and pathological conditions of the heart. threshold in the other sample. In total, 4,257 of fetal peaks (84%) and Nature 2,113 of adult peaks (95%) showed significantly or subsignificantly Heart disease is a leading cause of morbidity and mortality in both increased read densities in the adult or fetal data sets, respectively. This 2011 children and adults and is strongly influenced by genetic factors1–5. remarkable overlap in data from the two samples suggests that many © Genome-wide association studies indicate that variation in noncod- cardiac p300/CBP-binding sites are maintained from prenatal stages ing sequences, including distant-acting transcriptional enhancers, of heart development into adulthood (Fig. 1b and Supplementary affects susceptibility to many types of human disease6–10. However, Fig. 2). These results indicate that thousands of distal p300/CBP- the possible role of enhancers in heart disease has been difficult to binding sites (candidate enhancers) exist in fetal and adult human evaluate because of the lack of a catalog of human cardiac enhancers. heart tissue. Mapping of enhancer-associated epigenomic marks via chromatin Tissue-specific enhancers typically act over distances of tens or immunoprecipitation coupled with massively parallel sequencing hundreds of kilobases9; therefore, authentic cardiac enhancers are (ChIP-Seq) represents a conservation-independent strategy to dis- expected to be detectably enriched in the larger genomic vicinity of cover tissue-specific enhancers11–14. It has previously been shown that genes that are expressed and functional in the heart. To assess whether genome-wide binding profiles of the enhancer-associated coactivator cardiac enhancers were localized in this manner, we examined protein p300 in mouse heart tissue can correctly predict the genomic the cardiac expression and function of genes located near the location of heart enhancers in the mouse genome15. However, the regions identified by ChIP-Seq. First, we compared the genome- sequences identified by this approach tended to be poorly conserved wide set of candidate enhancers to genome-wide gene expression in evolution, suggesting that mouse-derived ChIP-Seq data sets are data from human heart tissue. We observed a 4.7-fold enrichment of limited value for accurate annotation of heart enhancers in the in cardiac p300/CBP-binding peaks within 2.5–10 kb of the TSSs of human genome. genes highly expressed in fetal human heart (P < 1 × 10−40, binomial 1Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA. 2United States Department of Energy Joint Genome Institute, Walnut Creek, California, USA. 3Department of Molecular and Cell Biology, California Institute of Quantitative Biosciences, University of California, Berkeley, Berkeley, California, USA. 4School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. 5Cardiovascular Research Institute, University of California, San Francisco, San Francisco, California, USA. 6Division of Cardiology, University of California, San Francisco, San Francisco, California, USA. 7Cardiology Division, Virginia Medical Center, San Francisco, California, USA. 8Current address: Division of Cardiology, University of North Carolina, Chapel Hill, North Carolina, USA. Correspondence should be addressed to L.A.P. ([email protected]) or A.V. ([email protected]). Received 13 April; accepted 20 October; published online 4 December 2011; doi:10.1038/ng.1006 NATURE GEnETICS ADVANCE ONLINE PUBLICATION 1 LETTERS a are associated with a variety of cardiac ChIP-Seq 5,047 distal bound regions 10,059 peaks Fetal human heart (p300/CBP) (candidate enhancers) diseases, including conduction disorders (16 weeks) (potassium channels: KCNQ1 and KCNE2), cardiomyopathies (muscle structural proteins: ChIP-Seq 2,233 distal bound regions MYL2, ACTN2 and TNNC1) and congenital 4,485 peaks (p300/CBP) (candidate enhancers) defects (cardiac transcription factors: GATA4, Adult human heart NKX2-5 and TBX5) (Supplementary Table 3). b c To facilitate the study of genetic variation in candidate enhancers, we identified 1,546 20 kb known SNPs within these 81 candidate 50 Fetal human heart enhancers (Supplementary Table 3). In Adult Fetal 0 (2,233) (5,047) addition, we identified genetic variants in 11 50 Adult human heart 0 candidate enhancers that have been linked hs1763 to changes in gene expression in human INPP5A cells22 (Supplementary Table 4). Although 1,082 in-depth studies will be required to clarify Figure 1 ChIP-Seq identification of candidate enhancer regions from human fetal and adult heart. whether these sequence variants affect the Human fetal heart was obtained at gestational week 16, and adult heart tissue was obtained from the in vivo activity of individual enhancers, these septum of an adult failing heart. (a) Overview of strategy and results of ChIP-Seq analysis. In total, data highlight the usefulness of a genome- 5,047 regions from fetal heart and 2,233 from adult heart were significantly enriched in p300/CBP- wide set of heart enhancers as an entry binding sites and were considered as candidate human heart enhancers (distal: ≥2.5 kb from the nearest transcript start site; proximal or promoter associated: <2.5 kb from the nearest TSS). point for the functional exploration of how (b) Overlap of candidate enhancers identified in fetal and adult heart tissues. (c) ChIP-Seq profiles noncoding enhancer variants contribute to of p300/CBP in the genomic region of the tested hs1763 element (thin black bar). Thick black bars cardiac disease. indicate two regions significantly enriched for p300/CBP binding in introns of the INPP5A gene. It was previously shown that embryonic The thin black line represents a read depth of 10; maximum read depth shown is 50. mouse heart enhancers tend to be poorly con- served in evolution15, raising the possibility All rights reserved. distribution), with significant enrichment up to 200 kb away from that many human heart enhancers might also be under weak evo- promoters (cumulative fold enrichment of 2.6; P < 0.001, binomial lutionary constraint. Comparison with previously published mouse Inc distribution; Fig. 2a). We also observed enrichment for human enhancer data sets confirmed that human candidate heart enhancer adult candidate enhancers when the binding of RNA polymerase II sequences also tend to be under relatively weak evolutionary con- to gene promoters was used as a complementary approach to straint13,15,23 (Supplementary Fig. 6). Moreover, the vast majority America, identify active genes (Supplementary Note and Supplementary of human peaks identified in this study (86%) were not predicted in Figs. 3 and 4). In contrast, no enrichment of p300/CBP-binding previous computational screens combining evolutionary conservation sites was observed near genes highly expressed in other tissues with motif-based prediction of heart enhancers24. These observations Nature (Fig. 2b). Second, to examine whether candidate cardiac enhancers emphasize the limitations of comparative genomic and computational were enriched near genes with known cardiac functions, we per- methods for the discovery