© 2011 Nature America, Inc. All rights reserved. Correspondence should Correspondence be addressed to L.A.P. or ([email protected]) A.V. ([email protected]). Virginia Medical Center, San Francisco, California, USA. San Francisco, San Francisco, California, USA. USA. California, USA. 1 the in genome. enhancers heart of annotation accurate for value limited of are sets data ChIP-Seq mouse-derived that suggesting evolution, in conserved to poorly be tended by approach this identified sequences genome mouse the in enhancers heart of location the genomic predict can correctly tissue in p300 heart mouse binding genome-wide profiles of coactivator the enhancer-associated cover tissue-specific enhancers dis to strategy conservation-independent a represents sequencing (ChIP-Seq) parallel massively with coupled immunoprecipitation chromatin via marks epigenomic enhancer-associated of Mapping of to of the lack of enhancers. a because human cardiac catalog evaluate difficult been has disease heart in enhancers of role possible the disease human of types many to susceptibility affects enhancers, transcriptional distant-acting including sequences, ing noncod in variation that indicate studies association Genome-wide factors genetic by influenced strongly is and adults and children both in mortality and morbidity of cause leading a is disease Heart and downstream enriched discovery reporter enhancer we disease. near predicted from identified we accurate distant-acting the Development Edward M Rubin Amy Holt Dalit May Large-scale discovery of enhancers from human heart tissue Nature Ge Nature Received 13 April; accepted 20 October; published online 4 December 2011; Genomics Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA.

tested used

dynamic pathological

4

School School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel. fetal

To

an

genome-wide

in

assay of 65 ~6,200 function, and n

further

implicated

1 1 epigenomic etics human

, , Ingrid Plajzer-Frick , , Matthew J Blow a

control studies 3 of

Department Department of Molecular and Cell Biology, California Institute of Quantitative Biosciences, University of California, Berkeley, Berkeley, California, and

expression transcriptional genome-wide adult

and these

conditions

candidate

function validate

ADVANCE ONLINE PUBLICATION ONLINE ADVANCE

heart

1 these

observed

human of , of

2 human

, , Brian L Black

in

the tissue-specific map enhancer

heart

enhancers in elements

role their

of

the heart

11– of enhancer of set

sequences

enhancers. that the

1

development, human the

of

4 heart.

of 1 . . It has previously been shown that in in vivo ,

2

tissue. human discovery enhancers

43

noncoding , , Tommy Kaplan

heart. were

that 1

6 (66%)

, , Malak Shoukry

Division Division of Cardiology, University of California, San Francisco, San Francisco, California, USA. These gene sequences heart 5

enhancer in

, , James Bristow

Consistent is

markedly To heart

a

likely

expression

generate approach 8

results transgenic enhancers, drove

in Current Current address: Division of Cardiology, University of North Carolina, Chapel Hill, North Carolina, USA. function

sequences

depend development 1

to 5

directly . However, the the However, . activity,

6–

reproducible enriched

support facilitate with 1

0 an

. However, .

3

and and by

, 4 on

mouse

, , David J McCulley

1 highly

their

, Crystal Wright , Crystal

1

,

2

the

, Len , A Len Pennacchio 1– d

5 o - -

. i

: 1 0 the closely related CBP coactivator protein coactivator CBP related and closely p300 the both recognizes that antibody pan-specific a with tation We 16) and week immunoprecipi heart. adult chromatin performed (gestational two in human fetal coactivator of enhancer-associated profiles occupancy the determined we genome, human the genes highly expressed in fetal human heart ( heart human in fetal expressed highly genes of TSSs the of kb 2.5–10 within peaks p300/CBP-binding cardiac in enrichment 4.7-fold a We observed tissue. heart expression human from gene data genome-wide to the enhancers genome- near the candidate of set compared located we wide First, genes ChIP-Seq. of by identified function examined regions and we manner, expression this cardiac in the localized were enhancers cardiac genes that are expressed and functional in the heart. of To vicinity assess whether genomic larger the in enriched detectably be to expected kilobases of hundreds tissue. heart human adult and fetal in exist enhancers) (candidate sites binding Fig. ( adulthood into development heart of stages prenatal from are maintained sites p300/CBP-binding cardiac many that suggests samples two the from data in overlap remarkable increased read densities in the adult or subsignificantly fetal or data sets, respectively.significantly showed This (95%) peaks adult of 2,113 significance peak and in (84%) sample. the In other of threshold peaks 4,257 the total, fetal below but background above densities read exhibited samples the of one in identified peaks many addition, In (1,082; candidates 48%) coincided with candidate enhancers found enhancer in fetal human heart. heart human adult the of half heart. human Nearly adult from regions 2,233 identified we analysis, lent 1 Fig. (TSS; site start nearest the regions p300/CBP-bound the genome that at 5,047 were throughout 2.5 kb located least from (peaks) identified tissue heart fetal from analysis enrichment and sequencing . 1 0 2 To generate genome-wide maps of predicted cardiac enhancers in in enhancers cardiac predicted of maps To genome-wide generate Tissue-specific enhancers typically act over distances of tens or or tens of distances over act typically enhancers Tissue-specific United United States Department of Energy Joint Genome Institute, Walnut Creek, 3

8 2 / , , n ). These results indicate that thousands of distal p300/CBP- distal of thousands that indicate results These ). g Supplementary Table 1 Supplementary . 1 5 Cardiovascular Research Cardiovascular Institute, University of California, 0 2 0 , , Veena Afzal 6 5 , , Brian C Jensen 1 , 2 & Axel Visel 9 ; therefore, authentic cardiac enhancers are are enhancers cardiac authentic therefore, ; 1 , , Paul C Simpson and Online Methods). In an equiva an In Methods). Online and 6 , 8 , , Jennifer A Akiyama 1 , 2 1 Fig. 1 Fig. 9 of the aligned sequences sequences aligned the of Fig. 1a Fig. 16– P b < 1 × 10 and and 1 8 – 7 . Massively parallel parallel Massively . Cardiology Cardiology Division, 5 c , , , s r e t t e l 7 Supplementary Supplementary Supplementary Supplementary ,

−40 , binomial , binomial

1 ,  - -

© 2011 Nature America, Inc. All rights reserved. chance ( chance by expected be would what over 4.5-fold of enrichment an genes, cardiac these of 30 of kb 50 within located are that enhancers date candi heart fetal human 81 identified We binomial distribution; nificantly enriched near these genes ( found that candidate heart enhancers were sig and of diseases heart diagnosis in the genetic orders. We compiled a panel of 73 genes used ferent types of congenital and adult heart dis genomic vicinity of genes associated with dif focused on candidate enhancers located in the dicted heart enhancers to cardiac diseases, we function enhancers. heart they human that as idea the supporting system, in cardiovascular the functional and expressed are that genes near enriched are heart human the in binding p300/CBP by identified enhancers candidate ( annotation Ontology by and Gene studies deletion mouse in tions func cardiovascular to linked been have that genes with associated func of analysis enrichment annotations gene tional statistical per we unsupervised an functions, formed cardiac known with genes near enriched tissues were other in expressed highly ( genes near observed was sites to approach 3 Figs. complementary ( a genes as active used identify was promoters gene to II polymerase RNA of binding the when enhancers candidate adult distribution; 2.6; of enrichment fold (cumulative promoters from away kb 200 to up enrichment significant with distribution), is 50. shown depth read maximum of 10; depth a read represents line black thin The of the in introns binding p300/CBP for enriched significantly regions two indicate bars black Thick bar). black (thin element hs1763 tested of the region genomic in the of p300/CBP ( TSS). nearest the kb from <2.5 associated: or promoter proximal site; start transcript nearest (distal: enhancers heart human as candidate considered were and sites binding in p300/CBP- enriched significantly were heart adult from 2,233 and heart fetal from regions 5,047 ( heart. failing of an adult septum the from obtained was tissue heart adult and 16, week at gestational obtained was heart fetal Human 1 Figure s r e t t e l  genes These Methods). Online see tribution; b Fig. 2 Fig. Table Table ) Overlap of candidate enhancers identified in fetal and adult heart tissues. ( tissues. heart adult and in fetal identified enhancers of candidate ) Overlap To assess the potential relevance of pre of relevance potential the assess To b a (2,233) Adult b 1

and ). Second, to examine whether candidate cardiac enhancers enhancers cardiac candidate whether examine to Second, ). 1,082 ChIP-Seq identification of candidate enhancer regions from human fetal and adult heart. heart. adult and fetal human from regions enhancer of candidate identification ChIP-Seq P and and < 7 × 10 × 7 < 4 Fetal humanheart Adult humanheart Supplementary Table 2 Table Supplementary ). In contrast, no enrichment of p300/CBP-binding p300/CBP-binding of enrichment no contrast, In ). Fig. 2 Fig. (16 weeks) (5,047) Fetal −15 a Supplementary Fig. 5 ). We also observed enrichment for human human for enrichment observed also We ). , hypergeometric dis hypergeometric , Supplementary Note Supplementary 2 0 . Candidate cardiac enhancers were indeed were indeed enhancers cardiac . Candidate INPP5A 50 50 c a 0 0 ) Overview of strategy and results of ChIP-Seq analysis. In total, In total, analysis. of ChIP-Seq results and of strategy ) Overview (p300/CBP) (p300/CBP) P ChIP-Seq ChIP-Seq < 0.05, ). These results indicate that that indicate results These ). ). ------

and and relative to other human tissues (see Online Methods). Error bars indicate 95% confidence intervals. confidence 95% indicate bars Error Methods). Online (see tissues human other to relative ( overexpressed are that genes near (black) regions random ( heart. 2 Figure P a < 0.001, binomial binomial 0.001, < Fetal humanheart Number of Adult humanheart Supplementary Supplementary

peaks or regions 10,059 peaks 4,485 peaks a 100

, 50 b Human p300/CBP candidate enhancers are enriched near genes expressed in human human in expressed genes near enriched are enhancers candidate p300/CBP Human 0 ) Frequency of human fetal heart candidate enhancers (red) compared to matched matched to compared (red) enhancers candidate heart fetal human of ) Frequency 10 20 Distance toTSSofnearestfetalheart

30 hs1763 40 50 overexpressed gene(kb) 2 1 - - 60 - 70 2,233 distalboundregions 20 kb 80 5,047 distalboundregions candidate enhancers (distal ChIP-Seq peaks; Using identical methods as for human heart tissue, we identified 6,564 similar to that of human fetal heart at gestational week 16 (refs. developmental the progression and when gene expression profile of the mouse heart is 2, broadly day postnatal at tissue heart mouse in human of annotation heart enhancers. We accurate performed ChIP-Seq for p300/CBP-bindingthe sites for sets data ChIP-Seq derived of mouse- the usefulness we re-examined enhancers, candidate heart genome. human the in enhancers of heart discovery for the methods emphasize the limitations of comparative genomic and computational with prediction motif-based of heart enhancers previous computational screens combining evolutionary conservation in predicted not were (86%) study this in identified peaks human of con straint evolutionary weak relatively under be to tend also sequences enhancer heart candidate human that confirmed sets data enhancer mouse published previously with Comparison evo constraint. weak lutionary under be also might enhancers heart human many that (candidate enhancers) (candidate enhancers) c

90 profiles ) ChIP-Seq ≥ 2.5 kb from the kb from 2.5

100 of human conservation sequence In of limited light apparently the 110 INPP5A 120 130 13 ,

140 15 gene. gene. 150 , 2

160 3 ( 170 Supplementary Fig. 6 Fig. Supplementary

a 180

DVANCE ONLINE PUBLICATION ONLINE DVANCE 190 200 cardiac disease. cardiac to contribute variants enhancer entry noncoding an how of as exploration functional the enhancers for point heart of set wide genome- a of usefulness the highlight data in the vivo affect variants sequence these whether clarify to required be will studies in-depth human in expression cells gene in changes to linked been have that enhancers candidate in 11 variants genetic we identified addition, candidate 81 ( these enhancers within SNPs known 1,546 identified we enhancers, candidate variation in genetic of study the facilitate To NKX2 disorders channels: cardiac (potassium conduction of including variety a diseases, with associated are served in evolution in served mouse heart enhancers tend to be poorly con defects defects (cardiac transcription factors: MYL2 cardi b Number of peaks or regions embryonic that shown previously was It 2 100 o 2 50 -5 and myopathies (muscle structural proteins: a ( , , activity of these enhancers, activity individual 0 ) or underexpressed ( underexpressed ) or ACTN2 Supplementary Table 4 Table Supplementary 10 20 Distance toTSSofnearestnon-heart

30 TBX5 upeetr Tbe 3 Table Supplementary

40 majority vast the Moreover, ). 50 and overexpressed gene(kb) 60 ) (

70 Matched randomregions Fetal heartp300/CBPpeaks 1 Supplementary Supplementary Table Supplementary Table 3 TNNC1 80 5 , raising the possibility possibility the raising , KCNQ1

90 2 4 100 . . These observations

Nature Ge Nature 110 b 120 ) in fetal heart heart fetal ) in ) and congenital congenital and ) 130

140 and

150 Although ). 160

170 KCNE2 GATA4 n 180 25,26). 25,26). etics

190 In ). 200

1 ), ), ). - - - ) ,

© 2011 Nature America, Inc. All rights reserved. Although the tested human sequences in the present study were on were study present the in sequences human tested the Although ref. in described sets data to (compared assay this in tested previously sequences enhancer heart candidate and ( lines cell human genes or from by derived near predictions heart-expressed sequences fre the than quency of higher significantly was sequences human tested the Fig. ( heart the included that ( tissues expression in the heart or in the vasculature, either exclusively in these and ( conservation functional and evolutionary of categories all from sequences of sampling a were regions selected The set. data heart fetal the in to p300/CBP binding high-confidence with sequences on focusing genes, nearby the of knowledge without selected were enhancers Candidate (E11.5). 11.5 day embryonic at mice transgenic in enhancers heart human candidate 65 tested We of patterns the define to approach effective an be to shown previously was that assay enhancer mouse transgenic a used we ChIP-Seq, by predicted samples. tissue human on directly ChIP-Seq performing of value the highlight and enhancers heart of human identification for the obstacle a major represent ture architec enhancer genome-wide in differences lineage-specific that suggest observations these approach, the of limitations technical to may candidate enhancers have in missed been the mouse data set due Note ( set data mouse the from predicted been have not mouse the could peaks in human of majority the of site location the whereas genome, orthologous the at peaks significant with cided coin candidates enhancer heart human of fetal 21% only that found We sets. enhancer candidate fetal human the to them compared and a significant enrichmentandhadabinomialfoldof Ontology (GO)showinghighlysignificantenrichmentofcardiovascularsystem–relatedterms.Onlytermsthatshowed genes implicatedincardiovascularsystem–relatedphenotypes.Bottom,thetopfivetermsbiologicalprocessGene top tenenrichedMouseGenomeInformaticsphenotypeontologyterms Unsupervised enrichmentanalysis 5. Regulationofheartcontraction 4. Vasculature development 3. Bloodvesseldevelopment 2. Heartlooping 1. Bloodvesselmorphogenesis Top enrichedGOterms 10. Abnormalcardiacmusclecontractility 9. Pericardialeffusion 8. Anemia 7. Abnormalrenalglomerulusmorphology 6. Decreasedcardiacmusclecontractility 5. Abnormalmyocardialtrabeculaemorphology 4. Thinventricularwall 3. Abnormalepicardiummorphology 2. Abnormalvitellinevasculature 1. Abnormalcardiacmusclemorphology Top enrichedphenotypes heart enhancers T Nature Ge Nature bp (3,719 sequences mouse tested previously the than longer average The onlynon-cardiovasculartermsretrievedbythisanalysis. able able 1 To further validate and characterize the human heart enhancers enhancers heart human the characterize and validate further To 6 Supplementary Table 8 Table Supplementary 4 , , ). In total, 43 of 65 tested sequences (66%) drove reproducible reproducible drove (66%) sequences tested 65 of 43 total, In ). and and Supplementary Fig. 7 Fig. Supplementary

N a T op op enriched annotations of putative target genes near the candidate human = 28; 43%) or as part of a reproducible compound pattern pattern compound reproducible a of part as or 43%) 28; = in vivo n Supplementary Table 7 Table Supplementary in vivo in etics heart enhancers identified on heart enhancers identified the basis of conserved activity of human and mouse enhancers mouse and human of activity

P < 0.001, 0.001, < ADVANCE ONLINE PUBLICATION ONLINE ADVANCE N 2 = 15; 23%) ( 23%) 15; = 0 ofannotatedgenesintheproximityp300/CBP-bindingdistalregions.Top, the and Online Methods). Although some some Although Methods). Online and ) and similar to that of mouse-derived mouse-derived of that to similar and ) χ 2 test; test; a ). The The ). Supplementary Figs. 8 8 Figs. Supplementary Fig. 15; 15; in vivo in Supplementary Tables 5 Supplementary Supplementary Fig. 9 Fig. Supplementary

3 Binomial FDR Binomial FDR , selected examples in in examples selected , ≥ 3.76 ×10 2.30 ×10 7.73 ×10 5.19 ×10 2.37 ×10 4.35 ×10 7.44 ×10 1.14 ×10 3.34 ×10 2.79 ×10 7.74 ×10 3.43 ×10 5.23 ×10 1.13 ×10 3.32 ×10 2 wereconsidered. validation rate of rate validation Supplementary Supplementary

2 9 showinghighlysignificantenrichmentof Q Q −25 −26 −27 −28 −30 −19 −20 −20 −21 −21 −22 −22 −27 −31 −44 value value 13 , 15 and , 27 , 2 9 8 ). ). - - - .

Binomial foldenrichment Binomial foldenrichment P test; exact Fisher’s two-tailed with calculated was subcategory each for comparison Pairwise peaks). non-alignable in or background above not binding p300/CBP −, background; above but subsignificant or significant binding p300/CBP (+, mouse the to conservation binding by enhancers phastCons −, > 350; phastCons (+, ( ( mice. transgenic 3 Figure conservation binding and sequence ( their of regardless these animals, from hearts in activity examined reproducible showed and enhancers eight E11.5 at enhancers heart their as validated enhanc been candidate had heart human ( adult ers as were and heart considered human adult therefore (63%) the in E11.5 at p300/CBP by tested bound enhancers also are candidate 65 (P28). the 28 of day 41 postnatal total, at In mice transgenic in the heart in human enhancers adult active be to predicted sequences studied also we b Supplementary Table 11 Table Supplementary a > 0.05 in all cases. cases. all in > 0.05 ) Proportion of reproducible enhancers by extent of sequence constraint constraint sequence of extent by enhancers reproducible of ) Proportion Negative tissues only Positive enhancersinother and othertissues Positive enhancersinheart Positive enhancersinheart 5 (8% upeetr Fg 11 Fig. Supplementary n vivo in In vivo 2.9 2.0 2.0 7.5 2.2 2.1 2.9 2.0 2.1 2.3 2.3 2.6 5.7 2.2 2.1

(26% of testedelements In vivo In ) 17 (23% enhanceractivity ) 15 activity patterns in 4-week-old transgenic mice. All All mice. transgenic 4-week-old in patterns activity ) testing of predicted human heart enhancer activities in activities enhancer heart human predicted of testing

a ) ) In vivo In (43% 28 ) for for noncoding sequences that are highly enriched human of population a identify tissue heart human from derived sets data ChIP-Seq the that suggests E11.5 at screen first-pass this Fig. ( sequences mouse enhancer and heart human in sites binding factor transcription same the of enrichment to able attribut partly least at be may assay this in enhancers heart human conserved weakly χ ( differ conservation of classes the ent between elements tested the of activity reproducible the in differences cant Table 10 reproduc ( patterns the reporter of on ibility effects minor only had incompatibilities species human-mouse tial had overall robust cardiac activity, and poten enhancers heart candidate human predicted versus 1,648 bp; beyond prenatal stages of development, development, of stages prenatal beyond can be used to predict enhancer activity of the 65 tested elements. elements. tested 65 the of activity enhancer 2 test). The reproducible activity of even even of activity reproducible The test). To examine whether human ChIP-Seq data ). To explore the spatial activity of these these of activity spatial the explore To ). bona bona fide

10 b constraint Sequence ). We selected eight sequences that that sequences eight selected We ). Percent of elements in rate validation considerable The ). 100 ). Of note, we did not observe signifi 20 40 60 80 ≤ 0 350). ( 350). human heart enhancers. human heart N = 41 – + Supplementary Supplementary Table 9 c N ) Proportion of reproducible reproducible of ) Proportion = 24 in vivo conservation Binding c s r e t t e l heart enhancers Fig. 3 Fig. Supplementary Supplementary Supplementary Supplementary 100 20 40 60 80 0 N b =49 – + ; ; P = 0.9, 0.9, =

N ), ), the =16  ------

© 2011 Nature America, Inc. All rights reserved. Genomics (retrieved October 2010), 2010), October (retrieved Genomics algorithm, Peak fitting URLs. disorders. heart human in sequences enhancer eluc distant-acting for critical be to expected are and genome an add human to the annotation study of functional layer this defined from experimentally results The enhancers. human conserved the report correctly to of those to similar sufficiently are factors transcription mouse that ing in assay, this in the had mouse activity robust cardiac con suggest served not sequences with enhancers heart human even studies, In our mouse. the and human the between conserved functionally nor evolutionarily neither are enhancers heart many that showed tissue authentic heart mouse from data ChIP-Seq are with sets data epigenomic derived sequences these of vivo in majority the that showed enhancer- we the adult and for fetal tissue human heart on ChIP-Seq p300/CBP proteins performing associated by enhancers heart human putative thousand several of set genome-wide a identify ep to an used we study, present the In challenge. develo active are that vivo in sequences noncoding human of location the genomic predict correctly study this in obtained sets data ChIP-Seq the ( examined 12 Fig. sequences Supplementary eight the of seven for heart P28 concord and embryonic the in overall activity enhancer of regions remarkable the between a ance observed we heart, Although the P28 of and subregions E11.5 patterns. distinct in expression staining drive enhancers reporter different reproducible annotating hearts, P28 from sections representative longitudinal and from embryos sections whole-mount transverse examined we enhancers, aorta. Ao, artery; PA, pulmonary tract; OFT, outflow atrium; right RA, atrium; left LA, RV, ventricle; ventricle; right for stained were specimens All P28. at heart of section longitudinal and P28 at heart stained whole-mount E11.5, at heart of section histological and close-up embryo, 4 Figure s r e t t e l  h a c a r Deciphering the gene-regulatory architecture required for for required architecture gene-regulatory the Deciphering v a (intergenic) CIDE 8/8 hs1751 (intergenic) OTOS 7/7 hs1750 r VISTA Enhancer Browser, Browser, Enhancer VISTA p enhancers in the developing and adult heart. adult and developing the in enhancers d er ehnes Ntby cmaio o tee human- these of comparison Notably, enhancers. heart

ment and function of the human heart represents a pressing a pressing represents heart human of the function and ment . e In vivo In A d - - GPC1 TUBB6 u 11– / h 1 o activity of human cardiac enhancers in embryonic and 4-week-old transgenic mice. ( mice. transgenic 4-week-old and embryonic in enhancers cardiac human of activity LacZ 4 m . Through transgenic mouse reporter experiments, experiments, reporter mouse transgenic . Through RA RA e RV RA RV RV ; gene expression in human non-heart tissue, tissue, non-heart human in expression gene ; RV enhancer reporter activity (dark blue). Element ID, reproducibility in E11.5 embryos and flanking genes are indicated. LV, indicated. are genes left flanking and embryos E11.5 in reproducibility ID, Element blue). (dark activity reporter enhancer OF OF T h T LA LV ). These results further support the idea that that idea the support further results These ). LV LV t DA t LA LA p : / / e i s e n l cis a RV h RV b t - RA t . RA o regulatory function of poorly poorly of function regulatory p r : h / g / t / t e s p n o LV : h f / LV t / LA a w c n a LA a i c r r genomic approach approach genomic e i d e dating the role of of role the dating r i / . o g l RA b g r RV l i e RA . z g RV n z o o l y Fig. Fig. v m PA / Ao / ; ; Cardio- ; Grizzly Grizzly ; i c ex vivo ex LV s 4 . LV m and and LA e LA d - - - .

The The authors declare no competing interests.financial manuscript. the of writing the to contributed authors All analysis. data performed and materials and reagents P.C.S. provided B.L.B. and analysis. data and experiments performed I.P.-F., T.K.,A.H., M.J.B., D.M., J.A.A., D.J.M., B.C.J., C.W. M.S., V.A. and experiments. the J.B., L.A.P. A.V. and designed E.M.R., and D.M., of conceived DE-AC02-05CH11231, University of California). Department of JointEnergy Genome Institute (Department of Contract Energy was performed at Lawrence NationalBerkeley Laboratory and at the United States was supported by the NHLBI and the Department of Veterans Affairs. Research Foundation for Cardiac Research and a grant from the NHLBI P.C.S.(HL096836). for Cardiovascular UniversityDisease, of San California, Francisco (UCSF) B.C.J. was supported by the GlaxoSmithKline Research and Education Foundation Molecular OrganizationBiology long-term (EMBO) fellowships.postdoctoral (NHLBI, HL64658 and HL89707). D.M. and T.K. were supported by European B.L.B. was supported by grants from the National Heart, Lung, and InstituteBlood by a grant funded by the National Research Institute (HG003988). D. Dickel for commentscritical on the manuscript. L.A.P. and A.V. were supported S. Deutsch for help in humanretrieving genetic data and C. Attanasio and The authors thank R. Hosseini and S. Phouanenavong for technical support, Note: Supplementary information is available on the Tables 8 7, Supplementary in provided numbers accession the under Browser VISTA Enhancer Complete (GSE32587). base data GEO NCBI the in deposited been have experiments ChIP-Seq numbers. Accession online the at http://www.nature.com/naturegenetics/. paper the of in ­version available are references associated any and Methods Me database, H h C AUTH A t ck O d b G t p MPETI t - (intergenic) FAM78A NUP214 13/14 hs1963 (intronic C1orf198 12/12 hs1766 : no U / ho / O 1 w wled R R C 3 w 3 d NG h ) _ w - ON s t P . t a FI p l u g f TRIBUTI : f / s N me y / RA RV _ a RV RA RV m A RA w 2 DVANCE ONLINE PUBLICATION ONLINE DVANCE RV N OF n w . e OF t CIAL CIAL I t i ts w s r T OF T s i a . All raw sequences and processed data from from data processed and sequences raw All LV x n u ON – LV T LA LV . e c d LA c - b LV ) From left to right: whole-mount stained E11.5 E11.5 stained whole-mount right: to left ) From o N S m i m . TERESTS n i x and / l m A t in vivo in u u RV . r RA n RV 10 t e h i - h RA d / . . s a g u data sets are available from the the from available are sets data t o p a v LV - p / s LV LA o s e i r t t LA N . t e a / s a p d / t t u G o - r w r e e e

n G n RA s e e RV RV u l Nature Ge Nature o n T l e RA t a e t s d i s . c z t s s s website. i / Ao p / LV d . ; ; GeneTests Ao e m LV LA o n _ etics

d a t a - / © 2011 Nature America, Inc. All rights reserved. 4. 3. 2. 1. reprints/index.html. http://www.nature.com/ at online available is information permissions and Reprints Published online at http://www.nature.com/naturegenetics/. Nature Ge Nature 14. 13. 12. 11. 10. 9. 8. 7. 6. 5.

Bruneau, B.G. The developmental genetics of congenital heart disease. cardiovascular controlling mechanisms Genetic S. Bhattacharya, & J. Bentham, D. Lloyd-Jones, Hoffman, J.I., Kaplan, S. & Liberthson, R.R. Prevalence of congenital heart disease. Visel, A., Rubin, E.M. & Pennacchio, L.A. Genomic views of distant-acting enhancers. Schunkert, H. R. McPherson, A. Helgadottir, M.E. Pierpont, Nature disease. artery coronary for loci disease. heart infarction. myocardial Pediatrics. of Academy American the Cardiac by endorsed Young: Congenital the in Heart Association Disease Cardiovascular on Council Committee, the American Defects from statement scientific a (2008). 943–948 development. Association. Heart American the from J. Heart Am. i H. Xi, A. Visel, Heintzman,N.D. Chen, X. A. Visel, chromatinregulatory structures humantheingenome. Nature genome.humanpromotersenhancerstheand in cells. stem embryonic in network mice. in interval risk

t al. et 461 457 et al. t al. et t al. et n , 199–205 (2009). 199–205 , , 854–858 (2009). 854–858 , Identification and characterization of cell type–specific and ubiquitous type–specific Identification characterization cell and and of etics

147 Integration of external signaling pathways with the core transcriptional Ann. NY Acad. Sci. Acad. NY Ann. et al. Science agtd eein f h 92 nncdn crnr atr disease artery coronary non-coding 9p21 the of deletion Targeted ChIP-seq accurately predicts tissue-specific activity of enhancers. activity tissue-specific predicts accurately ChIP-seq et al. et t al. et et al. et et al. et etal. , 425–439 (2004). 425–439 , Large-scale association analysis identifies 13 new susceptibility

Genetic basis for congenital heart defects: current knowledge: current defects: heart congenital for basis Genetic A common allele on 9 associated with coronary with associated 9 chromosome on allele common A cmo vrat n hoooe p1 fet te ik of risk the affects 9p21 chromosome on variant common A Heart disease and stroke statistics—2010 update: a report a update: statistics—2010 stroke and disease Heart Nature ADVANCE ONLINE PUBLICATION ONLINE ADVANCE Distinct andpredictive chromatin signatures oftranscriptional Science 316 , 1488–1491 (2007). 1488–1491 ,

464

316 Nat. Genet. Nat. , 409–412 (2010). 409–412 ,

Cell 1123 , 1491–1493 (2007). 1491–1493 ,

Circulation Circulation 133 , 10–19 (2008). 10–19 , , 1106–1117 (2008). 1106–1117 ,

43 , 333–338 (2011). 333–338 , Nat. Genet.Nat.

115 121 PLoSGenet. , 3015–3038 (2007). 3015–3038 , , e46–e215 (2010). e46–e215 ,

39

, 311–318(2007).,

3 , e136(2007)., Nature

451 , 27. 26. 25. 24. 23. 22. 21. 20. 19. 18. 17. 16. 15. 28. 29.

Kothary, R. Kothary, S.R. Coppen, ventricles the of structure and development The R.H. Anderson, & D.J. Henderson, Narlikar,L. A. Siepel, A.S. Dimas, M. Ashburner, C.Y. McLean, A.P. Capaldi, The Y. Nakatani, & B.H. Howard, V., Russanova, R.L., Schiltz, V.V., Ogryzko, R. Eckner, p300 E1A-associated R. Eckner, & D.M. Livingston, W.R., Sellers, Z., Arany, M.J. Blow, Pennacchio, L.A. Pennacchio, lk, .. Bl, .. Epg JT, ai, .. Rcado, .. h Mouse The J.E. Richardson, & J.A. Kadin, J.T., Eppig, C.J., Bult, J.A., Blake, expressed in neural tube. neural in expressed heart. foetal human (2003). and heart mouse heart. human the in 381–392(2010). genomes. yeast manner. type–dependent Consortium. Ontology regions. Hog1. MAPK the by a (1996). 953–959 of properties acetyltransferases. histone are CBP and p300 coactivators with transcriptional protein a reveals (p300) adaptor. transcriptional protein 300-kD associated (1994). 799–800 coactivators. of family conserved a to belong CBP CREB-associated and Genet. Nat. sequences. (2009). genotypes:phenotypes. Database Genome Nat. Biotechnol. Nat. et al. et etal. et al. et

Nature t al. et et al. et 42 et al. et t al. et et al. et t al. et t al. et , 806–810 (2010). 806–810 , Evolutionarily conserved elements in vertebrate, insect, worm, and worm, insect, vertebrate, in elements conserved Evolutionarily Genome-wide discovery of human heart enhancers. Genome Res. Genome Molecular cloning and functional analysis of the adenovirus E1A- adenovirus the of analysis functional and cloning Molecular A transgene containing transgene A ChIP-Seq identification of weakly conserved heart enhancers. heart conserved weakly of identification ChIP-Seq et al. et Common regulatory variation impacts gene expression in a cell a in expression gene impacts variation regulatory Common

Comparison of connexin expression patterns in the developing the in patterns expression connexin of Comparison tutr ad ucin f tasrpinl ewr activated network transcriptional a of function and Structure 444 RA ipoe fntoa itrrtto of interpretation functional improves GREAT : tool for the unification of biology. The Gene biology. of unification the for tool ontology: Gene Nat. Genet. Nat. Pediatr. Cardiol. Pediatr. Nat. Genet. Nat.

, 499–502 (2006). 499–502 , Genes Dev. Genes In vivo In

Science 28 Nature , 495–501 (2010). 495–501 ,

15 enhancer analysis of human conserved non-coding conserved human of analysis enhancer

, 1034–1050 (2005). 1034–1050 ,

40

335 325 25

8 , 1300–1306 (2008). 1300–1306 ,

, 869–884 (1994). 869–884 , , 25–29 (2000). 25–29 , , 435–437 (1988). 435–437 , 30 , 1246–1250 (2009). 1246–1250 , , 588–596 (2009). 588–596 , lacZ o. el Biochem. Cell. Mol. uli Ais Res. Acids Nucleic inserted into the into inserted s r e t t e l

37 dystonia GenomeRes. 242 D712–D719 , cis 121–127 , -regulatory is locus Cell Cell

87 77 20  , , ,

© 2011 Nature America, Inc. All rights reserved. tissues tissues (average expression across ten non-heart tissues; see URLs). Genes that non-heart human and CardioGenomics) samples; heart human ischemic 32 fetal normal weeks five at ~20 samples heart across expression average GSE1789, fetal set human data from (GEO data heart expression gene available publicly using identified were genes expressed Differentially in heart. the or suppressed overexpressed genes near regions random matched of frequency the with enhancers heart human of candidate by the frequency comparing was determined in the heart genes.system–related cardiovascular near enhancers candidate of Enrichment project ENCODE the of part as generated were that lines cell human nine of each from samples DNA input from reads sampled randomly million 1 combining by generated was sample generic DNA input a human comparison, For scripts. custom using performed was enhancers heart adult candidate of vicinity the in samples ChIP-Seq heart fetal human of from reads from sample. Enrichment a human heart ChIP-Seq fetal second enhancers. cardiac distant-acting candidate represent peaks remaining The peaks. as defined were TSS nearest the TSSs of known genes in the UCSC database artifacts. likely as analyses further from removed were repeats in originated reads tributing con of the >50% where or regions duplications segmental regions, telomeric regions, centromeric contigs, chromosomal unassembled to mapped Bound regions Grizzly. by called as occupancy, maximal of basis the on scored sorted were and Peaks regions. single into merged were kb 1 than less or peaks of of two the either gap length the of than less a with separating peaks Grizzly URLs) (see algorithm fitting Peak Grizzly modified a and 100) = size shift settings except: bw = 300; by of from MACS identified the peaks intersection resolution. at 25-bp calculated was reads extended of genome-wide coverage Cumulative length. fragment DNA sequenced estimated the for account to bp 300 to extended were alignments and excluded, were reads (BWA) using Aligner mm9) Burrows-Wheeler 37, the build NCBI and hg18 36, build (NCBI genomes reference data. ChIP-Seq of Processing Abcam). ab817, state Signaling Technology) that recognizes these proteins in their active, acetylated 40 using immunoprecipitated and (Diagenode) Bioruptor a using sheared was Chromatin homogenizer. Polytron a using homogenized were tissues frozen and homogenizer, dounce glass a in dissociated were samples fresh dehyde, previously Committee. Welfare Research and Animal Laboratory National Berkeley Lawrence the by approved and reviewed protocols with accordance National Laboratory. in work at All animal Berkeley was performed Lawrence tissue samples were reviewed and approved by human the Human involving Subjects study Committee this of procedures All mice. CD-1 15 from approximately pooled and isolated was tissue heart mouse P2 consent. informed full and with laws and federal state applicable with compliance in Inc. ABR, from 16 week was obtained from gestational tissue human −80 °C. Fresh heart fetal explanted cleaned rapidly of all epicardial fat, flash the frozen in liquid nitrogen and stored and at were Samples solution. cardiectomy, physiologic ice-cold in immediately before placed was heart antegrade perfused was solution cardioplegic Cold surgery. before recipient transplant the from obtained was the with UCSF at approval of the UCSF Committee for transplant Human Full Research. of informed consent time the at removed heart a from obtained was 31%) fraction ejection years; 45 age male; failing; (ischemic, tissue heart preparation. and collection tissue heart mouse and Human O Nature Ge Nature are differentially expressed in the heart were identified by comparison with the NLIN To assess the reproducibility of p300/CBP ChIP-Seq data, we performed performed we data, ChIP-Seq p300/CBP of reproducibility the Toassess prox their to according classified were regions p300/CBP-bound identified were regions p300/CBP-bound of sets data High-confidence Tissue samples for were processed ChIP and DNA as described sequencing µ 30 l of anti–acetyl-CBP/p300 antibody (rabbit polyclonal no. 4771, Cell Cell 4771, no. polyclonal (rabbit antibody anti–acetyl-CBP/p300 of l Enrichment of candidate human cardiac enhancers near genes expressed , 3 1 or 5 5 or E E 13 M , n 1 5 E etics µ , with a few modifications. After cross-linking in 1% formal 1% in cross-linking After modifications. few a with , T g of anti–RNA polymerase II antibody (mouse monoclonal monoclonal (mouse antibody II polymerase anti–RNA of g HO D S P = 1 × 10 3 6 ), human adult heart (average expression across across expression (average heart adult human ), Sequence reads (36 bp) were aligned to the the to aligned were bp) (36 reads Sequence −5 ; ; mfold = 10,30; local = 20,000; off-auto– 3 2 . Repetitively mapped and duplicate duplicate and mapped Repetitively . 3 3 . . Regions within 2.5 kb of the 1 1 34 (version 1.4, with default , 3 5 . Adult human human Adult i mity to to mity 1 9 - - .

TMEM43 FKRP TGFB3 MYBPC3 AGT disease: heart in involved be ChIP-Seq). p300/CBP for as method same the using called were peaks II II polymerase (RNA TSS) nearest polymerase from away kb RNA (<2.5 peaks near proximal regions random matching of frequency the with of by human enhancers candidate heart the frequency comparing determined was peaks proximal II RNA polymerase near enhancers p300/CBP candidate (GREAT) tool annotations of enrichment regions genomic the using performed was enhancers heart fetal human p300/CBP candidate iterations. 1,000 over regions domized of ran frequency average the with by comparison determined was enhancers heart of candidate Enrichment above. described procedure failed peak-filtering the that regions excluding chromosome, same the on location random a to enhancer candidate each moving by generated were sets data randomized in matched the heart, expressed near genes differentially enhancers candidate of frequency expected Tothe respectively. determine heart, versus non-heart and non-heart versus heart in expression of ratio the by ranked genes 1,000 top the as identified were heart the in suppressed and overexpressed Genes less than 100 in both heart and non-heart were excluded from analysis. further non-heart tissue data set. Genes with MAS5.0 normalized expression values of Grizzly Peak fitting algorithm but were subsequently removed from the final final the from removed subsequently were but algorithm fitting Peak Grizzly the by called initially were that hs1959) and hs1948 (hs1913, elements three score. The 65 peak high-scoring Grizzly the data set of 5,047 candidate fetal human heart enhancers on the basis of their Transgenic mouse enhancer assay. described as performed was tissues heart mouse and human from peaks p300/CBP acetylated 500 top the of analysis analysis. site binding factor Transcription human genome. mouse the in 4, locus homologous unique class no with peaks and binding; p300/CBP no with regions mouse to mapped peaks human 3, class reads); extended overlapping 11 of threshold enrichment, coverage (subsignificant levels background genome-wide exceeded but peaks p300/CBP as called not were mouse the in regions homologous whose the to mapped genome mouse peaks human 1, class classes: four of one in enhancers date vivo in tool liftOver the using to (mm9) enhancers genome mouse candidate the human the aligned we enhancers, heart candidate conservation. and alignments Human-mouse maximum peak the around centered whole interval ( the genomic either 1-kb the overlapping or element peak phastCons highest-scoring the of score the assigned were enhancers Candidate sequences. human genomic to mouse or sequences genome vertebrate of alignments multiple from tified (phastCons elements conserved with comparison by evaluated was Sequence constraint analyses. enhancers. heart individuals 75 of types cell three for SNPs (eQTL) loci trait quantitative expression 1,637 and human gene was in SNPscells expression retrieved tional primary affecting Browser Genome UCSC the of function table the using server Galaxy the using genes disease–associated heart of upstream and downstream kb 50 regions flanking the with intersected was enhancers heart human candidate FKTN SGCD Supplementary Table 5 Supplementary The The GeneTests URLs) (see was database used to 73 identify genes known to of proximity the in annotation gene of analysis enrichment Unsupervised The list of SNPs and their locations was retrieved from dbSNP build 130 130 build dbSNP from retrieved was locations their and SNPs of list The , ACTN2 , , , , , , occupancy of p300/CBP in the mouse and placed the human candi human the placed and mouse the in p300/CBP of occupancy , , NKX2-5 TNNI3 TRIM32 ACTC1 , , , , MYL2 AGTR1 , 37 RYR2 , , , , , , , 3 , , ≤ CFC1 , , PLN CAPN3 8 NOTCH1 TBX5 2 2 kb from a mouse p300/CBP peak; class 2, human peaks peaks human 2, class peak; p300/CBP mouse a from kb 2 . , , 2 were intersected with the 5,047 candidate fetal human human fetal candidate 5,047 the with intersected were DNAJC19 , SLC29A3 , , , , GJA1 , , TTN ABCC9 , , ). TPM1 , , , , LAMP2 , , EYA4 CASQ2 , , , DES Evolutionary constraint of candidate enhancers SCN5A VCL , , , , PKP2 TCAP , , , , We selected regions for DSP , DYSF , , LDB3 , , FLNA GBA , , , , MYL3 , , TMPO , , in in vivo JUP PRKAG2 13 , , , , , , ANKRD1 1 JAG1 , , LMNA binding site binding factor Transcription , , 5 TAZ SGCA . , , , , TNNC1 –tested sequences also include also sequences –tested SGCG To identify human-specific human-specific identify To , , and and , , KCNE2 , , GATA4 , , 4 TNNT2 0 ACE , , , . We then analyzed the the analyzed then We . CSRP3 MYH6 , , DMD CAV3 33 , , doi:10.1038/ng.1006 , 3 , , DSC2 in vivo , , 9 CHD7 2 . The list of func of list The . KCNE1 , , 0 , . The data set of of set data The . PSEN2 , , . Enrichment of Enrichment . , , KCNQ1 MYH7 SGCB , , testing from DSG2 , , TMEM70 , , , , 2 , , , , PPARG ACTA1 , 3 MYOT PSEN1 ANO5 ) iden ) , , TTR 2 2 - - - - , , , , , , , ,

© 2011 Nature America, Inc. All rights reserved. 31. 30. methods. standard using sectioned and X-Gal with stained were Hearts age. of weeks 4 at sacrificed were mice transgenic F0 that except above, out were as assays carried detailed enhancer mouse The transgenic set. data at score and adult in human enhancer E11.5 the the candidate heart peak staining cardiac LacZ of reproducibility the on based selected were elements methods. standard using sectioned and paraffin in reproducible considered ing from independent transgenic integration events of the same construct were described in vector as described from human genomic DNA and cloned into the p300/CBP peaks ( the DNA kb of of ~3.7 genomic human flanking consisting regions candidate Enhancer process. filtering by set an data increased-stringency genome-wide doi:10.1038/ng.1006 The activity of eight selected elements was tested at 4 weeks of age. These These age. of weeks 4 at tested was elements selected eight of activity The Supplementary Table 12 Table Supplementary

Thompson, P.R.Thompson, therapy failure heart Novel K. Hasegawa, & Y.,M. T.,Sunagawa, Fujita, Morimoto, loop. compound, natural a by cardiomyocytes in curcumin. pathway transcriptional targeting Nat. Struct. Mol. Biol. Mol. Struct. Nat. 4 1 . Only patterns observed in result at embryos . different three least observed Only patterns Circ. J. Circ. Supplementary Table 10 27 et al. et

, 74 4 1 . Genomic coordinates of amplified regions are reported , 1059–1066 (2010). 1059–1066 , Regulation of the p300 HAT domain via a novel activation novel a via domain HAT p300 the of Regulation 4 2 . For histological analysis, embryos were . embryos embedded analysis, For histological . Transgenic mouse embryos were generated as as generated were embryos mouse Transgenic .

11 , 308–315 (2004). 308–315 , ) were amplified by PCR (Clontech) Hsp68 - promoter - LacZ reporter - 42. 41. 40. 39. 38. 37. 36. 35. 34. 33. 32.

Visel, A., gene Minovitsky, S., Dubchak, I. & Pennacchio, human L.A. VISTA Enhancer Browser— Scanning E.M. Rubin, & V. Afzal, I., Ovcharenko, M.A., Nobrega, A.S. Hinrichs, S.T.Sherry, for approach comprehensive a Galaxy: J. Taylor, & A. Nekrutenko, J., Goecks, D. Blankenberg, A. Conti, J. Ernst, E. Birney, R.M. Kuhn, Burrows-Wheeler with alignment read short accurate and Fast R. Durbin, & H. Li, (2007). enhancers. human tissue-specific of database a enhancers. long-range for deserts Res. Acids Res. sciences. life the in research computational transparent and reproducible, accessible, supporting 2010). (Wiley, experimentalists. trisomy. 21 (2007). chromosome 268 with fetuses human of heart the in types. cell project. pilot ENCODE the by (2007). genome human the of Res. Acids transform.

29 , 308–311 (2001). 308–311 , et al. et et al. et

Bioinformatics Nature 34 37 t al. et et al. et t al. et Genome Biol. Genome , D590–D598 (2006). D590–D598 , (2009). D755–D761 , Mapping and analysis of chromatin state dynamics in nine human nine in dynamics state chromatin of analysis and Mapping et al. et Altered expression of mitochondrial and extracellular matrix genes matrix extracellular and mitochondrial of expression Altered

dbSNP: the NCBI database of genetic variation. genetic of database NCBI the dbSNP: h US Gnm Bosr aaae udt 2009. update Database: Browser Genome UCSC The Identification and analysis of functional elements in1% elements offunctional analysis and Identification ur Poo. o. Biol. Mol. Protoc. Curr. 473 t al. et The UCSC Genome Browser Database: update 2006. update Database: Browser Genome UCSC The , 43–49 (2011). 43–49 ,

25 Galaxy: a web-based genome analysis tool for for tool analysis genome web-based a Galaxy:

11 , 1754–1760 (2009). 1754–1760 , , R86 (2010). R86 , Science

Chapter 19, Unit 19.10.1–19.10.21 19.10.1–19.10.21 Unit 19, Chapter 302 , 413 (2003). 413 , Nucleic Acids Res. Acids Nucleic Nature Nature Ge Nature M Genomics BMC

447

35 Nucleic Acids Nucleic 799–816 , , D88–D92 , n Nucleic Nucleic etics

8 ,