1 2

3

4 5

6 7 Mapping cell type-specific transcriptional enhancers using high affinity, 8 lineage-specific Ep300 bioChIP-seq 9

10 Running title: Mapping cell type-specific enhancers

11

12 Pingzhu Zhou1*, Fei Gu1*, Lina Zhang2, Brynn N. Akerberg,1 Qing Ma1, Kai Li1, Aibin He1,3, 13 Zhiqiang Lin1, Sean M. Stevens1, Bin Zhou4, and William T. Pu1,5 14 *contributed equally to this study

15

16 1Department of Cardiology, Boston Children’s Hospital, 300 Longwood Ave, Boston, MA 17 02115, USA 18 2Department of Biochemistry, Institute of Basic Medicine, Shanghai University of Traditional 19 Chinese Medicine, Shanghai 201203, China 20 3Institute of Molecular Medicine, Peking-Tsinghua Center for Life Sciences, Peking University, 21 Beijing 100871, China 22 4State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai 23 Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China 24 5Harvard Stem Cell Institute, Harvard University, 1350 Massachusetts Avenue, Suite 727W, 25 Cambridge, MA 02138, USA 26 27

28

- 1 - 29 Abstract 30 Understanding the mechanisms that regulate cell type-specific transcriptional programs

31 requires developing a lexicon of their genomic regulatory elements. We developed a

32 lineage-selective method to map transcriptional enhancers, regulatory genomic regions that

33 activate transcription, in mice. Since most tissue-specific enhancers are bound by the

34 transcriptional co-activator Ep300, we used Cre-directed, lineage-specific Ep300 biotinylation

35 and pulldown on immobilized streptavidin followed by next generation sequencing of

36 co-precipitated DNA to indentify lineage-specific enhancers. By driving this system with

37 lineage-specific Cre transgenes, we mapped enhancers active in embryonic endothelial

38 cells/blood or skeletal muscle. Analysis of these enhancers identified new

39 heterodimer motifs that likely regulate transcription in these lineages. Furthermore, we

40 identified candidate enhancers that regulate adult heart- or lung- specific endothelial cell

41 specialization. Our strategy for tissue-specific biotinylation opens new avenues for

42 studying lineage-specific protein-DNA and protein-protein interactions.

43

44 Keywords: transcriptional enhancer; endothelial cell; angiogenesis; Ep300

45

- 2 - 46 Introduction 47 The diverse cell types of a multicellular organism share the same but express

48 distinct expression programs. In mammals, precise cell-type specific regulation of gene

49 expression depends on transcriptional enhancers, non-coding regions of the genome required

50 to activate expression of their target (Visel et al., 2009a; Bulger and Groudine, 2011).

51 Enhancers are bound by transcription factors and transcriptional co-activators, which then

52 contact RNA polymerase 2 engaged at the promoter, stimulating gene transcription.

53 Enhancers are nodal points of transcriptional networks, integrating multiple upstream

54 signals to regulate . Because enhancers do not have defined sequence or

55 location with respect to their target genes, mapping enhancers is a major bottleneck for

56 delineating transcriptional networks. Recently chromatin immunoprecipitation of enhancer

57 features followed by sequencing (ChIP-seq) has been used to map potential enhancer. DNase

58 hypersensitivity (Crawford et al., 2006; Thurman et al., 2012), H3K27ac ( H3 acetylated

59 on lysine 27) occupancy (Creyghton et al., 2010; Nord et al., 2013), or H3K4me1 (histone H3

60 mono-methylated on lysine 4) occupancy (Heintzman et al., 2007) are chromatin features that

61 have been used to identify cell-type specific enhancers. While most enhancers are DNase

62 hypersensitive, DNase hypersensitive regions are often not active enhancers (Crawford et al.,

63 2006; Thurman et al., 2012). H3K27ac is enriched on cell type-specific enhancers (Creyghton

64 et al., 2010; Nord et al., 2013), but may be a less accurate predictor of enhancers than other

65 transcriptional regulators (Dogan et al., 2015). Chromatin occupancy of Ep300, a

66 transcriptional co-activator that catalyzes H3K27ac deposition, has been found to accurately

67 predict active enhancers (Visel et al., 2009b). However, antibodies for Ep300 are marginal for

68 robust ChIP-seq, particularly from tissues, leading to low reproducibility, variation between

69 antibody lots, and inefficient enhancer identification (Gasper et al., 2014).

- 3 - 70 Mammalian tissues are composed of multiple cell types, each with their own

71 lineage-specific transcriptional enhancers. Thus defining lineage-specific enhancers from

72 mammalian tissues requires developing strategies that overcome the cellular heterogeneity of

73 mammalian tissues, particularly when the lineage of interest comprises a small fraction of the

74 cells in the tissue. Past efforts to surmount this challenge have taken the strategy of purifying

75 nuclei from the cell type of interest using a lineage-specific tag. For instance, nuclei labeled by

76 lineage-specific expression of a fluorescent protein have been purified by FACS (Bonn et al.,

77 2012). This method is limited by the need to dissociate tissues and recover intact nuclei, and

78 by the relatively slow rate of FACS and the need to collect millions of labeled nuclei. To

79 circumvent the FACS bottleneck, cell type-specific overexpression of tagged SUN1, a nuclear

80 envelope protein, has been used to permit affinity purification of nuclei (Deal and Henikoff,

81 2010; Mo et al., 2015). Although this mouse line was reported to be normal, SUN1

82 overexpression potentially could affect cell phenotype and gene regulation (Chen et al., 2012).

83 Chromatin from isolated nuclei are then subjected to ChIP-seq to identify histone signatures of

84 enhancer activity. However, as noted above histone signatures may less accurately predict

85 enhancer activity compared to occupancy by key transcriptional regulators (Dogan et al.,

86 2015).

87 Here, we report an approach to identify murine enhancers active in a specific lineage within

88 a tissue. We developed a knock-in allele of Ep300 in which the protein is labeled by the bio

89 peptide sequence (de Boer et al., 2003; He et al., 2011). Cre recombinase-directed, cell type

90 specific expression of BirA, an E. coli enzyme that biotinylates the bio epitope tag (de Boer et

91 al., 2003), allows selective Ep300 ChIP-seq, thereby identifying enhancers active in the cell

92 type of interest. Using this strategy, we identified thousands of endothelial cell (EC) and

93 skeletal muscle lineage enhancers active during embryonic development. Extending the

- 4 - 94 approach to adult organs, we defined adult EC enhancers, including enhancers associated

95 with distinct EC gene expression programs in heart compared to lung. Analysis of motifs

96 enriched in EC or skeletal muscle lineage enhancers predicted novel transcription factor motif

97 signatures that govern EC gene expression.

98 Results 99 Efficient identification of enhancers using Ep300fb bioChIP-seq 100 We developed an epitope-tagged Ep300 allele, Ep300fb, in which FLAG and bio epitopes

101 (de Boer et al., 2003; He et al., 2011) were knocked into the C-terminus of endogenous Ep300

102 (Fig. 1A-B and Figure 1 -- figure supplement 1A). Transgenically expressed BirA (Driegen et

103 al., 2005) biotinylates the bio epitope, permitting quantitative Ep300 pull down on streptavidin

104 beads (Fig. 1C). We have not noted abnormal phenotypes. Heart development and function

105 are sensitive to Ep300 gene dosage (Shikama et al., 2003; Wei et al., 2008), yet Ep300fb/fb

106 homozygous mice survived normally (Fig. 1D) and Ep300fb/fb hearts expressed normal levels of

107 Ep300 and had normal size and function (Figure 1 -- figure supplement 1B-E). These data

108 indicate that Ep300fb is not overtly hypomorphic.

109 To evaluate Ep300fb-based mapping of Ep300 chromatin occupancy, we isolated

110 embryonic stem cells (ESCs) from Ep300fb/fb; Rosa26BirA/BirA mice. We then performed Ep300fb

111 biotin-mediated chromatin precipitation followed by sequencing (bioChiP-seq), in which high

112 affinity biotin-streptavidin interaction is used to pull down Ep300 and its associated chromatin

113 (He et al., 2011). Biological duplicate sample signals and peak calls correlated well (93.6%

114 overlap; Spearman r=0.96; Fig. 2A-B). We compared the results to publicly available Ep300

115 antibody ChIP-seq data generated by ENCODE (overlap between duplicate peaks 77.8%;

116 r=0.91; Fig. 2A-B). Ep300 bioChiP-seq identified 48963 Ep300-bound regions ("p300 regions")

117 shared by both replicates, compared to 15281 for Ep300 antibody ChIP-seq (Fig. 2A,C). The

- 5 - 118 large majority (89.6%) of Ep300 regions detected by antibody were also found by Ep300

119 bioChiP-seq, and Ep300 signal was substantially stronger using bioChiP-seq (Fig. 2A,C,D).

120 These data indicate that Ep300fb bioChiP-seq has improved sensitivity compared to Ep300

121 antibody ChIP-seq for mapping Ep300 chromatin occupancy in cultured cells.

122 Identification of tissue-specific enhancers using Ep300fb bioChIP-seq 123 We used Ep300fb/+; Rosa26BirA/+ mice to analyze Ep300fb genome-wide occupancy in

124 embryonic heart and forebrain. We performed bioChiP-seq on heart and forebrain from

125 embryonic day 12.5 (E12.5) embryos in biological duplicate (Fig. 3A-B). There was high

126 reproducibility (83% and 93%, respectively) between biological duplicates (Fig. 3B and Figure

127 3 -- figure supplement 1A). In comparison, published Ep300 antibody ChIP-seq from E11.5

128 heart and forebrain (Visel et al., 2009b) had lower signal-to-noise and yielded few peaks when

129 analyzed using the same peak detection algorithm (MACS2 (Zhang et al., 2008)). Using the

130 originally published peaks, antibody-based Ep300 ChIP-seq yielded 9.5x or 3.0x less Ep300

131 regions in heart and forebrain, respectively (Fig. 3B). These regions overlapped 58.7% and

132 64.7% of the Ep300fb bioChIP-seq regions, suggesting that the epitope-tagged allele has

133 superior sensitivity and specificity for mapping Ep300-bound regions in tissues, as it does in

134 cultured cells.

135 We compared Ep300fb regions from forebrain and heart (Suppl. File 1). Only a minority of

136 Ep300fb regions (8.9% for heart and 31.3% for brain) were common between tissues (Fig. 1B).

137 Viewing Ep300fb bioChiP-seq signal at genes selectively expressed in heart or brain confirmed

138 robust tissue-specific differences that overlapped enhancers with known tissue-specific activity

139 (Figure 3 -- figure supplement 1B). Genes neighboring the Ep300fb occupied regions specific to

140 heart or forebrain were enriched for (GO) functional terms relevant to the

141 respective tissue (Fig. 3C). These results reinforce the conclusion that Ep300 occupies

- 6 - 142 tissue-specific enhancers and indicate that this conclusion was not a consequence of

143 insensitive detection of Ep300-occupied regions in earlier studies (Visel et al., 2009b).

144 p300 is a histone acetyltransferase, and one of its enzymatic products is histone H3

145 acetylated on lysine 27 (H3K27ac). We compared genome-wide the signal of Ep300fb and

146 H3K27ac in E12.5 heart and forebrain (Fig. 3 -- figure supplement 1C and data not shown).

147 There was high correlation between biological replicates (r = 0.98). Ep300 was also well

148 correlated with H3K27ac (r = 0.64), independently validating the Ep300fb bioChIP-seq data.

149 The previously published Ep300 antibody ChIP-seq data (Visel et al., 2009b) was less well

150 correlated to H3K27ac (r = 0.37), although the correlation was highly statistically significant

151 (P<0.0001). Interestingly, 26.4% and 52.9% of heart and brain H3K27ac regions were shared

152 between tissues (Fig. 3 -- figure supplement 1D) compared to 8.9% and 31.3% for Ep300fb

153 heart and brain regions, respectively (Fig. 3A), suggesting that Ep300fb occupancy is more

154 tissue-specific.

155 We analyzed the prediction of active enhancers by our Ep300fb bioChiP-seq data. The

156 VISTA Enhancer database (Visel et al., 2007) contains thousands of genomic regions that

157 have been tested for tissue-specific enhancer activity using an in vivo transient transgenic

158 assay. 185 tested regions had heart activity and 130 (70%) of these overlapped Ep300fb

159 regions that were reproduced in both biological duplicates. In comparison, only 105 (57%) of

160 these regions overlapped the regions previously reported to be bound by Ep300 using

161 antibody ChIP-seq.

162 Recently human and mouse Ep300 and H3K27ac ChIP-seq data from fetal and adult heart

163 were combined to yield a “compendium” of heart enhancers, with the strength of ChIP-seq

164 signal used to provide an “enhancer score” ranging from 0 to 1 that correlated with the

165 likelihood of regions covered in the VISTA database to show heart activity (Dickel et al., 2016).

- 7 - 166 We compared our heart Ep300 regions to this compendium. Overall, 9438/72508 (13%)

167 regions in the prenatal compendium overlapped with the Ep300 heart regions (Fig. 3 -- figure

168 supplement 2A). However, the overlap frequency increased markedly for regions with higher

169 enhancer scores (Fig. 3 -- figure supplement 2B). For example, if one considers the 3571

170 compendium regions with an enhancer score of at least 0.4 (corresponding to a validation rate

171 in the VISTA database of ~25%), 2647 (74.1%) were contained within the heart Ep300 regions,

172 and 63/68 (92.6%) regions with a score of at least 0.8 (validation rate ~43%) overlapped a

173 heart Ep300 region. Thus, heart compendium regions that are more likely to have in vivo heart

174 activity are largely covered by heart Ep300 regions. On the other hand, 10752 (53%) heart

175 Ep300 regions were not covered by the compendium, suggesting that this database is

176 incomplete, potentially as a result of its use of incomplete antibody-based Ep300 ChIP-seq

177 data.

178 p300 antibody ChIP-seq was one of the criteria used to select some of the test regions in

179 the VISTA Enhancer database; as an independent test free of this potentially confounding

180 effect, we searched the literature for other heart enhancers that were confirmed using the

181 transient transgenic assay. We identified 40 additional heart enhancers. Of these, 24 (60%)

182 intersected the Ep300fb regions common to both replicates. In comparison, only 6/40 (15%)

183 intersected the regions detected previously by Ep300 antibody ChIP-seq. Few heart enhancers

184 were found in the regions unique to Ep300 antibody ChIP-seq (11/185 VISTA and 2/40

185 non-VISTA), compared to regions unique to Ep300fb bioChIP-seq (36/185 VISTA and 20/40

186 non-VISTA). We conclude that Ep300fb ChIP-seq predicts heart enhancers with sensitivity that

187 is superior to antibody-mediated Ep300 ChIP-seq.

188 Other chromatin features have been used to predict transcriptional enhancers. We

189 compared the accuracy of Ep300 bioChiP-seq to other chromatin features for heart enhancer

- 8 - 190 prediction. To map accessible chromatin, we performed ATAC-seq (assay for

191 transposable-accessible chromatin followed by sequencing) on E12.5 cardiomyocytes. E12.5

192 heart ChIP-seq data for modified (H3K27ac, H3K4me1, H3K4me2, H3K4me3,

193 H3K9ac, H3K27me3, H3K9me3, H3K36me3) were obtained from publicly available datasets

194 (see Supplemental Methods). Using a machine learning approach and the VISTA enhancer

195 database as the gold standard, we evaluated the accuracy of each of these chromatin

196 features, compared to Ep300 bioChiP-seq, for predictive heart enhancers (Figure 3 -- figure

197 supplement 3). This analysis showed that Ep300 bioChiP-seq was the single most predictive

198 chromatin feature (area under the receiver operating characteristic curve (AUC) = 0.805).

199 ATAC-seq and H3K27ac also performed well (AUC=0.749 and 0.747, respectively), whereas

200 H3K4me1 had was poorly predictive (AUC=0.589). Combining Ep300 bioChIP-seq with

201 ATAC-seq improved predictive accuracy (AUC=0.866), equivalent to the value obtained by

202 performing predictions with all of the chromatin features (AUC=0.862). These analyses indicate

203 that of the features tested Ep300 is the best single factor for enhancer prediction.

204 Cre-activated, lineage-specific Ep300fb bioChIP-seq 205 In vivo biotinylation of Ep300fb requires co-expression of the biotinylating enzyme BirA. We

206 reasoned that Ep300fb bioChIP-seq could be targeted to a Cre-labeled lineage by making BirA

207 expression Cre-dependent. Therefore we established Rosa26fsBirA, in which BirA expression is

208 contingent upon Cre excision of a floxed-stop (fs) cassette (Fig. 4A). In preliminary

209 experiments, we showed that Rosa26fsBirA expression of BirA was Cre-dependent (Fig. 4 --

210 figure supplement 1A), as was Ep300fb biotinylation (Fig. 4B). When activated by Cre driven

211 from Tek regulatory elements (Tg(Tek-cre)1Ywa/J; also known as Tie2Cre), BirA was

212 expressed in endothelial and blood lineages (Fig. 4C), consistent with this Cre transgene's

213 labeling pattern. Thus, Rosa26fsBirA expresses BirA in a Cre-dependent manner.

- 9 - 214 Next, we compared Ep300fb bioChIP-seq from embryos when driven by Tie2Cre

215 (endothelial and blood lineages) (Kisanuki et al., 2001), Myf5tm3(cre)Sor/J (referred to as

216 Myf5Cre; skeletal muscle lineage) (Tallquist et al., 2000), or Tg(EIIa-cre)C5379Lmgd/J (also

217 known as EIIaCre; ubiquitous) (Williams-Simons and Westphal, 1999). For the Tie2Cre and

218 EIIaCre samples, we used E11.5 embryos, a stage with robust angiogenesis. For Myf5Cre, we

219 used E13.5 embryos, when muscle lineage cells are in a range of stages in the muscle

220 differentiation program, spanning muscle progenitors to differentiated muscle fibers.

221 BioChiP-seq from biological triplicates showed high within-group correlation, and lower

222 between-group correlation, demonstrating the strong effect of different Cre transgenes in

223 directing Ep300fb bioChIP-seq (Fig. 4D). Viewing the bioChiP-seq signals in a genome browser

224 confirmed lineage-selective signal enrichment. For example, Tie2Cre drove high Ep300fb

225 bioChIP-seq signal at a Gata2 intronic enhancer with known activity in endothelial and blood

226 lineages (Fig. 4E, top panel). There was less signal at this region in Myf5Cre and EIIaCre

227 samples. At the skeletal muscle specific gene Myod, Myf5Cre drove strong Ep300fb

228 bioChiP-seq signal at a known distal enhancer (Goldhamer et al., 1992), as well as a second

229 Ep300 bound region about 12 kb upstream from the transcriptional start site.

230 To identify lineage-selective regions genome-wide, we filtered for regions with called peaks

231 in which the lineage-specific Cre (Tie2Cre or Myf5Cre) Ep300fb signal was at least 1.5 times

232 the ubiquitous Cre (EIIaCre) Ep300fb signal (Fig. 4 -- figure supplement 1B-C). This led to

233 identification of 2411 regions with enriched signal in Tie2Cre (p300-T-fb) and 1292 regions

234 with enriched signal in Myf5Cre (p300-M-fb), compared to 17382 regions with Ep300fb

235 occupancy detected with ubiquitous biotinylation (p300-E-fb; Suppl. File 1). We analyzed the

236 biological process gene ontology terms enriched for genes neighboring these three sets of

237 Ep300 regions (Fig. 4F). Ep300-T-fb regions were highly enriched for functional terms related

- 10 - 238 to angiogenesis and hematopoiesis, whereas the Ep300-M-fb regions were highly enriched for

239 functional terms related to skeletal muscle. Together, these results indicate that our strategy

240 for Cre-driven, lineage-specific, Ep300fb bioChiP-seq successfully identifies regulatory regions

241 that are associated with lineage relevant biological processes.

242 Functional validation of enhancer activity of lineage-specific Ep300fb regions. 243 We next set out to validate the in vivo enhancer activity of the regions with Cre-driven

244 lineage-enriched Ep300fb occupancy. If a substantial fraction of the Ep300-T-fb regions have

245 transcriptional enhancer activity, then genes neighboring these enhancers should be

246 expressed at higher levels in Tie2Cre lineage cells. To test this hypothesis, we used

247 Tie2Cre-activated translating ribosome affinity purification (Heiman et al., 2008; Zhou et al.,

248 2013) (T2-TRAP) to obtain the gene expression of Tie2Cre-marked cell lineages. Using this

249 lineage-specific expression profile, we then compared the expression of genes neighboring

250 Ep300-T-fb, Ep300-M-fb, and Ep300-E-fb regions. Ep300-T-fb neighboring genes were more

251 highly expressed compared to Ep300-E-fb neighboring genes (P < 10-38, Mann-Whitney U-test;

252 Fig. 5A). In contrast, there was no significant difference between Ep300-M-fb and Ep300-E-fb

253 neighboring genes (Fig. 5A). This result held regardless of the maximal distance threshold

254 used to find the gene nearest to a Ep300 region (Fig. 5 -- figure supplement 1A). We also

255 compared the expression of genes with and without an associated Ep300-T-fb region.

256 Ep300-T-fb-associated genes were more highly expressed than non-associated genes (Fig. 5

257 -- figure supplement 1B). Some Ep300-T-fb-associated genes were not detected within actively

258 translating transcripts (Fig. 5 -- figure supplement 1C). This suggests that in some cases

259 Ep300 enhancer binding is not sufficient to drive gene expression; other contributing factors

260 likely include imprecision of the enhancer-to-gene mapping rule, and regulation at the level of

261 ribosome binding to transcripts. Together, our data are consistent with Ep300-T-fb regions

- 11 - 262 being enriched for enhancers that are active in the Tie2Cre-labeled lineage.

263 To further evaluate the enhancer activity of Ep300-T-fb regions, we searched the literature

264 and the VISTA Enhancer database (Visel et al., 2007) for genomic regions with endothelial cell

265 activity validated by transient transgenesis (Table 1). Of 40 positive regions identified, 19

266 (47.5%) overlapped with Ep300-T-fb regions. Next, we used the transient transgenic assay to

267 test the lineage-selective enhancer activity of 20 additional Ep300-T-fb regions. We selected

268 regions that neighbored genes with EC-selective expression (T2-TRAP more than 10-fold

269 enriched over input RNA) and with known or potential relevance to angiogenesis. Of the 20

270 tested Ep300-T-fb regions, 8 drove reporter gene activity in at least 3 embryos in a vascular

271 pattern, in both whole mounts and histological sections, and 2 additional regions drove reporter

272 gene activity in a vascular pattern in 2 embryos (Fig. 5B-D; Table 2; Fig. 5 -- figure

273 supplements 2-3). In retrospect, two of the positive enhancers (Eng; Mef2c) had been

274 described previously (Table 2) (Pimanda et al., 2006; De Val et al., 2008). In some of these

275 cases, there was also activity in blood cells, consistent with Tie2Cre activity in both blood and

276 endothelial lineages. Additionally, one enhancer of Lmo2 was active blood cells but not ECs.

277 Thus a substantial fraction (9/20; 45%) of regions identified by lineage-selective, Cre-directed

278 bioChIP-seq have appropriate and reproducible in vivo activity.

279 Arterial and venous ECs have overlapping but distinct gene expression programs, yet only

280 three artery-specific and no vein-specific transcriptional enhancers have been described

281 (Wythe et al., 2013; Robinson et al., 2014; Becker et al., 2016; Sacilotto et al., 2013). We

282 examined histological sections of the transient transgenic embryos to determine if a subset are

283 selectively active in ECs of the dorsal aorta or cardinal vein. We identified an enhancer,

284 Sema6d-enh, with activity predominantly in ECs that line the dorsal aorta but not the cardinal

285 vein (Fig. 5E, Table 2). A second enhancer, Sox7-enh, also showed selective activity in the

- 12 - 286 dorsal aorta, although this was only reproduced in two embryos. Both were also active in the

287 endocardium and endocardial derivatives of the cardiac outflow tract (Fig. 5 -- figure

288 supplement 3). We also identified two enhancers with activity at E11.5 predominantly in ECs

289 that line the cardinal vein and not the dorsal aorta (Ephb4-enh and Mef2c-enh; Fig. 5E, Table

290 2, and Fig. 5 -- figure supplement 3). Interestingly, a core 44-bp region of Mef2c-enh had been

291 previously reported to drive pan-EC reporter expression at E8.5-E9.5 (De Val et al., 2008).

292 This enhancer’s activity pattern may be dynamically regulated at different developmental

293 stages, as has been described previously for two artery-specific enhancers (Robinson et al.,

294 2014; Becker et al., 2016). Further analysis of these enhancers will be required to confirm the

295 artery- and vein- selective activity patterns that we observed, and to better characterize their

296 temporospatial regulation.

297 Collectively, our data show that we have developed and validated a robust method for

298 identification of transcriptional enhancers in Cre-marked lineages. Using this strategy, we

299 discovered thousands of candidate skeletal muscle, EC, and blood cell enhancers. Based on

300 our validation studies, we expect that a majority of these candidate regions have in vivo,

301 cell-type specific transcriptional enhancer activity.

302 Transcription factor binding motifs enriched in skeletal muscle and EC/blood 303 enhancers. 304 p300 does not bind directly to DNA. Rather, transcription factors recognize sequence motifs

305 in DNA and subsequently recruit Ep300. The transcription factors and transcription factor

306 combinations that direct enhancer activity in skeletal muscle, blood, and endothelial lineages

307 are incompletely described. To gain more insights into this question, we searched for

308 transcription factor binding motifs that were over-represented in the candidate enhancer

309 regions bound by Ep300 in skeletal muscle or blood/EC lineages. Starting from 1445 motifs for

310 transcription factors or transcription factor heterodimers, we found 173 motifs over-represented

- 13 - 311 in Ep300-T-fb or Ep300-M-fb regions (false discovery rate < 0.01% and frequency in Ep300

312 regions greater than 5%). Clustering and selection of representative non-redundant motifs left

313 40 motifs that were enriched in either Ep300-T-fb or Ep300-M-fb, or both (Fig. 6A). Many

314 closely related motifs were independently detected by de novo motif discovery (Fig. 6 -- figure

315 supplement 1A). Analysis of our T2-TRAP RNA-seq data identified genes expressed in

316 embryonic ECs that potentially bind to these motifs (Suppl. File 2).

317 The GATA and ETS motifs were the most highly enriched motifs in the Tie2Cre-marked

318 blood and EC lineages (Fig. 6A). Interestingly, GO analysis of the subset of regions positive for

319 these motifs showed that GATA-containing regions are highly enriched for hematopoiesis and

320 heme synthesis, consistent with the critical roles of GATA1/2/3 in these processes (Bresnick et

321 al., 2012). On the other hand, ETS-containing regions were highly enriched for functional terms

322 linked to angiogenesis, also consistent with the key roles of ETS factors in angiogenesis (Wei

323 et al., 2009) and our prior finding that the ETS motif is enriched in dynamic VEGF-dependent

324 EC enhancers (Zhang et al., 2013). The Ebox motif, recognized by bHLH , was the

325 most highly enriched motif in the skeletal muscle lineage, consistent with the important roles of

326 bHLH factors such as Myod and Myf5 in skeletal muscle development (Buckingham and

327 Rigby, 2014). Interestingly, the Ebox motif was also highly enriched in Tie2Cre-marked cells,

328 and genes neighboring these Ebox-containing regions were functionally related to both

329 angiogenesis and heme synthesis. bHLH-encoding genes such as Hey1/2, Scl, and are

330 known to be important in blood and vascular development.

331 The database used for our motif search included 315 heterodimer motifs that were recently

332 discovered through high throughput sequencing of DNA concurrently bound by two different

333 transcription factors (Jolma et al., 2015). This allowed us to probe for enrichment of

334 heterodimer motifs that may contribute to enhancer activity in blood, EC, and skeletal muscle

- 14 - 335 lineages. One heterodimer motif that was highly enriched in Ep300-T-fb regions (and not

336 Ep300-M-fb regions; referred to as T2Cre-enriched motifs) was ETS:FOX2 (AAACAGGAA),

337 comprised of a tail-to-tail fusion of Fox (TGTTT) and ETS (GGAA) binding sites (Fig. 6A). GO

338 analysis showed that this motif was closely linked to vascular biological process terms (Fig.

339 6A). This motif was previously found to be sufficient to drive enhancer EC activity during

340 vasculogenesis and developmental angiogenesis (De Val et al., 2008), validating that our

341 approach is able to identify bona fide, functional heterodimer motifs. Interestingly, we

342 discovered two additional ETS-FOX heterodimer motifs, which were also highly enriched in

343 Ep300-T-fb regions and also linked to vascular biological process terms: ETS:FOX1

344 (GGATGTT), consisting of a head-to-tail fusion between ETS and FOX motifs, with the ETS

345 motif located 5' to the FOX motif (Fig. 6A, arrows over motif logo), and ETS:FOX3

346 (TGTTTACGGAA), a head-to-tail fusion with the FOX motif located 5' to the ETS motif. Other

347 heterodimer motifs that were enriched in Ep300-T-fb regions and to our knowledge previously

348 were unrecognized as regulatory elements in ECs were ETS:TBox, ETS:HOMEO, and

349 ETS:Ebox. Similar analysis of Ep300-M-fb regions identified highly enriched Ebox-containing

350 heterodimer motifs including Ebox:Hox, Ebox:HOMEO, and ETS:Ebox (“Myf5Cre-enriched

351 motifs”).

352 To assess functional significance of these motifs, we examined their evolutionary

353 conservation. Using PhastCons genome conservation scores for 30 vertebrate species (Siepel

354 et al., 2005), we measured the conservation of sequences matching Tie2Cre-enriched motifs

355 within the central 200 bp of Ep300-T-fb regions. Whereas randomly selected 12 bp sequences

356 from these regions exhibited a distribution of scores heavily weighted towards low

357 conservation values, sequences matching Tie2Cre-enriched motifs showed a bimodal

358 distribution consisting of highly conserved and poorly conserved sequences (Fig. 6B). The

- 15 - 359 conservation of individual heterodimer motifs such as ETS:FOX1-3 confirmed that they shared

360 this bimodal distribution that included deeply conserved sequences (Fig. 6 -- figure supplement

361 1B). These findings indicate that a subset of sequences matching Tie2Cre-enriched motifs,

362 including the novel heterodimer motifs, are under selective pressure. Myf5Cre-enriched motifs

363 within the center of Ep300-M-fb regions similarly adopted a bimodal distribution that includes a

364 subset of motif occurrences with high conservation (Fig. 6B and Fig. 6 -- figure supplement

365 1B). This analysis supports the biological function of Tie2Cre- and Myf5Cre-enriched motifs in

366 endothelial cell/blood or skeletal muscle enhancers, respectively.

367 To further functionally validate the transcriptional activity of these heterodimer motifs, we

368 measured their enhancer activity using luciferase reporter assays. Three repeats of enhancer

369 fragments containing motifs of interest were cloned upstream of a minimal promoter and

370 luciferase. The constructs were transfected into human umbilical vein endothelial cells

371 (HUVECs), and transcriptional activity was measured by luciferase assay, normalized to the

372 enhancerless promoter-luciferase construct (Fig. 6C). The well-studied ETS:FOX2 motif from

373 an endothelial Mef2c enhancer (De Val et al., 2008) robustly stimulated transcription to about

374 the same extent as the SV40 enhancer. As expected, mutation of the ETS:FOX2 motif

375 markedly blunted its activity, supporting the specificity of this assay. Interestingly, the

376 alternative ETS:FOX3 motif uncovered by our study was at least as potent in stimulating

377 transcription as the previously described ETS:FOX2 motif. The other novel heterodimer motifs

378 tested, ETS:HOMEO2 and ETS:Ebox, likewise supported strong transcriptional activity, as did

379 the SOX motif, whose enrichment and associated GO terms were highly EC-selective.

380 To further assess and compare the transcriptional activity of these heterodimer motifs, we

381 cloned upstream of luciferase 3x repeated motifs into a consistent DNA context that had

382 minimal endogenous enhancer activity (Fig. 6D). The ETS motif alone only weakly stimulated

- 16 - 383 luciferase expression, whereas the previously described ETS:FOX2 motif robustly activated

384 luciferase expression. Interestingly, both of the newly identified, alternative ETS:FOX motifs

385 (ETS:FOX1 and ETS:FOX3) were more potent activators that ETS:FOX2. The other

386 heterodimer motifs tested, ETS-Homeo2, ETS-Ebox, and ETS-Tbox, also demonstrated

387 significant enhancer activity.

388 Together, unbiased discovery of cell type specific enhancers coupled with motif analysis

389 identified novel transcription factor signatures that are likely important for gene expression

390 programs of blood, vasculature, and skeletal muscle.

391 Organ-specific EC enhancers 392 ECs in different adult organs have distinct gene expression programs that underlie

393 organ-specific EC functions (Nolan et al., 2013; Coppiello et al., 2015). For example, heart

394 ECs are adapted for transport of fatty acids essential for fueling oxidative phosphorylation in

395 the heart (Coppiello et al., 2015). On the other hand, lung ECs are adapted for efficient

396 transport of gas, but possess specialized tight junctions to minimize transit of water (Mehta et

397 al., 2014). Nolan et al. recently profiled gene expression in ECs freshly isolated from 9 different

398 adult mouse organs (Nolan et al., 2013). Clustering the 3104 genes with greater than 4-fold

399 difference in expression across this panel identified genes with selective expression in ECs

400 from a subset of organs (Fig. 7 -- figure supplement 1A). 240 genes were preferentially

401 expressed in ECs from heart (and skeletal muscle) compared to other organs, whereas 355

402 genes were preferentially expressed in ECs from lung (Fig. 7 -- figure supplement 1B). One

403 cluster contained genes co-enriched in lung and brain, including many tight junction genes

404 such as claudin 5.

405 We asked if our strategy of lineage-specific Ep300fb bioChIP-seq would allow us to identify

406 enhancers linked to organ-specific EC gene expression. For these experiments, we used

- 17 - 407 Tg(Cdh5-cre/ERT2)1Rha (also known as VECad-CreERT2) (Sorensen et al., 2009) to drive

408 Ep300fb biotinylation in ECs (Fig. 7 -- figure supplement 2); unlike Tie2Cre, this transgene does

409 not label hematopoietic cells when induced with tamoxifen in the neonatal period. We isolated

410 adult (8 wk old) heart and lung from Ep300fb/+; VEcad-CreERT2+; Rosa26fsBirA/+ mice and

411 performed bioChiP-seq. Triplicate repeats experiments were highly reproducible (Fig. 7A).

412 Inspection of Ep300fb ChIP-seq signals from lung and heart ECs suggested that

413 VEcad-CreERT2 successfully directed Ep300fb enrichment from ECs. For example, Tbx3 and

414 Meox2, transcription factors selectively expressed in lung and heart ECs, respectively, were

415 associated with Ep300fb-decorated regions in the matching tissues (Fig. 7 -- figure supplement

416 3). Interestingly, Meox2 has been implicated in directing expression of heart EC-specific genes

417 and in the pathogenesis of coronary artery disease (Coppiello et al., 2015; Yang et al., 2015).

418 On the other hand, Tbx2/4 and Myh6, genes expressed in non-ECs in lung and heart, were not

419 associated with regions of Ep300fb enrichment (Fig. 7 -- figure supplement 3).

420 To obtain a broader, unbiased view of the adult EC Ep300fb bioChIP-seq results, regions

421 bound by Ep300fb in either heart (14251) or lung (22174) were rank-ordered by the heart to

422 lung Ep300fb signal ratio (Suppl. File 1). Regions were grouped into deciles by this ratio, with

423 the most heart-enriched regions in decile 1 and the most lung-enriched regions in decile 10.

424 Next, for genes neighboring regions in each decile, we compared expression in heart

425 compared to lung. Genes neighboring decile 1 regions (greater Ep300fb signal in heart) had

426 higher median mRNA transcript levels than lung (Fig. 7B). In contrast, genes neighboring

427 decile 10 regions (greater Ep300fb signal in lung) had higher median mRNA transcript levels in

428 lung than heart. We then focused on genes with selective expression in heart or lung. More

429 decile 1 Ep300fb regions were associated with genes with heart-selective expression than

430 lung-selective expression, and more decile 10 Ep300fb regions were associated with genes

- 18 - 431 with lung-selective expression (in each case, P<0.0001, Fisher's exact test; Fig. 7C). These

432 results are consistent with greater heart- or lung- associated enhancer activity in Ep300fb

433 decile 1 or 10 regions, respectively.

434 We analyzed the GO terms that are over-represented for genes neighboring Ep300fb decile

435 1 or 10 regions. Both sets of regions were highly enriched for GO terms related to vasculature

436 and blood vessel development (Fig. 7 -- figure supplement 4). Analysis of disease ontology

437 terms showed that genes neighboring decile 1 regions (greater Ep300fb signal in heart) were

438 significantly associated with terms relevant to coronary artery disease, such as "coronary heart

439 disease", "arteriosclerosis", and "myocardial infarction", whereas genes neighboring decile 10

440 regions (greater Ep300fb signal in lung) were less enriched. Conversely, genes neighboring

441 decile 10 regions were selectively associated with hypertension and cerebral artery disease.

442 To gain insights into transcriptional regulators that preferentially drive heart or lung EC

443 enhancers, we searched for motifs with differential enrichment in decile 1 compared to decile

444 10. Using decile 1 regions as the foreground sequences and decile 10 regions as the

445 background sequences, we detected significant enrichment of the TCF motif in regions

446 preferentially occupied by Ep300 in heart ECs (Fig. 7D), consistent with prior work that found

447 that a Meox2:TCF15 complex promotes heart EC-specific gene expression (Coppiello et al.,

448 2015). Other motifs that were significantly enriched in decile 1 (heart) compared to decile 10

449 regions were FOX, ETS, ETS-FOX and ETS-HOMEO motifs. We performed the reciprocal

450 analysis to identify lung-enriched motifs. One motif enriched in decile 10 (lung) compared to

451 decile 1 regions was the TBOX motif. Interestingly, TBX3 is a lung EC-enriched transcription

452 factor (Nolan et al., 2013). Other motifs over-represented in decile 10 compared to decile 1

453 regions were EBox, ETS:Ebox, and EBOX:HOMEO motifs. Thus differences in transcription

454 factor expression or heterodimer formation in heart and lung ECs may contribute to differences

- 19 - 455 in Ep300 chromatin occupancy and enhancer activity between heart and lung.

456 Discussion 457 Here we show that Cre-mediated, tissue-specific activation of BirA permits high affinity

458 bioChiP-seq of factors with a bio epitope tag. Furthermore, we show that combining this

459 strategy with the Ep300fb knockin allele permits efficient identification of cell type-specific

460 enhancers. The bio epitope has been knocked into a number of different transcription factor

461 loci (He et al., 2012; Waldron et al., 2016) and Jackson Labs 025982 (Rbpj), 025980 (Ep300),

462 025983 (Mef2c), 025978 (Nkx2-5), 025979 (Srf), and 025977 (Zfpm2)), and these alleles can

463 be combined with Cre-activated BirA to permit lineage-specific mapping of transcription factor

464 binding sites. Cell type-specific protein biotinylation will also be useful for mapping

465 protein-protein interactions in specific cell lineages.

466 Using this technique, we identified thousands of candidate skeletal muscle and EC

467 enhancers and showed that many of these candidate enhancers are likely to be functional.

468 Furthermore, we showed that the technique can identify tissue-specific enhancers in postnatal

469 tissues, and identified novel candidate enhancers that regulate organ-specific endothelial gene

470 expression. These enhancer regions will be a valuable resource for future studies of

471 transcriptional regulation in these systems.

472 Large scale identification of tissue-specific enhancers will facilitate decoding the

473 mechanisms responsible for cell type-specific gene expression in development and disease.

474 Here . By analyzing these candidate regulatory regions, we revealed novel transcriptional

475 regulatory motifs that likely participate in skeletal muscle development, angiogenesis, and

476 organ specific EC gene expression. Our recovery of a previously described FOX-ETS

477 heterodimer binding site in EC enhancers (De Val et al., 2008) validates the ability of our

478 approach to detect sequence motifs important for angiogenesis, and suggests that these

- 20 - 479 motifs provide important clues to the transcriptional regulators that interact with them. Our

480 study identified significant enrichment of two new FOX-ETS heterodimer motifs in which the

481 FOX and ETS sites are in different positions and orientations. In their original study, Jolma and

482 colleagues already demonstrated that these alternative FOX-ETS motifs are indeed bound by

483 both FOX and ETS family proteins, implying that these proteins are able to collaboratively bind

484 DNA in diverse configurations, potentially with DNA itself playing an important role in stabilizing

485 the heterodimer (Jolma et al., 2015). Enrichment of these motifs in Ep300-T-fb, and their

486 over-representation neighboring genes related to angiogenesis, suggests that these alternative

487 FOX-ETS configurations are functional. We also recovered FOX-ETS motifs from Ep300

488 regions in adult ECs (and preferentially in heart ECs), suggesting that these motifs continue to

489 be important in maintenance of adult vasculature. However, direct support of these inferences

490 will require further experiments that identify the specific proteins involved and that dissect their

491 functional roles in vivo.

492 Additional novel heterodimer motifs, such as ETS:EBox, ETS:HOMEO, and ETS:Tbox,

493 were enriched in EC enhancers from both developing and adult ECs. This suggests that these

494 motifs, and potentially the protein heterodimers that were reported to bind them (Jolma et al.,

495 2015), are important for vessel growth and maintenance. These potential TF combinations, like

496 the FOX-ETS combination, may act as a transcriptional code to create regulatory specificity at

497 individual enhancers and their associated genes. Experimental validation of these hypotheses

498 will be a fruitful direction for future studies.

499 Limitations 500 A limitation of our current protocol is the need for several million cells to obtain robust

501 ChIP-seq signal. Optimization of chromatin pulldown and purification through application of

502 streamlined protocols or microfluidics (Cao et al., 2015), combined with use of improved library

- 21 - 503 preparation methods that work on smaller quantities of starting material, will likely overcome

504 this limitation. Another limitation of our strategy is that Ep300 decorates many, but not all,

505 active transcriptional enhancers (He et al., 2011), and therefore Ep300 bioChIP-seq will not

506 comprehensively detect all enhancers. Our strategy does not directly permit profiling of other

507 chromatin features in a Cre-targeted lineage. This limitation could be overcome by developing

508 additional proteins labeled with the bio-epitope. For instance, by bio-tagging histone H3,

509 non-tissue restricted ChIP for a feature of interest could be followed by sequential high affinity

510 histone H3 bioChIP. Finally, our technique's lineage specificity is dependent on the properties

511 of the Cre allele used (Ma et al., 2008), and users must be cognizant of the cell labeling

512 pattern of the Cre allele that they choose.

513 Materials and Methods 514 Mice 515 Animal experiments were performed under protocols approved by the Boston Children's

516 Hospital Animal Care and Use Committee (protocols 13-08-2460R and 13-12-2601). Ep300fb

517 mice were generated by in embryonic stem cells. Targeted ESCs

518 were used generate a mouse line, which was bred to homozygosity. This line has been

519 donated to Jackson labs (Jax 025980). The Rosa26fsBirA allele was derived from the previously

520 described Rosa26fsTRAP mouse (Zhou et al., 2013) (Jax 022367) by removal of the frt-TRAP-frt

521 cassette using germline Flp recombination. The Rosa26BirA (Driegen et al., 2005)

522 (constitutive; Jackson Labs 010920), Tie2Cre (Kisanuki et al., 2001) (Jackson Labs 008863),

523 Myf5Cre (Tallquist et al., 2000) (Jackson Labs 007893), EIIaCre (Williams-Simons and

524 Westphal, 1999) (Jackson Labs 003724), Rosa26mTmG (Muzumdar et al., 2007) (Jackson Labs

525 007576) and VEcad-CreERT2 (Sorensen et al., 2009) (Taconic 13073) lines were described

526 previously.

- 22 - 527 Transient transgenics 528 Candidate regions approximately 1 kb in length were PCR amplified using primers listed in

529 Table 3 and cloned into a gateway-hsp68-lacZ construct derived from pWhere (Invivogen) (He

530 et al., 2011). Constructs were linearized by PacI and injected into oocytes by Cyagen, Inc. At

531 least 5 PCR positive embryos were obtained per construct. Regions were scored as positive

532 for EC activity if they displayed an EC staining pattern in whole mount and validated in

533 sections, using previously described criteria (Visel et al., 2009). For scoring purposes, we

534 required that an EC pattern was observed for at least three different embryos, although we

535 also describe results for an additional two regions with activity observed in two different

536 embryos. Embryos were analyzed at E11.5 by whole mount LacZ staining. After whole mount

537 imaging, embryos were embedded in paraffin and sectioned for histological analysis.

538 Histology 539 Embryos were collected in ice-cold PBS and fixed in 4% paraformaldehyde over night. For

540 immunostaining, cryosections were stained with HA (Cell Signaling #3724, 1:100 dilution) and

541 PECAM1 (BD Pharmingen, 553371, 1:200 dilution) antibodies and imaged by confocal

542 microscopy (Olympus FV1000).

543 For immunostaining, cryosections were stained with HA (Cell Signaling #3724) and

544 PECAM1 (BD Pharmingen, 553371) antibodies and imaged by confocal microscopy (Olympus

545 FV1000).

546 Ep300fb/fb;Rosa26BirA/BirA ES cells derivation and bioChiP 547 ESCs were derived as described previously (Bryja et al., 2006). Five week old

548 Ep300fb/fb;Rosa26BirA/BirA female mice were hormonally primed and mated overnight with eight

549 weeks old Ep300fb/fb;Rosa26BirA/BirA male mice. Uteri were collected on 3.5 dpc and embryos

550 were flushed out under a microscope. After removal of zona pellucida by treating the embryos

551 with Tyrode’s solution (Sigma, T1788), the embryos were cultured in ESC medium (DMEM

- 23 - 552 with high glucose, 15% ES cell-qualified FBS, 1000 U/mL LIF, 100 µM non-essential amino

553 acids, 1 mM sodium pyruvate, 2 mM glutamine,100 µM β-mercaptoethanol, and

554 penicillin/streptomycin), supplemented with 50 µM PD98059 (Cell Signaling Technology,

555 #9900) for 7-10 days. The outgrowth was dissociated with trypsin, and the cells were cultured

556 in ESC medium in 24 well plates to obtain colonies, which were then clonally expanded. Five

557 male ESC lines were retained for further experiments. Pluripotency of these ESC lines was

558 confirmed by immunostaining for pluripotency markers Oct4, , and SSEA, and they were

559 negative for mycoplasma.

560 The five Ep300fb/fb;Rosa26BirA/BirA ESC lines were cultured in 150 mm dishes to 70-80%

561 confluence. Crosslinking was performed by adding formaldehyde to 1% and incubating at room

562 temperature for 15 min. Chromatin was fragmented using a microtip sonicator (QSONICA

563 Q700). The chromatin from 3 ESC lines was pooled for replicate 1 and from the other 2 ESC

564 lines for replicate 2. Ep300fb and bound chromatin were pulled down by incubation with

565 streptavidin beads (Life Technologies #11206D).

566 Cell culture and luciferase reporter assays 567 Candidate motif sequences were synthesized as oligonucleotides (Table 3) and cloned into

568 plasmid pGL3-promoter (Promega) between MluI and XhoI. HUVEC cells (Lonza) were

569 cultured to 50-60% confluent in 24-well plates and transfected in triplicate with 1 µg luciferase

570 construct and 0.5 µg pRL-TK internal control plasmid, using 5 µl jetPEI-HUVEC (Polyplus).

571 After 2 days, cells were analyzed using the dual luciferase assay (Promega). Luciferase

572 activity was measured using a 96-plate reading luminometer (Victor2, Perkin Elmer). Results

573 are representative of at least two independent experiments.

574 Western blotting 575 Immunoblotting was performed using standard protocols and the following primary

- 24 - 576 antibodies: GAPDH, Fitzgerald, 10R-G109A (1:10,000 dilution); Ep300, Millipore, 05257

577 (1:2,000 dilution); BirA, Abcam, ab14002 (1:1,000 dilution).

578 Tissue collection for ChIP and bioChiP 579 bioChIP-seq was performed as described previously (He et al., 2014; He and Pu, 2010),

580 with minor modifications. We used lower amplitude sonication to avoid fragmentation of Ep300

581 protein.

582 p300 and H3K27ac in E12.5 heart and forebrain. E12.5 embryonic forebrain and ventricle

583 apex tissues were isolated from Swiss Webster (Charles River) females crossed to

584 Ep300fb/fb;Rosa26BirA/BirA males. Cells were dissociated in a 2-mL glass dounce homogenizer

585 (large clearance pestle, Sigma P0485) and then cross-linked in 1% formaldehyde-containing

586 PBS for 15 min at room temperature. Glycine was add to final concentration of 125 mM to

587 quench formaldehyde. Chromatin isolation was performed as previously described (He and Pu,

588 2010) 30 forebrains or 60 heart apexes were used in each sonication. Conditions were titrated

589 to achieve sufficient fragmentation (mean fragment size 500 bp) while avoiding degradation of

590 Ep300 protein. We used a microtip sonicator (QSONICA Q700) at 30% amplitude and a cycle

591 of 5s on and 20s off for 96 cycles in total. Sheared chromation was precleared by incubation

592 with 100 µl Dynabeads Protein A (Life Technologies, 10002D) for 1 hour at 4C. For Ep300

593 bioChIP, 2/3 of the chromatin was then incubated with 250 µl (for forebrains) or 100 µl (for

594 heart apexes) Dynabeads M-280 Streptavidin (Life Technologies, 11206D) for 1 hour at 4C.

595 The streptavidin beads were washed and bound DNA eluted as described (He and Pu, 2010).

596 For H3K27ac ChIP, the remaining 1/3 chromatin was incubated with 10 µg (forebrain) or 5 µg

597 (heart apex) H3K27ac antibody (ActiveMotif #39133) overnight at 4C. Then 50 µl or 25 µl

598 Dynabeads Protein A were added and incubated for 1 hour at 4C. The magnetic beads were

599 washed 6 times with RIPA buffer (50 mM HEPES, pH 8.0, 500 mM LiCl, 1% Igepal ca-630,

- 25 - 600 0.7% sodium deoxycholate, and 1 mM EDTA) and washed once in TE buffer. ChIP DNA was

601 eluted at 65 °C in elution buffer (10 mM Tris, pH 8.0, 1% SDS and 1 mM EDTA) and incubated

602 at 65 °C overnight to reverse crosslinks.

603 p300 bioChIP of fetal EC cells. E11.5 embryos were isolated from pregnant Rosa26mTmG

604 females crossed to Ep300fb/fb;Rosa26fsBirA/BriA;Tie2Cre males. Tie2Cre positive embryos were

605 picked under a fluorescence microscope for further experiments. Crosslinking and sonication

606 were performed as described above. We used 15 Tie2Cre positive embryos for each bioChIP

607 replicate. We used 750 µL Dynabeads M-280 Streptavidin beads for Ep300fb pull-down; this

608 amount was determined by empiric titration.

609 p300 bioChIP of fetal Myf5Cre labeled cells. E13.5 embryos were isolated from pregnant

610 Rosa26mTmG females crossed to Ep300fb/fb;Rosa26fsBirA/BriA;Myf5Cre males. Cre positive

611 embryos were picked under a fluorescence microscope for further experiments. 5 embryos

612 were used for each bioChIP replicate and incubated with 750 µl Dynabeads M-280

613 Streptavidin beads for Ep300fb pull-down.

614 p300 bioChIP of adult ECs. Ep300fb/fb;Rosa26fsBirA/mTmG;VEcad-CreERT2 pups were given

615 two consecutive intragastic injections of 50 µl tamoxifen (2 mg/ml in sunflower seed oil) on

616 postnatal day P1 and P2 to induce the activity of Cre. The lungs and heart apexes were

617 collected when the mice were 8 week old. Cross-linking and sonication were performed as

618 described above. 6 mice were used for each replicate. We used 600 µl (heart) or 1.5 mL (lung)

619 streptavidin beads for the bioChIP.

620 bioChIP-seq and ChIP-seq 621 Libraries were constructed using a ChIP-seq library preparation kit (KAPA Biosystems

622 KK8500). 50 ng of sonicated chromatin without pull-down was used as input.

623 Sequencing (50 nt single end) was performed on an Illumina HiSeq 2500. Reads were

- 26 - 624 aligned to mm9 using Bowtie2 (Langmead and Salzberg, 2012) using default parameters.

625 Peaks were called with MACS2 (Zhang et al., 2008). Murine blacklist regions were masked out

626 of peak lists. For embryo samples, peak calling was performed against input chromatin

627 background with a false discovery rate of less than 0.01. For adult samples, peak calling was

628 poor using input chromatin background and therefore was performed using the ChIP sample

629 only at a false discovery rate of less than 0.05. Aggregation plots, tag heat maps, and global

630 correlation analyses were performed using deepTools 2.0 (Ramírez et al., 2014). bioChIP-seq

631 signal was visualized in the Integrated Genome Viewer (Thorvaldsdóttir et al., 2013).

632 To associate genomic regions to genes, we used Homer’s AnnotatePeaks (Heinz et al.,

633 2010) to select the gene with the closest transcriptional start site.

634 ATAC-seq 635 E12.5 heart ventricles were dissociated into single cell-suspensions using the Neonatal

636 Heart Dissociation Kit (Miltenyi Biotec #130-098-373) with minor modifications from

637 manufacturer’s protocol. Embryonic tissue samples were incubated twice with enzyme

638 dissociation mixes at 37°C for 15 minutes with gentle agitation by tube inversions between

639 incubations. Cell mixtures were gently filtered through a 70 μm cell strainer and centrifuged at

640 300 x g, 5 minutes, 4°C. Red blood cells were lysed with 10X Red Blood Cell Lysis Solution

641 (Miltenyi #130-094-183) and myocytes were isolated using the Neonatal Cardiomyocyte

642 Isolation Kit (Miltenyi Biotec #130-100-825). 75,000 isolated cardiomyocytes were used for

643 each ATAC-Seq experiment. Libraries were prepped as previously described (Buenrostro et

644 al., 2015).

645 TRAP 646 Translating ribosome RNA purification (TRAP) and RNA-seq were performed as described

647 (Zhou et al., 2013). The expression of the gene with TSS closest to the Ep300-bound region

- 27 - 648 was used to define "neighboring gene". In Figure 5A, no maximal distance limit was used; in

649 Fig. 5 -- figure supplement 1, a range of maximal distance thresholds were tested.

650 Gene ontology analysis 651 Gene Ontology analysis was performed using GREAT (McLean et al., 2010). Results were

652 ranked by the raw binomial P-value. To determine the fraction of terms relevant to a cell type

653 or process, the top twenty biological process terms were manually inspected.

654 Motif analysis 655 Homer (Heinz et al., 2010) was used for motif scanning and for de novo motif analysis.

656 Regions analyzed were 100 bp regions centered on the summits called by MACS2. The motif

657 database used for motif scanning was the default Homer motif vertebrate database plus the

658 heterodimer motifs described by Jolma et al. (Jolma et al., 2015). To select motifs for display,

659 motifs from samples under consideration with negative ln P-value > 15 in any one sample were

660 clustered using STAMP (Mahony et al., 2007). Non-redundant motifs were then manually

661 selected.

662 Gene expression analysis 663 Translating ribosome RNA purification (TRAP) was performed as described (Zhou et al.,

664 2013). E10.5 Rosa26fsTrap/+;Tie2Cre embryos were isolated from Swiss Webster strain

665 pregnant females crossed to Rosa26fsTrap/Trap;Tie2Cre males. TRAP RNA from 50 embryos

666 was pooled for RNA-seq. The polyadenylated RNA was purified by binding to oligo (dT)

667 magnetic beads (ThermoFisher Scientific, 61005). RNA-seq libraries were prepared with

668 ScriptSeq v2 kit (Illumina, SSV 21106 ) according to the manufacturer’s instructions. RNA-seq

669 reads were aligned with TopHat (Trapnell et al., 2009) and expression levels were determined

670 with htseq-count (Anders et al., 2015). Adult organ EC expression values were obtained from

671 (Nolan et al., 2013) (GEO GSE47067).

- 28 - 672 Comparison of enhancer prediction using different chromatin features 673 Enhancer regions in VISTA database was used as the golden standard. Each chromatin

674 feature that was analyzed was from E12.5 heart. Data sources are listed below under “Data

675 Sources”. For each chromatin factor, average read intensity in each VISTA regions with or

676 without heart enhancer activity was calculated and used as the starting point for machine

677 learning. We used the weighted KNN method as the classifier. The parameters used were: 10

678 nearest neighbors, Euclidean distance, and squared inverse weight. The classifier was trained

679 and tested using 10-fold cross validation. ROC (Receiver operating characteristic) curve were

680 used to evaluate the predictive accuracy of each factor, as measured by the area under the

681 ROC curve (AUC).

682 Conservation analysis 683 The phastCons score for the multiple alignments of 30 vertebrates to mouse mm9 genome

684 was used as the conservation score (Siepel et al., 2005). Sequences within Ep300 regions

685 matching selected motif (s) were identified using FIMO (Grant et al., 2011) with default

686 settings, and the average conservation score across the width of the sequence was used. To

687 generate the random conservation background, 100 random motifs 12 bp wide (the average

688 width of the motifs being analyzed) were used to the scan the same regions.

689 Literature Searches 690 Heart and endothelial cell enhancers were identified by searching PubMed for "heart

691 enhancer" or "endothelial cell enhancer", respectively, and then manually curating references

692 for mouse or human enhancers with appropriate activity in murine transient transgenic assays.

693 Data Sources 694 Sequencing data generated for this study are as follows: (1) ES_p300_bioChIP from

695 Ep300fb/fb;Rosa26BirA/BirA ESCs; (2) FB_p300_bioChIP from E12.5 Ep300fb/+;Rosa26BirA/+

696 forebrains; (3) FB_H3K27ac_ChIP from E12.5 Ep300fb/+;Rosa26BirA/+ forebrains; (4)

- 29 - 697 He_p300_bioChIP from E12.5 Ep300fb/+;Rosa26BirA/+ heart apex; (5) He_H3K27ac_ChIP from

698 E12.5 Ep300fb/+;Rosa26BirA/+ heart apex; (6) EIIaCre_p300_bioChIP from E11.5

699 Ep300fb/+;Rosa26BirA/+ whole embryos; Myf5Cre_p300_bioChIP from E13.5

700 Ep300fb/+;Rosa26fsBirA/+;Myf5Cre whole embryos; Tie2Cre_p300_bioChIP from E11.5

701 Ep300fb/+;Rosa26fsBirA/+;Tie2Cre+ whole embryos; Tie2Cre-TRAP and Tie2Cre-Input from E10.5

702 Rosa26fsTrap/+;Tie2Cre+ whole embryos; and ATAC-seq from wild-type E12.5 ventricular

703 cardiomyocytes. These data are available via the Gene Expression Omnibus (accession

704 number to be provided upon acceptance) or the Cardiovascular Development Consortium

705 server (https://b2b.hci.utah.edu/gnomex/; login as guest).

706 The following public data sources were used for this study: ES_P300_ChIP, GSE36027;

707 ES_P300_input, GSE36027; E12.5_Histone_input, GSE82850, E12.5_H3K27ac, GSE82449;

708 E12.5_H3K27me3, GSE82448; E12.5_H3K36me3, GSE82970; E12.5_H3K4me1, GSE82697;

709 E12.5_H3K4me2, GSE82667; E12.5_H3K4me3, GSE82882; E12.5_H3K9ac, GSE83056;

710 E12.5_H3K9me3, GSE82787. Antibody Ep300 ChIP-seq data on forebrain (Visel et al., 2009b)

711 and heart (Blow et al., 2010) are from GSE13845 and GSE22549, respectively. Adult organ

712 EC gene expression data are from GEO GSE47067 (Nolan et al., 2013).

713

714 Accession Numbers 715 Sequencing data generated for this study are available via the Gene Expression Omnibus

716 (upload in progress) or the Cardiovascular Development Consortium server

717 (https://b2b.hci.utah.edu/gnomex/; login as guest; instructions for reviewer access are provided

718 in an supplementary file).

719 Acknowledgements 720 WTP was supported by funding from the National Heart Lung and Blood Institute

- 30 - 721 (U01HL098166 and HL095712), by an Established Investigator Award from the American

722 Heart Association, and by charitable donations from Dr. and Mrs. Edwin A. Boger. The content

723 is solely the responsibility of the authors and does not necessarily represent the official views

724 of the funding agencies.

725

726 Competing Financial Interests 727 The authors declare no competing financial interests.

728

729

- 31 - 730 References 731 Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq--a Python framework to work with

732 high-throughput sequencing data. Bioinformatics 31, 166-69.

733 Becker, P.W., Sacilotto, N., Nornes, S., Neal, A., Thomas, M.O., Liu, K., Preece, C.,

734 Ratnayaka, I., Davies, B., et al. (2016). An Intronic Flk1 Enhancer Directs Arterial-Specific

735 Expression via RBPJ-Mediated Venous Repression. Arterioscler Thromb Vasc Biol 36,

736 1209-219.

737 Blow, M.J., McCulley, D.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry,

738 M., Wright, C., et al. (2010). ChIP-Seq identification of weakly conserved heart

739 enhancers. Nat Genet 42, 806-810.

740 de Boer, E., Rodriguez, P., Bonte, E., Krijgsveld, J., Katsantoni, E., Heck, A., Grosveld, F., and

741 Strouboulis, J. (2003). Efficient biotinylation and single-step purification of tagged

742 transcription factors in mammalian cells and transgenic mice. Proc Natl Acad Sci U S A

743 100, 7480-85.

744 Bonn, S., Zinzen, R.P., Girardot, C., Gustafson, E.H., Perez-Gonzalez, A., Delhomme, N.,

745 Ghavi-Helm, Y., Wilczynski, B., Riddell, A., and Furlong, E.E. (2012). Tissue-specific

746 analysis of chromatin state identifies temporal signatures of enhancer activity during

747 embryonic development. Nat Genet 44, 148-156.

748 Bresnick, E.H., Katsumura, K.R., Lee, H.Y., Johnson, K.D., and Perkins, A.S. (2012). Master

749 regulatory GATA transcription factors: mechanistic principles and emerging links to

750 hematologic malignancies. Nucleic Acids Res 40, 5819-831.

751 Bryja, V., Bonilla, S., and Arenas, E. (2006). Derivation of mouse embryonic stem cells. Nat

752 Protoc 1, 2082-87.

753 Buckingham, M., and Rigby, P.J. (2014). Gene Regulatory Networks and Transcriptional

- 32 - 754 Mechanisms that Control Myogenesis. Dev Cell 28, 225-238.

755 Buenrostro, J.D., Wu, B., Chang, H.Y., and Greenleaf, W.J. (2015). ATAC-seq: A Method for

756 Assaying Chromatin Accessibility Genome-Wide. Curr Protoc Mol Biol 109, 21.29.1-.9.

757 Bulger, M., and Groudine, M. (2011). Functional and mechanistic diversity of distal

758 transcription enhancers. Cell 144, 327-339.

759 Cao, Z., Chen, C., He, B., Tan, K., and Lu, C. (2015). A microfluidic device for epigenomic

760 profiling using 100 cells. Nat Methods 12, 959-962.

761 Chen, C.Y., Chi, Y.H., Mutalif, R.A., Starost, M.F., Myers, T.G., Anderson, S.A., Stewart, C.L.,

762 and Jeang, K.T. (2012). Accumulation of the inner nuclear envelope protein Sun1 is

763 pathogenic in progeric and dystrophic laminopathies. Cell 149, 565-577.

764 Coppiello, G., Collantes, M., Sirerol-Piquer, M.S., Vandenwijngaert, S., Schoors, S., Swinnen,

765 M., Vandersmissen, I., Herijgers, P., Topal, B., and van Loon, J. (2015). Meox2/Tcf15

766 heterodimers program the heart capillary endothelium for cardiac fatty acid uptake.

767 Circulation 131, 815-826.

768 Crawford, G.E., Holt, I.E., Whittle, J., Webb, B.D., Tai, D., Davis, S., Margulies, E.H., Chen, Y.,

769 Bernat, J.A., et al. (2006). Genome-wide mapping of DNase hypersensitive sites using

770 massively parallel signature sequencing (MPSS). Genome Res 16, 123-131.

771 Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna,

772 J., Lodato, M.A., Frampton, G.M., et al. (2010). Histone H3K27ac separates active from

773 poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 107,

774 21931-36.

775 Deal, R.B., and Henikoff, S. (2010). A simple method for gene expression and chromatin

776 profiling of individual cell types within a tissue. Dev Cell 18, 1030-040.

777 Dickel, D.E., Barozzi, I., Zhu, Y., Fukuda-Yuzawa, Y., Osterwalder, M., Mannion, B.J., May, D.,

- 33 - 778 Spurrell, C.H., Plajzer-Frick, I., et al. (2016). Genome-wide compendium and functional

779 assessment of in vivo heart enhancers. Nat Commun 7, 12923.

780 Dogan, N., Wu, W., Morrissey, C.S., Chen, K.B., Stonestrom, A., Long, M., Keller, C.A.,

781 Cheng, Y., Jain, D., et al. (2015). Occupancy by key transcription factors is a more

782 accurate predictor of enhancer activity than histone modifications or chromatin

783 accessibility. Epigenetics Chromatin 8, 16.

784 Driegen, S., Ferreira, R., van Zon, A., Strouboulis, J., Jaegle, M., Grosveld, F., Philipsen, S.,

785 and Meijer, D. (2005). A generic tool for biotinylation of tagged proteins in transgenic

786 mice. Transgenic Res 14, 477-482.

787 Fulton, D.L., Sundararajan, S., Badis, G., Hughes, T.R., Wasserman, W.W., Roach, J.C., and

788 Sladek, R. (2009). TFCat: the curated catalog of mouse and human transcription factors.

789 Genome Biol 10, R29.

790 Gasper, W.C., Marinov, G.K., Pauli-Behn, F., Scott, M.T., Newberry, K., DeSalvo, G., Ou, S.,

791 Myers, R.M., Vielmetter, J., and Wold, B.J. (2014). Fully automated high-throughput

792 chromatin immunoprecipitation for ChIP-seq: identifying ChIP-quality p300 monoclonal

793 antibodies. Sci Rep 4, 5152.

794 Goldhamer, D.J., Faerman, A., Shani, M., and Emerson, C.P. (1992). Regulatory elements that

795 control the lineage-specific expression of myoD. Science 256, 538-542.

796 Grant, C.E., Bailey, T.L., and Noble, W.S. (2011). FIMO: scanning for occurrences of a given

797 motif. Bioinformatics 27, 1017-18.

798 He, A., and Pu, W.T. (2010). Genome-wide location analysis by pull down of in vivo

799 biotinylated transcription factors. Curr Protoc Mol Biol Chapter 21, Unit 21.20.

800 He, A., Gu, F., Hu, Y., Ma, Q., Yi Ye, L., Akiyama, J.A., Visel, A., Pennacchio, L.A., and Pu,

801 W.T. (2014). Dynamic GATA4 enhancers shape the chromatin landscape central to heart

- 34 - 802 development and disease. Nat Commun 5, 4907.

803 He, A., Kong, S.W., Ma, Q., and Pu, W.T. (2011). Co-occupancy by multiple cardiac

804 transcription factors identifies transcriptional enhancers active in heart. Proc Natl Acad

805 Sci U S A 108, 5632-37.

806 He, A., Shen, X., Ma, Q., Cao, J., von Gise, A., Zhou, P., Wang, G., Marquez, V.E., Orkin,

807 S.H., and Pu, W.T. (2012). PRC2 directly methylates GATA4 and represses its

808 transcriptional activity. Genes Dev 26, 37-42.

809 Heiman, M., Schaefer, A., Gong, S., Peterson, J.D., Day, M., Ramsey, K.E., Suarez-Farinas,

810 M., Schwarz, C., Stephan, D.A., et al. (2008). A translational profiling approach for the

811 molecular characterization of CNS cell types. Cell 135, 738-748.

812 Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van

813 Calcar, S., Qu, C., et al. (2007). Distinct and predictive chromatin signatures of

814 transcriptional promoters and enhancers in the . Nat Genet 39, 311-18.

815 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C.,

816 Singh, H., and Glass, C.K. (2010). Simple combinations of lineage-determining

817 transcription factors prime cis-regulatory elements required for macrophage and B cell

818 identities. Mol Cell 38, 576-589.

819 Jolma, A., Yin, Y., Nitta, K.R., Dave, K., Popov, A., Taipale, M., Enge, M., Kivioja, T.,

820 Morgunova, E., and Taipale, J. (2015). DNA-dependent formation of transcription factor

821 pairs alters their binding specificity. Nature 527, 384-88.

822 Kanamori, M., Konno, H., Osato, N., Kawai, J., Hayashizaki, Y., and Suzuki, H. (2004). A

823 genome-wide and nonredundant mouse transcription factor database. Biochem Biophys

824 Res Commun 322, 787-793.

825 Kisanuki, Y.Y., Hammer, R.E., Miyazaki, J., Williams, S.C., Richardson, J.A., and Yanagisawa,

- 35 - 826 M. (2001). Tie2-Cre Transgenic Mice: A New Model for Endothelial Cell-Lineage Analysis

827 in Vivo. Dev Biol 230, 230-242.

828 Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat

829 Methods 9, 357-59.

830 Ma, Q., Zhou, B., and Pu, W.T. (2008). Reassessment of Isl1 and Nkx2-5 cardiac fate maps

831 using a Gata4-based reporter of Cre activity. Dev Biol 323, 98-104.

832 Mahony, S., Auron, P.E., and Benos, P.V. (2007). DNA familial binding profiles made easy:

833 comparison of various motif alignment and clustering strategies. PLoS Comput Biol 3,

834 e61.

835 McLean, C.Y., Bristor, D., Hiller, M., Clarke, S.L., Schaar, B.T., Lowe, C.B., Wenger, A.M., and

836 Bejerano, G. (2010). GREAT improves functional interpretation of cis-regulatory regions.

837 Nat Biotechnol 28, 495-501.

838 Mehta, D., Ravindran, K., and Kuebler, W.M. (2014). Novel regulators of endothelial barrier

839 function. Am J Physiol Lung Cell Mol Physiol 307, L924-935.

840 Mo, A., Mukamel, E.A., Davis, F.P., Luo, C., Henry, G.L., Picard, S., Urich, M.A., Nery, J.R.,

841 Sejnowski, T.J., et al. (2015). Epigenomic Signatures of Neuronal Diversity in the

842 Mammalian Brain. Neuron 86, 1369-384.

843 Muzumdar, M.D., Tasic, B., Miyamichi, K., Li, L., and Luo, L. (2007). A global

844 double-fluorescent Cre reporter mouse. Genesis 45, 593-605.

845 Nolan, D.J., Ginsberg, M., Israely, E., Palikuqi, B., Poulos, M.G., James, D., Ding, B.S.,

846 Schachterle, W., Liu, Y., et al. (2013). Molecular signatures of tissue-specific

847 microvascular endothelial cell heterogeneity in organ maintenance and regeneration. Dev

848 Cell 26, 204-219.

849 Nord, A.S., Blow, M.J., Attanasio, C., Akiyama, J.A., Holt, A., Hosseini, R., Phouanenavong,

- 36 - 850 S., Plajzer-Frick, I., Shoukry, M., et al. (2013). Rapid and Pervasive Changes in

851 Genome-wide Enhancer Usage during Mammalian Development. Cell 155, 1521-531.

852 Pimanda, J.E., Chan, W.Y., Donaldson, I.J., Bowen, M., Green, A.R., and Göttgens, B. (2006).

853 Endoglin expression in the endothelium is regulated by Fli-1, Erg, and Elf-1 acting on the

854 promoter and a -8-kb enhancer. Blood 107, 4737-745.

855 Ramírez, F., Dündar, F., Diehl, S., Grüning, B.A., and Manke, T. (2014). deepTools: a flexible

856 platform for exploring deep-sequencing data. Nucleic Acids Res 42, W187-191.

857 Robinson, A.S., Materna, S.C., Barnes, R.M., De Val, S., Xu, S.M., and Black, B.L. (2014). An

858 arterial-specific enhancer of the human endothelin converting enzyme 1 (ECE1) gene is

859 synergistically activated by Sox17, FoxC2, and Etv2. Dev Biol 395, 379-389.

860 Sacilotto, N., Monteiro, R., Fritzsche, M., Becker, P.W., Sanchez-Del-Campo, L., Liu, K.,

861 Pinheiro, P., Ratnayaka, I., Davies, B., et al. (2013). Analysis of Dll4 regulation reveals a

862 combinatorial role for Sox and Notch in arterial development. Proc Natl Acad Sci U S A

863 110, 11893-98.

864 Shikama, N., Lutz, W., Kretzschmar, R., Sauter, N., Roth, J.F., Marino, S., Wittwer, J.,

865 Scheidweiler, A., and Eckner, R. (2003). Essential function of p300 acetyltransferase

866 activity in heart, lung and small intestine formation. EMBO J 22, 5175-185.

867 Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson,

868 H., Spieth, J., Hillier, L.W., et al. (2005). Evolutionarily conserved elements in vertebrate,

869 insect, worm, and yeast . Genome Res 15, 1034-050.

870 Sorensen, I., Adams, R.H., and Gossler, A. (2009). DLL1-mediated Notch activation regulates

871 endothelial identity in mouse fetal arteries. Blood 113, 5680-88.

872 Tallquist, M.D., Weismann, K.E., Hellström, M., and Soriano, P. (2000). Early myotome

873 specification regulates PDGFA expression and axial skeleton development. Development

- 37 - 874 127, 5059-070.

875 Thorvaldsdóttir, H., Robinson, J.T., and Mesirov, J.P. (2013). Integrative Genomics Viewer

876 (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14,

877 178-192.

878 Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen, E., Sheffield,

879 N.C., Stergachis, A.B., Wang, H., et al. (2012). The accessible chromatin landscape of

880 the human genome. Nature 489, 75-82.

881 Trapnell, C., Pachter, L., and Salzberg, S.L. (2009). TopHat: discovering splice junctions with

882 RNA-Seq. Bioinformatics 25, 1105-111.

883 De Val, S., Chi, N.C., Meadows, S.M., Minovitsky, S., Anderson, J.P., Harris, I.S., Ehlers, M.L.,

884 Agarwal, P., Visel, A., et al. (2008). Combinatorial regulation of endothelial gene

885 expression by ets and forkhead transcription factors. Cell 135, 1053-064.

886 Visel, A., Rubin, E.M., and Pennacchio, L.A. (2009a). Genomic views of distant-acting

887 enhancers. Nature 461, 199-205.

888 Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M.,

889 Wright, C., et al. (2009b). ChIP-seq accurately predicts tissue-specific activity of

890 enhancers. Nature 457, 854-58.

891 Visel, A., Minovitsky, S., Dubchak, I., and Pennacchio, L.A. (2007). VISTA Enhancer

892 Browser--a database of tissue-specific human enhancers. Nucleic Acids Res 35,

893 D88-D92.

894 Waldron, L., Steimle, J.D., Greco, T.M., Gomez, N.C., Dorr, K.M., Kweon, J., Temple, B.,

895 Yang, X.H., Wilczewski, C.M., et al. (2016). The Cardiac TBX5 Interactome Reveals a

896 Chromatin Remodeling Network Essential for Cardiac Septation. Dev Cell 36, 262-275.

897 Wei, G., Srinivasan, R., Cantemir-Stone, C.Z., Sharma, S.M., Santhanam, R., Weinstein, M.,

- 38 - 898 Muthusamy, N., Man, A.K., Oshima, R.G., et al. (2009). Ets1 and Ets2 are required for

899 endothelial cell survival during embryonic angiogenesis. Blood 114, 1123-130.

900 Wei, J.Q., Shehadeh, L.A., Mitrani, J.M., Pessanha, M., Slepak, T.I., Webster, K.A., and

901 Bishopric, N.H. (2008). Quantitative control of adaptive cardiac hypertrophy by

902 acetyltransferase p300. Circulation 118, 934-946.

903 Williams-Simons, L., and Westphal, H. (1999). EIIaCre -- utility of a general deleter strain.

904 Transgenic Res 8, 53-54.

905 Wythe, J.D., Dang, L.T., Devine, W.P., Boudreau, E., Artap, S.T., He, D., Schachterle, W.,

906 Stainier, D.Y., Oettgen, P., et al. (2013). ETS factors regulate Vegf-dependent arterial

907 specification. Dev Cell 26, 45-58.

908 Yang, W.Y., Petit, T., Thijs, L., Zhang, Z.Y., Jacobs, L., Hara, A., Wei, F.F., Salvi, E., Citterio,

909 L., et al. (2015). Coronary risk in relation to genetic variation in MEOX2 and TCF15 in a

910 Flemish population. BMC Genet 16, 116.

911 Zhang, B., Day, D.S., Ho, J.W., Song, L., Cao, J., Christodoulou, D., Seidman, J.G., Crawford,

912 G.E., Park, P.J., and Pu, W.T. (2013). A dynamic H3K27ac signature identifies

913 VEGFA-stimulated endothelial enhancers and requires EP300 activity. Genome Res 23,

914 917-927.

915 Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nussbaum, C.,

916 Myers, R.M., Brown, M., et al. (2008). Model-based analysis of ChIP-Seq (MACS).

917 Genome Biol 9, R137.

918 Zhou, P., Zhang, Y., Ma, Q., Gu, F., Day, D.S., He, A., Zhou, B., Li, J., Stevens, S.M., et al.

919 (2013). Interrogating translational efficiency and lineage-specific transcriptomes using

920 ribosome affinity purification. Proc Natl Acad Sci U S A 110, 15395-5400.

921 922 - 39 - 923 Table 1 924 mm9 genome coordinates of regions with EC activity as determined by transient transgenic 925 assay. Vista_XXX indicates that the region was obtained from the VISTA enhancer database. 926 Lifeover indicates that the region was inferred by liftover from the human genome. For 927 enhancers obtained from the literature, Pubmed was searched for "endothelial cell enhancer". 928 The resulting references were manually curated for transient transgenic testing of candidate 929 endothelial cell enhancer regions. 930 chr start end Note chr9 37206631 37209631Robo4;PMID17495228;bloodVessels chr4 94412022 94413640Tek;PMID9096345;bloodVessels chr8 106625634 106626053Cdh5;PMID15746076;bloodVessels chr11 49445663 49446520Flt4;posEC;liftoverFromPMID19070576;FoxETS chr6 99338223 99339713FoxP1;posEC;liftoverFromPMID19070576;FoxETS chr8 130910720 130911740Nrp1;posEC;liftoverFromPMID19070576;FoxETS chr18 61219017 61219690Pdgfrb;posEC;liftoverFromPMID19070576;FoxETS chr4 137475841 137476446Ece1;posEC;liftoverFromPMID19070576;FoxETS;artery chr4 114743698 114748936Tal1;PMID14966269;endocardium;bloodVessels chr2 119152861 119153661Dll4;PMID23830865;arterial chr13 83721919 83721962Mef2c;PMID19070576;FoxETS chr13 83711086 83711527Mefec;PMID15501228;panEC chr2 32493213 32493467Eng;liftoverFromPMID16484587;bloodVessels chr2 32517844 32518197Eng;liftoverFromPMID18805961;bloodVessels;blood chr6 88152598 88153791Gata2;PMID17395646;PMID17347142;bloodVessels chr9 32337302 32337549Fli1;PMID15649946;bloodVessels chr5 76370571 76372841KDR;PMID10361126;bloodVessels chr6 125502138 125502981VWF;PMID20980682;smallbloodVessels chr5 148537311 148538294Flt1;liftoverFromPMID19822898 chr17 34701455 34702277Notch4;liftoverFromPMID15684396 chr2 155568649 155569132Procr;liftoverfromPMID16627757;bloodVessels[7/17] chr9 63916890 63917347Smad6;liftoverfromPMID17213321;bloodVessels chr19 37510161 37510394Hhex;liftoverfromPMID15649946;bloodVessels;blood chr5 76357892 76358715Flk1;PMID27079877;bloodVessesl;artery chr11 32145270 32146411vista_101;heart[9/12];bloodVessels[7/12] chr13 28809515 28811310vista_265;neural tube[7/8];bloodVessels[3/8] chr17 12982583 12985936vista_89;heart[6/10];bloodVessels[8/10] chr4 131631431 131635142vista_80;bloodVessels[10/10] chr4 57858433 57860639vista_261;bloodVessels[5/8] chr5 93426293 93427320vista_397;limb[12/13];bloodVessels[12/13] chr8 28216799 28218903vista_136;heart[7/10];other[6/10];bloodVessels[4/10] chr3 87786601 87790798liftoverFromvista_1891;somite[7/7];midbrain;(mesencephalon)[6/7 ];limb[7/7];branchial;arch[7/7];eye[7/7];heart[7/7];ear[7/7];bloodV essels[6/7] chr6 116309006 116309768liftoverFromvista_2065;bloodVessels[9/9] chr19 37566512 37571839liftoverFromvista_1866;bloodVessels[5/5] chr7 116256647 116260817liftoverFromvista_1859;neuraltube[8/8];hindbrain;(rhombencephal on)[5/8];midbrain;(mesencephalon)[8/8];forebrain[8/8];heart[7/8]; bloodVessels[5/8];liver[4/8] chr18 14006285 14007917liftoverFromvista_1653;bloodVessels[5/8] chr2 152613921 152616703liftoverFromvista_2050;bloodVessels[5/5] chr14 32045520 32047900liftoverFromvista_2179;bloodVessels[5/7] chr15 73570041 73577576liftoverFromvista_1882;bloodVessels[8/8]

- 40 - chr8 28216817 28218896liftoverFromvista_1665;heart[5/7];bloodVessels[7/7] 931 932 933

- 41 - 934 Table 2 935 Summary of transient transgenic validation of candidate EC enhancers 936 Neighboring region (mm9) size Location Distance Whole Mount Sections Ref. gene (bp) w/r gene to TSS (PMID) #PCR #LacZ # EC or Endo Art Vein Blood pos pos blood cells pos Apln chrX:45358891-45359918 1028 3'_Distal 28,624 10 3 3 + + + - - Dab2 chr15:6009504-6010497 994 5'_Distal -239,788 24 10 9 + ++ + +++ - Egfl7_enh1 chr2:26427040-26428029 990 5'_Distal -9,041 9 3 3 +++ +++ +++ + - Eng chr2:32493216-32494019 804 5'_Distal -8,497 12 4 3 ++ ++ ++ - 16484587 Ephb4 chr5:137789649-137790412 764 5'_Proxim -1,306 7 5 4 ++ - +++ - - al Lmo2 chr2:103733621-103734378 758 5'_Distal -64,152 29 14 10 - - - +++ - Mef2c chr13:83721522-83722451 930 Intragenic 78,954 10 6 5 + - +++ + 19070576 Notch1_enh1 chr2:26330255-26331184 930 Intragenic -28,622 22 3 3 +++ ++ ++ - - Sema6d chr2:124380522-124381285 764 5'_Distal -55,128 17 10 7 ++ +++ - + -

Egfl7_enh3 chr2:26433680-26434642 963 5'_Distal -2,415 12 2 2 + ++ +++ - - Sox7 chr14:64576382-64577118 737 3'_Distal 14,207 7 4 2 + ++ - - - Aplnr chr2:85003436-85004412 977 3'_Distal 27,407 12 3 0 NA NA NA NA - Egfl7_enh2 chr2:26431273-26431995 723 5'_Proxim -4,942 9 3 1 + - - - - al Emcn chr3:136984933-136986103 1171 5'_Distal -18,524 15 1 1 - - - + - Ets1 chr9:32481485-32482133 649 Intragenic -21,818 11 1 0 NA NA NA NA - Foxc1 chr13:31921976-31922827 852 3'_Distal 23,887 13 3 1* NA NA NA NA - Gata2 chr6:88101907-88102696 790 5'_Distal -46,356 8 0 0 NA NA NA NA - Lyve1 chr7:118020264-118021043 780 5'_Distal -3,128,02 7 1 1 + + - - - 8 Notch1_enh2 chr2:26345973-26347118 1146 Intragenic 12,796 9 0 0 NA NA NA NA - Sox18 chr2:181397552-181398335 784 3'_Proxim 8,401 13 2 1 - ++ - - - al *EC/blood pattern on whole mount not validated in histological sections 937 938 939

- 42 - 940 Table 3. Oligonucleotides used in this study. 941 Genotyping primers Name Sequence (5'->3') Comments p300fb-f AATGCTTTCACAGCTCGC 0.28kb for wild-type, 0.43kb for Ep300fb knockin p300fb-r AAACCATAAATGGCTACTGC Forward common CTCTGCTGCCTCCTGGCTTCT Rosa26-fs-BirA, 0.33 kb for wildtype, 0.25 kb for knockin Wild type reverse CGAGGCGGATCACAAGCAATA CAG reverse TCAATGGGCGGGGGTCGTT LacZ-f CAATGCTGTCAGGTGCTCTCACTACC 0.42 kb, genotyping of transient transgenic LacZ-r GCCACTTCTTGATGCTCCACTTGG

Primers to amplify Ep300 peak regions for transient transgenic assay. 4 nucleotides CACC have been added to all the forward primers for TOPO Cloning. Name Sequence (5'->3') Apln_f CACCGGAGGCTGAGCAATGAATAG Apln_r TTGGCTGGGGAAGAGTAAGC Aplnr_f CACCTCTCTCTCTGGCTTCG Aplnr_r CCTCAGAATGTTTTCATGG Dab2_f CACCGTGGAAATCATAGCAC Dab2_r GGTTGGAATAAAAGAGC Egfl7_Enh1_f CACCGCCTACCCAGTGCTGTTCC Egfl7_Enh1_r CTGGAGTGGAGTGTCACG Egfl7_Enh2_f CACCGCTAGGGGCTTCTAGTTC Egfl7_Enh2_r AGGTCTCTTCTGTGTCG Egfl7_Enh3_f CACCTGTTAGTGGTGCTCCC Egfl7_Enh3_r TCCAAGGTCACAAAGC Emcn_f CACCAGCACACCTCGTAAAATGG Emcn_r GAGTGAAGTAAGACATCGTCC Eng_f CACCAAACTAATTAAAAAACAAAGCAGGT Eng_r CATATGTACATTAGAACCATCCA Ephb4_f CACCTGGGTCTCATCAACCGAAC Ephb4_r CCTATCTACATCAGGGCACTG Ets1_f CACCTTCGTCAGAAATGATCTTGCCA Ets1_r TAGCAAGAGAGCCTGGTCAG Foxc1_f CACCTCTCTGCTTCAAGGCACCTT Foxc1_r TGGATAGCATGCAGAGGACA Gata2_f CACCTTCTCTTGGGCCACACAGA Gata2_r ATCTGCTCCACTCTCCGTCA Lmo2_f CACCTGGTTTTGCTTGCTAC Lmo2_r CATTTCTAAGTCTCCAC Lyve1_f CACCTACTGCCATGGAGGACTG Lyve1_r AGACACCTGGCTGCCTGATA Mef2c_f CACCGGAGGATTAAAAATTCCCC Mef2c_r CCTCTTAAATGTACGTG Notch1_Enh1_f CACCTCCCAAATGCTCCACGATG Notch1_Enh1_r GAGGAATGGCGAGAAATAGAC Notch1_Enh2_f CACCGAAGGCAGGCAGGAATAAC Notch1_Enh2_r TGGACAGGTGCTTTGTTG

- 43 - Sema6d_f CACCTCTTAACCACTATCTCC Sema6d_r ACTTCCTACACAGTTC Sox18_f CACCTTGGGGGGAAAGAGTG Sox18_r GACTTCATCCCATCTC Sox7_f CACCACAGAGCCCCTGCATATGT Sox7_r GCATGGTTTCTGAAGCCCAAAT

3x repeated enhancer regions for luciferase assay. Core motifs of interest are highlighted in red. Name Sequence (5'->3') ETS-FOX2 (Mef2c) CAGGAAGCACATTTGTCTACGCTTTCCTGTCATAACAGGAAGAGCAGGAAGCACATTTG TCTACGCTTTCCTGTCATAACAGGAAGAGCAGGAAGCACATTTGTCTACGCTTTCCTGT CATAACAGGAAGAG Mef2c-mut CAAGAAGCACATTTGTCTACGCTTTCCTGTCATATCTAGAAGAGCAAGAAGCACATTTG TCTACGCTTTCCTGTCATATCTAGAAGAGCAAGAAGCACATTTGTCTACGCTTTCCTGT CATATCTAGAAGAG ETS-FOX3 (Bcl2l1) CAGTTATTTCAGGAAAGATCAGTTATTTCAGGAAAGATCAGTTATTTCAGGAAAGAT

ETS-HOMEO2 GACAGACAGGAAGGCGGGACAGACAGGAAGGCGGGACAGACAGGAAGGCGG (Egfl7_Enh1) ETS-HOMEO2 ACACACTTCCTGTTTCCTGACACACTTCCTGTTTCCTGACACACTTCCTGTTTCCTG (Egfl7_Enh3) ETS-HOMEO2 ACAGTCACTTCCTGTTTTACAGTCACTTCCTGTTTTACAGTCACTTCCTGTTTT (Flt4) ETS-HOMEO2 CAACAACAGGAAGTGGACAACAACAGGAAGTGGACAACAACAGGAAGTGGA (Kdr) Sox (Apln) CAGTTCCCCATTGTTCTCGCAGTTCCCCATTGTTCTCGCAGTTCCCCATTGTTCTCG Sox (Robo4) GCCAGAACAATGAAGAACAAAGCCTGCACGGCCAGAACAATGAAGAACAAAGCCTGCAC GGCCAGAACAATGAAGAACAAAGCCTGCACG Sox (Sema3g) CGAATGGAAAGGGCATTGTTCAGGGGAGAACGAATGGAAAGGGCATTGTTCAGGGGAGA ACGAATGGAAAGGGCATTGTTCAGGGGAGAA ETS-Ebox (Apln) AGGCGGAAGCAGCTGGGATAGGCGGAAGCAGCTGGGATAGGCGGAAGCAGCTGGGAT ETS-Ebox TTATCCACAGGAAACAGATGAGGATCGTTATCCACAGGAAACAGATGAGGATCGTTATC (Zfp521) CACAGGAAACAGATGAGGATCG

3x repeated motifs within a similar DNA context. Motifs are indicated in red. Neg. control TGTCATATCTAGAAGAGTGTCATATCTAGAAGAGTGTCATATCTAGAAGAG ETS_alone TGTCATATCTGGAAGAGTGTCATATCTGGAAGAGTGTCATATCTGGAAGAG

ETS-FOX1 TGTCCGGATGTTGAGTGTCCGGATGTTGAGTGTCCGGATGTTGAG ETS-FOX2 TGTGTAAACAGGAAGTGAGTGTGTAAACAGGAAGTGAGTGTGTAAACAGGAAG TGAG ETS-FOX3 TGTTGTTTACGGAAGTGAGTGTTGTTTACGGAAGTGAGTGTTGTTTACGGAAG TGAG ETS_Ebox TGTAGGAAACAGCTGGAGTGTAGGAAACAGCTGGAGTGTAGGAAACAGCTGGA G ETS-HOMEO2 TGTAACCGGAAGTGAGTGTAACCGGAAGTGAGTGTAACCGGAAGTGAG Tbx-ETS TGTCACACCGGAAGGAGTGTCACACCGGAAGGAGTGTCACACCGGAAGGAG 942 943 944 945

- 44 - 946 Figure Legends 947 Fig. 1. Generation and characterization of Ep300flbio allele. 948 A. Experimental strategy for high affinity Ep300 pull down. A flag-bio epitope was knocked

949 onto C-terminus of the endogenous Ep300 gene. The bio peptide sequence is biotinylated by

950 BirA, widely expressed from the Rosa26 . B. Targeting strategy to knock flag-bio epitope

951 into the C-terminus of Ep300. A targeting vector containing homology arms, flag-bio epitope,

952 and Frt-neo-Frt cassette was used to insert the epitope tag into embryonic stem cells by

953 homologous recombination. Chimeric mice were mated with Act:Flpe mice to excise the

954 Frt-neo-Frt cassette in the germline, yielding the Ep300fb allele. C. Biotinylated Ep300 is

955 quantitatively pulled down by streptavidin beads. Protein extract from Ep300flbio/flbio; Rosa26BirA

956 embryos was incubated with immobilized streptavidin. Input, bound, and unbound fractions

957 were analyzed for Ep300 by immunoblotting. GAPDH was used as an internal control. D.

958 Ep300flbio/flbio mice from heterozygous intercrosses survived normally to weaning.

959

960 Fig. 2. Comparison of Ep300 bioChiP-seq to antibody ChIP-seq for mapping Ep300 961 chromatin occupancy. 962 A-B. Comparison of biological duplicate antibody Ep300 ChIP-seq (Encode) and Ep300

963 bioChIP-seq (flbio). The Ep300 bioChIP-seq data had greater overlap between replicates and

964 greater intragroup correlation. Most antibody peaks were covered by the bioChIP-seq data.

965 There were 3.3 times more Ep300 regions detected by Ep300 bioChiP-seq. 89.6% of Ep300

966 regions detected by antibody ChIP-seq were recovered by Ep300 bioChIP. Panel B shows

967 Spearman correlation between samples over the peak regions. C. Tag heatmap shows

968 input-subtracted Ep300 antibody or bioChIP signal in the union of the Ep300-bound regions

969 detected by each method. D. Correlation plots show greater Ep300 bioChIP-seq signal

970 compared to antibody bioChIP-seq.

- 45 - 971

972 Fig. 3. Tissue specific Ep300 bioChiP-seq. 973 A. Tag heatmap showing Ep300fb signal in heart and forebrain. Each row represents a

974 region that was bound by Ep300 in heart, forebrain, or both. A minority of Ep300-enriched

975 regions were shared between tissues. B. Ep300fb pull down from E12.5 heart or forebrain.

976 Ep300-bound regions identified by bioChIP-seq of Ep300fb/fb; Rosa26BirA tissues in biological

977 duplicates are compared to regions identified by antibody-mediated Ep300 ChIP-seq (Visel et

978 al., 2009b). C. GO biological process terms enriched for genes that neighbor the

979 tissue-specific heart or forebrain peaks. Bars indicate statistical significance.

980

981 Fig. 4. Cre-directed, lineage-selective Ep300 bioChiP-seq. 982 A. Experimental strategy. Lineage specific Cre recombinase activates expression of BirA

983 (HA-tagged) from Rosa26-flox-stop-BirA (Rosa26fsBirA). This results in Ep300 biotinylation in

984 the progeny of Cre-expressing cells. B. Cre-dependent Ep300 biotinylation using Rosa26fsBirA.

985 Protein extracts were prepared from E11.5 embryos with the indicated genotypes. R26BirA/+

986 (ca) and R26fsBirA/+ indicate the constitutively active and Cre-activated alleles, respectively. C.

987 Immunostaining demonstrating selective Tie2Cre-mediated expression of HA-tagged BirA in

988 ECs of R26fsBirA; Tie2Cre embryos. Arrows and arrowheads indicate endothelial and

989 hematopoietic lineages, respectively. D. Tissue-selective Ep300 bioChIP-seq. Tie2Cre (T;

990 endothelial and hematopoietic lineages), Myf5Cre (M; skeletal muscle lineages), and EIIaCre

991 (E; germline activation) were used to drive tissue-selective Ep300 bioChIP-seq. Correlation

992 between ChIP-seq signals within peak regions are shown for triplicate biological repeats.

993 Samples within groups were the most closely correlated. E. Ep300fb bioChIP-seq signal at

994 Gata2 (EC/blood specific) and Myod (muscle specific). Enhancers validated by transient

- 46 - 995 transgenic assay are indicated along with the citation’s Pubmed identifier (PMID). F. Biological

996 process GO terms illustrate distinct functional groups of genes that neighbor Ep300

997 bioChIP-seq driven by lineage-specific Cre alleles. The heatmap contains the top 10 terms

998 enriched for genes neighboring each of the three lineage-selective Ep300 regions.

999

1000 Fig. 5. Functional validation of enhancer activity of Ep300-T-fb-bound regions. 1001 A. Expression of genes neighboring regions bound by Ep300fb in different Cre-marked

1002 lineages. Translating ribosome affinity purification (TRAP) was used to enrich for RNAs from

1003 the Tie2Cre lineage. Input or Tie2Cre-enriched RNAs were profiled by RNA sequencing. The

1004 expression of the nearest gene neighboring Ep300 regions in ECs (p300-T-fb regions), but not

1005 skeletal muscle (p300-M-fb regions), was higher than that of genes neighboring regions bound

1006 by Ep300 across the whole embryo (EIIaCre). Box and whiskers show quartiles and 1.5 times

1007 the interquartile range. Groups were compared to EIIaCre using the Mann-Whitney U-test. B.

1008 Transient transgenesis assay to measure in vivo activity of Ep300-T-fb regions. Test regions

1009 were positioned upstream of an hsp68 minimal promoter and lacZ. Embryos were assayed at

1010 E11.5. C. Summary of transient transgenic assay results. Out of 20 regions tested, 9 showed

1011 activity in ECs or blood in three or more embryos, and 2 more showed activity in 2 embryos.

1012 See also Table 2. D. Representative whole mount Xgal-stained embryos. Enhancers that

1013 directed LacZ expression in an EC or blood pattern in two or more embryos are shown.

1014 Numbers indicate embryos with LacZ distribution similar to shown image, compared to the total

1015 number of PCR positive embryos. E. Sections of Xgal-stained embryos showing examples of

1016 enhancers active in arteries, veins, and endocardium, or selectively active in arteries or veins.

1017 AS: aortic sac; CV: cardinal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein;

1018 LA: left atrium; LV: left ventricle; RV: right ventricle. Scale bars, 100 µm. See also figure

- 47 - 1019 supplements 1 and 2 and Table 2.

1020

1021 Fig. 6. Motifs enriched in Ep300-T-fb and Ep300-M-fb regions. 1022 A. Motifs enriched in Ep300-T-fb or Ep300-M-fb regions. 1445 motifs were tested for

1023 enrichment in Ep300 bound regions compared to randomly permuted control regions.

1024 Significantly enriched motifs (neg ln P-value > 15) were clustered and the displayed

1025 non-redundant motifs were manually selected. Heatmaps show statistical enrichment (left),

1026 fraction of regions that contain the motif (center), and GO terms associated with genes

1027 neighboring motif-containing, Ep300-bound regions (fraction of the top 20 GO biological

1028 process terms). Grey indicates that the motif was not significantly enriched (neg ln P-value ≤

1029 15). B. Conservation of sequences matching Tie2Cre- or Myf5Cre-enriched motifs within 100

1030 bp of the summit of Ep300-T-fb or Ep300-M-fb regions, compared to randomly selected 12 bp

1031 sequences from the same regions. PhastCons conservation scores across 30 vertebrate

1032 species were used. ****, P<0.0001, Kolmogorov–Smirnov test. C. Luciferase assay of activity

1033 of enhancers containing indicated motifs. Three repeats of 20-30 bp regions from

1034 Ep300-bound enhancers linked to the indicated gene and centered on the indicated motif were

1035 cloned upstream of a minimal promoter and luciferase. The constructs were transfected into

1036 human umbilical vein endothelial cells. Luciferase activity was expressed as fold activation

1037 above that driven by the enhancerless, minimal promoter-luciferase construct. *, P<0.05

1038 compared to Mef2c-enhancer with mutated ETS:FOX2 motif (Mef2c mut). n=3. D. Luciferase

1039 assay of indicated motifs repeated three times within a consistent DNA context. Assay was

1040 performed as in C. *, P<0.05 compared to negative control sequencing lacking predicted motif.

1041 n=3. Error bars in C and D indicate standard error of the mean.

1042

- 48 - 1043 Figure 7. Enhancers in adult heart and lung ECs. 1044 VEcad-CreERT2 and neonatal tamoxifen pulse was used to drive BirA expression in ECs.

1045 Ep300 regions in adult heart or lung ECs were identified by bioChIP-seq in biological triplicate.

1046 Ep300 regions were ranked into deciles by the ratio of the Ep300-VE-fb signal in heart to lung.

1047 A. Tag heatmap shows that the top and bottom deciles have selective Ep300 occupancy in

1048 heart and lung ECs, respectively. B. Expression of genes in heart ECs (red) or lung ECs (blue)

1049 neighboring Ep300 regions, divided into deciles by Ep300-VE-fb signal ratio in heart and lung.

1050 ***, P<0.0001, Wilcoxon test. Expression values were obtained from (Nolan et al., 2013). Box

1051 plots indicate quartiles, and whiskers indicate 1.5 times the interquartile range. C. Genes with

1052 selective expression in heart or lung ECs were identified by K-means clustering (see Methods

1053 and figure supplement 1). The number of Ep300-VE-fb regions in heart (red) or lung (blue)

1054 neighboring these genes was determined, stratified by decile. ***, P<0.0001, Fisher's exact

1055 test. D. Enrichment of selected motifs in Ep300-VE-fb regions from heart (decile 1) compared

1056 to lung (decile 10) or visa versa. Grey indicates no significant enrichment. Displayed

1057 non-redundant motifs were selected from all significant motifs by manual curation of clustered

1058 motifs.

1059

- 49 - 1060 Supplementary Information 1061 Suppl. File 1. Tissue-specific Ep300-bound regions identified in this study. 1062 Each tab of the excel spreadsheet contains the Ep300-bound regions in the following

1063 conditions: Each tab of this spreadsheet shows the Ep300-bound regions in the following

1064 samples:

1065 H1.narrowPeak: E12.5 heart, replicate 1. 1066 H2.narrowPeak: E12.5 heart, replicate 2. 1067 FB1.narrowPeak: E12.5 forebrain, replicate 1. 1068 FB2.narrowPeak: E12.5 forebrain, replicate 2. 1069 p300-T-fb: Tie2Cre-enriched regions. Average Ep300 signal in Tie2Cre (T2), Myf5Cre 1070 (M5), and EIIaCre (E) is shown in reads per million. T2/E and M5/E show the ratio of 1071 signals. Tie2Cre-enriched regions were defined as peak regions with T2/E ratio > 1072 1.5. 1073 p300-M-fb: Myf5Cre-enriched regions. Average Ep300 signal in Tie2Cre (T2), Myf5Cre 1074 (M5), and EIIaCre (E) is shown in reads per million. T2/E and M5/E show the ratio of 1075 signals. Myf5Cre-enriched regions were defined as peak regions with M5/E ratio > 1076 1.5. 1077 p300-E-fb: Ep300-bound peaks called from whole embryo (ubiquitous BirA expression). 1078 p300-VE-fb: Merged VEcad-CreERT2-driven Ep300 peaks in adult heart and lung. The 1079 peaks were ranked by the ratio of Ep300 signal in heart ECs compared to lung ECs 1080 and then grouped into deciles. 1081 1082 Suppl. File 2. Transcription factors expressed in embryonic ECs. 1083 E10.5 embryo T2-TRAP and input RNA-seq data were analyzed to identify DNA-binding

1084 transcriptional regulators with detectable expression level in ECs (T2-TRAP log2

1085 (fpkm+1)>0.8) and preferential EC expression (T2-TRAP/input > 1). Genes with DNA binding

1086 domains belonging to the indicated families were annotated based on literature searches and

1087 previously publshed catalogs of transcriptional regulators (Fulton et al., 2009; Kanamori et al.,

1088 2004).

1089

1090 Figure 1 -- figure supplement 1. Characterization of Ep300fb allele. 1091 A. Southern blot genotyping of Ep300fb allele. Probes are marked in Fig. 1B. B.The

1092 expression level of tagged Ep300 protein is similar to that of wild-type Ep300 in neonatal

- 50 - 1093 mouse hearts. C, D. The heart weight to body weight ratio of 10 week old Ep300fb/fb heart was

1094 not significantly different from littermates. E. Echocardiograms of 10 week old Ep300fb/fb mice

1095 show that their ventricular function was not significantly different from littermate controls.

1096 Sample sizes are shown in bars in D,E. Bars represent mean ± standard deviation. n.s., not

1097 significantly different (t-test).

1098

1099 Figure 3 -- figure supplement 1. Heart and forebrain Ep300 bioChiP-seq. 1100 A. Ep300 bioChIP-seq signals within called Ep300 peaks were highly correlated between

1101 biological replicates and discordant between heart and forebrain. B. Genome browser views

1102 showing Ep300 bioChIP-seq signal in heart and brain at Srf (preferential cardiac expression)

1103 and Nes (preferential Brain expression). C. Comparison of Ep300 bioChIP-seq and H3K27ac

1104 ChIP-seq signals in heart. Numbers indicate correlation coefficients. There was excellent

1105 correlation between biological replicates of Ep300 or H3K27ac. Ep300 and H3K27ac also

1106 showed highly significant genome-wide correlation. Similar results were observed in forebrain

1107 (data not shown). D. Tag heatmap of H3K27ac-bound regions showing overlap between heart

1108 and brain H3K27ac enriched regions from each tissue.

1109

1110 Figure 3 -- figure supplement 2. Comparison of heart Ep300fb regions to a compendium 1111 of heart enhancers assembled by Dickel et al. (2016). 1112 The Dickel heart enhancer compendium (Dickel et al., 2016) is based on H3K27ac

1113 ChIP-seq and Ep300 antibody ChIP-seq. Most compendium regions had low enhancer scores

1114 and have low validation frequency in transient transgenic assays. Compendium regions were

1115 compared to heart Ep300fb regions using the LiftOver tool to map human genome coordinates

1116 to mouse. A. Venn diagram showing that 13% of prenatal heart compendium regions overlap

1117 with heart Ep300 regions. On the other hand, 47% of Ep300fb heart regions overlap with heart

- 51 - 1118 compendium regions, and 53% do not. B. Analysis of the fraction of compendium regions that

1119 overlap the Ep300fb heart regions, as a function of enhancer score. Regions with higher

1120 enhancer score (and higher likelihood of in vivo validation) corresponded well with heart Ep300

1121 regions.

1122

1123 Figure 3 -- figure supplement 3. Comparison of enhancer prediction based on indicated 1124 chromatin features of E12.5 heart. 1125 Data are from ENCODE except for ATAC-seq (Pu lab, unpublished), and Ep300fb (this

1126 manuscript). The KNN classifier was trained on a subset of the VISTA enhancer database and

1127 the validated on the remaining data, using a 10x cross-validation design. Ep300 was the most

1128 accurate single predictor, based on the area under the receiver operating characteristic curve

1129 (AUC, indicated in parentheses). All of the features together (All), or just Ep300 plus ATAC-seq

1130 together (not shown), gave the best prediction.

1131

1132 Figure 4 -- figure supplement 1. Cre-dependent Ep300 bioChIP-seq. 1133 A. Cre-dependent expression of BirA. Heart with or without cardiac-specific TNTCre was

1134 immunoblotted for expression of BirA from the Rosa26fsBirA allele. B. Distribution of Ep300fb

1135 bioChIP-seq signal when driven by Tie2Cre or Myf5Cre. Regions with Ep300 peaks called in

1136 samples from any of the three Cre drivers were scored for their EC- or muscle- enriched signal

1137 by calculating ratio of the region's signal in Tie2Cre or Myf5Cre bioChIP-seq to its signal with

1138 ubiquitous Ep300 bioChIP-seq (EIIaCre). The Tie2Cre-enrichment score showed a bimodal

1139 distribution with a small peak at values greater than 1.5. A value of greater than 1.5 similarly

1140 was used to define Myf5Cre-enriched regions. C. Relationship of Tie2Cre-enriched (p300-T-fb)

1141 and Myf5Cre-enriched (p300-M-fb) regions to peaks called from Tie2Cre or Myf5Cre samples

1142 alone.

- 52 - 1143

1144 Fig. 5 -- figure supplement 1. Relationship of EC gene expression to Ep300 regions in 1145 ECs, skeletal muscle, and whole embryo. 1146 A. EC enrichment of actively translating genes that neighbored Ep300 regions in skeletal

1147 muscle (p300-M-fb), whole embryo (p300-E-fb), and endothelial cell/blood (p300-T-fb), using

1148 several different thresholds for the maximum distance allowed to associate genes to regions.

1149 The relationship between Ep300-T-fb regions and gene expression was not dependent upon

1150 the threshold used. The rightmost panel (no limit to maximum distance) is the same as in

1151 Figure 5A. Mann-Whitney U-test. B. EC expression of actively translating genes that

1152 neighbored Ep300-T-fb, compared to those that did not, using several different thresholds for

1153 the maximum distance used associate genes to regions. Ep300-T-fb-associated genes were

1154 more highly expressed regardless of the distance threshold used. Ep300-T-fb genes were

1155 significantly more highly expressed (Mann-Whitney U-test). C. Fraction of genes expressed or

1156 not expressed, for Ep300-T-fb-associated genes compared to all genes. Ep300-T-fb

1157 associated genes were more frequently expressed than non-associated genes. However, not

1158 all genes associated with Ep300-T-fb regions were expressed, and some genes without

1159 Ep300-T-fb regions were expressed. This may reflect additional mechanisms of gene

1160 regulation as well as limitations of the gene-to-region association rule additional rule.

1161

1162 Fig. 5 -- figure supplement 2. Transient transgenic assays to measure in vivo activity of 1163 candidate endothelial cell/blood enhancers. 1164 For each tested enhancer with positive endothelial cell or blood activity, we show the

1165 candidate enhancer’s Ep300 signal in skeletal muscle (p300-M-fb), whole embryo (p300-E-fb),

1166 and endothelial cell/blood (p300-T-fb), with respect to the gene both. The whole mount images

1167 show each of the positive Xgal-stained embryos.

- 53 - 1168

1169 Fig. 5 -- figure supplement 3. Histological sections of transient transgenic embryos. 1170 Histological sections of embryos containing the indicated enhancer-driven lacZ transgene.

1171 AS: aortic sac; CV: cardinal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein;

1172 PA: pulmonary artery; RA: right atrium; RV: right ventricle. Bar, 100 µm.

1173

1174 Fig. 6 -- figure supplement 1. Motifs with enriched Ep300 occupancy in Tie2Cre or 1175 Myf5Cre embryo samples. 1176 A. De novo motif discovery. The top 10 discovered motifs are shown, with the significance

1177 P-value, percent of regions containing the motif, and the best matching known motif class. B.

1178 Conservation of Tie2Cre-enriched motifs (dashed lines) or Myf5Cre-enriched motifs (solid

1179 lines) within 100bp of Ep300-T-fb or Ep300-M-fb peak summits. The distribution of

1180 conservation scores for each indicated motif is shown as a cumulative frequency plot. Random

1181 indicates randomly sampled 12 nt regions from within the Ep300-T-fb or Ep300-M-fb regions.

1182

1183 Figure 7 -- figure supplement 1. Organ-specific EC gene expression. 1184 Genes with at least 4-fold differential mean expression between highest and lowest EC

1185 samples were identified from microarray data (Nolan 2013). These genes were grouped by

1186 k-means clustering. Genes in clusters containing selective heart or lung expression were

1187 analyzed further. Expression of these heart or lung EC genes is displayed in panel B using

1188 hierarchical clustering.

1189

1190 Fig. 7 -- figure supplement 2. VEcad-CreERT2 activation of Ep300 biotinylation. 1191 A. VEcad-CreERT2 activated Ep300 biotinylation by Rosa26fsBirA, permitting its pulldown

1192 on streptavidin (SA) beads. B. EC-selective activation of the Rosa26mTmG Cre reporter allele

- 54 - 1193 by VEcad-CreERT2. Tamoxifen was administered at P1 and P2, and tissues were analyzed at

1194 6 weeks of age.

1195

1196 Figure 7 -- figure supplement 3. Ep300fb bioChIP-seq signal in whole organ or ECs. 1197 p300fb bioChIP-seq signal in whole organ ("Con"; EIIaCre) or ECs ("EC"; VEcad-CreERT2)

1198 at genes with EC or non-EC expression in lung or heart.

1199

1200 Figure 7 -- figure supplement 4. Gene functional terms enriched in decile 1 or decile 10 1201 of Ep300-VE-fb regions. 1202 GO biological process (top) or disease ontology (bottom) terms enriched in decile 1 or

1203 decile 10 of Ep300-VE-fb regions. Grey indicates no significant enrichment. The top five

1204 biological process terms in deciles 1 and 10 and selected relevant disease ontology terms

1205 were are displayed.

- 55 - bio 12 kb A B 6.4 kb C Ep300fb/fb D X S 3’ UTR S X Exon 31 BirA 5’ probe 3’ probe Rosa26 survival to weaning Ep300fb Ep300 genomic (n=57) Ep300 flbio PGK-Neo S X S S biotin X S Input Non-bound Bound targeting construct BirA FRT FRT fb/fb wt Homologous recombination Ep300 High affinity 4.5 kb 4.3 kb S X S S BirA Streptavidin X S Rosa26BirA Pull-down Ep300fb-neo FLP recombinase GAPDH fb/+

Ep300fb

Fig. 1 A B P0 heart lysate

fb-neo/+ fb-neo/+ +/+ fb-neo/+ fb-neo/+ +/+ Ep300 WT fb/+ fb/fb Full-length Ep300 12 kb 6.4 kb IB: Ep300 300 kd 4.3 4.5 Loading ctrl Coomassie Blue XbaI, 5’ probe SacI, 3’ probe

10 week old

C Ep300+/+ Ep300 fb/fb D 6 n.s. E 60 n.s. 5 50 4 40 3 30

5 5 FS (%) 2 20 5 9

HW/BW (mg/g) 1 10 0 0 fb/fb fb/fb

Ep300 wt Ep300 Ep300 wt Ep300 Figure 1 -- figure supplement 1. Characterization of Ep300fb allele. A. Southern blot genotyping of Ep300fb allele. Probes are marked in Fig. 1B. B. The expression level of tagged Ep300 protein is similar to that of wild-type Ep300 in neo- natal mouse hearts. C, D. The heart weight to body weight ratio of 10 week old Ep300fb/fb heart was not signifi- cantly different from littermates. E. Echocardiograms of 10 week old Ep300fb/fb mice show that their ventricular function was not significantly different from littermate controls. 3 4

Sample sizes are shown in bars in D,E. Bars represent mean ± standard deviation. n.s., not significantly different (t-test). B A 0 Fig. 2 1.00 0.96 0.68 0.68 rep1 (25855) Ep300_Encode rep1 (77906) Ep300 Ep300_fb_rep1 shared (48963) Ep300 fb 0.70 0.96 1.00 0.71

Ep300_fb_rep2 fb 48963 (93.8%) 0.5 0.68 1.00 0.91 0.70 Ep300_Encode_rep1 rep2 (19638) Ep300_Encode 15281 (77.8%) rep2 (52174) Ep300 0.68 1.00 0.71 0.91 Ep300_Encode_rep2 shared (15281) Ep300_EncodeE 13698 (89.6%) 1

Ep300_fb_rep1Ep300_fb_rep2Ep300_Encode_rep1Ep300_Encode_rep2 fb

Encode specific Peaks (1583) C Union Ep300Peaks minus input Shared Peaks (13698) p300flbio-specific Peaks (35265) ChIP -2k 0 RP10M 0 2k 5 D ES_P300_Encode (RP10M) ES_P300_Encode (RP10M) ES_P300_Encode (RP10M) 10 15 20 10 15 20 10 15 20 0 5 0 5 0 5 p300 0 0 0 Encode specificPeaks ES_Ep300 ES_Ep300 ES_Ep300 Shared Peaks fb 5 5 -specific Peaks 5 10 10 10 fb fb fb (RP10M) (RP10M) (RP10M) 15 15 15 20 20 20 A Forebrain Heart B H-rep1 H-rep2 rep 1 rep 2 input rep 1 rep 2 input 35306 28944 (82%) Brain specific 80 (6923) 70 Shared (3147) heart 9255 19689 4250 60 apex

50

40 1813 78

signal (RP10M) Heart specific fb 194 H-Visel 30 (32190) 1466 3597 Ep300 20 E12.5 Ep300flbio/+ 10 Rosa26-BirA bioChIP-seq 0 -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb

-log10(P-value) FB-rep2 0 5 10 15 20 25 30 35 40 45 50 10735 C cell junction organization FB-rep1 cell junction assembly 5823 (86%) cell substrate adhesion Heart ventricular cardiac muscle tissue dev. cell substrate junction assembly specific cardiac muscle tissue growth peaks reg. of gen. of precursor metabolites & energy adherens junction organization 318 3757 4238 muscle cell proliferation forebrain cardiac muscle cell proliferation

CNS neuron differentiation 12 1251 telencephalon dev. Forebrain glial cell diff. pallium dev. specific oligodendrocyte diff. 502 gliogenesis 966 peaks stem cell maintenance hindbrain dev. FB-Visel somatic stem cell maint. 2759 cerebellum dev.

Fig. 3 A B Srf: Preferential Cardiac Expression FB1 FB2 Heart1 Heart2 46,690 kb 46,700 kb Ep300-fb Heart [0 - 30] 0.04 0.01 0.93 1.00 Heart2 Ep300-fb Forebrain [0 - 30] Ep300-Visel Heart [0 - 15] Ep300-Visel Forebrain [0 - 15] 0.05 0.02 1.00 0.93 Heart1 known enhancers PMID: 21415370 Cul9 Srf Ptk7

0.96 1.00 0.02 0.01 FB2 Nes: Brain Expression 87,770 kb 87,780 kb Ep300-fb Heart [0 - 15] Ep300-fb Forebrain [0 - 15] 1.00 0.96 0.05 0.04 FB1 Ep300-Visel Heart [0 - 15] Ep300-Visel Forebrain [0 - 15] known enhancers spearman correlation PMID: 9917366 0 1 Crabp2 Nes C D Forebrain H3K27ac Heart H3K27ac Rep 1 Rep 2 Rep 1 Rep 2

160

Heart Forebrain (13018)

Ht_Ep300_1

120 Shared (14602)

0.94 Ht_Ep300_2

80

0.62 0.62 Ht_K27ac_1

Heart Reads per 10 Million (40735) 40 0.63 0.63 0.98 Ht_K27ac_2

Reads per 10 Million Reads 0 -2 +2 -2 +2 -2 +2 -2 +2 kb from peak center

Figure 3 -- figure supplement 1. Heart and forebrain Ep300 bioChiP-seq. A. Ep300 bioChIP-seq signals within called Ep300 peaks were highly correlated between biological replicates and discordant between heart and forebrain. B. Genome browser views showing Ep300 bioChIP-seq signal in heart and brain at Srf (preferential cardiac expression) and Nes (preferential Brain expression). C. Comparison of Ep300 bioChIP-seq and H3K27ac ChIP-seq signals in heart. Numbers indicate correlation coefficients. There was excellent correlation between biological replicates of Ep300 or H3K27ac. Ep300 and H3K27ac also showed highly significant genome-wide correlation. Similar results were observed in forebrain (data not shown). D. Tag heatmap of H3K27ac-bound regions showing overlap between heart and brain H3K27ac enriched regions from each tissue. Compendium A genic assays.A.Venn diagramshowingthat47%ofEp300 regions hadlowenhancerscoresandarenotexpectedtovalidateintransienttrans was basedonH3K27acChIP-seqandEp300antibodyChIP-seq.Mostcompendium compendium pendium ofheartenhancersassembledbyDickeletal.(2016).This Figure 3--figuresupplement2.ComparisonofheartEp300 that overlaptheEp300 with heartcompendiumregions. 63070 63070 fb heartregions,asafunctionofenhancerscore. 9438 9438 Ep300 10752 10752 B. Analysis ofthefractioncompendiumregions fb Heart

% of Compendium Peaks B 0.00 0.25 0.50 0.75 1.00

Enhancer Score 0 0.2 Compendium Ep300 fb

heartregionsoverlap 0.4 fb fb

0.6 Heart regionstoacom 0.8 1.0 0 5 10 15 20 25 30 35 40 45 50 55 60

Number of Regions (x 1000) - - prediction. ses). All ofthe featurestogether(All),orjustEp300plus ATAC-seq together (notshown),gavethebest predictor, based ontheareaunderreceiveroperating characteristiccurve(AUC,indicatedinparenthe validated ontheremainingdata,usinga10x cross-validationdesign.Ep300wasthemostaccuratesingle (this manuscript). The KNNclassifierwastrainedonasubsetoftheVISTA enhancerdatabaseandthe features ofE12.5heart.DataarefromENCODEexceptfor ATAC-seq (Pulab,unpublished) andEp300fb Figure 3--figuresupplement3.Comparison ofenhancerpredictionbasedonindicatedchromatin 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Ep300 bioChip-seq(AUC=0.805) 0 0 0 1-specificity sensitivity 0 H3K4me1 (AUC=0.589) H3K9me3 (AUC=0.512) 0.2 0.4 0.6 ROC curvecalculatedusing10xcross-validation Machine learningwithweightedKNNmethod 0.8 1.0 0 ATAC-seq (AUC=0.749) H3K4me2 (AUC=0.646) H3K27ac (AUC=0.747) 0.2 0.4 0.6 0.8 1.0 P300 + ATAC (AUC=0.866) H3K27me3 (AUC=0.608) H3K27me3 0 H3K4me3 (AUC=0.637) 0.2 0.4 0.6 ideal test 0.8 1.0 uninformative test H3K36me3 (AUC=0.579) H3K36me3 0 H3K9ac (AUC=0.540) 0.2 All (AUC=0.862) 0.4 0.6 0.8 1.0 - A B Ep300flbio/+ E11.5 C CAG + + – R26fsBirA/+ stop BirA BirA/+ Rosa26fsBirA – – + R26 (ca) Ep300fb – + – Tie2Cre Cre IB: Ep300, P300 SA IP

CAG biotin IB: Ep300, BirA BirA input (5%) PECAM

fb/+ Ep300 o P-val (-log10) D fsBirA F Rosa26 5Cr e 0 26 mbr y Tie2Cr e My f E Tie2Cre EIIa-Cre Myf5Cre GO Term E11.5 E11.5 E13.5 skeletal system dev. cell proliferation neural tube development bioChIP-seq neg. reg. of glial cell prolif. stem cell differentiation ventricular septum morph. T3 T1 T2 E3 E1 E2 M3 M1 M2 regulation of DNA binding heart morphogenesis 0.23 0.15 0.15 0.66 0.66 0.65 0.78 0.89 1.00 M2 somatic stem cell maint. regulation of binding

M1 0.17 0.09 0.09 0.65 0.65 0.64 0.77 1.00 0.89 muscle cell differentiation (Tie2Cre) HA-BirA skeletal muscle tissue dev. M3 0.20 0.13 0.12 0.65 0.66 0.65 1.00 0.77 0.78 skeletal muscle organ dev. muscle organ development 0.30 0.21 0.20 0.88 0.92 1.00 0.65 0.64 0.65 muscle cell development E2 myotube differentiation satellite cell differentiation E1 0.30 0.21 0.21 0.89 1.00 0.92 0.66 0.65 0.66 regulation of myotube diff. neg. reg. of phosphorylation E3 0.30 0.21 0.21 1.00 0.89 0.88 0.65 0.65 0.66 neg. reg. of protein phosph. angiogenesis T2 0.86 0.93 1.00 0.21 0.21 0.20 0.12 0.09 0.15 actin cytoskeleton org. myeloid leukocyte diff. myeloid cell diff. T1 0.86 1.00 0.93 0.21 0.21 0.21 0.13 0.09 0.15 reg. of vasc. dev. reg. of erythrocyte diff. T3 1.00 0.86 0.86 0.30 0.30 0.30 0.20 0.17 0.23 reg. of angiogenesis erythrocyte homeostasis DAPI PECAM HA-BirA Spearman correlation megakaryocyte dev. 0 1.0 erythrocyte diff.

E Gata2: EC and Blood Lineages 16 kb 88,146 kb 88,148 kb 88,150 kb 88,152 kb 88,154 kb 88,156 kb 88,158 kb 88,160 kb

Ep300-M-fb [0 - 100] Ep300-E-fb [0 - 100] Ep300-T-fb [0 - 100] Ep300-M-fb peak p300-T-fb peak known enhancer PMID17395646 Gata2

MyoD: Skeletal Muscle Lineage 29 kb 53,610 kb 53,620 kb 53,630 kb

Ep300-M-fb [0 - 60] Ep300-E-fb [0 - 60] Ep300-T-fb [0 - 60] Ep300-M-fb peak Ep300-T-fb peak

known enhancer PMID1315077 Myod1

Fig. 4 A Rosa26-fsBirA/+ P0 heart TNT-Cre - + 37 IB: BirA 2xHA-BirA 25 Ponceau S

B Histogram of Tie2Cre-enriched regions Histogram of Myf5Cre-enriched regions 6000

6000 >1.5 = Ep300-T-fb >1.5 = Ep300-M-fb 4000 # of Regions Frequency 2000 2000 0 0

0 1 2 3 4 5 0 1 2 3 4 5 Ratio of ChIP signal in Tie2Cre vs Embryo Ratio of ChIP signal in Myf5Cre vs Embryo C Tie2Cre Myf5Cre p300-T-fb p300-M-fb Peaks Peaks

489 2407 7 5548 1151 141

Figure 4 -- figure supplement 1. Cre-dependent Ep300 bioChIP-seq. A. Cre-dependent expression of BirA. Heart with or without cardiac-specific TNTCre was immunoblotted for expres- sion of BirA from the Rosa26fsBirA allele. B. Distribution of Ep300fb bioChIP-seq signal when driven by Tie2Cre or Myf5Cre. Regions with p300 peaks called in samples from any of the three Cre drivers were scored for their EC- or muscle- enriched signal by calculating ratio of the region's signal in Tie2Cre or Myf5Cre bioChIP-seq to its signal with ubiquitous Ep300 bioChIP-seq (EIIaCre). The Tie2Cre-enrichment score showed a bimodal distribution with a small peak at values greater than 1.5. A value of greater than 1.5 similarly was used to define Myf5Cre- enriched regions. C. Relationship of Tie2Cre-enriched (Ep300-T-fb) and Myf5Cre-enriched (Ep300-M-fb) regions to peaks called from Tie2Cre or Myf5Cre samples alone. D A EC ± Blood NS P<10-38 4 Blood 3 Only

2 Egfl7_E1 Eng Mef2c Sema6d Egfl7_E3 3/9 3/12 5/10 7/17 2/12 1 RNA-seq T2-TRAP/Input Lmo2 0 10/29

EIIaCre Myf5Cre Tie2Cre

p300 bioChIP-seq Ephb4 Notch1_E1 Apln Dab2 Sox7 4/7 3/22 3/10 9/24 2/7 B E Endocardium and endocardial cushion Artery and vein Egfl7 Notch1 Egfl7 Notch1 Ins hsp68 Ins AS DA candidate lacZ EC LV DA region CV

CV EC transient RV transgenic LV LA assay (E11.5) RV

C 20 Ep300(T-fb) Artery selective Vein selective regions tested Sema6d Sox7 Ephb4 Mef2c EC ± Bld, CV DA DA 3+ embryos HV CV 8 EC ± neg CV Bld 9 HV 10 AS LA DA DA EC ± Bld, Bld only 2 embryos 1 2

Fig. 5 fb A Max distance from TSS to Ep300 peak centerE < 2 kb < 10 kb < 100 kb No Limit NS P<10-38 NS P<10-38 NS P<10-38 NS P<10-38 4 4 4 4

3 3 3 3

2 2 2 2

1 1 1 1 RNA-seq T2-TRAP/Input 0 0 0 0

EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre

B P<10-4 P<10-4 P<10-4 P<10-4 C 15

100

10

50 [FPKM+1]) 2 % of genes

(log 5

T2-TRAP Expression T2-TRAP 0 No All < 2 < 10 < 100 Limit Genes Max distance 0 TSS to Ep300-T-fb peak center < 2 kb < 10 kb < 100 kb No Limit Max distance FPKM >= 1 threshold TSS to Ep300-T-fb peak center FPKM < 1 threshold Ep300fb-associated not Ep300fb-associated Fig. 5 -- figure supplement 1. Relationship of EC gene expression to Ep300 regions in ECs, skeletal muscle, and whole embryo. A. EC enrichment of actively translating genes that neighbored Ep300 regions in skeletal muscle (Ep300-M-fb), whole embryo (Ep300-E-fb), and endothelial cell/blood (Ep300-T-fb), using several different thresholds for the maximum distance allowed to associate genes to regions. The relation- ship between Ep300-T-fb regions and gene expression was not dependent upon the threshold used. The rightmost panel (no limit to maximum distance) is the same as in Figure 5A. Mann-Whitney U-test. B. EC expression of actively translating genes that neighbored Ep300-T-fb, compared to those that did not, using several different thresholds for the maximum distance used associate genes to regions. Ep300-T-fb-associated genes were more highly expressed regardless of the distance threshold used. Ep300-T-fb genes were significantly more highly expressed (Mann-Whitney U-test). C. Fraction of genes expressed or not expressed, for Ep300-T-fb-associated genes compared to all genes. Ep300-T-fb associated genes were more frequently expressed than non-associated genes. However, not all genes associated with Ep300-T-fb regions were expressed, and some genes with- out Ep300-T-fb regions were expressed. This may reflect additional mechanisms of gene regulation as well as limitations of the gene-to-region association rule additional rule. Apln Lmo2 30 kb 97 kb 45,360 kb 45,370 kb 45,380 kb 103,730 kb 103,740 kb 103,750 kb 103,760 kb 103,770 kb 103,780 kb 103,790 kb 103,800 kb 103,810 kb 103,820 kb [0 - 100] p300-M-fb [0 - 150] [0 - 100] p300-M-fb p300-E-fb [0 - 150] [0 - 100] p300-E-fb p300-T-fb [0 - 150] TestedTested regionregion p300-T-fb Apln Tested region Lmo2

Dab2 Mef2c 409 kb 167 kb 6,000 kb 6,100 kb 6,200 kb 6,300 kb 6,400 kb 83,640 kb 83,660 kb 83,680 kb 83,700 kb 83,720 kb 83,740 kb 83,760 kb 83,780 kb 83,800 kb

[0 - 50] [0 - 100] p300-M-fb p300-M-fb [0 - 50] [0 - 100] p300-E-fb p300-E-fb [0 - 100] [0 - 50] p300-T-fb p300-T-fb Tested region Tested region Dab2 Mef2c

Egfl7 Notch1 26,320 kb 26,330 kb 46 kb 26,340 kb 26,350 kb 26,426 kb 26,428 kb 26,430 kb 26,432 kb 26,434 kb 26,436 kb23 kb 26,438 kb 26,440 kb 26,442 kb 26,444 kb 26,446 kb 26,448 kb

[0 - 200] [0 - 100] p300-M-fb p300-M-fb [0 - 200] [0 - 100] p300-E-fb p300-E-fb [0 - 200] [0 - 100] p300-T-fb p300-T-fb Tested region Tested region Enh1 Enh2 Enh3 Egfl7

Egfl7_Enh1 Egfl7_Enh3 Notch1

Enh2 no consistent endothelial cell activity

Eng Sema6d 50 kb 32,500 kb 32,510 kb 32,520 kb 32,530 kb 32,540 kb 124,380 kb 124,400 kb 124,420 kb 117 kb124,440 kb 124,460 kb 124,480 kb

[0 - 150] [0 - 150] p300-M-fb [0 - 150] p300-M-fb [0 - 150] p300-E-fb p300-E-fb [0 - 150] [0 - 150] p300-T-fb p300-T-fb Tested region Tested region Eng Sema6d

Ephb4 Sox7 29 kb 16 kb 137,790 kb 137,800 kb 137,810 kb 64,564 kb 64,566 kb 64,568 kb 64,570 kb 64,572 kb 64,574 kb 64,576 kb 64,578 kb

[0 - 100] [0 - 100] p300-M-fb p300-M-fb [0 - 100] [0 - 100] p300-E-fb p300-E-fb [0 - 100] [0 - 100] p300-T-fb p300-T-fb Tested region Tested region Sox7 Ephb4

Fig. 5 -- figure supplement 2 Fig. 5 -- figure supplement 2. Transient transgenic assays to measure in vivo activity of candidate endothelial cell/blood enhancers. For each tested enhancer with positive endothelial cell or blood activity, we show the candidate enhancer’s Ep300 signal in skeletal muscle (Ep300-M-fb), whole embryo (Ep300-E-fb), and endothelial cell/blood (Ep300-T-fb), with respect to the gene both. The whole mount images show each of the positive Xgal-stained embryos. Apln-Enh Dab2-Enh Egfl7_Enh1 Egfl7_Enh3 Eng-Enh Ephb4-Enh HV CV CV HV DA HV DA DA DA CV DA PA

Lmo2-Enh Mef2c-Enh Notch1_Enh1 Sema6d-Enh Sox7-Enh

HV DA RA EC EC EC DA CV RV RV RV AS

Fig. 5 -- figure supplement 3. Histological sections of transient transgenic embryos. Histological sections of embryos containing the indicated enhancer-driven lacZ transgene. AS: aortic sac; CV: cardi- nal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein; PA: pulmonary artery; RA: right atrium; RV: right ventricle. Bar, 100 µm. + A Enrichment % Motif GO Terms B Ep300-T-fb Regions Ep300-M-fb Regions Tie2Cre Myf5Cre 30 Tie2Cre-enriched motifs (****) Myf5Cre-enriched motifs (****) Random Random 25 Tie2Cre Myf5Cre Tie2Cre Myf5Cre Vasc. Blood Heme Sk. M. Vasc. Blood Heme Sk. M.

2 Motif Name

bits 1

A SIX 20 GA G GA C G TT 2 TA C CCTG GAAT C 0 A GTT

TC T C bits 1 2 Ebox:HOX T G GC A T G GA C A C T T AT G T A T A CTGA T CG CG C A C C A T CGA GG GC 0 A C

C 15 bits 1 A T T G Ebox:HOMEO TG A GT G G G GG A C AGC C C T G G C CT AT T T A C C TG C GACG T C C A AGTC TGA CA 0

G Density CA TG AT 2

bits 1 2

TBOX1 10

C CTTG T GCT G C TACC AGT 0 AG A 1 bits G T RBPJ:Ebox 2 G C A G A A G G G GAA GAA A CC G C CC C A AT CGT T T CTCGGCT 0 TT GAT A GG T 5 G A CA G bits 1 HOX 2 GAC A T T A T CGCT T C C GAC T T GT TG ACT 0 C A A CA

bits 1 A T AT HOMEO 0 T C T 2 C TC A AG G G G 0 ATT ATCTCT AA bits 1 C C PRDM9 0.0 0.2 0.4 0.6 0.8 1.0 2 0.0 0.2 0.4 0.6 0.8 1.0 AT G AT T GA G T G C T C C T G AT C ATAT A C GTATGA G C CA 0 AGTA T bits 1 C C RUNX Conservation score Conservation score G T 2 TCT C G T G CA GA C CGGATCA 0 G min promoter T T bits 1 TG G RBPJ C 3x enhancer fragment 2 T C TG C G A G C T ATCA 0 C CA

bits 1 TTC CA Luciferase 2 GATA C G

A AG A C GTC CAC 0 AT G

1 GATAA bits 2 ETS G CC T A C GG C CA TAG C TA GATG T CT AAT A 0 GAT Mef2c mut bits 1 G ETS:HOMEO2 2

A C T GG A C GA A T GG G C C GC AT TCAT TCCTCTA 0 CA AG

bits 1 ETS-Fox2 (Mef2c) GGA ETS:FOX2 * 2 G T GT G C CC C A GC GA A G C T AT C CC C T GTT GC AATAT T A 0 ACCA TGA bits 1 GGA ETS-Fox3 (Bcl2l1) * 2 ETS:FOX1 A TG TT C T T G G G AC A TGA T CG AG C T A GT CA CGA CAT 0 AAG GG bits 1 ETS-Homeo2 (Egfl7-Enh1) * 2 Tbx:ETS A A AT T G TC C CAG G GG TG G ATC C CCT CA AATT TA 0 G C bits 1 GGA ETS:HOMEO1 ETS-Homeo2 (Egfl7-Enh3) * 2 T C G G A CA A G CT AA C A G T AT G T A A CG T CG GC GC A AG C GG AT A C C T GATCACCT TCAT 0 A T GG bits 1 ETS-Homeo2 (Kdr) 2 SOX * C GT CT A TG CGG 0 CCTAT T

1 bits TTGT ETS-Homeo2 (Flt4) 2 ETS:FOX1’ * G CT TC T AG CG T CA G G A C C A G TA AG T CC T CC TC CGA T A A G T T 0 AAT GG G bits 1 ETS-Ebox (Apln) 2 AP1 * A G T GA A C CT C GA G T CT CTACACGA 0 GT C

1 bits T A T A ETS:FOX3 ETS-Ebox (Zfp521) * A 2 G C T TT GT CT GG G G C CCC T G G CT CT C A G C AA T CA AAA CGA AGAAT 0 T G CG bits 1 Sox (Apln) 2 T KLF A T

T A A A TT G TC ATCGA * A CA G G 0 G G C bits 1 G GG NR Sox (Robo4) 2 * T T GG CA G C TC A C G GT T G G A C AT AC T C TTT T ATA GTAA GAG CAT 0 GG C A G CA A G bits 1 ZNF Sox (Sema3g) 2 AC CA GT GT G TG C ACA * C 0 T

bits 1 G G CC MITF pGL3-SV40-enh 2 GCT AT G CA T G G GG TG TAG C C T CA * AA T 0 CGT AC C G bits 1 GATA:SCL C 2 GG C C T G A G A C G G T C A T T T G T T CGAA T T C CGA 1 2 3 4 5 6 7 8 9 A 0 T 10 11 12 13 14 15 16 17 18 19 20 C Ceqlogo 13.03.16 13:22 0 50 100 150 200 250 300 bits 1 TG GATA A ETS:Ebox CG 2 CA T CG GTC G CGAT 0 TA Fold Activation A AG TG

A bits 1 GGA C FOX 2 CT GA CT 0 CT GTACT 3x motif min promoter

bits 1 D TG T Ebox 2

T T G A T C CGC A TGT C A G A 0 GC CA TG bits 1 MYB Luciferase T C 2 C A T G T A G C A AAT 0 C

bits 1 AAC G STAT T A 2 A

G GT CA GAATCATTC CG 0 CC GG TT AA bits 1 2 T G A A TA C T GTC GACCGT AT 0 CC AATT GG Neg control 1 bits A A T ATF catatctagaa 2

A TCTGAC T GGA TC A CC C GG CGACATAAT GA TGG G C 0 GA TC

A bits 1 ETS only T SMAD catatctGGAA 2 C G

GC G CGAT CT 0 GA

bits 1 ETS-Fox1 GTCT TBOX2 CCGGATGTT * 2

AT AA C C C GT GGA T T T CCG GACAA 0 T C T

A bits 1 T C Nanog ETS-Fox2 GTAAACAGGAAGT * 2 A GGG T A A AT C T A AG CT GA T TGTA TC 0 CC AT AG bits 1 GC C G NFAT ETS-Fox3 * A T 2 TGTTTACGGAAGT GA CA CC C TGTC CGTGA 0 T A bits 1 GGAA NF1 ETS-Homeo2 * 2 T A T A CT C G T AACCGGAAGT A A T T GA A ACGGA TGAC GCT CTCG 0 T C G A bits 1 GG CC ETS-Ebox CT GC TEAD 2 CA C * AT 0 TA T AGGAAACAGCTG GGAAT bits 1 MEIS1 T C ETS-Tbx G GG C C A T A C G TC A 0 A G * TGACA CACACCGGAAG Significance % Motif pos. % Top 20 pGL3-SV40-enh * (neg ln P-val) regions GO BP Terms 0 5 10 15 20 25 30 Fig. 6. 5 100 1164 0 67% 0 65% Fold Activation A Ep300-T-fb Ep300-M-fb

-ln P % of Best -ln P % of Best -value Targets Match -value Targets Match CGCA G CG GAGT A A A GAC G TTC TT C T C CG AT C G T A CT T A C G CACCTA T T A G TA 879 48.7 GATA CAGCTG 1049 68.0 Ebox A CA G C G G T GG TC T CT G AAGGAATA T 855 57.0 ETS ATGGAATCC 111 32.9 TEAD A GC A G CA GCG TGT AT A A A G A G T A C A GA TG CA CG C C T T C T CT A G ACAA T 199 19.0 Sox A ATT AT 58 31.5 PBX:Homeo A G C GCG GA T T C G GT A AC AA A A T C G A T AT G GTC TTACCCA CTGA 149 14.7 ETS:Ebox GCCA ATG 53 21.6 Ebox

C AG G T A CG A T T C G T TT A A T T AGA A CCT C A T A G C A T ATGATTCC G 105 16.9 Fox C CCTG CC 51 5.7 Ebox A C G GAT C T T C G G C C C AG CAT CA TGGG GTA 84 15.9 KLF AAATGTCA 50 28.2 Nuc. Recept. GA C G G CC T C A G A C T T T T GAAG CG A A GTATCT C A T GT A A 83 11.8 AP1 G TTA 41 27.7 Homeo

A AT A CAA C CTC TG CGA T C CG A GT C C A C A A A G T A GTGTGG TG C A T CGTATTTATTAG 69 3.9 Mef2 35 3.2 Hox G G ATG A T C G G TA A T TT A T A C C C AT G G C T A AG C T C C T G GG A G CC G T A G CT TATGCATTAC 61 10.3 ATF CCCTA AAATA 32 3.4 Mef2

C G A C C T TG G T T T C A GAA ACATCCGC 46 16.5 Ebox CTGT AG 32 8.6 Meis1

Tie2Cre-enriched B Ep300-T-fb Regions Ep300-M-fb Regions motifs Gata 1.0 1.0 ETS ETS:HOMEO2 ETS:FOX2 0.9 0.9 ETS:FOX1 Tbx:ETS ETS:HOMEO1 SOX 0.8 0.8 ETS:FOX1’ AP1 ETS:FOX3 0.7 0.7 KLF NR ZNF MITF 0.6 0.6 Myf5Cre-enriched motifs 0.5 0.5 Six Ebox:HOX Cumulative frequency Ebox:HOMEO 0.4 0.4 Tbox1 RBPJ:Ebox HOX 0.3 0.3 HOMEO PRDM9 RUNX 0.2 0.2 RBPJ 0.0 0.5 1.0 0.0 0.5 1.0 Conservation score Conservation score Random Fig. 6 -- figure supplement 1. Motifs with enriched p300 occupancy in Tie2Cre or Myf5Cre embryo samples. A. De novo motif discovery. The top 10 discovered motifs are shown, with the signifi- cance P-value, percent of regions containing the motif, and the best matching known motif class. B. Conservation of Tie2Cre-enriched motifs (dashed lines) or Myf5Cre-enriched motifs (solid lines) within 100 bp of Ep300-T-fb or Ep300-M-fb peak summits. The distribution of conservation scores for each indicated motif is shown as a cumulative frequency plot. Random indicates randomly sampled 12 nt regions from within the Ep300-T-fb or Ep300-M-fb regions. A B C VEcad-CreERT2; p300fb/fb; Tissue Specific EC Expression Profiles Rosa26fsBirA (Nolan et al., Dev Cell, 2013) Ep300 signal (RP10M)

0 90

Heart ECs Lung ECs Relative Gene Expr (log2 Ht/Lu) # Peaks assoc with tissue-specific genes Rep 1 Rep 2 Rep 3 Rep 1 Rep 2 Rep 3 Decile -1.0 0.0 1.0 Decile 0 200 400 *** *** Heart 1 Heart 1 Heart Lung Lung *** *** 2 2 *** 3 3 *** 4 4

Ep300 Regions, *** Ranked 5 5

by Ht/Lu *** 6 6 Signal Ratio *** *** 7 7 *** *** 8 8 *** *** 9 9 *** *** Lung 10 10

Distance from Ep300 peak center (±1 kb) ***, P<0.0001, Wilcoxon ***, P<0.0001, Fisher Exact

Neg Log Bonf P-value D 10 Foreground: D1 D10 16 449 60 Background: D10 D1

2

1 bits TEAD:SOX 2 G T G G A C A G GAGT A G T A G C C AT T C C A G C G C A C C TA T T T AT G A C TT GAT AT CT CT 1 0 2 3 4 5 6 7 8 9 G 10 11 12 13 14 15 16 17 18 C19 1 2 T bits ACAA G AT ETS:FOX2 G GC T T G C G AT C C C G C A T T CG A C T GAGTT TA GATGT C A 1 0 2 3 4 5 6 7 8 9 TA

1 10 11 12 13 14

bits A ETS 2 ACAG

A A GA A C AGT T C G C C G TC T A AT T C G CG G A 1 2 C3 4 5 6 7 8 9 0

10 1 2 C G ETS:HOMEO1 bits G G AA T C G G A C C A G T T G C G AG G TG C C GAC AC C A A T G TG CT ACGCT C C TGATG A TAAC CT A 1 0 2 3 4 5 6 7 8 9

1 A 10 11 12 13 14 15 16 17 18 bits 2 T ETS:TBOX2 G T A A C T T A G G CG CA C T G T TAT GT TA C T AG C AA TA G T G T G G G C GCG CA CAC CA 2 3 4 5 6 7 8 9 0 1 G 10 11 12 13 14 15 16 17 1 T bits 2 C AG G ETS:FOX1 C AG T GC TCT T AG G AT CC A G A AGA G G A 2 3 4 5 6 7 8 9 0 1 A T 1 A 10 11 12 13 bits 2 T NFAT:AP1 C GT A G A G AAT GA GA A A T G C C CTG G C G AT C G A A C CC T G G A CG T A G C C C C CG T G CT G T GT T C GTC G 2 3 4 5 6 7 8 9 0 1 A

10 11 12 13 14 15 16 T17 18 19 20 A 1 A G bits 2 AA T A FOX C C CT T T G A G G A 1 2 0 3 4 5 6 7 8 9 T

1 10 11 12 bits TCF3 2 C T C G G ATC A CGGG GCCG CA T T 1 0 2 3 4 T5 6 7 8 9 T

10 T TTTA 1 2 A bits A CA HOXB4 A G AG A G CC A T T C C C AT CTGTT GAC TTG T G AG A CG 1 0 2 3 T4 5 G6 7 8 9 T TG 10 11 12 1 bits FOXH1 2 A T

A GG CC C C A GT C GAC 4 5 6 7 8 9 0 1 2 3 TG

10 11 12 A T 1 2 G T bits T T GATA CA TCGGG CAT CCTAT C 1 G 2 3 4 5 6 7 8 A 0 1 bits A MEF2 2 A T G A AA T C T GTC GACCT AT T G G TT G 2 3 4 5 6 7 8 9 1 A 0

C A 10 1 C bits 2 AA AG LHX GA C C 2 3 4 5 6 7 8 T 1 0 C GG 1 bits EBOX 2 TA TA A C

A G C GT 1 0 2 3 4 5 C6 G7 8 9

10 AT 1 G bits 2 C ETS:EBOX A C A G G G G T T C C C C CAT G C CGT 1 C G 2 3 4 5 6 A7 8 9 0 CA T G A T 10 11 12 13 14 15 16 1 G bits EBOX:HOMEO 2 T TG A G GT G G G A C G A G GC C C T G G C T T C T T A A C C T GCT C GGAC C A AGTC TA C G G A G C 3 4 5 6 7 8 9 0 1 2 A A 10 11 12 13 14 15 16 17 18 19 20 1 G bits 2 AT CA T NKX6.1 A T G G TG G C ACA CT 1 0 2 3 4 5 6 7 8 1 bits G PAX3-FKHR 2 G C G C AT T C T C G A GA G GA TACTG AG 2 3 4 5 6 7 8 9 0 1 A

T 10 11 12 13 14 15 TAAT 1 GCA GA bits 2 T C A T EBOX2 GA T C AT C TCA T 2 3 4 5 6 7 8 A 0 1 1 bits TBOX 2 C CG G C CTTG T GCT C GA T TA C GT A C 1 0 2 3 4 5 6 7 A8 1 G bits 2 A GT EBF G TA GC A CG T AT G AT GT 2 3 4 5 6 7 8 9 0 1 C T A 1 10 11 12 bits NR 2 T C G A

T T GG CA G C A C GTTC T GG G A C A AC T C TTT TTATA GTAA GAG CAT C C G G 1 0 2 3 4 5 C6 7 8 9

10 11 12 C13 14 1 GG A AGG A NF1 bits T G C C C G A A 1 Fig. 7. 0 T2 G3 G4 C5 A6 G7 8 A B z-score across tissues z-score across tissues -4.5 6.9 −6 0 6 -0.7

Clusters

clusters with heart and lung selective expression t s i er M ain BM B lo m rai n estis Li v Lung B r es t Glom Live r Hea r Lun g T B G Hear t T Spleen Muscle Splee n Muscl e

Figure 7 -- figure supplement 1. Organ-specific EC gene expression. Genes with at least 4-fold differential mean expression between highest and lowest EC samples were identified from microarray data (Nolan 2013). These genes were grouped by k-means clustering. Genes in clus- ters containing selective heart or lung expression were analyzed further. Expression of these heart or lung EC genes is displayed in B using hierarchical clustering. A B Rosa26mTmG; VECad-CreERT2 mGFP Pecam1 m-Tomato Heart

heart lung VEcad-Cre – + – +

300 kd Bronchus

SA pull-down, Ep300 IB Artery

Bronchus Lung

Bronchus

Artery

Fig. 7 -- figure supplement 2. VEcad-CreERT2 activation of Ep300 biotinylation. A. VEcad-CreERT2 activated Ep300 biotinylation by Rosa26fsBirA, permitting its pull- down on streptavidin (SA) beads. B. EC-selective activation of the Rosa26mTmG Cre reporter allele by VECad-CreERT2. Tamoxifen was administered at P1 and P2, and tissues were analyzed at 6 weeks of age. Tbx3: Lung ECs

297 kb

120,000 kb 120,100 kb 120,200 kb

[0 - 100] HtCon1

[0 - 100] HtEC1

[0 - 100] HtEC2

[0 - 100] HtEC3

[0 - 100] LuCon1

[0 - 100] LuEC1

[0 - 100] LuEC2

[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Tbx3 Tbx2 and Tbx4: Lung non-ECs 139 kb

85,640 kb 85,660 kb 85,680 kb 85,700 kb 85,720 kb 85,740 kb 85,760 kb

[0 - 100] HtCon1

[0 - 100] HtEC1

[0 - 100] HtEC2

[0 - 100] HtEC3

[0 - 100] LuCon1

[0 - 100] LuEC1

[0 - 100] LuEC2

[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Tbx2 Tbx4 Meox2: Heart ECs

236 kb

37,700 kb 37,720 kb 37,740 kb 37,760 kb 37,780 kb 37,800 kb 37,820 kb 37,840 kb 37,860 kb 37,880 kb 37,900 kb 37,920 kb

[0 - 100] HtCon1

[0 - 100] HtEC1

[0 - 100] HtEC2

[0 - 100] HtEC3

[0 - 100] LuCon1

[0 - 100] LuEC1

[0 - 100] LuEC2

[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Meox2 Myh6: Heart non-ECs

61 kb

55,550 kb 55,560 kb 55,570 kb 55,580 kb 55,590 kb 55,600 kb

[0 - 100] HtCon1

[0 - 100] HtEC1

[0 - 100] HtEC2

[0 - 100] HtEC3

[0 - 100] LuCon1

[0 - 100] LuEC1

[0 - 100] LuEC2

[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks

Efs Il25 Cmtm5 Myh6 D830015G02Rik Myh7

Figure 7 -- figure supplement 3. Ep300fb bioChIP-seq signal in whole organ ("Con"; EIIaCre) or ECs ("EC"; VEcad-CreERT2) at genes with EC or non-EC expression in lung or heart. Neg Log10 Bonferonni P-value

3.4 13.9 4.3 25.9 Dec 1 (Ht) Dec 10 (Lu) vasculature development blood vessel development reg. of protein modification process anatom. struct. form. involved in morph. in utero embryonic development cardiovascular system development blood vessel morphogenesis

GO Biol. Process regulation of vasculature development cerebral arterial disease essential hypertension persistent fetal circulation syndrome pulmonary hypertension hypertension congestive heart failure myocardial infarction arteriopathy arteriosclerosis Disease Ontology arterial occlusive disease coronary heart disease

Figure 7 -- figure supplement 4. GO biological process (top) or disease ontol- ogy (bottom) terms enriched in decile 1 or decile 10 of Ep300-VE-fb regions. Grey indicates no significant enrichment. The top five biological process terms in deciles 1 and 10 and selected relevant disease ontology terms were are displayed.