1 2
3
4 5
6 7 Mapping cell type-specific transcriptional enhancers using high affinity, 8 lineage-specific Ep300 bioChIP-seq 9
10 Running title: Mapping cell type-specific enhancers
11
12 Pingzhu Zhou1*, Fei Gu1*, Lina Zhang2, Brynn N. Akerberg,1 Qing Ma1, Kai Li1, Aibin He1,3, 13 Zhiqiang Lin1, Sean M. Stevens1, Bin Zhou4, and William T. Pu1,5 14 *contributed equally to this study
15
16 1Department of Cardiology, Boston Children’s Hospital, 300 Longwood Ave, Boston, MA 17 02115, USA 18 2Department of Biochemistry, Institute of Basic Medicine, Shanghai University of Traditional 19 Chinese Medicine, Shanghai 201203, China 20 3Institute of Molecular Medicine, Peking-Tsinghua Center for Life Sciences, Peking University, 21 Beijing 100871, China 22 4State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai 23 Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China 24 5Harvard Stem Cell Institute, Harvard University, 1350 Massachusetts Avenue, Suite 727W, 25 Cambridge, MA 02138, USA 26 27
28
- 1 - 29 Abstract 30 Understanding the mechanisms that regulate cell type-specific transcriptional programs
31 requires developing a lexicon of their genomic regulatory elements. We developed a
32 lineage-selective method to map transcriptional enhancers, regulatory genomic regions that
33 activate transcription, in mice. Since most tissue-specific enhancers are bound by the
34 transcriptional co-activator Ep300, we used Cre-directed, lineage-specific Ep300 biotinylation
35 and pulldown on immobilized streptavidin followed by next generation sequencing of
36 co-precipitated DNA to indentify lineage-specific enhancers. By driving this system with
37 lineage-specific Cre transgenes, we mapped enhancers active in embryonic endothelial
38 cells/blood or skeletal muscle. Analysis of these enhancers identified new transcription factor
39 heterodimer motifs that likely regulate transcription in these lineages. Furthermore, we
40 identified candidate enhancers that regulate adult heart- or lung- specific endothelial cell
41 specialization. Our strategy for tissue-specific protein biotinylation opens new avenues for
42 studying lineage-specific protein-DNA and protein-protein interactions.
43
44 Keywords: transcriptional enhancer; endothelial cell; angiogenesis; Ep300
45
- 2 - 46 Introduction 47 The diverse cell types of a multicellular organism share the same genome but express
48 distinct gene expression programs. In mammals, precise cell-type specific regulation of gene
49 expression depends on transcriptional enhancers, non-coding regions of the genome required
50 to activate expression of their target genes (Visel et al., 2009a; Bulger and Groudine, 2011).
51 Enhancers are bound by transcription factors and transcriptional co-activators, which then
52 contact RNA polymerase 2 engaged at the promoter, stimulating gene transcription.
53 Enhancers are nodal points of transcriptional networks, integrating multiple upstream
54 signals to regulate gene expression. Because enhancers do not have defined sequence or
55 location with respect to their target genes, mapping enhancers is a major bottleneck for
56 delineating transcriptional networks. Recently chromatin immunoprecipitation of enhancer
57 features followed by sequencing (ChIP-seq) has been used to map potential enhancer. DNase
58 hypersensitivity (Crawford et al., 2006; Thurman et al., 2012), H3K27ac (histone H3 acetylated
59 on lysine 27) occupancy (Creyghton et al., 2010; Nord et al., 2013), or H3K4me1 (histone H3
60 mono-methylated on lysine 4) occupancy (Heintzman et al., 2007) are chromatin features that
61 have been used to identify cell-type specific enhancers. While most enhancers are DNase
62 hypersensitive, DNase hypersensitive regions are often not active enhancers (Crawford et al.,
63 2006; Thurman et al., 2012). H3K27ac is enriched on cell type-specific enhancers (Creyghton
64 et al., 2010; Nord et al., 2013), but may be a less accurate predictor of enhancers than other
65 transcriptional regulators (Dogan et al., 2015). Chromatin occupancy of Ep300, a
66 transcriptional co-activator that catalyzes H3K27ac deposition, has been found to accurately
67 predict active enhancers (Visel et al., 2009b). However, antibodies for Ep300 are marginal for
68 robust ChIP-seq, particularly from tissues, leading to low reproducibility, variation between
69 antibody lots, and inefficient enhancer identification (Gasper et al., 2014).
- 3 - 70 Mammalian tissues are composed of multiple cell types, each with their own
71 lineage-specific transcriptional enhancers. Thus defining lineage-specific enhancers from
72 mammalian tissues requires developing strategies that overcome the cellular heterogeneity of
73 mammalian tissues, particularly when the lineage of interest comprises a small fraction of the
74 cells in the tissue. Past efforts to surmount this challenge have taken the strategy of purifying
75 nuclei from the cell type of interest using a lineage-specific tag. For instance, nuclei labeled by
76 lineage-specific expression of a fluorescent protein have been purified by FACS (Bonn et al.,
77 2012). This method is limited by the need to dissociate tissues and recover intact nuclei, and
78 by the relatively slow rate of FACS and the need to collect millions of labeled nuclei. To
79 circumvent the FACS bottleneck, cell type-specific overexpression of tagged SUN1, a nuclear
80 envelope protein, has been used to permit affinity purification of nuclei (Deal and Henikoff,
81 2010; Mo et al., 2015). Although this mouse line was reported to be normal, SUN1
82 overexpression potentially could affect cell phenotype and gene regulation (Chen et al., 2012).
83 Chromatin from isolated nuclei are then subjected to ChIP-seq to identify histone signatures of
84 enhancer activity. However, as noted above histone signatures may less accurately predict
85 enhancer activity compared to occupancy by key transcriptional regulators (Dogan et al.,
86 2015).
87 Here, we report an approach to identify murine enhancers active in a specific lineage within
88 a tissue. We developed a knock-in allele of Ep300 in which the protein is labeled by the bio
89 peptide sequence (de Boer et al., 2003; He et al., 2011). Cre recombinase-directed, cell type
90 specific expression of BirA, an E. coli enzyme that biotinylates the bio epitope tag (de Boer et
91 al., 2003), allows selective Ep300 ChIP-seq, thereby identifying enhancers active in the cell
92 type of interest. Using this strategy, we identified thousands of endothelial cell (EC) and
93 skeletal muscle lineage enhancers active during embryonic development. Extending the
- 4 - 94 approach to adult organs, we defined adult EC enhancers, including enhancers associated
95 with distinct EC gene expression programs in heart compared to lung. Analysis of motifs
96 enriched in EC or skeletal muscle lineage enhancers predicted novel transcription factor motif
97 signatures that govern EC gene expression.
98 Results 99 Efficient identification of enhancers using Ep300fb bioChIP-seq 100 We developed an epitope-tagged Ep300 allele, Ep300fb, in which FLAG and bio epitopes
101 (de Boer et al., 2003; He et al., 2011) were knocked into the C-terminus of endogenous Ep300
102 (Fig. 1A-B and Figure 1 -- figure supplement 1A). Transgenically expressed BirA (Driegen et
103 al., 2005) biotinylates the bio epitope, permitting quantitative Ep300 pull down on streptavidin
104 beads (Fig. 1C). We have not noted abnormal phenotypes. Heart development and function
105 are sensitive to Ep300 gene dosage (Shikama et al., 2003; Wei et al., 2008), yet Ep300fb/fb
106 homozygous mice survived normally (Fig. 1D) and Ep300fb/fb hearts expressed normal levels of
107 Ep300 and had normal size and function (Figure 1 -- figure supplement 1B-E). These data
108 indicate that Ep300fb is not overtly hypomorphic.
109 To evaluate Ep300fb-based mapping of Ep300 chromatin occupancy, we isolated
110 embryonic stem cells (ESCs) from Ep300fb/fb; Rosa26BirA/BirA mice. We then performed Ep300fb
111 biotin-mediated chromatin precipitation followed by sequencing (bioChiP-seq), in which high
112 affinity biotin-streptavidin interaction is used to pull down Ep300 and its associated chromatin
113 (He et al., 2011). Biological duplicate sample signals and peak calls correlated well (93.6%
114 overlap; Spearman r=0.96; Fig. 2A-B). We compared the results to publicly available Ep300
115 antibody ChIP-seq data generated by ENCODE (overlap between duplicate peaks 77.8%;
116 r=0.91; Fig. 2A-B). Ep300 bioChiP-seq identified 48963 Ep300-bound regions ("p300 regions")
117 shared by both replicates, compared to 15281 for Ep300 antibody ChIP-seq (Fig. 2A,C). The
- 5 - 118 large majority (89.6%) of Ep300 regions detected by antibody were also found by Ep300
119 bioChiP-seq, and Ep300 signal was substantially stronger using bioChiP-seq (Fig. 2A,C,D).
120 These data indicate that Ep300fb bioChiP-seq has improved sensitivity compared to Ep300
121 antibody ChIP-seq for mapping Ep300 chromatin occupancy in cultured cells.
122 Identification of tissue-specific enhancers using Ep300fb bioChIP-seq 123 We used Ep300fb/+; Rosa26BirA/+ mice to analyze Ep300fb genome-wide occupancy in
124 embryonic heart and forebrain. We performed bioChiP-seq on heart and forebrain from
125 embryonic day 12.5 (E12.5) embryos in biological duplicate (Fig. 3A-B). There was high
126 reproducibility (83% and 93%, respectively) between biological duplicates (Fig. 3B and Figure
127 3 -- figure supplement 1A). In comparison, published Ep300 antibody ChIP-seq from E11.5
128 heart and forebrain (Visel et al., 2009b) had lower signal-to-noise and yielded few peaks when
129 analyzed using the same peak detection algorithm (MACS2 (Zhang et al., 2008)). Using the
130 originally published peaks, antibody-based Ep300 ChIP-seq yielded 9.5x or 3.0x less Ep300
131 regions in heart and forebrain, respectively (Fig. 3B). These regions overlapped 58.7% and
132 64.7% of the Ep300fb bioChIP-seq regions, suggesting that the epitope-tagged allele has
133 superior sensitivity and specificity for mapping Ep300-bound regions in tissues, as it does in
134 cultured cells.
135 We compared Ep300fb regions from forebrain and heart (Suppl. File 1). Only a minority of
136 Ep300fb regions (8.9% for heart and 31.3% for brain) were common between tissues (Fig. 1B).
137 Viewing Ep300fb bioChiP-seq signal at genes selectively expressed in heart or brain confirmed
138 robust tissue-specific differences that overlapped enhancers with known tissue-specific activity
139 (Figure 3 -- figure supplement 1B). Genes neighboring the Ep300fb occupied regions specific to
140 heart or forebrain were enriched for gene ontology (GO) functional terms relevant to the
141 respective tissue (Fig. 3C). These results reinforce the conclusion that Ep300 occupies
- 6 - 142 tissue-specific enhancers and indicate that this conclusion was not a consequence of
143 insensitive detection of Ep300-occupied regions in earlier studies (Visel et al., 2009b).
144 p300 is a histone acetyltransferase, and one of its enzymatic products is histone H3
145 acetylated on lysine 27 (H3K27ac). We compared genome-wide the signal of Ep300fb and
146 H3K27ac in E12.5 heart and forebrain (Fig. 3 -- figure supplement 1C and data not shown).
147 There was high correlation between biological replicates (r = 0.98). Ep300 was also well
148 correlated with H3K27ac (r = 0.64), independently validating the Ep300fb bioChIP-seq data.
149 The previously published Ep300 antibody ChIP-seq data (Visel et al., 2009b) was less well
150 correlated to H3K27ac (r = 0.37), although the correlation was highly statistically significant
151 (P<0.0001). Interestingly, 26.4% and 52.9% of heart and brain H3K27ac regions were shared
152 between tissues (Fig. 3 -- figure supplement 1D) compared to 8.9% and 31.3% for Ep300fb
153 heart and brain regions, respectively (Fig. 3A), suggesting that Ep300fb occupancy is more
154 tissue-specific.
155 We analyzed the prediction of active enhancers by our Ep300fb bioChiP-seq data. The
156 VISTA Enhancer database (Visel et al., 2007) contains thousands of genomic regions that
157 have been tested for tissue-specific enhancer activity using an in vivo transient transgenic
158 assay. 185 tested regions had heart activity and 130 (70%) of these overlapped Ep300fb
159 regions that were reproduced in both biological duplicates. In comparison, only 105 (57%) of
160 these regions overlapped the regions previously reported to be bound by Ep300 using
161 antibody ChIP-seq.
162 Recently human and mouse Ep300 and H3K27ac ChIP-seq data from fetal and adult heart
163 were combined to yield a “compendium” of heart enhancers, with the strength of ChIP-seq
164 signal used to provide an “enhancer score” ranging from 0 to 1 that correlated with the
165 likelihood of regions covered in the VISTA database to show heart activity (Dickel et al., 2016).
- 7 - 166 We compared our heart Ep300 regions to this compendium. Overall, 9438/72508 (13%)
167 regions in the prenatal compendium overlapped with the Ep300 heart regions (Fig. 3 -- figure
168 supplement 2A). However, the overlap frequency increased markedly for regions with higher
169 enhancer scores (Fig. 3 -- figure supplement 2B). For example, if one considers the 3571
170 compendium regions with an enhancer score of at least 0.4 (corresponding to a validation rate
171 in the VISTA database of ~25%), 2647 (74.1%) were contained within the heart Ep300 regions,
172 and 63/68 (92.6%) regions with a score of at least 0.8 (validation rate ~43%) overlapped a
173 heart Ep300 region. Thus, heart compendium regions that are more likely to have in vivo heart
174 activity are largely covered by heart Ep300 regions. On the other hand, 10752 (53%) heart
175 Ep300 regions were not covered by the compendium, suggesting that this database is
176 incomplete, potentially as a result of its use of incomplete antibody-based Ep300 ChIP-seq
177 data.
178 p300 antibody ChIP-seq was one of the criteria used to select some of the test regions in
179 the VISTA Enhancer database; as an independent test free of this potentially confounding
180 effect, we searched the literature for other heart enhancers that were confirmed using the
181 transient transgenic assay. We identified 40 additional heart enhancers. Of these, 24 (60%)
182 intersected the Ep300fb regions common to both replicates. In comparison, only 6/40 (15%)
183 intersected the regions detected previously by Ep300 antibody ChIP-seq. Few heart enhancers
184 were found in the regions unique to Ep300 antibody ChIP-seq (11/185 VISTA and 2/40
185 non-VISTA), compared to regions unique to Ep300fb bioChIP-seq (36/185 VISTA and 20/40
186 non-VISTA). We conclude that Ep300fb ChIP-seq predicts heart enhancers with sensitivity that
187 is superior to antibody-mediated Ep300 ChIP-seq.
188 Other chromatin features have been used to predict transcriptional enhancers. We
189 compared the accuracy of Ep300 bioChiP-seq to other chromatin features for heart enhancer
- 8 - 190 prediction. To map accessible chromatin, we performed ATAC-seq (assay for
191 transposable-accessible chromatin followed by sequencing) on E12.5 cardiomyocytes. E12.5
192 heart ChIP-seq data for modified histones (H3K27ac, H3K4me1, H3K4me2, H3K4me3,
193 H3K9ac, H3K27me3, H3K9me3, H3K36me3) were obtained from publicly available datasets
194 (see Supplemental Methods). Using a machine learning approach and the VISTA enhancer
195 database as the gold standard, we evaluated the accuracy of each of these chromatin
196 features, compared to Ep300 bioChiP-seq, for predictive heart enhancers (Figure 3 -- figure
197 supplement 3). This analysis showed that Ep300 bioChiP-seq was the single most predictive
198 chromatin feature (area under the receiver operating characteristic curve (AUC) = 0.805).
199 ATAC-seq and H3K27ac also performed well (AUC=0.749 and 0.747, respectively), whereas
200 H3K4me1 had was poorly predictive (AUC=0.589). Combining Ep300 bioChIP-seq with
201 ATAC-seq improved predictive accuracy (AUC=0.866), equivalent to the value obtained by
202 performing predictions with all of the chromatin features (AUC=0.862). These analyses indicate
203 that of the features tested Ep300 is the best single factor for enhancer prediction.
204 Cre-activated, lineage-specific Ep300fb bioChIP-seq 205 In vivo biotinylation of Ep300fb requires co-expression of the biotinylating enzyme BirA. We
206 reasoned that Ep300fb bioChIP-seq could be targeted to a Cre-labeled lineage by making BirA
207 expression Cre-dependent. Therefore we established Rosa26fsBirA, in which BirA expression is
208 contingent upon Cre excision of a floxed-stop (fs) cassette (Fig. 4A). In preliminary
209 experiments, we showed that Rosa26fsBirA expression of BirA was Cre-dependent (Fig. 4 --
210 figure supplement 1A), as was Ep300fb biotinylation (Fig. 4B). When activated by Cre driven
211 from Tek regulatory elements (Tg(Tek-cre)1Ywa/J; also known as Tie2Cre), BirA was
212 expressed in endothelial and blood lineages (Fig. 4C), consistent with this Cre transgene's
213 labeling pattern. Thus, Rosa26fsBirA expresses BirA in a Cre-dependent manner.
- 9 - 214 Next, we compared Ep300fb bioChIP-seq from embryos when driven by Tie2Cre
215 (endothelial and blood lineages) (Kisanuki et al., 2001), Myf5tm3(cre)Sor/J (referred to as
216 Myf5Cre; skeletal muscle lineage) (Tallquist et al., 2000), or Tg(EIIa-cre)C5379Lmgd/J (also
217 known as EIIaCre; ubiquitous) (Williams-Simons and Westphal, 1999). For the Tie2Cre and
218 EIIaCre samples, we used E11.5 embryos, a stage with robust angiogenesis. For Myf5Cre, we
219 used E13.5 embryos, when muscle lineage cells are in a range of stages in the muscle
220 differentiation program, spanning muscle progenitors to differentiated muscle fibers.
221 BioChiP-seq from biological triplicates showed high within-group correlation, and lower
222 between-group correlation, demonstrating the strong effect of different Cre transgenes in
223 directing Ep300fb bioChIP-seq (Fig. 4D). Viewing the bioChiP-seq signals in a genome browser
224 confirmed lineage-selective signal enrichment. For example, Tie2Cre drove high Ep300fb
225 bioChIP-seq signal at a Gata2 intronic enhancer with known activity in endothelial and blood
226 lineages (Fig. 4E, top panel). There was less signal at this region in Myf5Cre and EIIaCre
227 samples. At the skeletal muscle specific gene Myod, Myf5Cre drove strong Ep300fb
228 bioChiP-seq signal at a known distal enhancer (Goldhamer et al., 1992), as well as a second
229 Ep300 bound region about 12 kb upstream from the transcriptional start site.
230 To identify lineage-selective regions genome-wide, we filtered for regions with called peaks
231 in which the lineage-specific Cre (Tie2Cre or Myf5Cre) Ep300fb signal was at least 1.5 times
232 the ubiquitous Cre (EIIaCre) Ep300fb signal (Fig. 4 -- figure supplement 1B-C). This led to
233 identification of 2411 regions with enriched signal in Tie2Cre (p300-T-fb) and 1292 regions
234 with enriched signal in Myf5Cre (p300-M-fb), compared to 17382 regions with Ep300fb
235 occupancy detected with ubiquitous biotinylation (p300-E-fb; Suppl. File 1). We analyzed the
236 biological process gene ontology terms enriched for genes neighboring these three sets of
237 Ep300 regions (Fig. 4F). Ep300-T-fb regions were highly enriched for functional terms related
- 10 - 238 to angiogenesis and hematopoiesis, whereas the Ep300-M-fb regions were highly enriched for
239 functional terms related to skeletal muscle. Together, these results indicate that our strategy
240 for Cre-driven, lineage-specific, Ep300fb bioChiP-seq successfully identifies regulatory regions
241 that are associated with lineage relevant biological processes.
242 Functional validation of enhancer activity of lineage-specific Ep300fb regions. 243 We next set out to validate the in vivo enhancer activity of the regions with Cre-driven
244 lineage-enriched Ep300fb occupancy. If a substantial fraction of the Ep300-T-fb regions have
245 transcriptional enhancer activity, then genes neighboring these enhancers should be
246 expressed at higher levels in Tie2Cre lineage cells. To test this hypothesis, we used
247 Tie2Cre-activated translating ribosome affinity purification (Heiman et al., 2008; Zhou et al.,
248 2013) (T2-TRAP) to obtain the gene expression of Tie2Cre-marked cell lineages. Using this
249 lineage-specific expression profile, we then compared the expression of genes neighboring
250 Ep300-T-fb, Ep300-M-fb, and Ep300-E-fb regions. Ep300-T-fb neighboring genes were more
251 highly expressed compared to Ep300-E-fb neighboring genes (P < 10-38, Mann-Whitney U-test;
252 Fig. 5A). In contrast, there was no significant difference between Ep300-M-fb and Ep300-E-fb
253 neighboring genes (Fig. 5A). This result held regardless of the maximal distance threshold
254 used to find the gene nearest to a Ep300 region (Fig. 5 -- figure supplement 1A). We also
255 compared the expression of genes with and without an associated Ep300-T-fb region.
256 Ep300-T-fb-associated genes were more highly expressed than non-associated genes (Fig. 5
257 -- figure supplement 1B). Some Ep300-T-fb-associated genes were not detected within actively
258 translating transcripts (Fig. 5 -- figure supplement 1C). This suggests that in some cases
259 Ep300 enhancer binding is not sufficient to drive gene expression; other contributing factors
260 likely include imprecision of the enhancer-to-gene mapping rule, and regulation at the level of
261 ribosome binding to transcripts. Together, our data are consistent with Ep300-T-fb regions
- 11 - 262 being enriched for enhancers that are active in the Tie2Cre-labeled lineage.
263 To further evaluate the enhancer activity of Ep300-T-fb regions, we searched the literature
264 and the VISTA Enhancer database (Visel et al., 2007) for genomic regions with endothelial cell
265 activity validated by transient transgenesis (Table 1). Of 40 positive regions identified, 19
266 (47.5%) overlapped with Ep300-T-fb regions. Next, we used the transient transgenic assay to
267 test the lineage-selective enhancer activity of 20 additional Ep300-T-fb regions. We selected
268 regions that neighbored genes with EC-selective expression (T2-TRAP more than 10-fold
269 enriched over input RNA) and with known or potential relevance to angiogenesis. Of the 20
270 tested Ep300-T-fb regions, 8 drove reporter gene activity in at least 3 embryos in a vascular
271 pattern, in both whole mounts and histological sections, and 2 additional regions drove reporter
272 gene activity in a vascular pattern in 2 embryos (Fig. 5B-D; Table 2; Fig. 5 -- figure
273 supplements 2-3). In retrospect, two of the positive enhancers (Eng; Mef2c) had been
274 described previously (Table 2) (Pimanda et al., 2006; De Val et al., 2008). In some of these
275 cases, there was also activity in blood cells, consistent with Tie2Cre activity in both blood and
276 endothelial lineages. Additionally, one enhancer of Lmo2 was active blood cells but not ECs.
277 Thus a substantial fraction (9/20; 45%) of regions identified by lineage-selective, Cre-directed
278 bioChIP-seq have appropriate and reproducible in vivo activity.
279 Arterial and venous ECs have overlapping but distinct gene expression programs, yet only
280 three artery-specific and no vein-specific transcriptional enhancers have been described
281 (Wythe et al., 2013; Robinson et al., 2014; Becker et al., 2016; Sacilotto et al., 2013). We
282 examined histological sections of the transient transgenic embryos to determine if a subset are
283 selectively active in ECs of the dorsal aorta or cardinal vein. We identified an enhancer,
284 Sema6d-enh, with activity predominantly in ECs that line the dorsal aorta but not the cardinal
285 vein (Fig. 5E, Table 2). A second enhancer, Sox7-enh, also showed selective activity in the
- 12 - 286 dorsal aorta, although this was only reproduced in two embryos. Both were also active in the
287 endocardium and endocardial derivatives of the cardiac outflow tract (Fig. 5 -- figure
288 supplement 3). We also identified two enhancers with activity at E11.5 predominantly in ECs
289 that line the cardinal vein and not the dorsal aorta (Ephb4-enh and Mef2c-enh; Fig. 5E, Table
290 2, and Fig. 5 -- figure supplement 3). Interestingly, a core 44-bp region of Mef2c-enh had been
291 previously reported to drive pan-EC reporter expression at E8.5-E9.5 (De Val et al., 2008).
292 This enhancer’s activity pattern may be dynamically regulated at different developmental
293 stages, as has been described previously for two artery-specific enhancers (Robinson et al.,
294 2014; Becker et al., 2016). Further analysis of these enhancers will be required to confirm the
295 artery- and vein- selective activity patterns that we observed, and to better characterize their
296 temporospatial regulation.
297 Collectively, our data show that we have developed and validated a robust method for
298 identification of transcriptional enhancers in Cre-marked lineages. Using this strategy, we
299 discovered thousands of candidate skeletal muscle, EC, and blood cell enhancers. Based on
300 our validation studies, we expect that a majority of these candidate regions have in vivo,
301 cell-type specific transcriptional enhancer activity.
302 Transcription factor binding motifs enriched in skeletal muscle and EC/blood 303 enhancers. 304 p300 does not bind directly to DNA. Rather, transcription factors recognize sequence motifs
305 in DNA and subsequently recruit Ep300. The transcription factors and transcription factor
306 combinations that direct enhancer activity in skeletal muscle, blood, and endothelial lineages
307 are incompletely described. To gain more insights into this question, we searched for
308 transcription factor binding motifs that were over-represented in the candidate enhancer
309 regions bound by Ep300 in skeletal muscle or blood/EC lineages. Starting from 1445 motifs for
310 transcription factors or transcription factor heterodimers, we found 173 motifs over-represented
- 13 - 311 in Ep300-T-fb or Ep300-M-fb regions (false discovery rate < 0.01% and frequency in Ep300
312 regions greater than 5%). Clustering and selection of representative non-redundant motifs left
313 40 motifs that were enriched in either Ep300-T-fb or Ep300-M-fb, or both (Fig. 6A). Many
314 closely related motifs were independently detected by de novo motif discovery (Fig. 6 -- figure
315 supplement 1A). Analysis of our T2-TRAP RNA-seq data identified genes expressed in
316 embryonic ECs that potentially bind to these motifs (Suppl. File 2).
317 The GATA and ETS motifs were the most highly enriched motifs in the Tie2Cre-marked
318 blood and EC lineages (Fig. 6A). Interestingly, GO analysis of the subset of regions positive for
319 these motifs showed that GATA-containing regions are highly enriched for hematopoiesis and
320 heme synthesis, consistent with the critical roles of GATA1/2/3 in these processes (Bresnick et
321 al., 2012). On the other hand, ETS-containing regions were highly enriched for functional terms
322 linked to angiogenesis, also consistent with the key roles of ETS factors in angiogenesis (Wei
323 et al., 2009) and our prior finding that the ETS motif is enriched in dynamic VEGF-dependent
324 EC enhancers (Zhang et al., 2013). The Ebox motif, recognized by bHLH proteins, was the
325 most highly enriched motif in the skeletal muscle lineage, consistent with the important roles of
326 bHLH factors such as Myod and Myf5 in skeletal muscle development (Buckingham and
327 Rigby, 2014). Interestingly, the Ebox motif was also highly enriched in Tie2Cre-marked cells,
328 and genes neighboring these Ebox-containing regions were functionally related to both
329 angiogenesis and heme synthesis. bHLH-encoding genes such as Hey1/2, Scl, and Myc are
330 known to be important in blood and vascular development.
331 The database used for our motif search included 315 heterodimer motifs that were recently
332 discovered through high throughput sequencing of DNA concurrently bound by two different
333 transcription factors (Jolma et al., 2015). This allowed us to probe for enrichment of
334 heterodimer motifs that may contribute to enhancer activity in blood, EC, and skeletal muscle
- 14 - 335 lineages. One heterodimer motif that was highly enriched in Ep300-T-fb regions (and not
336 Ep300-M-fb regions; referred to as T2Cre-enriched motifs) was ETS:FOX2 (AAACAGGAA),
337 comprised of a tail-to-tail fusion of Fox (TGTTT) and ETS (GGAA) binding sites (Fig. 6A). GO
338 analysis showed that this motif was closely linked to vascular biological process terms (Fig.
339 6A). This motif was previously found to be sufficient to drive enhancer EC activity during
340 vasculogenesis and developmental angiogenesis (De Val et al., 2008), validating that our
341 approach is able to identify bona fide, functional heterodimer motifs. Interestingly, we
342 discovered two additional ETS-FOX heterodimer motifs, which were also highly enriched in
343 Ep300-T-fb regions and also linked to vascular biological process terms: ETS:FOX1
344 (GGATGTT), consisting of a head-to-tail fusion between ETS and FOX motifs, with the ETS
345 motif located 5' to the FOX motif (Fig. 6A, arrows over motif logo), and ETS:FOX3
346 (TGTTTACGGAA), a head-to-tail fusion with the FOX motif located 5' to the ETS motif. Other
347 heterodimer motifs that were enriched in Ep300-T-fb regions and to our knowledge previously
348 were unrecognized as regulatory elements in ECs were ETS:TBox, ETS:HOMEO, and
349 ETS:Ebox. Similar analysis of Ep300-M-fb regions identified highly enriched Ebox-containing
350 heterodimer motifs including Ebox:Hox, Ebox:HOMEO, and ETS:Ebox (“Myf5Cre-enriched
351 motifs”).
352 To assess functional significance of these motifs, we examined their evolutionary
353 conservation. Using PhastCons genome conservation scores for 30 vertebrate species (Siepel
354 et al., 2005), we measured the conservation of sequences matching Tie2Cre-enriched motifs
355 within the central 200 bp of Ep300-T-fb regions. Whereas randomly selected 12 bp sequences
356 from these regions exhibited a distribution of scores heavily weighted towards low
357 conservation values, sequences matching Tie2Cre-enriched motifs showed a bimodal
358 distribution consisting of highly conserved and poorly conserved sequences (Fig. 6B). The
- 15 - 359 conservation of individual heterodimer motifs such as ETS:FOX1-3 confirmed that they shared
360 this bimodal distribution that included deeply conserved sequences (Fig. 6 -- figure supplement
361 1B). These findings indicate that a subset of sequences matching Tie2Cre-enriched motifs,
362 including the novel heterodimer motifs, are under selective pressure. Myf5Cre-enriched motifs
363 within the center of Ep300-M-fb regions similarly adopted a bimodal distribution that includes a
364 subset of motif occurrences with high conservation (Fig. 6B and Fig. 6 -- figure supplement
365 1B). This analysis supports the biological function of Tie2Cre- and Myf5Cre-enriched motifs in
366 endothelial cell/blood or skeletal muscle enhancers, respectively.
367 To further functionally validate the transcriptional activity of these heterodimer motifs, we
368 measured their enhancer activity using luciferase reporter assays. Three repeats of enhancer
369 fragments containing motifs of interest were cloned upstream of a minimal promoter and
370 luciferase. The constructs were transfected into human umbilical vein endothelial cells
371 (HUVECs), and transcriptional activity was measured by luciferase assay, normalized to the
372 enhancerless promoter-luciferase construct (Fig. 6C). The well-studied ETS:FOX2 motif from
373 an endothelial Mef2c enhancer (De Val et al., 2008) robustly stimulated transcription to about
374 the same extent as the SV40 enhancer. As expected, mutation of the ETS:FOX2 motif
375 markedly blunted its activity, supporting the specificity of this assay. Interestingly, the
376 alternative ETS:FOX3 motif uncovered by our study was at least as potent in stimulating
377 transcription as the previously described ETS:FOX2 motif. The other novel heterodimer motifs
378 tested, ETS:HOMEO2 and ETS:Ebox, likewise supported strong transcriptional activity, as did
379 the SOX motif, whose enrichment and associated GO terms were highly EC-selective.
380 To further assess and compare the transcriptional activity of these heterodimer motifs, we
381 cloned upstream of luciferase 3x repeated motifs into a consistent DNA context that had
382 minimal endogenous enhancer activity (Fig. 6D). The ETS motif alone only weakly stimulated
- 16 - 383 luciferase expression, whereas the previously described ETS:FOX2 motif robustly activated
384 luciferase expression. Interestingly, both of the newly identified, alternative ETS:FOX motifs
385 (ETS:FOX1 and ETS:FOX3) were more potent activators that ETS:FOX2. The other
386 heterodimer motifs tested, ETS-Homeo2, ETS-Ebox, and ETS-Tbox, also demonstrated
387 significant enhancer activity.
388 Together, unbiased discovery of cell type specific enhancers coupled with motif analysis
389 identified novel transcription factor signatures that are likely important for gene expression
390 programs of blood, vasculature, and skeletal muscle.
391 Organ-specific EC enhancers 392 ECs in different adult organs have distinct gene expression programs that underlie
393 organ-specific EC functions (Nolan et al., 2013; Coppiello et al., 2015). For example, heart
394 ECs are adapted for transport of fatty acids essential for fueling oxidative phosphorylation in
395 the heart (Coppiello et al., 2015). On the other hand, lung ECs are adapted for efficient
396 transport of gas, but possess specialized tight junctions to minimize transit of water (Mehta et
397 al., 2014). Nolan et al. recently profiled gene expression in ECs freshly isolated from 9 different
398 adult mouse organs (Nolan et al., 2013). Clustering the 3104 genes with greater than 4-fold
399 difference in expression across this panel identified genes with selective expression in ECs
400 from a subset of organs (Fig. 7 -- figure supplement 1A). 240 genes were preferentially
401 expressed in ECs from heart (and skeletal muscle) compared to other organs, whereas 355
402 genes were preferentially expressed in ECs from lung (Fig. 7 -- figure supplement 1B). One
403 cluster contained genes co-enriched in lung and brain, including many tight junction genes
404 such as claudin 5.
405 We asked if our strategy of lineage-specific Ep300fb bioChIP-seq would allow us to identify
406 enhancers linked to organ-specific EC gene expression. For these experiments, we used
- 17 - 407 Tg(Cdh5-cre/ERT2)1Rha (also known as VECad-CreERT2) (Sorensen et al., 2009) to drive
408 Ep300fb biotinylation in ECs (Fig. 7 -- figure supplement 2); unlike Tie2Cre, this transgene does
409 not label hematopoietic cells when induced with tamoxifen in the neonatal period. We isolated
410 adult (8 wk old) heart and lung from Ep300fb/+; VEcad-CreERT2+; Rosa26fsBirA/+ mice and
411 performed bioChiP-seq. Triplicate repeats experiments were highly reproducible (Fig. 7A).
412 Inspection of Ep300fb ChIP-seq signals from lung and heart ECs suggested that
413 VEcad-CreERT2 successfully directed Ep300fb enrichment from ECs. For example, Tbx3 and
414 Meox2, transcription factors selectively expressed in lung and heart ECs, respectively, were
415 associated with Ep300fb-decorated regions in the matching tissues (Fig. 7 -- figure supplement
416 3). Interestingly, Meox2 has been implicated in directing expression of heart EC-specific genes
417 and in the pathogenesis of coronary artery disease (Coppiello et al., 2015; Yang et al., 2015).
418 On the other hand, Tbx2/4 and Myh6, genes expressed in non-ECs in lung and heart, were not
419 associated with regions of Ep300fb enrichment (Fig. 7 -- figure supplement 3).
420 To obtain a broader, unbiased view of the adult EC Ep300fb bioChIP-seq results, regions
421 bound by Ep300fb in either heart (14251) or lung (22174) were rank-ordered by the heart to
422 lung Ep300fb signal ratio (Suppl. File 1). Regions were grouped into deciles by this ratio, with
423 the most heart-enriched regions in decile 1 and the most lung-enriched regions in decile 10.
424 Next, for genes neighboring regions in each decile, we compared expression in heart
425 compared to lung. Genes neighboring decile 1 regions (greater Ep300fb signal in heart) had
426 higher median mRNA transcript levels than lung (Fig. 7B). In contrast, genes neighboring
427 decile 10 regions (greater Ep300fb signal in lung) had higher median mRNA transcript levels in
428 lung than heart. We then focused on genes with selective expression in heart or lung. More
429 decile 1 Ep300fb regions were associated with genes with heart-selective expression than
430 lung-selective expression, and more decile 10 Ep300fb regions were associated with genes
- 18 - 431 with lung-selective expression (in each case, P<0.0001, Fisher's exact test; Fig. 7C). These
432 results are consistent with greater heart- or lung- associated enhancer activity in Ep300fb
433 decile 1 or 10 regions, respectively.
434 We analyzed the GO terms that are over-represented for genes neighboring Ep300fb decile
435 1 or 10 regions. Both sets of regions were highly enriched for GO terms related to vasculature
436 and blood vessel development (Fig. 7 -- figure supplement 4). Analysis of disease ontology
437 terms showed that genes neighboring decile 1 regions (greater Ep300fb signal in heart) were
438 significantly associated with terms relevant to coronary artery disease, such as "coronary heart
439 disease", "arteriosclerosis", and "myocardial infarction", whereas genes neighboring decile 10
440 regions (greater Ep300fb signal in lung) were less enriched. Conversely, genes neighboring
441 decile 10 regions were selectively associated with hypertension and cerebral artery disease.
442 To gain insights into transcriptional regulators that preferentially drive heart or lung EC
443 enhancers, we searched for motifs with differential enrichment in decile 1 compared to decile
444 10. Using decile 1 regions as the foreground sequences and decile 10 regions as the
445 background sequences, we detected significant enrichment of the TCF motif in regions
446 preferentially occupied by Ep300 in heart ECs (Fig. 7D), consistent with prior work that found
447 that a Meox2:TCF15 complex promotes heart EC-specific gene expression (Coppiello et al.,
448 2015). Other motifs that were significantly enriched in decile 1 (heart) compared to decile 10
449 regions were FOX, ETS, ETS-FOX and ETS-HOMEO motifs. We performed the reciprocal
450 analysis to identify lung-enriched motifs. One motif enriched in decile 10 (lung) compared to
451 decile 1 regions was the TBOX motif. Interestingly, TBX3 is a lung EC-enriched transcription
452 factor (Nolan et al., 2013). Other motifs over-represented in decile 10 compared to decile 1
453 regions were EBox, ETS:Ebox, and EBOX:HOMEO motifs. Thus differences in transcription
454 factor expression or heterodimer formation in heart and lung ECs may contribute to differences
- 19 - 455 in Ep300 chromatin occupancy and enhancer activity between heart and lung.
456 Discussion 457 Here we show that Cre-mediated, tissue-specific activation of BirA permits high affinity
458 bioChiP-seq of factors with a bio epitope tag. Furthermore, we show that combining this
459 strategy with the Ep300fb knockin allele permits efficient identification of cell type-specific
460 enhancers. The bio epitope has been knocked into a number of different transcription factor
461 loci (He et al., 2012; Waldron et al., 2016) and Jackson Labs 025982 (Rbpj), 025980 (Ep300),
462 025983 (Mef2c), 025978 (Nkx2-5), 025979 (Srf), and 025977 (Zfpm2)), and these alleles can
463 be combined with Cre-activated BirA to permit lineage-specific mapping of transcription factor
464 binding sites. Cell type-specific protein biotinylation will also be useful for mapping
465 protein-protein interactions in specific cell lineages.
466 Using this technique, we identified thousands of candidate skeletal muscle and EC
467 enhancers and showed that many of these candidate enhancers are likely to be functional.
468 Furthermore, we showed that the technique can identify tissue-specific enhancers in postnatal
469 tissues, and identified novel candidate enhancers that regulate organ-specific endothelial gene
470 expression. These enhancer regions will be a valuable resource for future studies of
471 transcriptional regulation in these systems.
472 Large scale identification of tissue-specific enhancers will facilitate decoding the
473 mechanisms responsible for cell type-specific gene expression in development and disease.
474 Here . By analyzing these candidate regulatory regions, we revealed novel transcriptional
475 regulatory motifs that likely participate in skeletal muscle development, angiogenesis, and
476 organ specific EC gene expression. Our recovery of a previously described FOX-ETS
477 heterodimer binding site in EC enhancers (De Val et al., 2008) validates the ability of our
478 approach to detect sequence motifs important for angiogenesis, and suggests that these
- 20 - 479 motifs provide important clues to the transcriptional regulators that interact with them. Our
480 study identified significant enrichment of two new FOX-ETS heterodimer motifs in which the
481 FOX and ETS sites are in different positions and orientations. In their original study, Jolma and
482 colleagues already demonstrated that these alternative FOX-ETS motifs are indeed bound by
483 both FOX and ETS family proteins, implying that these proteins are able to collaboratively bind
484 DNA in diverse configurations, potentially with DNA itself playing an important role in stabilizing
485 the heterodimer (Jolma et al., 2015). Enrichment of these motifs in Ep300-T-fb, and their
486 over-representation neighboring genes related to angiogenesis, suggests that these alternative
487 FOX-ETS configurations are functional. We also recovered FOX-ETS motifs from Ep300
488 regions in adult ECs (and preferentially in heart ECs), suggesting that these motifs continue to
489 be important in maintenance of adult vasculature. However, direct support of these inferences
490 will require further experiments that identify the specific proteins involved and that dissect their
491 functional roles in vivo.
492 Additional novel heterodimer motifs, such as ETS:EBox, ETS:HOMEO, and ETS:Tbox,
493 were enriched in EC enhancers from both developing and adult ECs. This suggests that these
494 motifs, and potentially the protein heterodimers that were reported to bind them (Jolma et al.,
495 2015), are important for vessel growth and maintenance. These potential TF combinations, like
496 the FOX-ETS combination, may act as a transcriptional code to create regulatory specificity at
497 individual enhancers and their associated genes. Experimental validation of these hypotheses
498 will be a fruitful direction for future studies.
499 Limitations 500 A limitation of our current protocol is the need for several million cells to obtain robust
501 ChIP-seq signal. Optimization of chromatin pulldown and purification through application of
502 streamlined protocols or microfluidics (Cao et al., 2015), combined with use of improved library
- 21 - 503 preparation methods that work on smaller quantities of starting material, will likely overcome
504 this limitation. Another limitation of our strategy is that Ep300 decorates many, but not all,
505 active transcriptional enhancers (He et al., 2011), and therefore Ep300 bioChIP-seq will not
506 comprehensively detect all enhancers. Our strategy does not directly permit profiling of other
507 chromatin features in a Cre-targeted lineage. This limitation could be overcome by developing
508 additional proteins labeled with the bio-epitope. For instance, by bio-tagging histone H3,
509 non-tissue restricted ChIP for a feature of interest could be followed by sequential high affinity
510 histone H3 bioChIP. Finally, our technique's lineage specificity is dependent on the properties
511 of the Cre allele used (Ma et al., 2008), and users must be cognizant of the cell labeling
512 pattern of the Cre allele that they choose.
513 Materials and Methods 514 Mice 515 Animal experiments were performed under protocols approved by the Boston Children's
516 Hospital Animal Care and Use Committee (protocols 13-08-2460R and 13-12-2601). Ep300fb
517 mice were generated by homologous recombination in embryonic stem cells. Targeted ESCs
518 were used generate a mouse line, which was bred to homozygosity. This line has been
519 donated to Jackson labs (Jax 025980). The Rosa26fsBirA allele was derived from the previously
520 described Rosa26fsTRAP mouse (Zhou et al., 2013) (Jax 022367) by removal of the frt-TRAP-frt
521 cassette using germline Flp recombination. The Rosa26BirA (Driegen et al., 2005)
522 (constitutive; Jackson Labs 010920), Tie2Cre (Kisanuki et al., 2001) (Jackson Labs 008863),
523 Myf5Cre (Tallquist et al., 2000) (Jackson Labs 007893), EIIaCre (Williams-Simons and
524 Westphal, 1999) (Jackson Labs 003724), Rosa26mTmG (Muzumdar et al., 2007) (Jackson Labs
525 007576) and VEcad-CreERT2 (Sorensen et al., 2009) (Taconic 13073) lines were described
526 previously.
- 22 - 527 Transient transgenics 528 Candidate regions approximately 1 kb in length were PCR amplified using primers listed in
529 Table 3 and cloned into a gateway-hsp68-lacZ construct derived from pWhere (Invivogen) (He
530 et al., 2011). Constructs were linearized by PacI and injected into oocytes by Cyagen, Inc. At
531 least 5 PCR positive embryos were obtained per construct. Regions were scored as positive
532 for EC activity if they displayed an EC staining pattern in whole mount and validated in
533 sections, using previously described criteria (Visel et al., 2009). For scoring purposes, we
534 required that an EC pattern was observed for at least three different embryos, although we
535 also describe results for an additional two regions with activity observed in two different
536 embryos. Embryos were analyzed at E11.5 by whole mount LacZ staining. After whole mount
537 imaging, embryos were embedded in paraffin and sectioned for histological analysis.
538 Histology 539 Embryos were collected in ice-cold PBS and fixed in 4% paraformaldehyde over night. For
540 immunostaining, cryosections were stained with HA (Cell Signaling #3724, 1:100 dilution) and
541 PECAM1 (BD Pharmingen, 553371, 1:200 dilution) antibodies and imaged by confocal
542 microscopy (Olympus FV1000).
543 For immunostaining, cryosections were stained with HA (Cell Signaling #3724) and
544 PECAM1 (BD Pharmingen, 553371) antibodies and imaged by confocal microscopy (Olympus
545 FV1000).
546 Ep300fb/fb;Rosa26BirA/BirA ES cells derivation and bioChiP 547 ESCs were derived as described previously (Bryja et al., 2006). Five week old
548 Ep300fb/fb;Rosa26BirA/BirA female mice were hormonally primed and mated overnight with eight
549 weeks old Ep300fb/fb;Rosa26BirA/BirA male mice. Uteri were collected on 3.5 dpc and embryos
550 were flushed out under a microscope. After removal of zona pellucida by treating the embryos
551 with Tyrode’s solution (Sigma, T1788), the embryos were cultured in ESC medium (DMEM
- 23 - 552 with high glucose, 15% ES cell-qualified FBS, 1000 U/mL LIF, 100 µM non-essential amino
553 acids, 1 mM sodium pyruvate, 2 mM glutamine,100 µM β-mercaptoethanol, and
554 penicillin/streptomycin), supplemented with 50 µM PD98059 (Cell Signaling Technology,
555 #9900) for 7-10 days. The outgrowth was dissociated with trypsin, and the cells were cultured
556 in ESC medium in 24 well plates to obtain colonies, which were then clonally expanded. Five
557 male ESC lines were retained for further experiments. Pluripotency of these ESC lines was
558 confirmed by immunostaining for pluripotency markers Oct4, Sox2, and SSEA, and they were
559 negative for mycoplasma.
560 The five Ep300fb/fb;Rosa26BirA/BirA ESC lines were cultured in 150 mm dishes to 70-80%
561 confluence. Crosslinking was performed by adding formaldehyde to 1% and incubating at room
562 temperature for 15 min. Chromatin was fragmented using a microtip sonicator (QSONICA
563 Q700). The chromatin from 3 ESC lines was pooled for replicate 1 and from the other 2 ESC
564 lines for replicate 2. Ep300fb and bound chromatin were pulled down by incubation with
565 streptavidin beads (Life Technologies #11206D).
566 Cell culture and luciferase reporter assays 567 Candidate motif sequences were synthesized as oligonucleotides (Table 3) and cloned into
568 plasmid pGL3-promoter (Promega) between MluI and XhoI. HUVEC cells (Lonza) were
569 cultured to 50-60% confluent in 24-well plates and transfected in triplicate with 1 µg luciferase
570 construct and 0.5 µg pRL-TK internal control plasmid, using 5 µl jetPEI-HUVEC (Polyplus).
571 After 2 days, cells were analyzed using the dual luciferase assay (Promega). Luciferase
572 activity was measured using a 96-plate reading luminometer (Victor2, Perkin Elmer). Results
573 are representative of at least two independent experiments.
574 Western blotting 575 Immunoblotting was performed using standard protocols and the following primary
- 24 - 576 antibodies: GAPDH, Fitzgerald, 10R-G109A (1:10,000 dilution); Ep300, Millipore, 05257
577 (1:2,000 dilution); BirA, Abcam, ab14002 (1:1,000 dilution).
578 Tissue collection for ChIP and bioChiP 579 bioChIP-seq was performed as described previously (He et al., 2014; He and Pu, 2010),
580 with minor modifications. We used lower amplitude sonication to avoid fragmentation of Ep300
581 protein.
582 p300 and H3K27ac in E12.5 heart and forebrain. E12.5 embryonic forebrain and ventricle
583 apex tissues were isolated from Swiss Webster (Charles River) females crossed to
584 Ep300fb/fb;Rosa26BirA/BirA males. Cells were dissociated in a 2-mL glass dounce homogenizer
585 (large clearance pestle, Sigma P0485) and then cross-linked in 1% formaldehyde-containing
586 PBS for 15 min at room temperature. Glycine was add to final concentration of 125 mM to
587 quench formaldehyde. Chromatin isolation was performed as previously described (He and Pu,
588 2010) 30 forebrains or 60 heart apexes were used in each sonication. Conditions were titrated
589 to achieve sufficient fragmentation (mean fragment size 500 bp) while avoiding degradation of
590 Ep300 protein. We used a microtip sonicator (QSONICA Q700) at 30% amplitude and a cycle
591 of 5s on and 20s off for 96 cycles in total. Sheared chromation was precleared by incubation
592 with 100 µl Dynabeads Protein A (Life Technologies, 10002D) for 1 hour at 4C. For Ep300
593 bioChIP, 2/3 of the chromatin was then incubated with 250 µl (for forebrains) or 100 µl (for
594 heart apexes) Dynabeads M-280 Streptavidin (Life Technologies, 11206D) for 1 hour at 4C.
595 The streptavidin beads were washed and bound DNA eluted as described (He and Pu, 2010).
596 For H3K27ac ChIP, the remaining 1/3 chromatin was incubated with 10 µg (forebrain) or 5 µg
597 (heart apex) H3K27ac antibody (ActiveMotif #39133) overnight at 4C. Then 50 µl or 25 µl
598 Dynabeads Protein A were added and incubated for 1 hour at 4C. The magnetic beads were
599 washed 6 times with RIPA buffer (50 mM HEPES, pH 8.0, 500 mM LiCl, 1% Igepal ca-630,
- 25 - 600 0.7% sodium deoxycholate, and 1 mM EDTA) and washed once in TE buffer. ChIP DNA was
601 eluted at 65 °C in elution buffer (10 mM Tris, pH 8.0, 1% SDS and 1 mM EDTA) and incubated
602 at 65 °C overnight to reverse crosslinks.
603 p300 bioChIP of fetal EC cells. E11.5 embryos were isolated from pregnant Rosa26mTmG
604 females crossed to Ep300fb/fb;Rosa26fsBirA/BriA;Tie2Cre males. Tie2Cre positive embryos were
605 picked under a fluorescence microscope for further experiments. Crosslinking and sonication
606 were performed as described above. We used 15 Tie2Cre positive embryos for each bioChIP
607 replicate. We used 750 µL Dynabeads M-280 Streptavidin beads for Ep300fb pull-down; this
608 amount was determined by empiric titration.
609 p300 bioChIP of fetal Myf5Cre labeled cells. E13.5 embryos were isolated from pregnant
610 Rosa26mTmG females crossed to Ep300fb/fb;Rosa26fsBirA/BriA;Myf5Cre males. Cre positive
611 embryos were picked under a fluorescence microscope for further experiments. 5 embryos
612 were used for each bioChIP replicate and incubated with 750 µl Dynabeads M-280
613 Streptavidin beads for Ep300fb pull-down.
614 p300 bioChIP of adult ECs. Ep300fb/fb;Rosa26fsBirA/mTmG;VEcad-CreERT2 pups were given
615 two consecutive intragastic injections of 50 µl tamoxifen (2 mg/ml in sunflower seed oil) on
616 postnatal day P1 and P2 to induce the activity of Cre. The lungs and heart apexes were
617 collected when the mice were 8 week old. Cross-linking and sonication were performed as
618 described above. 6 mice were used for each replicate. We used 600 µl (heart) or 1.5 mL (lung)
619 streptavidin beads for the bioChIP.
620 bioChIP-seq and ChIP-seq 621 Libraries were constructed using a ChIP-seq library preparation kit (KAPA Biosystems
622 KK8500). 50 ng of sonicated chromatin without pull-down was used as input.
623 Sequencing (50 nt single end) was performed on an Illumina HiSeq 2500. Reads were
- 26 - 624 aligned to mm9 using Bowtie2 (Langmead and Salzberg, 2012) using default parameters.
625 Peaks were called with MACS2 (Zhang et al., 2008). Murine blacklist regions were masked out
626 of peak lists. For embryo samples, peak calling was performed against input chromatin
627 background with a false discovery rate of less than 0.01. For adult samples, peak calling was
628 poor using input chromatin background and therefore was performed using the ChIP sample
629 only at a false discovery rate of less than 0.05. Aggregation plots, tag heat maps, and global
630 correlation analyses were performed using deepTools 2.0 (Ramírez et al., 2014). bioChIP-seq
631 signal was visualized in the Integrated Genome Viewer (Thorvaldsdóttir et al., 2013).
632 To associate genomic regions to genes, we used Homer’s AnnotatePeaks (Heinz et al.,
633 2010) to select the gene with the closest transcriptional start site.
634 ATAC-seq 635 E12.5 heart ventricles were dissociated into single cell-suspensions using the Neonatal
636 Heart Dissociation Kit (Miltenyi Biotec #130-098-373) with minor modifications from
637 manufacturer’s protocol. Embryonic tissue samples were incubated twice with enzyme
638 dissociation mixes at 37°C for 15 minutes with gentle agitation by tube inversions between
639 incubations. Cell mixtures were gently filtered through a 70 μm cell strainer and centrifuged at
640 300 x g, 5 minutes, 4°C. Red blood cells were lysed with 10X Red Blood Cell Lysis Solution
641 (Miltenyi #130-094-183) and myocytes were isolated using the Neonatal Cardiomyocyte
642 Isolation Kit (Miltenyi Biotec #130-100-825). 75,000 isolated cardiomyocytes were used for
643 each ATAC-Seq experiment. Libraries were prepped as previously described (Buenrostro et
644 al., 2015).
645 TRAP 646 Translating ribosome RNA purification (TRAP) and RNA-seq were performed as described
647 (Zhou et al., 2013). The expression of the gene with TSS closest to the Ep300-bound region
- 27 - 648 was used to define "neighboring gene". In Figure 5A, no maximal distance limit was used; in
649 Fig. 5 -- figure supplement 1, a range of maximal distance thresholds were tested.
650 Gene ontology analysis 651 Gene Ontology analysis was performed using GREAT (McLean et al., 2010). Results were
652 ranked by the raw binomial P-value. To determine the fraction of terms relevant to a cell type
653 or process, the top twenty biological process terms were manually inspected.
654 Motif analysis 655 Homer (Heinz et al., 2010) was used for motif scanning and for de novo motif analysis.
656 Regions analyzed were 100 bp regions centered on the summits called by MACS2. The motif
657 database used for motif scanning was the default Homer motif vertebrate database plus the
658 heterodimer motifs described by Jolma et al. (Jolma et al., 2015). To select motifs for display,
659 motifs from samples under consideration with negative ln P-value > 15 in any one sample were
660 clustered using STAMP (Mahony et al., 2007). Non-redundant motifs were then manually
661 selected.
662 Gene expression analysis 663 Translating ribosome RNA purification (TRAP) was performed as described (Zhou et al.,
664 2013). E10.5 Rosa26fsTrap/+;Tie2Cre embryos were isolated from Swiss Webster strain
665 pregnant females crossed to Rosa26fsTrap/Trap;Tie2Cre males. TRAP RNA from 50 embryos
666 was pooled for RNA-seq. The polyadenylated RNA was purified by binding to oligo (dT)
667 magnetic beads (ThermoFisher Scientific, 61005). RNA-seq libraries were prepared with
668 ScriptSeq v2 kit (Illumina, SSV 21106 ) according to the manufacturer’s instructions. RNA-seq
669 reads were aligned with TopHat (Trapnell et al., 2009) and expression levels were determined
670 with htseq-count (Anders et al., 2015). Adult organ EC expression values were obtained from
671 (Nolan et al., 2013) (GEO GSE47067).
- 28 - 672 Comparison of enhancer prediction using different chromatin features 673 Enhancer regions in VISTA database was used as the golden standard. Each chromatin
674 feature that was analyzed was from E12.5 heart. Data sources are listed below under “Data
675 Sources”. For each chromatin factor, average read intensity in each VISTA regions with or
676 without heart enhancer activity was calculated and used as the starting point for machine
677 learning. We used the weighted KNN method as the classifier. The parameters used were: 10
678 nearest neighbors, Euclidean distance, and squared inverse weight. The classifier was trained
679 and tested using 10-fold cross validation. ROC (Receiver operating characteristic) curve were
680 used to evaluate the predictive accuracy of each factor, as measured by the area under the
681 ROC curve (AUC).
682 Conservation analysis 683 The phastCons score for the multiple alignments of 30 vertebrates to mouse mm9 genome
684 was used as the conservation score (Siepel et al., 2005). Sequences within Ep300 regions
685 matching selected motif (s) were identified using FIMO (Grant et al., 2011) with default
686 settings, and the average conservation score across the width of the sequence was used. To
687 generate the random conservation background, 100 random motifs 12 bp wide (the average
688 width of the motifs being analyzed) were used to the scan the same regions.
689 Literature Searches 690 Heart and endothelial cell enhancers were identified by searching PubMed for "heart
691 enhancer" or "endothelial cell enhancer", respectively, and then manually curating references
692 for mouse or human enhancers with appropriate activity in murine transient transgenic assays.
693 Data Sources 694 Sequencing data generated for this study are as follows: (1) ES_p300_bioChIP from
695 Ep300fb/fb;Rosa26BirA/BirA ESCs; (2) FB_p300_bioChIP from E12.5 Ep300fb/+;Rosa26BirA/+
696 forebrains; (3) FB_H3K27ac_ChIP from E12.5 Ep300fb/+;Rosa26BirA/+ forebrains; (4)
- 29 - 697 He_p300_bioChIP from E12.5 Ep300fb/+;Rosa26BirA/+ heart apex; (5) He_H3K27ac_ChIP from
698 E12.5 Ep300fb/+;Rosa26BirA/+ heart apex; (6) EIIaCre_p300_bioChIP from E11.5
699 Ep300fb/+;Rosa26BirA/+ whole embryos; Myf5Cre_p300_bioChIP from E13.5
700 Ep300fb/+;Rosa26fsBirA/+;Myf5Cre whole embryos; Tie2Cre_p300_bioChIP from E11.5
701 Ep300fb/+;Rosa26fsBirA/+;Tie2Cre+ whole embryos; Tie2Cre-TRAP and Tie2Cre-Input from E10.5
702 Rosa26fsTrap/+;Tie2Cre+ whole embryos; and ATAC-seq from wild-type E12.5 ventricular
703 cardiomyocytes. These data are available via the Gene Expression Omnibus (accession
704 number to be provided upon acceptance) or the Cardiovascular Development Consortium
705 server (https://b2b.hci.utah.edu/gnomex/; login as guest).
706 The following public data sources were used for this study: ES_P300_ChIP, GSE36027;
707 ES_P300_input, GSE36027; E12.5_Histone_input, GSE82850, E12.5_H3K27ac, GSE82449;
708 E12.5_H3K27me3, GSE82448; E12.5_H3K36me3, GSE82970; E12.5_H3K4me1, GSE82697;
709 E12.5_H3K4me2, GSE82667; E12.5_H3K4me3, GSE82882; E12.5_H3K9ac, GSE83056;
710 E12.5_H3K9me3, GSE82787. Antibody Ep300 ChIP-seq data on forebrain (Visel et al., 2009b)
711 and heart (Blow et al., 2010) are from GSE13845 and GSE22549, respectively. Adult organ
712 EC gene expression data are from GEO GSE47067 (Nolan et al., 2013).
713
714 Accession Numbers 715 Sequencing data generated for this study are available via the Gene Expression Omnibus
716 (upload in progress) or the Cardiovascular Development Consortium server
717 (https://b2b.hci.utah.edu/gnomex/; login as guest; instructions for reviewer access are provided
718 in an supplementary file).
719 Acknowledgements 720 WTP was supported by funding from the National Heart Lung and Blood Institute
- 30 - 721 (U01HL098166 and HL095712), by an Established Investigator Award from the American
722 Heart Association, and by charitable donations from Dr. and Mrs. Edwin A. Boger. The content
723 is solely the responsibility of the authors and does not necessarily represent the official views
724 of the funding agencies.
725
726 Competing Financial Interests 727 The authors declare no competing financial interests.
728
729
- 31 - 730 References 731 Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq--a Python framework to work with
732 high-throughput sequencing data. Bioinformatics 31, 166-69.
733 Becker, P.W., Sacilotto, N., Nornes, S., Neal, A., Thomas, M.O., Liu, K., Preece, C.,
734 Ratnayaka, I., Davies, B., et al. (2016). An Intronic Flk1 Enhancer Directs Arterial-Specific
735 Expression via RBPJ-Mediated Venous Repression. Arterioscler Thromb Vasc Biol 36,
736 1209-219.
737 Blow, M.J., McCulley, D.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry,
738 M., Wright, C., et al. (2010). ChIP-Seq identification of weakly conserved heart
739 enhancers. Nat Genet 42, 806-810.
740 de Boer, E., Rodriguez, P., Bonte, E., Krijgsveld, J., Katsantoni, E., Heck, A., Grosveld, F., and
741 Strouboulis, J. (2003). Efficient biotinylation and single-step purification of tagged
742 transcription factors in mammalian cells and transgenic mice. Proc Natl Acad Sci U S A
743 100, 7480-85.
744 Bonn, S., Zinzen, R.P., Girardot, C., Gustafson, E.H., Perez-Gonzalez, A., Delhomme, N.,
745 Ghavi-Helm, Y., Wilczynski, B., Riddell, A., and Furlong, E.E. (2012). Tissue-specific
746 analysis of chromatin state identifies temporal signatures of enhancer activity during
747 embryonic development. Nat Genet 44, 148-156.
748 Bresnick, E.H., Katsumura, K.R., Lee, H.Y., Johnson, K.D., and Perkins, A.S. (2012). Master
749 regulatory GATA transcription factors: mechanistic principles and emerging links to
750 hematologic malignancies. Nucleic Acids Res 40, 5819-831.
751 Bryja, V., Bonilla, S., and Arenas, E. (2006). Derivation of mouse embryonic stem cells. Nat
752 Protoc 1, 2082-87.
753 Buckingham, M., and Rigby, P.J. (2014). Gene Regulatory Networks and Transcriptional
- 32 - 754 Mechanisms that Control Myogenesis. Dev Cell 28, 225-238.
755 Buenrostro, J.D., Wu, B., Chang, H.Y., and Greenleaf, W.J. (2015). ATAC-seq: A Method for
756 Assaying Chromatin Accessibility Genome-Wide. Curr Protoc Mol Biol 109, 21.29.1-.9.
757 Bulger, M., and Groudine, M. (2011). Functional and mechanistic diversity of distal
758 transcription enhancers. Cell 144, 327-339.
759 Cao, Z., Chen, C., He, B., Tan, K., and Lu, C. (2015). A microfluidic device for epigenomic
760 profiling using 100 cells. Nat Methods 12, 959-962.
761 Chen, C.Y., Chi, Y.H., Mutalif, R.A., Starost, M.F., Myers, T.G., Anderson, S.A., Stewart, C.L.,
762 and Jeang, K.T. (2012). Accumulation of the inner nuclear envelope protein Sun1 is
763 pathogenic in progeric and dystrophic laminopathies. Cell 149, 565-577.
764 Coppiello, G., Collantes, M., Sirerol-Piquer, M.S., Vandenwijngaert, S., Schoors, S., Swinnen,
765 M., Vandersmissen, I., Herijgers, P., Topal, B., and van Loon, J. (2015). Meox2/Tcf15
766 heterodimers program the heart capillary endothelium for cardiac fatty acid uptake.
767 Circulation 131, 815-826.
768 Crawford, G.E., Holt, I.E., Whittle, J., Webb, B.D., Tai, D., Davis, S., Margulies, E.H., Chen, Y.,
769 Bernat, J.A., et al. (2006). Genome-wide mapping of DNase hypersensitive sites using
770 massively parallel signature sequencing (MPSS). Genome Res 16, 123-131.
771 Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna,
772 J., Lodato, M.A., Frampton, G.M., et al. (2010). Histone H3K27ac separates active from
773 poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 107,
774 21931-36.
775 Deal, R.B., and Henikoff, S. (2010). A simple method for gene expression and chromatin
776 profiling of individual cell types within a tissue. Dev Cell 18, 1030-040.
777 Dickel, D.E., Barozzi, I., Zhu, Y., Fukuda-Yuzawa, Y., Osterwalder, M., Mannion, B.J., May, D.,
- 33 - 778 Spurrell, C.H., Plajzer-Frick, I., et al. (2016). Genome-wide compendium and functional
779 assessment of in vivo heart enhancers. Nat Commun 7, 12923.
780 Dogan, N., Wu, W., Morrissey, C.S., Chen, K.B., Stonestrom, A., Long, M., Keller, C.A.,
781 Cheng, Y., Jain, D., et al. (2015). Occupancy by key transcription factors is a more
782 accurate predictor of enhancer activity than histone modifications or chromatin
783 accessibility. Epigenetics Chromatin 8, 16.
784 Driegen, S., Ferreira, R., van Zon, A., Strouboulis, J., Jaegle, M., Grosveld, F., Philipsen, S.,
785 and Meijer, D. (2005). A generic tool for biotinylation of tagged proteins in transgenic
786 mice. Transgenic Res 14, 477-482.
787 Fulton, D.L., Sundararajan, S., Badis, G., Hughes, T.R., Wasserman, W.W., Roach, J.C., and
788 Sladek, R. (2009). TFCat: the curated catalog of mouse and human transcription factors.
789 Genome Biol 10, R29.
790 Gasper, W.C., Marinov, G.K., Pauli-Behn, F., Scott, M.T., Newberry, K., DeSalvo, G., Ou, S.,
791 Myers, R.M., Vielmetter, J., and Wold, B.J. (2014). Fully automated high-throughput
792 chromatin immunoprecipitation for ChIP-seq: identifying ChIP-quality p300 monoclonal
793 antibodies. Sci Rep 4, 5152.
794 Goldhamer, D.J., Faerman, A., Shani, M., and Emerson, C.P. (1992). Regulatory elements that
795 control the lineage-specific expression of myoD. Science 256, 538-542.
796 Grant, C.E., Bailey, T.L., and Noble, W.S. (2011). FIMO: scanning for occurrences of a given
797 motif. Bioinformatics 27, 1017-18.
798 He, A., and Pu, W.T. (2010). Genome-wide location analysis by pull down of in vivo
799 biotinylated transcription factors. Curr Protoc Mol Biol Chapter 21, Unit 21.20.
800 He, A., Gu, F., Hu, Y., Ma, Q., Yi Ye, L., Akiyama, J.A., Visel, A., Pennacchio, L.A., and Pu,
801 W.T. (2014). Dynamic GATA4 enhancers shape the chromatin landscape central to heart
- 34 - 802 development and disease. Nat Commun 5, 4907.
803 He, A., Kong, S.W., Ma, Q., and Pu, W.T. (2011). Co-occupancy by multiple cardiac
804 transcription factors identifies transcriptional enhancers active in heart. Proc Natl Acad
805 Sci U S A 108, 5632-37.
806 He, A., Shen, X., Ma, Q., Cao, J., von Gise, A., Zhou, P., Wang, G., Marquez, V.E., Orkin,
807 S.H., and Pu, W.T. (2012). PRC2 directly methylates GATA4 and represses its
808 transcriptional activity. Genes Dev 26, 37-42.
809 Heiman, M., Schaefer, A., Gong, S., Peterson, J.D., Day, M., Ramsey, K.E., Suarez-Farinas,
810 M., Schwarz, C., Stephan, D.A., et al. (2008). A translational profiling approach for the
811 molecular characterization of CNS cell types. Cell 135, 738-748.
812 Heintzman, N.D., Stuart, R.K., Hon, G., Fu, Y., Ching, C.W., Hawkins, R.D., Barrera, L.O., Van
813 Calcar, S., Qu, C., et al. (2007). Distinct and predictive chromatin signatures of
814 transcriptional promoters and enhancers in the human genome. Nat Genet 39, 311-18.
815 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C.,
816 Singh, H., and Glass, C.K. (2010). Simple combinations of lineage-determining
817 transcription factors prime cis-regulatory elements required for macrophage and B cell
818 identities. Mol Cell 38, 576-589.
819 Jolma, A., Yin, Y., Nitta, K.R., Dave, K., Popov, A., Taipale, M., Enge, M., Kivioja, T.,
820 Morgunova, E., and Taipale, J. (2015). DNA-dependent formation of transcription factor
821 pairs alters their binding specificity. Nature 527, 384-88.
822 Kanamori, M., Konno, H., Osato, N., Kawai, J., Hayashizaki, Y., and Suzuki, H. (2004). A
823 genome-wide and nonredundant mouse transcription factor database. Biochem Biophys
824 Res Commun 322, 787-793.
825 Kisanuki, Y.Y., Hammer, R.E., Miyazaki, J., Williams, S.C., Richardson, J.A., and Yanagisawa,
- 35 - 826 M. (2001). Tie2-Cre Transgenic Mice: A New Model for Endothelial Cell-Lineage Analysis
827 in Vivo. Dev Biol 230, 230-242.
828 Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat
829 Methods 9, 357-59.
830 Ma, Q., Zhou, B., and Pu, W.T. (2008). Reassessment of Isl1 and Nkx2-5 cardiac fate maps
831 using a Gata4-based reporter of Cre activity. Dev Biol 323, 98-104.
832 Mahony, S., Auron, P.E., and Benos, P.V. (2007). DNA familial binding profiles made easy:
833 comparison of various motif alignment and clustering strategies. PLoS Comput Biol 3,
834 e61.
835 McLean, C.Y., Bristor, D., Hiller, M., Clarke, S.L., Schaar, B.T., Lowe, C.B., Wenger, A.M., and
836 Bejerano, G. (2010). GREAT improves functional interpretation of cis-regulatory regions.
837 Nat Biotechnol 28, 495-501.
838 Mehta, D., Ravindran, K., and Kuebler, W.M. (2014). Novel regulators of endothelial barrier
839 function. Am J Physiol Lung Cell Mol Physiol 307, L924-935.
840 Mo, A., Mukamel, E.A., Davis, F.P., Luo, C., Henry, G.L., Picard, S., Urich, M.A., Nery, J.R.,
841 Sejnowski, T.J., et al. (2015). Epigenomic Signatures of Neuronal Diversity in the
842 Mammalian Brain. Neuron 86, 1369-384.
843 Muzumdar, M.D., Tasic, B., Miyamichi, K., Li, L., and Luo, L. (2007). A global
844 double-fluorescent Cre reporter mouse. Genesis 45, 593-605.
845 Nolan, D.J., Ginsberg, M., Israely, E., Palikuqi, B., Poulos, M.G., James, D., Ding, B.S.,
846 Schachterle, W., Liu, Y., et al. (2013). Molecular signatures of tissue-specific
847 microvascular endothelial cell heterogeneity in organ maintenance and regeneration. Dev
848 Cell 26, 204-219.
849 Nord, A.S., Blow, M.J., Attanasio, C., Akiyama, J.A., Holt, A., Hosseini, R., Phouanenavong,
- 36 - 850 S., Plajzer-Frick, I., Shoukry, M., et al. (2013). Rapid and Pervasive Changes in
851 Genome-wide Enhancer Usage during Mammalian Development. Cell 155, 1521-531.
852 Pimanda, J.E., Chan, W.Y., Donaldson, I.J., Bowen, M., Green, A.R., and Göttgens, B. (2006).
853 Endoglin expression in the endothelium is regulated by Fli-1, Erg, and Elf-1 acting on the
854 promoter and a -8-kb enhancer. Blood 107, 4737-745.
855 Ramírez, F., Dündar, F., Diehl, S., Grüning, B.A., and Manke, T. (2014). deepTools: a flexible
856 platform for exploring deep-sequencing data. Nucleic Acids Res 42, W187-191.
857 Robinson, A.S., Materna, S.C., Barnes, R.M., De Val, S., Xu, S.M., and Black, B.L. (2014). An
858 arterial-specific enhancer of the human endothelin converting enzyme 1 (ECE1) gene is
859 synergistically activated by Sox17, FoxC2, and Etv2. Dev Biol 395, 379-389.
860 Sacilotto, N., Monteiro, R., Fritzsche, M., Becker, P.W., Sanchez-Del-Campo, L., Liu, K.,
861 Pinheiro, P., Ratnayaka, I., Davies, B., et al. (2013). Analysis of Dll4 regulation reveals a
862 combinatorial role for Sox and Notch in arterial development. Proc Natl Acad Sci U S A
863 110, 11893-98.
864 Shikama, N., Lutz, W., Kretzschmar, R., Sauter, N., Roth, J.F., Marino, S., Wittwer, J.,
865 Scheidweiler, A., and Eckner, R. (2003). Essential function of p300 acetyltransferase
866 activity in heart, lung and small intestine formation. EMBO J 22, 5175-185.
867 Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson,
868 H., Spieth, J., Hillier, L.W., et al. (2005). Evolutionarily conserved elements in vertebrate,
869 insect, worm, and yeast genomes. Genome Res 15, 1034-050.
870 Sorensen, I., Adams, R.H., and Gossler, A. (2009). DLL1-mediated Notch activation regulates
871 endothelial identity in mouse fetal arteries. Blood 113, 5680-88.
872 Tallquist, M.D., Weismann, K.E., Hellström, M., and Soriano, P. (2000). Early myotome
873 specification regulates PDGFA expression and axial skeleton development. Development
- 37 - 874 127, 5059-070.
875 Thorvaldsdóttir, H., Robinson, J.T., and Mesirov, J.P. (2013). Integrative Genomics Viewer
876 (IGV): high-performance genomics data visualization and exploration. Brief Bioinform 14,
877 178-192.
878 Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen, E., Sheffield,
879 N.C., Stergachis, A.B., Wang, H., et al. (2012). The accessible chromatin landscape of
880 the human genome. Nature 489, 75-82.
881 Trapnell, C., Pachter, L., and Salzberg, S.L. (2009). TopHat: discovering splice junctions with
882 RNA-Seq. Bioinformatics 25, 1105-111.
883 De Val, S., Chi, N.C., Meadows, S.M., Minovitsky, S., Anderson, J.P., Harris, I.S., Ehlers, M.L.,
884 Agarwal, P., Visel, A., et al. (2008). Combinatorial regulation of endothelial gene
885 expression by ets and forkhead transcription factors. Cell 135, 1053-064.
886 Visel, A., Rubin, E.M., and Pennacchio, L.A. (2009a). Genomic views of distant-acting
887 enhancers. Nature 461, 199-205.
888 Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M.,
889 Wright, C., et al. (2009b). ChIP-seq accurately predicts tissue-specific activity of
890 enhancers. Nature 457, 854-58.
891 Visel, A., Minovitsky, S., Dubchak, I., and Pennacchio, L.A. (2007). VISTA Enhancer
892 Browser--a database of tissue-specific human enhancers. Nucleic Acids Res 35,
893 D88-D92.
894 Waldron, L., Steimle, J.D., Greco, T.M., Gomez, N.C., Dorr, K.M., Kweon, J., Temple, B.,
895 Yang, X.H., Wilczewski, C.M., et al. (2016). The Cardiac TBX5 Interactome Reveals a
896 Chromatin Remodeling Network Essential for Cardiac Septation. Dev Cell 36, 262-275.
897 Wei, G., Srinivasan, R., Cantemir-Stone, C.Z., Sharma, S.M., Santhanam, R., Weinstein, M.,
- 38 - 898 Muthusamy, N., Man, A.K., Oshima, R.G., et al. (2009). Ets1 and Ets2 are required for
899 endothelial cell survival during embryonic angiogenesis. Blood 114, 1123-130.
900 Wei, J.Q., Shehadeh, L.A., Mitrani, J.M., Pessanha, M., Slepak, T.I., Webster, K.A., and
901 Bishopric, N.H. (2008). Quantitative control of adaptive cardiac hypertrophy by
902 acetyltransferase p300. Circulation 118, 934-946.
903 Williams-Simons, L., and Westphal, H. (1999). EIIaCre -- utility of a general deleter strain.
904 Transgenic Res 8, 53-54.
905 Wythe, J.D., Dang, L.T., Devine, W.P., Boudreau, E., Artap, S.T., He, D., Schachterle, W.,
906 Stainier, D.Y., Oettgen, P., et al. (2013). ETS factors regulate Vegf-dependent arterial
907 specification. Dev Cell 26, 45-58.
908 Yang, W.Y., Petit, T., Thijs, L., Zhang, Z.Y., Jacobs, L., Hara, A., Wei, F.F., Salvi, E., Citterio,
909 L., et al. (2015). Coronary risk in relation to genetic variation in MEOX2 and TCF15 in a
910 Flemish population. BMC Genet 16, 116.
911 Zhang, B., Day, D.S., Ho, J.W., Song, L., Cao, J., Christodoulou, D., Seidman, J.G., Crawford,
912 G.E., Park, P.J., and Pu, W.T. (2013). A dynamic H3K27ac signature identifies
913 VEGFA-stimulated endothelial enhancers and requires EP300 activity. Genome Res 23,
914 917-927.
915 Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nussbaum, C.,
916 Myers, R.M., Brown, M., et al. (2008). Model-based analysis of ChIP-Seq (MACS).
917 Genome Biol 9, R137.
918 Zhou, P., Zhang, Y., Ma, Q., Gu, F., Day, D.S., He, A., Zhou, B., Li, J., Stevens, S.M., et al.
919 (2013). Interrogating translational efficiency and lineage-specific transcriptomes using
920 ribosome affinity purification. Proc Natl Acad Sci U S A 110, 15395-5400.
921 922 - 39 - 923 Table 1 924 mm9 genome coordinates of regions with EC activity as determined by transient transgenic 925 assay. Vista_XXX indicates that the region was obtained from the VISTA enhancer database. 926 Lifeover indicates that the region was inferred by liftover from the human genome. For 927 enhancers obtained from the literature, Pubmed was searched for "endothelial cell enhancer". 928 The resulting references were manually curated for transient transgenic testing of candidate 929 endothelial cell enhancer regions. 930 chr start end Note chr9 37206631 37209631Robo4;PMID17495228;bloodVessels chr4 94412022 94413640Tek;PMID9096345;bloodVessels chr8 106625634 106626053Cdh5;PMID15746076;bloodVessels chr11 49445663 49446520Flt4;posEC;liftoverFromPMID19070576;FoxETS chr6 99338223 99339713FoxP1;posEC;liftoverFromPMID19070576;FoxETS chr8 130910720 130911740Nrp1;posEC;liftoverFromPMID19070576;FoxETS chr18 61219017 61219690Pdgfrb;posEC;liftoverFromPMID19070576;FoxETS chr4 137475841 137476446Ece1;posEC;liftoverFromPMID19070576;FoxETS;artery chr4 114743698 114748936Tal1;PMID14966269;endocardium;bloodVessels chr2 119152861 119153661Dll4;PMID23830865;arterial chr13 83721919 83721962Mef2c;PMID19070576;FoxETS chr13 83711086 83711527Mefec;PMID15501228;panEC chr2 32493213 32493467Eng;liftoverFromPMID16484587;bloodVessels chr2 32517844 32518197Eng;liftoverFromPMID18805961;bloodVessels;blood chr6 88152598 88153791Gata2;PMID17395646;PMID17347142;bloodVessels chr9 32337302 32337549Fli1;PMID15649946;bloodVessels chr5 76370571 76372841KDR;PMID10361126;bloodVessels chr6 125502138 125502981VWF;PMID20980682;smallbloodVessels chr5 148537311 148538294Flt1;liftoverFromPMID19822898 chr17 34701455 34702277Notch4;liftoverFromPMID15684396 chr2 155568649 155569132Procr;liftoverfromPMID16627757;bloodVessels[7/17] chr9 63916890 63917347Smad6;liftoverfromPMID17213321;bloodVessels chr19 37510161 37510394Hhex;liftoverfromPMID15649946;bloodVessels;blood chr5 76357892 76358715Flk1;PMID27079877;bloodVessesl;artery chr11 32145270 32146411vista_101;heart[9/12];bloodVessels[7/12] chr13 28809515 28811310vista_265;neural tube[7/8];bloodVessels[3/8] chr17 12982583 12985936vista_89;heart[6/10];bloodVessels[8/10] chr4 131631431 131635142vista_80;bloodVessels[10/10] chr4 57858433 57860639vista_261;bloodVessels[5/8] chr5 93426293 93427320vista_397;limb[12/13];bloodVessels[12/13] chr8 28216799 28218903vista_136;heart[7/10];other[6/10];bloodVessels[4/10] chr3 87786601 87790798liftoverFromvista_1891;somite[7/7];midbrain;(mesencephalon)[6/7 ];limb[7/7];branchial;arch[7/7];eye[7/7];heart[7/7];ear[7/7];bloodV essels[6/7] chr6 116309006 116309768liftoverFromvista_2065;bloodVessels[9/9] chr19 37566512 37571839liftoverFromvista_1866;bloodVessels[5/5] chr7 116256647 116260817liftoverFromvista_1859;neuraltube[8/8];hindbrain;(rhombencephal on)[5/8];midbrain;(mesencephalon)[8/8];forebrain[8/8];heart[7/8]; bloodVessels[5/8];liver[4/8] chr18 14006285 14007917liftoverFromvista_1653;bloodVessels[5/8] chr2 152613921 152616703liftoverFromvista_2050;bloodVessels[5/5] chr14 32045520 32047900liftoverFromvista_2179;bloodVessels[5/7] chr15 73570041 73577576liftoverFromvista_1882;bloodVessels[8/8]
- 40 - chr8 28216817 28218896liftoverFromvista_1665;heart[5/7];bloodVessels[7/7] 931 932 933
- 41 - 934 Table 2 935 Summary of transient transgenic validation of candidate EC enhancers 936 Neighboring region (mm9) size Location Distance Whole Mount Sections Ref. gene (bp) w/r gene to TSS (PMID) #PCR #LacZ # EC or Endo Art Vein Blood pos pos blood cells pos Apln chrX:45358891-45359918 1028 3'_Distal 28,624 10 3 3 + + + - - Dab2 chr15:6009504-6010497 994 5'_Distal -239,788 24 10 9 + ++ + +++ - Egfl7_enh1 chr2:26427040-26428029 990 5'_Distal -9,041 9 3 3 +++ +++ +++ + - Eng chr2:32493216-32494019 804 5'_Distal -8,497 12 4 3 ++ ++ ++ - 16484587 Ephb4 chr5:137789649-137790412 764 5'_Proxim -1,306 7 5 4 ++ - +++ - - al Lmo2 chr2:103733621-103734378 758 5'_Distal -64,152 29 14 10 - - - +++ - Mef2c chr13:83721522-83722451 930 Intragenic 78,954 10 6 5 + - +++ + 19070576 Notch1_enh1 chr2:26330255-26331184 930 Intragenic -28,622 22 3 3 +++ ++ ++ - - Sema6d chr2:124380522-124381285 764 5'_Distal -55,128 17 10 7 ++ +++ - + -
Egfl7_enh3 chr2:26433680-26434642 963 5'_Distal -2,415 12 2 2 + ++ +++ - - Sox7 chr14:64576382-64577118 737 3'_Distal 14,207 7 4 2 + ++ - - - Aplnr chr2:85003436-85004412 977 3'_Distal 27,407 12 3 0 NA NA NA NA - Egfl7_enh2 chr2:26431273-26431995 723 5'_Proxim -4,942 9 3 1 + - - - - al Emcn chr3:136984933-136986103 1171 5'_Distal -18,524 15 1 1 - - - + - Ets1 chr9:32481485-32482133 649 Intragenic -21,818 11 1 0 NA NA NA NA - Foxc1 chr13:31921976-31922827 852 3'_Distal 23,887 13 3 1* NA NA NA NA - Gata2 chr6:88101907-88102696 790 5'_Distal -46,356 8 0 0 NA NA NA NA - Lyve1 chr7:118020264-118021043 780 5'_Distal -3,128,02 7 1 1 + + - - - 8 Notch1_enh2 chr2:26345973-26347118 1146 Intragenic 12,796 9 0 0 NA NA NA NA - Sox18 chr2:181397552-181398335 784 3'_Proxim 8,401 13 2 1 - ++ - - - al *EC/blood pattern on whole mount not validated in histological sections 937 938 939
- 42 - 940 Table 3. Oligonucleotides used in this study. 941 Genotyping primers Name Sequence (5'->3') Comments p300fb-f AATGCTTTCACAGCTCGC 0.28kb for wild-type, 0.43kb for Ep300fb knockin p300fb-r AAACCATAAATGGCTACTGC Forward common CTCTGCTGCCTCCTGGCTTCT Rosa26-fs-BirA, 0.33 kb for wildtype, 0.25 kb for knockin Wild type reverse CGAGGCGGATCACAAGCAATA CAG reverse TCAATGGGCGGGGGTCGTT LacZ-f CAATGCTGTCAGGTGCTCTCACTACC 0.42 kb, genotyping of transient transgenic LacZ-r GCCACTTCTTGATGCTCCACTTGG
Primers to amplify Ep300 peak regions for transient transgenic assay. 4 nucleotides CACC have been added to all the forward primers for TOPO Cloning. Name Sequence (5'->3') Apln_f CACCGGAGGCTGAGCAATGAATAG Apln_r TTGGCTGGGGAAGAGTAAGC Aplnr_f CACCTCTCTCTCTGGCTTCG Aplnr_r CCTCAGAATGTTTTCATGG Dab2_f CACCGTGGAAATCATAGCAC Dab2_r GGTTGGAATAAAAGAGC Egfl7_Enh1_f CACCGCCTACCCAGTGCTGTTCC Egfl7_Enh1_r CTGGAGTGGAGTGTCACG Egfl7_Enh2_f CACCGCTAGGGGCTTCTAGTTC Egfl7_Enh2_r AGGTCTCTTCTGTGTCG Egfl7_Enh3_f CACCTGTTAGTGGTGCTCCC Egfl7_Enh3_r TCCAAGGTCACAAAGC Emcn_f CACCAGCACACCTCGTAAAATGG Emcn_r GAGTGAAGTAAGACATCGTCC Eng_f CACCAAACTAATTAAAAAACAAAGCAGGT Eng_r CATATGTACATTAGAACCATCCA Ephb4_f CACCTGGGTCTCATCAACCGAAC Ephb4_r CCTATCTACATCAGGGCACTG Ets1_f CACCTTCGTCAGAAATGATCTTGCCA Ets1_r TAGCAAGAGAGCCTGGTCAG Foxc1_f CACCTCTCTGCTTCAAGGCACCTT Foxc1_r TGGATAGCATGCAGAGGACA Gata2_f CACCTTCTCTTGGGCCACACAGA Gata2_r ATCTGCTCCACTCTCCGTCA Lmo2_f CACCTGGTTTTGCTTGCTAC Lmo2_r CATTTCTAAGTCTCCAC Lyve1_f CACCTACTGCCATGGAGGACTG Lyve1_r AGACACCTGGCTGCCTGATA Mef2c_f CACCGGAGGATTAAAAATTCCCC Mef2c_r CCTCTTAAATGTACGTG Notch1_Enh1_f CACCTCCCAAATGCTCCACGATG Notch1_Enh1_r GAGGAATGGCGAGAAATAGAC Notch1_Enh2_f CACCGAAGGCAGGCAGGAATAAC Notch1_Enh2_r TGGACAGGTGCTTTGTTG
- 43 - Sema6d_f CACCTCTTAACCACTATCTCC Sema6d_r ACTTCCTACACAGTTC Sox18_f CACCTTGGGGGGAAAGAGTG Sox18_r GACTTCATCCCATCTC Sox7_f CACCACAGAGCCCCTGCATATGT Sox7_r GCATGGTTTCTGAAGCCCAAAT
3x repeated enhancer regions for luciferase assay. Core motifs of interest are highlighted in red. Name Sequence (5'->3') ETS-FOX2 (Mef2c) CAGGAAGCACATTTGTCTACGCTTTCCTGTCATAACAGGAAGAGCAGGAAGCACATTTG TCTACGCTTTCCTGTCATAACAGGAAGAGCAGGAAGCACATTTGTCTACGCTTTCCTGT CATAACAGGAAGAG Mef2c-mut CAAGAAGCACATTTGTCTACGCTTTCCTGTCATATCTAGAAGAGCAAGAAGCACATTTG TCTACGCTTTCCTGTCATATCTAGAAGAGCAAGAAGCACATTTGTCTACGCTTTCCTGT CATATCTAGAAGAG ETS-FOX3 (Bcl2l1) CAGTTATTTCAGGAAAGATCAGTTATTTCAGGAAAGATCAGTTATTTCAGGAAAGAT
ETS-HOMEO2 GACAGACAGGAAGGCGGGACAGACAGGAAGGCGGGACAGACAGGAAGGCGG (Egfl7_Enh1) ETS-HOMEO2 ACACACTTCCTGTTTCCTGACACACTTCCTGTTTCCTGACACACTTCCTGTTTCCTG (Egfl7_Enh3) ETS-HOMEO2 ACAGTCACTTCCTGTTTTACAGTCACTTCCTGTTTTACAGTCACTTCCTGTTTT (Flt4) ETS-HOMEO2 CAACAACAGGAAGTGGACAACAACAGGAAGTGGACAACAACAGGAAGTGGA (Kdr) Sox (Apln) CAGTTCCCCATTGTTCTCGCAGTTCCCCATTGTTCTCGCAGTTCCCCATTGTTCTCG Sox (Robo4) GCCAGAACAATGAAGAACAAAGCCTGCACGGCCAGAACAATGAAGAACAAAGCCTGCAC GGCCAGAACAATGAAGAACAAAGCCTGCACG Sox (Sema3g) CGAATGGAAAGGGCATTGTTCAGGGGAGAACGAATGGAAAGGGCATTGTTCAGGGGAGA ACGAATGGAAAGGGCATTGTTCAGGGGAGAA ETS-Ebox (Apln) AGGCGGAAGCAGCTGGGATAGGCGGAAGCAGCTGGGATAGGCGGAAGCAGCTGGGAT ETS-Ebox TTATCCACAGGAAACAGATGAGGATCGTTATCCACAGGAAACAGATGAGGATCGTTATC (Zfp521) CACAGGAAACAGATGAGGATCG
3x repeated motifs within a similar DNA context. Motifs are indicated in red. Neg. control TGTCATATCTAGAAGAGTGTCATATCTAGAAGAGTGTCATATCTAGAAGAG ETS_alone TGTCATATCTGGAAGAGTGTCATATCTGGAAGAGTGTCATATCTGGAAGAG
ETS-FOX1 TGTCCGGATGTTGAGTGTCCGGATGTTGAGTGTCCGGATGTTGAG ETS-FOX2 TGTGTAAACAGGAAGTGAGTGTGTAAACAGGAAGTGAGTGTGTAAACAGGAAG TGAG ETS-FOX3 TGTTGTTTACGGAAGTGAGTGTTGTTTACGGAAGTGAGTGTTGTTTACGGAAG TGAG ETS_Ebox TGTAGGAAACAGCTGGAGTGTAGGAAACAGCTGGAGTGTAGGAAACAGCTGGA G ETS-HOMEO2 TGTAACCGGAAGTGAGTGTAACCGGAAGTGAGTGTAACCGGAAGTGAG Tbx-ETS TGTCACACCGGAAGGAGTGTCACACCGGAAGGAGTGTCACACCGGAAGGAG 942 943 944 945
- 44 - 946 Figure Legends 947 Fig. 1. Generation and characterization of Ep300flbio allele. 948 A. Experimental strategy for high affinity Ep300 pull down. A flag-bio epitope was knocked
949 onto C-terminus of the endogenous Ep300 gene. The bio peptide sequence is biotinylated by
950 BirA, widely expressed from the Rosa26 locus. B. Targeting strategy to knock flag-bio epitope
951 into the C-terminus of Ep300. A targeting vector containing homology arms, flag-bio epitope,
952 and Frt-neo-Frt cassette was used to insert the epitope tag into embryonic stem cells by
953 homologous recombination. Chimeric mice were mated with Act:Flpe mice to excise the
954 Frt-neo-Frt cassette in the germline, yielding the Ep300fb allele. C. Biotinylated Ep300 is
955 quantitatively pulled down by streptavidin beads. Protein extract from Ep300flbio/flbio; Rosa26BirA
956 embryos was incubated with immobilized streptavidin. Input, bound, and unbound fractions
957 were analyzed for Ep300 by immunoblotting. GAPDH was used as an internal control. D.
958 Ep300flbio/flbio mice from heterozygous intercrosses survived normally to weaning.
959
960 Fig. 2. Comparison of Ep300 bioChiP-seq to antibody ChIP-seq for mapping Ep300 961 chromatin occupancy. 962 A-B. Comparison of biological duplicate antibody Ep300 ChIP-seq (Encode) and Ep300
963 bioChIP-seq (flbio). The Ep300 bioChIP-seq data had greater overlap between replicates and
964 greater intragroup correlation. Most antibody peaks were covered by the bioChIP-seq data.
965 There were 3.3 times more Ep300 regions detected by Ep300 bioChiP-seq. 89.6% of Ep300
966 regions detected by antibody ChIP-seq were recovered by Ep300 bioChIP. Panel B shows
967 Spearman correlation between samples over the peak regions. C. Tag heatmap shows
968 input-subtracted Ep300 antibody or bioChIP signal in the union of the Ep300-bound regions
969 detected by each method. D. Correlation plots show greater Ep300 bioChIP-seq signal
970 compared to antibody bioChIP-seq.
- 45 - 971
972 Fig. 3. Tissue specific Ep300 bioChiP-seq. 973 A. Tag heatmap showing Ep300fb signal in heart and forebrain. Each row represents a
974 region that was bound by Ep300 in heart, forebrain, or both. A minority of Ep300-enriched
975 regions were shared between tissues. B. Ep300fb pull down from E12.5 heart or forebrain.
976 Ep300-bound regions identified by bioChIP-seq of Ep300fb/fb; Rosa26BirA tissues in biological
977 duplicates are compared to regions identified by antibody-mediated Ep300 ChIP-seq (Visel et
978 al., 2009b). C. GO biological process terms enriched for genes that neighbor the
979 tissue-specific heart or forebrain peaks. Bars indicate statistical significance.
980
981 Fig. 4. Cre-directed, lineage-selective Ep300 bioChiP-seq. 982 A. Experimental strategy. Lineage specific Cre recombinase activates expression of BirA
983 (HA-tagged) from Rosa26-flox-stop-BirA (Rosa26fsBirA). This results in Ep300 biotinylation in
984 the progeny of Cre-expressing cells. B. Cre-dependent Ep300 biotinylation using Rosa26fsBirA.
985 Protein extracts were prepared from E11.5 embryos with the indicated genotypes. R26BirA/+
986 (ca) and R26fsBirA/+ indicate the constitutively active and Cre-activated alleles, respectively. C.
987 Immunostaining demonstrating selective Tie2Cre-mediated expression of HA-tagged BirA in
988 ECs of R26fsBirA; Tie2Cre embryos. Arrows and arrowheads indicate endothelial and
989 hematopoietic lineages, respectively. D. Tissue-selective Ep300 bioChIP-seq. Tie2Cre (T;
990 endothelial and hematopoietic lineages), Myf5Cre (M; skeletal muscle lineages), and EIIaCre
991 (E; germline activation) were used to drive tissue-selective Ep300 bioChIP-seq. Correlation
992 between ChIP-seq signals within peak regions are shown for triplicate biological repeats.
993 Samples within groups were the most closely correlated. E. Ep300fb bioChIP-seq signal at
994 Gata2 (EC/blood specific) and Myod (muscle specific). Enhancers validated by transient
- 46 - 995 transgenic assay are indicated along with the citation’s Pubmed identifier (PMID). F. Biological
996 process GO terms illustrate distinct functional groups of genes that neighbor Ep300
997 bioChIP-seq driven by lineage-specific Cre alleles. The heatmap contains the top 10 terms
998 enriched for genes neighboring each of the three lineage-selective Ep300 regions.
999
1000 Fig. 5. Functional validation of enhancer activity of Ep300-T-fb-bound regions. 1001 A. Expression of genes neighboring regions bound by Ep300fb in different Cre-marked
1002 lineages. Translating ribosome affinity purification (TRAP) was used to enrich for RNAs from
1003 the Tie2Cre lineage. Input or Tie2Cre-enriched RNAs were profiled by RNA sequencing. The
1004 expression of the nearest gene neighboring Ep300 regions in ECs (p300-T-fb regions), but not
1005 skeletal muscle (p300-M-fb regions), was higher than that of genes neighboring regions bound
1006 by Ep300 across the whole embryo (EIIaCre). Box and whiskers show quartiles and 1.5 times
1007 the interquartile range. Groups were compared to EIIaCre using the Mann-Whitney U-test. B.
1008 Transient transgenesis assay to measure in vivo activity of Ep300-T-fb regions. Test regions
1009 were positioned upstream of an hsp68 minimal promoter and lacZ. Embryos were assayed at
1010 E11.5. C. Summary of transient transgenic assay results. Out of 20 regions tested, 9 showed
1011 activity in ECs or blood in three or more embryos, and 2 more showed activity in 2 embryos.
1012 See also Table 2. D. Representative whole mount Xgal-stained embryos. Enhancers that
1013 directed LacZ expression in an EC or blood pattern in two or more embryos are shown.
1014 Numbers indicate embryos with LacZ distribution similar to shown image, compared to the total
1015 number of PCR positive embryos. E. Sections of Xgal-stained embryos showing examples of
1016 enhancers active in arteries, veins, and endocardium, or selectively active in arteries or veins.
1017 AS: aortic sac; CV: cardinal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein;
1018 LA: left atrium; LV: left ventricle; RV: right ventricle. Scale bars, 100 µm. See also figure
- 47 - 1019 supplements 1 and 2 and Table 2.
1020
1021 Fig. 6. Motifs enriched in Ep300-T-fb and Ep300-M-fb regions. 1022 A. Motifs enriched in Ep300-T-fb or Ep300-M-fb regions. 1445 motifs were tested for
1023 enrichment in Ep300 bound regions compared to randomly permuted control regions.
1024 Significantly enriched motifs (neg ln P-value > 15) were clustered and the displayed
1025 non-redundant motifs were manually selected. Heatmaps show statistical enrichment (left),
1026 fraction of regions that contain the motif (center), and GO terms associated with genes
1027 neighboring motif-containing, Ep300-bound regions (fraction of the top 20 GO biological
1028 process terms). Grey indicates that the motif was not significantly enriched (neg ln P-value ≤
1029 15). B. Conservation of sequences matching Tie2Cre- or Myf5Cre-enriched motifs within 100
1030 bp of the summit of Ep300-T-fb or Ep300-M-fb regions, compared to randomly selected 12 bp
1031 sequences from the same regions. PhastCons conservation scores across 30 vertebrate
1032 species were used. ****, P<0.0001, Kolmogorov–Smirnov test. C. Luciferase assay of activity
1033 of enhancers containing indicated motifs. Three repeats of 20-30 bp regions from
1034 Ep300-bound enhancers linked to the indicated gene and centered on the indicated motif were
1035 cloned upstream of a minimal promoter and luciferase. The constructs were transfected into
1036 human umbilical vein endothelial cells. Luciferase activity was expressed as fold activation
1037 above that driven by the enhancerless, minimal promoter-luciferase construct. *, P<0.05
1038 compared to Mef2c-enhancer with mutated ETS:FOX2 motif (Mef2c mut). n=3. D. Luciferase
1039 assay of indicated motifs repeated three times within a consistent DNA context. Assay was
1040 performed as in C. *, P<0.05 compared to negative control sequencing lacking predicted motif.
1041 n=3. Error bars in C and D indicate standard error of the mean.
1042
- 48 - 1043 Figure 7. Enhancers in adult heart and lung ECs. 1044 VEcad-CreERT2 and neonatal tamoxifen pulse was used to drive BirA expression in ECs.
1045 Ep300 regions in adult heart or lung ECs were identified by bioChIP-seq in biological triplicate.
1046 Ep300 regions were ranked into deciles by the ratio of the Ep300-VE-fb signal in heart to lung.
1047 A. Tag heatmap shows that the top and bottom deciles have selective Ep300 occupancy in
1048 heart and lung ECs, respectively. B. Expression of genes in heart ECs (red) or lung ECs (blue)
1049 neighboring Ep300 regions, divided into deciles by Ep300-VE-fb signal ratio in heart and lung.
1050 ***, P<0.0001, Wilcoxon test. Expression values were obtained from (Nolan et al., 2013). Box
1051 plots indicate quartiles, and whiskers indicate 1.5 times the interquartile range. C. Genes with
1052 selective expression in heart or lung ECs were identified by K-means clustering (see Methods
1053 and figure supplement 1). The number of Ep300-VE-fb regions in heart (red) or lung (blue)
1054 neighboring these genes was determined, stratified by decile. ***, P<0.0001, Fisher's exact
1055 test. D. Enrichment of selected motifs in Ep300-VE-fb regions from heart (decile 1) compared
1056 to lung (decile 10) or visa versa. Grey indicates no significant enrichment. Displayed
1057 non-redundant motifs were selected from all significant motifs by manual curation of clustered
1058 motifs.
1059
- 49 - 1060 Supplementary Information 1061 Suppl. File 1. Tissue-specific Ep300-bound regions identified in this study. 1062 Each tab of the excel spreadsheet contains the Ep300-bound regions in the following
1063 conditions: Each tab of this spreadsheet shows the Ep300-bound regions in the following
1064 samples:
1065 H1.narrowPeak: E12.5 heart, replicate 1. 1066 H2.narrowPeak: E12.5 heart, replicate 2. 1067 FB1.narrowPeak: E12.5 forebrain, replicate 1. 1068 FB2.narrowPeak: E12.5 forebrain, replicate 2. 1069 p300-T-fb: Tie2Cre-enriched regions. Average Ep300 signal in Tie2Cre (T2), Myf5Cre 1070 (M5), and EIIaCre (E) is shown in reads per million. T2/E and M5/E show the ratio of 1071 signals. Tie2Cre-enriched regions were defined as peak regions with T2/E ratio > 1072 1.5. 1073 p300-M-fb: Myf5Cre-enriched regions. Average Ep300 signal in Tie2Cre (T2), Myf5Cre 1074 (M5), and EIIaCre (E) is shown in reads per million. T2/E and M5/E show the ratio of 1075 signals. Myf5Cre-enriched regions were defined as peak regions with M5/E ratio > 1076 1.5. 1077 p300-E-fb: Ep300-bound peaks called from whole embryo (ubiquitous BirA expression). 1078 p300-VE-fb: Merged VEcad-CreERT2-driven Ep300 peaks in adult heart and lung. The 1079 peaks were ranked by the ratio of Ep300 signal in heart ECs compared to lung ECs 1080 and then grouped into deciles. 1081 1082 Suppl. File 2. Transcription factors expressed in embryonic ECs. 1083 E10.5 embryo T2-TRAP and input RNA-seq data were analyzed to identify DNA-binding
1084 transcriptional regulators with detectable expression level in ECs (T2-TRAP log2
1085 (fpkm+1)>0.8) and preferential EC expression (T2-TRAP/input > 1). Genes with DNA binding
1086 domains belonging to the indicated families were annotated based on literature searches and
1087 previously publshed catalogs of transcriptional regulators (Fulton et al., 2009; Kanamori et al.,
1088 2004).
1089
1090 Figure 1 -- figure supplement 1. Characterization of Ep300fb allele. 1091 A. Southern blot genotyping of Ep300fb allele. Probes are marked in Fig. 1B. B.The
1092 expression level of tagged Ep300 protein is similar to that of wild-type Ep300 in neonatal
- 50 - 1093 mouse hearts. C, D. The heart weight to body weight ratio of 10 week old Ep300fb/fb heart was
1094 not significantly different from littermates. E. Echocardiograms of 10 week old Ep300fb/fb mice
1095 show that their ventricular function was not significantly different from littermate controls.
1096 Sample sizes are shown in bars in D,E. Bars represent mean ± standard deviation. n.s., not
1097 significantly different (t-test).
1098
1099 Figure 3 -- figure supplement 1. Heart and forebrain Ep300 bioChiP-seq. 1100 A. Ep300 bioChIP-seq signals within called Ep300 peaks were highly correlated between
1101 biological replicates and discordant between heart and forebrain. B. Genome browser views
1102 showing Ep300 bioChIP-seq signal in heart and brain at Srf (preferential cardiac expression)
1103 and Nes (preferential Brain expression). C. Comparison of Ep300 bioChIP-seq and H3K27ac
1104 ChIP-seq signals in heart. Numbers indicate correlation coefficients. There was excellent
1105 correlation between biological replicates of Ep300 or H3K27ac. Ep300 and H3K27ac also
1106 showed highly significant genome-wide correlation. Similar results were observed in forebrain
1107 (data not shown). D. Tag heatmap of H3K27ac-bound regions showing overlap between heart
1108 and brain H3K27ac enriched regions from each tissue.
1109
1110 Figure 3 -- figure supplement 2. Comparison of heart Ep300fb regions to a compendium 1111 of heart enhancers assembled by Dickel et al. (2016). 1112 The Dickel heart enhancer compendium (Dickel et al., 2016) is based on H3K27ac
1113 ChIP-seq and Ep300 antibody ChIP-seq. Most compendium regions had low enhancer scores
1114 and have low validation frequency in transient transgenic assays. Compendium regions were
1115 compared to heart Ep300fb regions using the LiftOver tool to map human genome coordinates
1116 to mouse. A. Venn diagram showing that 13% of prenatal heart compendium regions overlap
1117 with heart Ep300 regions. On the other hand, 47% of Ep300fb heart regions overlap with heart
- 51 - 1118 compendium regions, and 53% do not. B. Analysis of the fraction of compendium regions that
1119 overlap the Ep300fb heart regions, as a function of enhancer score. Regions with higher
1120 enhancer score (and higher likelihood of in vivo validation) corresponded well with heart Ep300
1121 regions.
1122
1123 Figure 3 -- figure supplement 3. Comparison of enhancer prediction based on indicated 1124 chromatin features of E12.5 heart. 1125 Data are from ENCODE except for ATAC-seq (Pu lab, unpublished), and Ep300fb (this
1126 manuscript). The KNN classifier was trained on a subset of the VISTA enhancer database and
1127 the validated on the remaining data, using a 10x cross-validation design. Ep300 was the most
1128 accurate single predictor, based on the area under the receiver operating characteristic curve
1129 (AUC, indicated in parentheses). All of the features together (All), or just Ep300 plus ATAC-seq
1130 together (not shown), gave the best prediction.
1131
1132 Figure 4 -- figure supplement 1. Cre-dependent Ep300 bioChIP-seq. 1133 A. Cre-dependent expression of BirA. Heart with or without cardiac-specific TNTCre was
1134 immunoblotted for expression of BirA from the Rosa26fsBirA allele. B. Distribution of Ep300fb
1135 bioChIP-seq signal when driven by Tie2Cre or Myf5Cre. Regions with Ep300 peaks called in
1136 samples from any of the three Cre drivers were scored for their EC- or muscle- enriched signal
1137 by calculating ratio of the region's signal in Tie2Cre or Myf5Cre bioChIP-seq to its signal with
1138 ubiquitous Ep300 bioChIP-seq (EIIaCre). The Tie2Cre-enrichment score showed a bimodal
1139 distribution with a small peak at values greater than 1.5. A value of greater than 1.5 similarly
1140 was used to define Myf5Cre-enriched regions. C. Relationship of Tie2Cre-enriched (p300-T-fb)
1141 and Myf5Cre-enriched (p300-M-fb) regions to peaks called from Tie2Cre or Myf5Cre samples
1142 alone.
- 52 - 1143
1144 Fig. 5 -- figure supplement 1. Relationship of EC gene expression to Ep300 regions in 1145 ECs, skeletal muscle, and whole embryo. 1146 A. EC enrichment of actively translating genes that neighbored Ep300 regions in skeletal
1147 muscle (p300-M-fb), whole embryo (p300-E-fb), and endothelial cell/blood (p300-T-fb), using
1148 several different thresholds for the maximum distance allowed to associate genes to regions.
1149 The relationship between Ep300-T-fb regions and gene expression was not dependent upon
1150 the threshold used. The rightmost panel (no limit to maximum distance) is the same as in
1151 Figure 5A. Mann-Whitney U-test. B. EC expression of actively translating genes that
1152 neighbored Ep300-T-fb, compared to those that did not, using several different thresholds for
1153 the maximum distance used associate genes to regions. Ep300-T-fb-associated genes were
1154 more highly expressed regardless of the distance threshold used. Ep300-T-fb genes were
1155 significantly more highly expressed (Mann-Whitney U-test). C. Fraction of genes expressed or
1156 not expressed, for Ep300-T-fb-associated genes compared to all genes. Ep300-T-fb
1157 associated genes were more frequently expressed than non-associated genes. However, not
1158 all genes associated with Ep300-T-fb regions were expressed, and some genes without
1159 Ep300-T-fb regions were expressed. This may reflect additional mechanisms of gene
1160 regulation as well as limitations of the gene-to-region association rule additional rule.
1161
1162 Fig. 5 -- figure supplement 2. Transient transgenic assays to measure in vivo activity of 1163 candidate endothelial cell/blood enhancers. 1164 For each tested enhancer with positive endothelial cell or blood activity, we show the
1165 candidate enhancer’s Ep300 signal in skeletal muscle (p300-M-fb), whole embryo (p300-E-fb),
1166 and endothelial cell/blood (p300-T-fb), with respect to the gene both. The whole mount images
1167 show each of the positive Xgal-stained embryos.
- 53 - 1168
1169 Fig. 5 -- figure supplement 3. Histological sections of transient transgenic embryos. 1170 Histological sections of embryos containing the indicated enhancer-driven lacZ transgene.
1171 AS: aortic sac; CV: cardinal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein;
1172 PA: pulmonary artery; RA: right atrium; RV: right ventricle. Bar, 100 µm.
1173
1174 Fig. 6 -- figure supplement 1. Motifs with enriched Ep300 occupancy in Tie2Cre or 1175 Myf5Cre embryo samples. 1176 A. De novo motif discovery. The top 10 discovered motifs are shown, with the significance
1177 P-value, percent of regions containing the motif, and the best matching known motif class. B.
1178 Conservation of Tie2Cre-enriched motifs (dashed lines) or Myf5Cre-enriched motifs (solid
1179 lines) within 100bp of Ep300-T-fb or Ep300-M-fb peak summits. The distribution of
1180 conservation scores for each indicated motif is shown as a cumulative frequency plot. Random
1181 indicates randomly sampled 12 nt regions from within the Ep300-T-fb or Ep300-M-fb regions.
1182
1183 Figure 7 -- figure supplement 1. Organ-specific EC gene expression. 1184 Genes with at least 4-fold differential mean expression between highest and lowest EC
1185 samples were identified from microarray data (Nolan 2013). These genes were grouped by
1186 k-means clustering. Genes in clusters containing selective heart or lung expression were
1187 analyzed further. Expression of these heart or lung EC genes is displayed in panel B using
1188 hierarchical clustering.
1189
1190 Fig. 7 -- figure supplement 2. VEcad-CreERT2 activation of Ep300 biotinylation. 1191 A. VEcad-CreERT2 activated Ep300 biotinylation by Rosa26fsBirA, permitting its pulldown
1192 on streptavidin (SA) beads. B. EC-selective activation of the Rosa26mTmG Cre reporter allele
- 54 - 1193 by VEcad-CreERT2. Tamoxifen was administered at P1 and P2, and tissues were analyzed at
1194 6 weeks of age.
1195
1196 Figure 7 -- figure supplement 3. Ep300fb bioChIP-seq signal in whole organ or ECs. 1197 p300fb bioChIP-seq signal in whole organ ("Con"; EIIaCre) or ECs ("EC"; VEcad-CreERT2)
1198 at genes with EC or non-EC expression in lung or heart.
1199
1200 Figure 7 -- figure supplement 4. Gene functional terms enriched in decile 1 or decile 10 1201 of Ep300-VE-fb regions. 1202 GO biological process (top) or disease ontology (bottom) terms enriched in decile 1 or
1203 decile 10 of Ep300-VE-fb regions. Grey indicates no significant enrichment. The top five
1204 biological process terms in deciles 1 and 10 and selected relevant disease ontology terms
1205 were are displayed.
- 55 - bio 12 kb A B 6.4 kb C Ep300fb/fb D X S 3’ UTR S X Exon 31 BirA 5’ probe 3’ probe Rosa26 survival to weaning Ep300fb Ep300 genomic (n=57) Ep300 flbio PGK-Neo S X S S biotin X S Input Non-bound Bound targeting construct BirA FRT FRT fb/fb wt Homologous recombination Ep300 High affinity 4.5 kb 4.3 kb S X S S BirA Streptavidin X S Rosa26BirA Pull-down Ep300fb-neo FLP recombinase GAPDH fb/+
Ep300fb
Fig. 1 A B P0 heart lysate
fb-neo/+ fb-neo/+ +/+ fb-neo/+ fb-neo/+ +/+ Ep300 WT fb/+ fb/fb Full-length Ep300 12 kb 6.4 kb IB: Ep300 300 kd 4.3 4.5 Loading ctrl Coomassie Blue XbaI, 5’ probe SacI, 3’ probe
10 week old
C Ep300+/+ Ep300 fb/fb D 6 n.s. E 60 n.s. 5 50 4 40 3 30
5 5 FS (%) 2 20 5 9
HW/BW (mg/g) 1 10 0 0 fb/fb fb/fb
Ep300 wt Ep300 Ep300 wt Ep300 Figure 1 -- figure supplement 1. Characterization of Ep300fb allele. A. Southern blot genotyping of Ep300fb allele. Probes are marked in Fig. 1B. B. The expression level of tagged Ep300 protein is similar to that of wild-type Ep300 in neo- natal mouse hearts. C, D. The heart weight to body weight ratio of 10 week old Ep300fb/fb heart was not signifi- cantly different from littermates. E. Echocardiograms of 10 week old Ep300fb/fb mice show that their ventricular function was not significantly different from littermate controls. 3 4
Sample sizes are shown in bars in D,E. Bars represent mean ± standard deviation. n.s., not significantly different (t-test). B A 0 Fig. 2 1.00 0.96 0.68 0.68 rep1 (25855) Ep300_Encode rep1 (77906) Ep300 Ep300_fb_rep1 shared (48963) Ep300 fb 0.70 0.96 1.00 0.71
Ep300_fb_rep2 fb 48963 (93.8%) 0.5 0.68 1.00 0.91 0.70 Ep300_Encode_rep1 rep2 (19638) Ep300_Encode 15281 (77.8%) rep2 (52174) Ep300 0.68 1.00 0.71 0.91 Ep300_Encode_rep2 shared (15281) Ep300_EncodeE 13698 (89.6%) 1
Ep300_fb_rep1Ep300_fb_rep2Ep300_Encode_rep1Ep300_Encode_rep2 fb
Encode specific Peaks (1583) C Union Ep300Peaks minus input Shared Peaks (13698) p300flbio-specific Peaks (35265) ChIP -2k 0 RP10M 0 2k 5 D ES_P300_Encode (RP10M) ES_P300_Encode (RP10M) ES_P300_Encode (RP10M) 10 15 20 10 15 20 10 15 20 0 5 0 5 0 5 p300 0 0 0 Encode specificPeaks ES_Ep300 ES_Ep300 ES_Ep300 Shared Peaks fb 5 5 -specific Peaks 5 10 10 10 fb fb fb (RP10M) (RP10M) (RP10M) 15 15 15 20 20 20 A Forebrain Heart B H-rep1 H-rep2 rep 1 rep 2 input rep 1 rep 2 input 35306 28944 (82%) Brain specific 80 (6923) 70 Shared (3147) heart 9255 19689 4250 60 apex
50
40 1813 78
signal (RP10M) Heart specific fb 194 H-Visel 30 (32190) 1466 3597 Ep300 20 E12.5 Ep300flbio/+ 10 Rosa26-BirA bioChIP-seq 0 -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb -1kb +1kb
-log10(P-value) FB-rep2 0 5 10 15 20 25 30 35 40 45 50 10735 C cell junction organization FB-rep1 cell junction assembly 5823 (86%) cell substrate adhesion Heart ventricular cardiac muscle tissue dev. cell substrate junction assembly specific cardiac muscle tissue growth peaks reg. of gen. of precursor metabolites & energy adherens junction organization 318 3757 4238 muscle cell proliferation forebrain cardiac muscle cell proliferation
CNS neuron differentiation 12 1251 telencephalon dev. Forebrain glial cell diff. pallium dev. specific oligodendrocyte diff. 502 gliogenesis 966 peaks stem cell maintenance hindbrain dev. FB-Visel somatic stem cell maint. 2759 cerebellum dev.
Fig. 3 A B Srf: Preferential Cardiac Expression FB1 FB2 Heart1 Heart2 46,690 kb 46,700 kb Ep300-fb Heart [0 - 30] 0.04 0.01 0.93 1.00 Heart2 Ep300-fb Forebrain [0 - 30] Ep300-Visel Heart [0 - 15] Ep300-Visel Forebrain [0 - 15] 0.05 0.02 1.00 0.93 Heart1 known enhancers PMID: 21415370 Cul9 Srf Ptk7
0.96 1.00 0.02 0.01 FB2 Nes: Brain Expression 87,770 kb 87,780 kb Ep300-fb Heart [0 - 15] Ep300-fb Forebrain [0 - 15] 1.00 0.96 0.05 0.04 FB1 Ep300-Visel Heart [0 - 15] Ep300-Visel Forebrain [0 - 15] known enhancers spearman correlation PMID: 9917366 0 1 Crabp2 Nes C D Forebrain H3K27ac Heart H3K27ac Rep 1 Rep 2 Rep 1 Rep 2
160
Heart Forebrain (13018)
Ht_Ep300_1
120 Shared (14602)
0.94 Ht_Ep300_2
80
0.62 0.62 Ht_K27ac_1
Heart Reads per 10 Million (40735) 40 0.63 0.63 0.98 Ht_K27ac_2
Reads per 10 Million Reads 0 -2 +2 -2 +2 -2 +2 -2 +2 kb from peak center
Figure 3 -- figure supplement 1. Heart and forebrain Ep300 bioChiP-seq. A. Ep300 bioChIP-seq signals within called Ep300 peaks were highly correlated between biological replicates and discordant between heart and forebrain. B. Genome browser views showing Ep300 bioChIP-seq signal in heart and brain at Srf (preferential cardiac expression) and Nes (preferential Brain expression). C. Comparison of Ep300 bioChIP-seq and H3K27ac ChIP-seq signals in heart. Numbers indicate correlation coefficients. There was excellent correlation between biological replicates of Ep300 or H3K27ac. Ep300 and H3K27ac also showed highly significant genome-wide correlation. Similar results were observed in forebrain (data not shown). D. Tag heatmap of H3K27ac-bound regions showing overlap between heart and brain H3K27ac enriched regions from each tissue. Compendium A genic assays.A.Venn diagramshowingthat47%ofEp300 regions hadlowenhancerscoresandarenotexpectedtovalidateintransienttrans was basedonH3K27acChIP-seqandEp300antibodyChIP-seq.Mostcompendium compendium pendium ofheartenhancersassembledbyDickeletal.(2016).This Figure 3--figuresupplement2.ComparisonofheartEp300 that overlaptheEp300 with heartcompendiumregions. 63070 63070 fb heartregions,asafunctionofenhancerscore. 9438 9438 Ep300 10752 10752 B. Analysis ofthefractioncompendiumregions fb Heart
% of Compendium Peaks B 0.00 0.25 0.50 0.75 1.00
Enhancer Score 0 0.2 Compendium Ep300 fb
heartregionsoverlap 0.4 fb fb
0.6 Heart regionstoacom 0.8 1.0 0 5 10 15 20 25 30 35 40 45 50 55 60
Number of Regions (x 1000) - - prediction. ses). All ofthe featurestogether(All),orjustEp300plus ATAC-seq together (notshown),gavethebest predictor, based ontheareaunderreceiveroperating characteristiccurve(AUC,indicatedinparenthe validated ontheremainingdata,usinga10x cross-validationdesign.Ep300wasthemostaccuratesingle (this manuscript). The KNNclassifierwastrainedonasubsetoftheVISTA enhancerdatabaseandthe features ofE12.5heart.DataarefromENCODEexceptfor ATAC-seq (Pulab,unpublished) andEp300fb Figure 3--figuresupplement3.Comparison ofenhancerpredictionbasedonindicatedchromatin 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Ep300 bioChip-seq(AUC=0.805) 0 0 0 1-specificity sensitivity 0 H3K4me1 (AUC=0.589) H3K9me3 (AUC=0.512) 0.2 0.4 0.6 ROC curvecalculatedusing10xcross-validation Machine learningwithweightedKNNmethod 0.8 1.0 0 ATAC-seq (AUC=0.749) H3K4me2 (AUC=0.646) H3K27ac (AUC=0.747) 0.2 0.4 0.6 0.8 1.0 P300 + ATAC (AUC=0.866) H3K27me3 (AUC=0.608) H3K27me3 0 H3K4me3 (AUC=0.637) 0.2 0.4 0.6 ideal test 0.8 1.0 uninformative test H3K36me3 (AUC=0.579) H3K36me3 0 H3K9ac (AUC=0.540) 0.2 All (AUC=0.862) 0.4 0.6 0.8 1.0 - A B Ep300flbio/+ E11.5 C CAG + + – R26fsBirA/+ stop BirA BirA/+ Rosa26fsBirA – – + R26 (ca) Ep300fb – + – Tie2Cre Cre IB: Ep300, P300 SA IP
CAG biotin IB: Ep300, BirA BirA input (5%) PECAM
fb/+ Ep300 o P-val (-log10) D fsBirA F Rosa26 5Cr e 0 26 mbr y Tie2Cr e My f E Tie2Cre EIIa-Cre Myf5Cre GO Term E11.5 E11.5 E13.5 skeletal system dev. cell proliferation neural tube development bioChIP-seq neg. reg. of glial cell prolif. stem cell differentiation ventricular septum morph. T3 T1 T2 E3 E1 E2 M3 M1 M2 regulation of DNA binding heart morphogenesis 0.23 0.15 0.15 0.66 0.66 0.65 0.78 0.89 1.00 M2 somatic stem cell maint. regulation of binding
M1 0.17 0.09 0.09 0.65 0.65 0.64 0.77 1.00 0.89 muscle cell differentiation (Tie2Cre) HA-BirA skeletal muscle tissue dev. M3 0.20 0.13 0.12 0.65 0.66 0.65 1.00 0.77 0.78 skeletal muscle organ dev. muscle organ development 0.30 0.21 0.20 0.88 0.92 1.00 0.65 0.64 0.65 muscle cell development E2 myotube differentiation satellite cell differentiation E1 0.30 0.21 0.21 0.89 1.00 0.92 0.66 0.65 0.66 regulation of myotube diff. neg. reg. of phosphorylation E3 0.30 0.21 0.21 1.00 0.89 0.88 0.65 0.65 0.66 neg. reg. of protein phosph. angiogenesis T2 0.86 0.93 1.00 0.21 0.21 0.20 0.12 0.09 0.15 actin cytoskeleton org. myeloid leukocyte diff. myeloid cell diff. T1 0.86 1.00 0.93 0.21 0.21 0.21 0.13 0.09 0.15 reg. of vasc. dev. reg. of erythrocyte diff. T3 1.00 0.86 0.86 0.30 0.30 0.30 0.20 0.17 0.23 reg. of angiogenesis erythrocyte homeostasis DAPI PECAM HA-BirA Spearman correlation megakaryocyte dev. 0 1.0 erythrocyte diff.
E Gata2: EC and Blood Lineages 16 kb 88,146 kb 88,148 kb 88,150 kb 88,152 kb 88,154 kb 88,156 kb 88,158 kb 88,160 kb
Ep300-M-fb [0 - 100] Ep300-E-fb [0 - 100] Ep300-T-fb [0 - 100] Ep300-M-fb peak p300-T-fb peak known enhancer PMID17395646 Gata2
MyoD: Skeletal Muscle Lineage 29 kb 53,610 kb 53,620 kb 53,630 kb
Ep300-M-fb [0 - 60] Ep300-E-fb [0 - 60] Ep300-T-fb [0 - 60] Ep300-M-fb peak Ep300-T-fb peak
known enhancer PMID1315077 Myod1
Fig. 4 A Rosa26-fsBirA/+ P0 heart TNT-Cre - + 37 IB: BirA 2xHA-BirA 25 Ponceau S
B Histogram of Tie2Cre-enriched regions Histogram of Myf5Cre-enriched regions 6000
6000 >1.5 = Ep300-T-fb >1.5 = Ep300-M-fb 4000 # of Regions Frequency 2000 2000 0 0
0 1 2 3 4 5 0 1 2 3 4 5 Ratio of ChIP signal in Tie2Cre vs Embryo Ratio of ChIP signal in Myf5Cre vs Embryo C Tie2Cre Myf5Cre p300-T-fb p300-M-fb Peaks Peaks
489 2407 7 5548 1151 141
Figure 4 -- figure supplement 1. Cre-dependent Ep300 bioChIP-seq. A. Cre-dependent expression of BirA. Heart with or without cardiac-specific TNTCre was immunoblotted for expres- sion of BirA from the Rosa26fsBirA allele. B. Distribution of Ep300fb bioChIP-seq signal when driven by Tie2Cre or Myf5Cre. Regions with p300 peaks called in samples from any of the three Cre drivers were scored for their EC- or muscle- enriched signal by calculating ratio of the region's signal in Tie2Cre or Myf5Cre bioChIP-seq to its signal with ubiquitous Ep300 bioChIP-seq (EIIaCre). The Tie2Cre-enrichment score showed a bimodal distribution with a small peak at values greater than 1.5. A value of greater than 1.5 similarly was used to define Myf5Cre- enriched regions. C. Relationship of Tie2Cre-enriched (Ep300-T-fb) and Myf5Cre-enriched (Ep300-M-fb) regions to peaks called from Tie2Cre or Myf5Cre samples alone. D A EC ± Blood NS P<10-38 4 Blood 3 Only
2 Egfl7_E1 Eng Mef2c Sema6d Egfl7_E3 3/9 3/12 5/10 7/17 2/12 1 RNA-seq T2-TRAP/Input Lmo2 0 10/29
EIIaCre Myf5Cre Tie2Cre
p300 bioChIP-seq Ephb4 Notch1_E1 Apln Dab2 Sox7 4/7 3/22 3/10 9/24 2/7 B E Endocardium and endocardial cushion Artery and vein Egfl7 Notch1 Egfl7 Notch1 Ins hsp68 Ins AS DA candidate lacZ EC LV DA region CV
CV EC transient RV transgenic LV LA assay (E11.5) RV
C 20 Ep300(T-fb) Artery selective Vein selective regions tested Sema6d Sox7 Ephb4 Mef2c EC ± Bld, CV DA DA 3+ embryos HV CV 8 EC ± neg CV Bld 9 HV 10 AS LA DA DA EC ± Bld, Bld only 2 embryos 1 2
Fig. 5 fb A Max distance from TSS to Ep300 peak centerE < 2 kb < 10 kb < 100 kb No Limit NS P<10-38 NS P<10-38 NS P<10-38 NS P<10-38 4 4 4 4
3 3 3 3
2 2 2 2
1 1 1 1 RNA-seq T2-TRAP/Input 0 0 0 0
EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre EIIaCre Myf5Cre Tie2Cre
B P<10-4 P<10-4 P<10-4 P<10-4 C 15
100
10
50 [FPKM+1]) 2 % of genes
(log 5
T2-TRAP Expression T2-TRAP 0 No All < 2 < 10 < 100 Limit Genes Max distance 0 TSS to Ep300-T-fb peak center < 2 kb < 10 kb < 100 kb No Limit Max distance FPKM >= 1 threshold TSS to Ep300-T-fb peak center FPKM < 1 threshold Ep300fb-associated not Ep300fb-associated Fig. 5 -- figure supplement 1. Relationship of EC gene expression to Ep300 regions in ECs, skeletal muscle, and whole embryo. A. EC enrichment of actively translating genes that neighbored Ep300 regions in skeletal muscle (Ep300-M-fb), whole embryo (Ep300-E-fb), and endothelial cell/blood (Ep300-T-fb), using several different thresholds for the maximum distance allowed to associate genes to regions. The relation- ship between Ep300-T-fb regions and gene expression was not dependent upon the threshold used. The rightmost panel (no limit to maximum distance) is the same as in Figure 5A. Mann-Whitney U-test. B. EC expression of actively translating genes that neighbored Ep300-T-fb, compared to those that did not, using several different thresholds for the maximum distance used associate genes to regions. Ep300-T-fb-associated genes were more highly expressed regardless of the distance threshold used. Ep300-T-fb genes were significantly more highly expressed (Mann-Whitney U-test). C. Fraction of genes expressed or not expressed, for Ep300-T-fb-associated genes compared to all genes. Ep300-T-fb associated genes were more frequently expressed than non-associated genes. However, not all genes associated with Ep300-T-fb regions were expressed, and some genes with- out Ep300-T-fb regions were expressed. This may reflect additional mechanisms of gene regulation as well as limitations of the gene-to-region association rule additional rule. Apln Lmo2 30 kb 97 kb 45,360 kb 45,370 kb 45,380 kb 103,730 kb 103,740 kb 103,750 kb 103,760 kb 103,770 kb 103,780 kb 103,790 kb 103,800 kb 103,810 kb 103,820 kb [0 - 100] p300-M-fb [0 - 150] [0 - 100] p300-M-fb p300-E-fb [0 - 150] [0 - 100] p300-E-fb p300-T-fb [0 - 150] TestedTested regionregion p300-T-fb Apln Tested region Lmo2
Dab2 Mef2c 409 kb 167 kb 6,000 kb 6,100 kb 6,200 kb 6,300 kb 6,400 kb 83,640 kb 83,660 kb 83,680 kb 83,700 kb 83,720 kb 83,740 kb 83,760 kb 83,780 kb 83,800 kb
[0 - 50] [0 - 100] p300-M-fb p300-M-fb [0 - 50] [0 - 100] p300-E-fb p300-E-fb [0 - 100] [0 - 50] p300-T-fb p300-T-fb Tested region Tested region Dab2 Mef2c
Egfl7 Notch1 26,320 kb 26,330 kb 46 kb 26,340 kb 26,350 kb 26,426 kb 26,428 kb 26,430 kb 26,432 kb 26,434 kb 26,436 kb23 kb 26,438 kb 26,440 kb 26,442 kb 26,444 kb 26,446 kb 26,448 kb
[0 - 200] [0 - 100] p300-M-fb p300-M-fb [0 - 200] [0 - 100] p300-E-fb p300-E-fb [0 - 200] [0 - 100] p300-T-fb p300-T-fb Tested region Tested region Enh1 Enh2 Enh3 Egfl7
Egfl7_Enh1 Egfl7_Enh3 Notch1
Enh2 no consistent endothelial cell activity
Eng Sema6d 50 kb 32,500 kb 32,510 kb 32,520 kb 32,530 kb 32,540 kb 124,380 kb 124,400 kb 124,420 kb 117 kb124,440 kb 124,460 kb 124,480 kb
[0 - 150] [0 - 150] p300-M-fb [0 - 150] p300-M-fb [0 - 150] p300-E-fb p300-E-fb [0 - 150] [0 - 150] p300-T-fb p300-T-fb Tested region Tested region Eng Sema6d
Ephb4 Sox7 29 kb 16 kb 137,790 kb 137,800 kb 137,810 kb 64,564 kb 64,566 kb 64,568 kb 64,570 kb 64,572 kb 64,574 kb 64,576 kb 64,578 kb
[0 - 100] [0 - 100] p300-M-fb p300-M-fb [0 - 100] [0 - 100] p300-E-fb p300-E-fb [0 - 100] [0 - 100] p300-T-fb p300-T-fb Tested region Tested region Sox7 Ephb4
Fig. 5 -- figure supplement 2 Fig. 5 -- figure supplement 2. Transient transgenic assays to measure in vivo activity of candidate endothelial cell/blood enhancers. For each tested enhancer with positive endothelial cell or blood activity, we show the candidate enhancer’s Ep300 signal in skeletal muscle (Ep300-M-fb), whole embryo (Ep300-E-fb), and endothelial cell/blood (Ep300-T-fb), with respect to the gene both. The whole mount images show each of the positive Xgal-stained embryos. Apln-Enh Dab2-Enh Egfl7_Enh1 Egfl7_Enh3 Eng-Enh Ephb4-Enh HV CV CV HV DA HV DA DA DA CV DA PA
Lmo2-Enh Mef2c-Enh Notch1_Enh1 Sema6d-Enh Sox7-Enh
HV DA RA EC EC EC DA CV RV RV RV AS
Fig. 5 -- figure supplement 3. Histological sections of transient transgenic embryos. Histological sections of embryos containing the indicated enhancer-driven lacZ transgene. AS: aortic sac; CV: cardi- nal vein; DA: dorsal aorta; EC: endocardial cushion; HV: head vein; PA: pulmonary artery; RA: right atrium; RV: right ventricle. Bar, 100 µm. + A Enrichment % Motif GO Terms B Ep300-T-fb Regions Ep300-M-fb Regions Tie2Cre Myf5Cre 30 Tie2Cre-enriched motifs (****) Myf5Cre-enriched motifs (****) Random Random 25 Tie2Cre Myf5Cre Tie2Cre Myf5Cre Vasc. Blood Heme Sk. M. Vasc. Blood Heme Sk. M.
2 Motif Name
bits 1
A SIX 20 GA G GA C G TT 2 TA C CCTG GAAT C 0 A GTT
TC T C bits 1 2 Ebox:HOX T G GC A T G GA C A C T T AT G T A T A CTGA T CG CG C A C C A T CGA GG GC 0 A C
C 15 bits 1 A T T G Ebox:HOMEO TG A GT G G G GG A C AGC C C T G G C CT AT T T A C C TG C GACG T C C A AGTC TGA CA 0
G Density CA TG AT 2
bits 1 2
TBOX1 10
C CTTG T GCT G C TACC AGT 0 AG A 1 bits G T RBPJ:Ebox 2 G C A G A A G G G GAA GAA A CC G C CC C A AT CGT T T CTCGGCT 0 TT GAT A GG T 5 G A CA G bits 1 HOX 2 GAC A T T A T CGCT T C C GAC T T GT TG ACT 0 C A A CA
bits 1 A T AT HOMEO 0 T C T 2 C TC A AG G G G 0 ATT ATCTCT AA bits 1 C C PRDM9 0.0 0.2 0.4 0.6 0.8 1.0 2 0.0 0.2 0.4 0.6 0.8 1.0 AT G AT T GA G T G C T C C T G AT C ATAT A C GTATGA G C CA 0 AGTA T bits 1 C C RUNX Conservation score Conservation score G T 2 TCT C G T G CA GA C CGGATCA 0 G min promoter T T bits 1 TG G RBPJ C 3x enhancer fragment 2 T C TG C G A G C T ATCA 0 C CA
bits 1 TTC CA Luciferase 2 GATA C G
A AG A C GTC CAC 0 AT G
1 GATAA bits 2 ETS G CC T A C GG C CA TAG C TA GATG T CT AAT A 0 GAT Mef2c mut bits 1 G ETS:HOMEO2 2
A C T GG A C GA A T GG G C C GC AT TCAT TCCTCTA 0 CA AG
bits 1 ETS-Fox2 (Mef2c) GGA ETS:FOX2 * 2 G T GT G C CC C A GC GA A G C T AT C CC C T GTT GC AATAT T A 0 ACCA TGA bits 1 GGA ETS-Fox3 (Bcl2l1) * 2 ETS:FOX1 A TG TT C T T G G G AC A TGA T CG AG C T A GT CA CGA CAT 0 AAG GG bits 1 ETS-Homeo2 (Egfl7-Enh1) * 2 Tbx:ETS A A AT T G TC C CAG G GG TG G ATC C CCT CA AATT TA 0 G C bits 1 GGA ETS:HOMEO1 ETS-Homeo2 (Egfl7-Enh3) * 2 T C G G A CA A G CT AA C A G T AT G T A A CG T CG GC GC A AG C GG AT A C C T GATCACCT TCAT 0 A T GG bits 1 ETS-Homeo2 (Kdr) 2 SOX * C GT CT A TG CGG 0 CCTAT T
1 bits TTGT ETS-Homeo2 (Flt4) 2 ETS:FOX1’ * G CT TC T AG CG T CA G G A C C A G TA AG T CC T CC TC CGA T A A G T T 0 AAT GG G bits 1 ETS-Ebox (Apln) 2 AP1 * A G T GA A C CT C GA G T CT CTACACGA 0 GT C
1 bits T A T A ETS:FOX3 ETS-Ebox (Zfp521) * A 2 G C T TT GT CT GG G G C CCC T G G CT CT C A G C AA T CA AAA CGA AGAAT 0 T G CG bits 1 Sox (Apln) 2 T KLF A T
T A A A TT G TC ATCGA * A CA G G 0 G G C bits 1 G GG NR Sox (Robo4) 2 * T T GG CA G C TC A C G GT T G G A C AT AC T C TTT T ATA GTAA GAG CAT 0 GG C A G CA A G bits 1 ZNF Sox (Sema3g) 2 AC CA GT GT G TG C ACA * C 0 T
bits 1 G G CC MITF pGL3-SV40-enh 2 GCT AT G CA T G G GG TG TAG C C T CA * AA T 0 CGT AC C G bits 1 GATA:SCL C 2 GG C C T G A G A C G G T C A T T T G T T CGAA T T C CGA 1 2 3 4 5 6 7 8 9 A 0 T 10 11 12 13 14 15 16 17 18 19 20 C Ceqlogo 13.03.16 13:22 0 50 100 150 200 250 300 bits 1 TG GATA A ETS:Ebox CG 2 CA T CG GTC G CGAT 0 TA Fold Activation A AG TG
A bits 1 GGA C FOX 2 CT GA CT 0 CT GTACT 3x motif min promoter
bits 1 D TG T Ebox 2
T T G A T C CGC A TGT C A G A 0 GC CA TG bits 1 MYB Luciferase T C 2 C A T G T A G C A AAT 0 C
bits 1 AAC G STAT T A 2 A
G GT CA GAATCATTC CG 0 CC GG TT AA bits 1 2 T MEF2 G A A TA C T GTC GACCGT AT 0 CC AATT GG Neg control 1 bits A A T ATF catatctagaa 2
A TCTGAC T GGA TC A CC C GG CGACATAAT GA TGG G C 0 GA TC
A bits 1 ETS only T SMAD catatctGGAA 2 C G
GC G CGAT CT 0 GA
bits 1 ETS-Fox1 GTCT TBOX2 CCGGATGTT * 2
AT AA C C C GT GGA T T T CCG GACAA 0 T C T
A bits 1 T C Nanog ETS-Fox2 GTAAACAGGAAGT * 2 A GGG T A A AT C T A AG CT GA T TGTA TC 0 CC AT AG bits 1 GC C G NFAT ETS-Fox3 * A T 2 TGTTTACGGAAGT GA CA CC C TGTC CGTGA 0 T A bits 1 GGAA NF1 ETS-Homeo2 * 2 T A T A CT C G T AACCGGAAGT A A T T GA A ACGGA TGAC GCT CTCG 0 T C G A bits 1 GG CC ETS-Ebox CT GC TEAD 2 CA C * AT 0 TA T AGGAAACAGCTG GGAAT bits 1 MEIS1 T C ETS-Tbx G GG C C A T A C G TC A 0 A G * TGACA CACACCGGAAG Significance % Motif pos. % Top 20 pGL3-SV40-enh * (neg ln P-val) regions GO BP Terms 0 5 10 15 20 25 30 Fig. 6. 5 100 1164 0 67% 0 65% Fold Activation A Ep300-T-fb Ep300-M-fb
-ln P % of Best -ln P % of Best -value Targets Match -value Targets Match CGCA G CG GAGT A A A GAC G TTC TT C T C CG AT C G T A CT T A C G CACCTA T T A G TA 879 48.7 GATA CAGCTG 1049 68.0 Ebox A CA G C G G T GG TC T CT G AAGGAATA T 855 57.0 ETS ATGGAATCC 111 32.9 TEAD A GC A G CA GCG TGT AT A A A G A G T A C A GA TG CA CG C C T T C T CT A G ACAA T 199 19.0 Sox A ATT AT 58 31.5 PBX:Homeo A G C GCG GA T T C G GT A AC AA A A T C G A T AT G GTC TTACCCA CTGA 149 14.7 ETS:Ebox GCCA ATG 53 21.6 Ebox
C AG G T A CG A T T C G T TT A A T T AGA A CCT C A T A G C A T ATGATTCC G 105 16.9 Fox C CCTG CC 51 5.7 Ebox A C G GAT C T T C G G C C C AG CAT CA TGGG GTA 84 15.9 KLF AAATGTCA 50 28.2 Nuc. Recept. GA C G G CC T C A G A C T T T T GAAG CG A A GTATCT C A T GT A A 83 11.8 AP1 G TTA 41 27.7 Homeo
A AT A CAA C CTC TG CGA T C CG A GT C C A C A A A G T A GTGTGG TG C A T CGTATTTATTAG 69 3.9 Mef2 35 3.2 Hox G G ATG A T C G G TA A T TT A T A C C C AT G G C T A AG C T C C T G GG A G CC G T A G CT TATGCATTAC 61 10.3 ATF CCCTA AAATA 32 3.4 Mef2
C G A C C T TG G T T T C A GAA ACATCCGC 46 16.5 Ebox CTGT AG 32 8.6 Meis1
Tie2Cre-enriched B Ep300-T-fb Regions Ep300-M-fb Regions motifs Gata 1.0 1.0 ETS ETS:HOMEO2 ETS:FOX2 0.9 0.9 ETS:FOX1 Tbx:ETS ETS:HOMEO1 SOX 0.8 0.8 ETS:FOX1’ AP1 ETS:FOX3 0.7 0.7 KLF NR ZNF MITF 0.6 0.6 Myf5Cre-enriched motifs 0.5 0.5 Six Ebox:HOX Cumulative frequency Ebox:HOMEO 0.4 0.4 Tbox1 RBPJ:Ebox HOX 0.3 0.3 HOMEO PRDM9 RUNX 0.2 0.2 RBPJ 0.0 0.5 1.0 0.0 0.5 1.0 Conservation score Conservation score Random Fig. 6 -- figure supplement 1. Motifs with enriched p300 occupancy in Tie2Cre or Myf5Cre embryo samples. A. De novo motif discovery. The top 10 discovered motifs are shown, with the signifi- cance P-value, percent of regions containing the motif, and the best matching known motif class. B. Conservation of Tie2Cre-enriched motifs (dashed lines) or Myf5Cre-enriched motifs (solid lines) within 100 bp of Ep300-T-fb or Ep300-M-fb peak summits. The distribution of conservation scores for each indicated motif is shown as a cumulative frequency plot. Random indicates randomly sampled 12 nt regions from within the Ep300-T-fb or Ep300-M-fb regions. A B C VEcad-CreERT2; p300fb/fb; Tissue Specific EC Expression Profiles Rosa26fsBirA (Nolan et al., Dev Cell, 2013) Ep300 signal (RP10M)
0 90
Heart ECs Lung ECs Relative Gene Expr (log2 Ht/Lu) # Peaks assoc with tissue-specific genes Rep 1 Rep 2 Rep 3 Rep 1 Rep 2 Rep 3 Decile -1.0 0.0 1.0 Decile 0 200 400 *** *** Heart 1 Heart 1 Heart Lung Lung *** *** 2 2 *** 3 3 *** 4 4
Ep300 Regions, *** Ranked 5 5
by Ht/Lu *** 6 6 Signal Ratio *** *** 7 7 *** *** 8 8 *** *** 9 9 *** *** Lung 10 10
Distance from Ep300 peak center (±1 kb) ***, P<0.0001, Wilcoxon ***, P<0.0001, Fisher Exact
Neg Log Bonf P-value D 10 Foreground: D1 D10 16 449 60 Background: D10 D1
2
1 bits TEAD:SOX 2 G T G G A C A G GAGT A G T A G C C AT T C C A G C G C A C C TA T T T AT G A C TT GAT AT CT CT 1 0 2 3 4 5 6 7 8 9 G 10 11 12 13 14 15 16 17 18 C19 1 2 T bits ACAA G AT ETS:FOX2 G GC T T G C G AT C C C G C A T T CG A C T GAGTT TA GATGT C A 1 0 2 3 4 5 6 7 8 9 TA
1 10 11 12 13 14
bits A ETS 2 ACAG
A A GA A C AGT T C G C C G TC T A AT T C G CG G A 1 2 C3 4 5 6 7 8 9 0
10 1 2 C G ETS:HOMEO1 bits G G AA T C G G A C C A G T T G C G AG G TG C C GAC AC C A A T G TG CT ACGCT C C TGATG A TAAC CT A 1 0 2 3 4 5 6 7 8 9
1 A 10 11 12 13 14 15 16 17 18 bits 2 T ETS:TBOX2 G T A A C T T A G G CG CA C T G T TAT GT TA C T AG C AA TA G T G T G G G C GCG CA CAC CA 2 3 4 5 6 7 8 9 0 1 G 10 11 12 13 14 15 16 17 1 T bits 2 C AG G ETS:FOX1 C AG T GC TCT T AG G AT CC A G A AGA G G A 2 3 4 5 6 7 8 9 0 1 A T 1 A 10 11 12 13 bits 2 T NFAT:AP1 C GT A G A G AAT GA GA A A T G C C CTG G C G AT C G A A C CC T G G A CG T A G C C C C CG T G CT G T GT T C GTC G 2 3 4 5 6 7 8 9 0 1 A
10 11 12 13 14 15 16 T17 18 19 20 A 1 A G bits 2 AA T A FOX C C CT T T G A G G A 1 2 0 3 4 5 6 7 8 9 T
1 10 11 12 bits TCF3 2 C T C G G ATC A CGGG GCCG CA T T 1 0 2 3 4 T5 6 7 8 9 T
10 T TTTA 1 2 A bits A CA HOXB4 A G AG A G CC A T T C C C AT CTGTT GAC TTG T G AG A CG 1 0 2 3 T4 5 G6 7 8 9 T TG 10 11 12 1 bits FOXH1 2 A T
A GG CC C C A GT C GAC 4 5 6 7 8 9 0 1 2 3 TG
10 11 12 A T 1 2 G T bits T T GATA CA TCGGG CAT CCTAT C 1 G 2 3 4 5 6 7 8 A 0 1 bits A MEF2 2 A T G A AA T C T GTC GACCT AT T G G TT G 2 3 4 5 6 7 8 9 1 A 0
C A 10 1 C bits 2 AA AG LHX GA C C 2 3 4 5 6 7 8 T 1 0 C GG 1 bits EBOX 2 TA TA A C
A G C GT 1 0 2 3 4 5 C6 G7 8 9
10 AT 1 G bits 2 C ETS:EBOX A C A G G G G T T C C C C CAT G C CGT 1 C G 2 3 4 5 6 A7 8 9 0 CA T G A T 10 11 12 13 14 15 16 1 G bits EBOX:HOMEO 2 T TG A G GT G G G A C G A G GC C C T G G C T T C T T A A C C T GCT C GGAC C A AGTC TA C G G A G C 3 4 5 6 7 8 9 0 1 2 A A 10 11 12 13 14 15 16 17 18 19 20 1 G bits 2 AT CA T NKX6.1 A T G G TG G C ACA CT 1 0 2 3 4 5 6 7 8 1 bits G PAX3-FKHR 2 G C G C AT T C T C G A GA G GA TACTG AG 2 3 4 5 6 7 8 9 0 1 A
T 10 11 12 13 14 15 TAAT 1 GCA GA bits 2 T C A T EBOX2 GA T C AT C TCA T 2 3 4 5 6 7 8 A 0 1 1 bits TBOX 2 C CG G C CTTG T GCT C GA T TA C GT A C 1 0 2 3 4 5 6 7 A8 1 G bits 2 A GT EBF G TA GC A CG T AT G AT GT 2 3 4 5 6 7 8 9 0 1 C T A 1 10 11 12 bits NR 2 T C G A
T T GG CA G C A C GTTC T GG G A C A AC T C TTT TTATA GTAA GAG CAT C C G G 1 0 2 3 4 5 C6 7 8 9
10 11 12 C13 14 1 GG A AGG A NF1 bits T G C C C G A A 1 Fig. 7. 0 T2 G3 G4 C5 A6 G7 8 A B z-score across tissues z-score across tissues -4.5 6.9 −6 0 6 -0.7
Clusters
clusters with heart and lung selective expression t s i er M ain BM B lo m rai n estis Li v Lung B r es t Glom Live r Hea r Lun g T B G Hear t T Spleen Muscle Splee n Muscl e
Figure 7 -- figure supplement 1. Organ-specific EC gene expression. Genes with at least 4-fold differential mean expression between highest and lowest EC samples were identified from microarray data (Nolan 2013). These genes were grouped by k-means clustering. Genes in clus- ters containing selective heart or lung expression were analyzed further. Expression of these heart or lung EC genes is displayed in B using hierarchical clustering. A B Rosa26mTmG; VECad-CreERT2 mGFP Pecam1 m-Tomato Heart
heart lung VEcad-Cre – + – +
300 kd Bronchus
SA pull-down, Ep300 IB Artery
Bronchus Lung
Bronchus
Artery
Fig. 7 -- figure supplement 2. VEcad-CreERT2 activation of Ep300 biotinylation. A. VEcad-CreERT2 activated Ep300 biotinylation by Rosa26fsBirA, permitting its pull- down on streptavidin (SA) beads. B. EC-selective activation of the Rosa26mTmG Cre reporter allele by VECad-CreERT2. Tamoxifen was administered at P1 and P2, and tissues were analyzed at 6 weeks of age. Tbx3: Lung ECs
297 kb
120,000 kb 120,100 kb 120,200 kb
[0 - 100] HtCon1
[0 - 100] HtEC1
[0 - 100] HtEC2
[0 - 100] HtEC3
[0 - 100] LuCon1
[0 - 100] LuEC1
[0 - 100] LuEC2
[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Tbx3 Tbx2 and Tbx4: Lung non-ECs 139 kb
85,640 kb 85,660 kb 85,680 kb 85,700 kb 85,720 kb 85,740 kb 85,760 kb
[0 - 100] HtCon1
[0 - 100] HtEC1
[0 - 100] HtEC2
[0 - 100] HtEC3
[0 - 100] LuCon1
[0 - 100] LuEC1
[0 - 100] LuEC2
[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Tbx2 Tbx4 Meox2: Heart ECs
236 kb
37,700 kb 37,720 kb 37,740 kb 37,760 kb 37,780 kb 37,800 kb 37,820 kb 37,840 kb 37,860 kb 37,880 kb 37,900 kb 37,920 kb
[0 - 100] HtCon1
[0 - 100] HtEC1
[0 - 100] HtEC2
[0 - 100] HtEC3
[0 - 100] LuCon1
[0 - 100] LuEC1
[0 - 100] LuEC2
[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks Meox2 Myh6: Heart non-ECs
61 kb
55,550 kb 55,560 kb 55,570 kb 55,580 kb 55,590 kb 55,600 kb
[0 - 100] HtCon1
[0 - 100] HtEC1
[0 - 100] HtEC2
[0 - 100] HtEC3
[0 - 100] LuCon1
[0 - 100] LuEC1
[0 - 100] LuEC2
[0 - 100] LuEC3 Ht_EC_peaks Lu_EC_peaks
Efs Il25 Cmtm5 Myh6 D830015G02Rik Myh7
Figure 7 -- figure supplement 3. Ep300fb bioChIP-seq signal in whole organ ("Con"; EIIaCre) or ECs ("EC"; VEcad-CreERT2) at genes with EC or non-EC expression in lung or heart. Neg Log10 Bonferonni P-value
3.4 13.9 4.3 25.9 Dec 1 (Ht) Dec 10 (Lu) vasculature development blood vessel development reg. of protein modification process anatom. struct. form. involved in morph. in utero embryonic development cardiovascular system development blood vessel morphogenesis
GO Biol. Process regulation of vasculature development cerebral arterial disease essential hypertension persistent fetal circulation syndrome pulmonary hypertension hypertension congestive heart failure myocardial infarction arteriopathy arteriosclerosis Disease Ontology arterial occlusive disease coronary heart disease
Figure 7 -- figure supplement 4. GO biological process (top) or disease ontol- ogy (bottom) terms enriched in decile 1 or decile 10 of Ep300-VE-fb regions. Grey indicates no significant enrichment. The top five biological process terms in deciles 1 and 10 and selected relevant disease ontology terms were are displayed.