Mapping Cell Type-Specific Transcriptional Enhancers Using High Affinity, 8 Lineage-Specific Ep300 Biochip-Seq 9
Total Page:16
File Type:pdf, Size:1020Kb
1 2 3 4 5 6 7 Mapping cell type-specific transcriptional enhancers using high affinity, 8 lineage-specific Ep300 bioChIP-seq 9 10 Running title: Mapping cell type-specific enhancers 11 12 Pingzhu Zhou1*, Fei Gu1*, Lina Zhang2, Brynn N. Akerberg,1 Qing Ma1, Kai Li1, Aibin He1,3, 13 Zhiqiang Lin1, Sean M. Stevens1, Bin Zhou4, and William T. Pu1,5 14 *contributed equally to this study 15 16 1Department of Cardiology, Boston Children’s Hospital, 300 Longwood Ave, Boston, MA 17 02115, USA 18 2Department of Biochemistry, Institute of Basic Medicine, Shanghai University of Traditional 19 Chinese Medicine, Shanghai 201203, China 20 3Institute of Molecular Medicine, Peking-Tsinghua Center for Life Sciences, Peking University, 21 Beijing 100871, China 22 4State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai 23 Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China 24 5Harvard Stem Cell Institute, Harvard University, 1350 Massachusetts Avenue, Suite 727W, 25 Cambridge, MA 02138, USA 26 27 28 - 1 - 29 Abstract 30 Understanding the mechanisms that regulate cell type-specific transcriptional programs 31 requires developing a lexicon of their genomic regulatory elements. We developed a 32 lineage-selective method to map transcriptional enhancers, regulatory genomic regions that 33 activate transcription, in mice. Since most tissue-specific enhancers are bound by the 34 transcriptional co-activator Ep300, we used Cre-directed, lineage-specific Ep300 biotinylation 35 and pulldown on immobilized streptavidin followed by next generation sequencing of 36 co-precipitated DNA to indentify lineage-specific enhancers. By driving this system with 37 lineage-specific Cre transgenes, we mapped enhancers active in embryonic endothelial 38 cells/blood or skeletal muscle. Analysis of these enhancers identified new transcription factor 39 heterodimer motifs that likely regulate transcription in these lineages. Furthermore, we 40 identified candidate enhancers that regulate adult heart- or lung- specific endothelial cell 41 specialization. Our strategy for tissue-specific protein biotinylation opens new avenues for 42 studying lineage-specific protein-DNA and protein-protein interactions. 43 44 Keywords: transcriptional enhancer; endothelial cell; angiogenesis; Ep300 45 - 2 - 46 Introduction 47 The diverse cell types of a multicellular organism share the same genome but express 48 distinct gene expression programs. In mammals, precise cell-type specific regulation of gene 49 expression depends on transcriptional enhancers, non-coding regions of the genome required 50 to activate expression of their target genes (Visel et al., 2009a; Bulger and Groudine, 2011). 51 Enhancers are bound by transcription factors and transcriptional co-activators, which then 52 contact RNA polymerase 2 engaged at the promoter, stimulating gene transcription. 53 Enhancers are nodal points of transcriptional networks, integrating multiple upstream 54 signals to regulate gene expression. Because enhancers do not have defined sequence or 55 location with respect to their target genes, mapping enhancers is a major bottleneck for 56 delineating transcriptional networks. Recently chromatin immunoprecipitation of enhancer 57 features followed by sequencing (ChIP-seq) has been used to map potential enhancer. DNase 58 hypersensitivity (Crawford et al., 2006; Thurman et al., 2012), H3K27ac (histone H3 acetylated 59 on lysine 27) occupancy (Creyghton et al., 2010; Nord et al., 2013), or H3K4me1 (histone H3 60 mono-methylated on lysine 4) occupancy (Heintzman et al., 2007) are chromatin features that 61 have been used to identify cell-type specific enhancers. While most enhancers are DNase 62 hypersensitive, DNase hypersensitive regions are often not active enhancers (Crawford et al., 63 2006; Thurman et al., 2012). H3K27ac is enriched on cell type-specific enhancers (Creyghton 64 et al., 2010; Nord et al., 2013), but may be a less accurate predictor of enhancers than other 65 transcriptional regulators (Dogan et al., 2015). Chromatin occupancy of Ep300, a 66 transcriptional co-activator that catalyzes H3K27ac deposition, has been found to accurately 67 predict active enhancers (Visel et al., 2009b). However, antibodies for Ep300 are marginal for 68 robust ChIP-seq, particularly from tissues, leading to low reproducibility, variation between 69 antibody lots, and inefficient enhancer identification (Gasper et al., 2014). - 3 - 70 Mammalian tissues are composed of multiple cell types, each with their own 71 lineage-specific transcriptional enhancers. Thus defining lineage-specific enhancers from 72 mammalian tissues requires developing strategies that overcome the cellular heterogeneity of 73 mammalian tissues, particularly when the lineage of interest comprises a small fraction of the 74 cells in the tissue. Past efforts to surmount this challenge have taken the strategy of purifying 75 nuclei from the cell type of interest using a lineage-specific tag. For instance, nuclei labeled by 76 lineage-specific expression of a fluorescent protein have been purified by FACS (Bonn et al., 77 2012). This method is limited by the need to dissociate tissues and recover intact nuclei, and 78 by the relatively slow rate of FACS and the need to collect millions of labeled nuclei. To 79 circumvent the FACS bottleneck, cell type-specific overexpression of tagged SUN1, a nuclear 80 envelope protein, has been used to permit affinity purification of nuclei (Deal and Henikoff, 81 2010; Mo et al., 2015). Although this mouse line was reported to be normal, SUN1 82 overexpression potentially could affect cell phenotype and gene regulation (Chen et al., 2012). 83 Chromatin from isolated nuclei are then subjected to ChIP-seq to identify histone signatures of 84 enhancer activity. However, as noted above histone signatures may less accurately predict 85 enhancer activity compared to occupancy by key transcriptional regulators (Dogan et al., 86 2015). 87 Here, we report an approach to identify murine enhancers active in a specific lineage within 88 a tissue. We developed a knock-in allele of Ep300 in which the protein is labeled by the bio 89 peptide sequence (de Boer et al., 2003; He et al., 2011). Cre recombinase-directed, cell type 90 specific expression of BirA, an E. coli enzyme that biotinylates the bio epitope tag (de Boer et 91 al., 2003), allows selective Ep300 ChIP-seq, thereby identifying enhancers active in the cell 92 type of interest. Using this strategy, we identified thousands of endothelial cell (EC) and 93 skeletal muscle lineage enhancers active during embryonic development. Extending the - 4 - 94 approach to adult organs, we defined adult EC enhancers, including enhancers associated 95 with distinct EC gene expression programs in heart compared to lung. Analysis of motifs 96 enriched in EC or skeletal muscle lineage enhancers predicted novel transcription factor motif 97 signatures that govern EC gene expression. 98 Results 99 Efficient identification of enhancers using Ep300fb bioChIP-seq 100 We developed an epitope-tagged Ep300 allele, Ep300fb, in which FLAG and bio epitopes 101 (de Boer et al., 2003; He et al., 2011) were knocked into the C-terminus of endogenous Ep300 102 (Fig. 1A-B and Figure 1 -- figure supplement 1A). Transgenically expressed BirA (Driegen et 103 al., 2005) biotinylates the bio epitope, permitting quantitative Ep300 pull down on streptavidin 104 beads (Fig. 1C). We have not noted abnormal phenotypes. Heart development and function 105 are sensitive to Ep300 gene dosage (Shikama et al., 2003; Wei et al., 2008), yet Ep300fb/fb 106 homozygous mice survived normally (Fig. 1D) and Ep300fb/fb hearts expressed normal levels of 107 Ep300 and had normal size and function (Figure 1 -- figure supplement 1B-E). These data 108 indicate that Ep300fb is not overtly hypomorphic. 109 To evaluate Ep300fb-based mapping of Ep300 chromatin occupancy, we isolated 110 embryonic stem cells (ESCs) from Ep300fb/fb; Rosa26BirA/BirA mice. We then performed Ep300fb 111 biotin-mediated chromatin precipitation followed by sequencing (bioChiP-seq), in which high 112 affinity biotin-streptavidin interaction is used to pull down Ep300 and its associated chromatin 113 (He et al., 2011). Biological duplicate sample signals and peak calls correlated well (93.6% 114 overlap; Spearman r=0.96; Fig. 2A-B). We compared the results to publicly available Ep300 115 antibody ChIP-seq data generated by ENCODE (overlap between duplicate peaks 77.8%; 116 r=0.91; Fig. 2A-B). Ep300 bioChiP-seq identified 48963 Ep300-bound regions ("p300 regions") 117 shared by both replicates, compared to 15281 for Ep300 antibody ChIP-seq (Fig. 2A,C). The - 5 - 118 large majority (89.6%) of Ep300 regions detected by antibody were also found by Ep300 119 bioChiP-seq, and Ep300 signal was substantially stronger using bioChiP-seq (Fig. 2A,C,D). 120 These data indicate that Ep300fb bioChiP-seq has improved sensitivity compared to Ep300 121 antibody ChIP-seq for mapping Ep300 chromatin occupancy in cultured cells. 122 Identification of tissue-specific enhancers using Ep300fb bioChIP-seq 123 We used Ep300fb/+; Rosa26BirA/+ mice to analyze Ep300fb genome-wide occupancy in 124 embryonic heart and forebrain. We performed bioChiP-seq on heart and forebrain from 125 embryonic day 12.5 (E12.5) embryos in biological duplicate (Fig. 3A-B). There was high 126 reproducibility (83% and 93%, respectively) between biological duplicates (Fig. 3B and Figure 127 3 -- figure supplement