Supplementary text for “Tissue-specific Targeting of Cell Fate Regulatory Genes by E2f Factors”, by Lisa M. Julian et al.

Table of Contents

ChIP antibodies 2

ChIP-chip and microarray design 2

ChIP-chip data analysis 2

Association between E2f3 and Ctcf 3

Supplementary text references 4 ChIP antibodies We performed ChIP-PCR and ChIP-chip using antibodies SC-878 and SC-866 against E2f3 and E2f4, respectively. These antibodies have been successfully used in previous ChIP-based experiments1-5, and we have validated their specificity here (Fig.S1,S2) and previously2. ChIP-PCR analyses show that they result in significant protein enrichment (relative to pre-immune IgG) at known E2f target promoters (Thymidine Kinase (TK1) and p107), but not at the Chrna1 promoter, which serves as a negative control2,6 in NPCs (Figure S1A, S1B). We have confirmed that NPCs deficient in one E2f3 isoform do not up-regulate the remaining isoform or other E2f factors2, and that the C-terminal specific antibody used in these experiments precipitates both E2f3 isoforms in our cultures with comparable efficiencies (Fig.1B and2).

ChIP-chip and microarray design In a set of pilot ChIP-chip experiments where we surveyed E2f binding sites along all non-repetitive sequences of mouse chromosome 7, we estimated that 89% of E2f3- and 85% of E2f4-bound sites are located between 3.5kb upstream and 1.5kb downstream of a transcriptional start site (TSS) (Figure S2A, S2B). As the vast majority of E2f binding sites are close to TSSs, we designed tiling arrays containing DNA probes spanning 5kb upstream to 3kb downstream of the TSS of approximately 28,000 well curated mouse transcripts, as defined by the RefSeq database. This strategy gave us extensive coverage of promoters and neighbouring sequences, and allowed us to identify the location of E2f3 and E2f4 binding sites at the vast majority of known promoter regions in the mouse genome.

Based on the results of the pilot experiments, we designed microarrays that contained approximately one million DNA probes representing 5kb upstream to 3kb downstream of the TSS of all known mouse transcripts, based on the mm9 mouse genome assembly. A total of 24,654 unique regions were surveyed. The 60-mer probes were typically tiled at a density of 5 per kilobase pair of DNA sequence. UCSC gene annotations, extracted from the UCSC genome browser7, were used to identify all known transcripts.

ChIP-chip data analysis ChIP-chip microarrays were scanned on an Agilent scanner at 2 microns resolution. The background- subtracted median fluorescence intensity measurements from three biological replicates (independent neurosphere cultures) were analyzed in CisGenome8,9. After quantiles normalization, regions of significant enrichment in the immunoprecipitation samples (“peaks” of E2f3 binding) were identified using a moving average analysis of the IP over input ratio. Regions with a fold-enrichment of two and a false-discovery rate of 10% or less were deemed to represent bona fide E2f3 binding sites. This FDR cut-off has been used in a number of previous studies with a similar experimental protocol to successfully identify transcription factor binding sites for E2fs and other factors10-12. Enriched peaks located within a 1000bp region were merged into a single peak, as they were likely to result from a single binding event. E2f target genes were defined as the gene whose TSS is closest to each enriched peak. To ensure that high stringency binding sites were reported, the results shown for each E2f protein are the average of three biologically independent replicate experiments, where a particular peak must be enriched in at least two separate experiments to be considered a valid binding event. We decided to focus exclusively on physiologically-relevant target genes by ensuring that all targets included in our isoform-dependent data sets are also bound by E2f3 in WT cells. Therefore, we have excluded any target genes that may be aberrantly bound in the absence of one E2f3 isoform by the remaining isoform (we identified 537 and 93 such peaks in E2f3a-/- and E2f3b-/- cells, respectively (Figure 1C)).

Association between E2f3 and Ctcf Since high phylogenetic conservation of DNA sequences has been associated with genes involved in regulating development13,14, we examined the average sequence conservation of gene promoters bound by E2f3 and/or Ctcf (in E14.5 brain) across mammalian species using the CisGenome program. We found that the mean conservation score of gene promoters is highest for those that are bound by both proteins, compared to promoters bound by either factor alone (Figure S6).

It has been proposed that one of the roles of Ctcf in gene regulation is to facilitate long-range chromosomal interactions in cis, whereby distant gene enhancers are brought in close spatial proximity to the proximal promoters of the genes they regulate15-17, forming so-called tissue-specific enhancer- promoter units (EPUs)15. In a given cell type, a gene promoter may establish contacts with multiple enhancers, providing a richness of regulatory signals15. Analysis of genome-wide predictions of E14.5 mouse brain EPUs generated from chromatin state maps15 revealed that while the average promoter engages in 1.3 EPUs in this tissue, promoters bound by Ctcf participate in 2.2 EPUs on average (Figure S7). Interestingly, gene promoters bound by E2f3 but not Ctcf are part of an average of 1.6 EPUs, and this proportion jumps to 2.4 EPUs for genes jointly targeted by E2f3 and Ctcf. Because enhancer- promoter interactions are especially important in controlling the expression of developmental/ cell fate genes18, and the more enhancers a promoter interacts with the higher the likelihood that the corresponding gene is functionally associated with developmental processes (Table S7), we take this observation to suggest that the targets of E2f3 that are co-bound by Ctcf are more likely to be involved in developmental processes. Supplementary Text References 1. Asp, P., Acosta-Alvear, D., Tsikitis, M., van Oevelen, C. & Dynlacht, B. D. E2f3b plays an essential role in myogenic differentiation through isoform-specific gene regulation. Genes Dev 23, 37–53 (2009). 2. Julian, L. M. et al. Opposing Regulation of Sox2 by Cell-Cycle Effectors E2f3a and E2f3b in Neural Stem Cells. Stem Cell 12, 440–452 (2013). 3. Mcclellan, K. A. et al. Unique Requirement for Rb/E2F3 in Neuronal Migration: Evidence for Cell Cycle-Independent Functions. Mol Cell Biol 27, 4825–4843 (2007). 4. Xu, X. et al. A comprehensive ChIP chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members. Genome Research 17, 1550–1561 (2007). 5. Ren, B. E2F integrates cell cycle progression with DNA repair, replication, and G2/M checkpoints. Genes Dev 16, 245–256 (2002). 6. Blais, A. et al. An initial blueprint for myogenic differentiation. Genes Dev 19, 553–569 (2005). 7. Kent, W. J. et al. The human genome browser at UCSC. Genome Research 12, 996–1006 (2002). 8. Ji, H. et al. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol 26, 1293–1300 (2008). 9. Ji, H. & Wong, W. H. TileMap: create chromosomal map of tiling array hybridizations. Bioinformatics 21, 3629–3636 (2005). 10. Liu, Y., Chu, A., Chakroun, I., Islam, U. & Blais, A. Cooperation between myogenic regulatory factors and SIX family transcription factors is important for myoblast differentiation. Nucleic Acids Research 38, 6857–6871 (2010). 11. Jin, V. X., Rabinovich, A., Squazzo, S. L., Green, R. & Farnham, P. J. A computational genomics approach to identify cis-regulatory modules from chromatin immunoprecipitation microarray data--A case study using E2F1. Genome Research 16, 1585–1595 (2006). 12. Vokes, S. A., Ji, H., Wong, W. H. & McMahon, A. P. A genome-scale analysis of the cis- regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb. Genes Dev 22, 2651–2663 (2008). 13. Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004). 14. Woolfe, A. et al. Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development. PLoS Biol 3, e7 (2005). 15. Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012). 16. Splinter, E. CTCF mediates long-range chromatin looping and local histone modification in the beta-globin locus. Genes Dev 20, 2349–2354 (2006). 17. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012). 18. Whyte, P. et al. Association between an oncogene and an anti-oncogene: the adenovirus E1A proteins bind to the retinoblastoma gene product. Nature 334, 124–129 (1988).