SUPPLEMENTAL INFORMATION

SetDB1 Contributes to Repression of Encoding Developmental Regulators and Maintenance of ES Cell State

Steve Bilodeau, Michael Kagey, Garrett M. Frampton, Peter B. Rahl, Richard A. Young

CONTENTS

Supplemental Tables

Supplemental Figures

Supplemental Data Files

Supplemental Experimental Procedures

Growth Conditions for Embryonic Stem Cells High-Throughput shRNA Screening Library Design and Lentiviral Production Lentiviral Infections Immunofluorescence Image Acquisition and Analysis Validation of shRNAs Lentiviral Production and Infection Immunofluorescence Chromatin Immunoprecipitation ChIP-Seq Sample Preparation and Analysis Sample Preparation Polony Generation and Sequencing ChIP-Seq Data Analysis ChIP-Seq Density Heatmaps Heatmap display of similarity between genomic occupancy of multiple factors Ontology Analysis Comparison to Previous Results Comparison of H3K9me3 ChIP-Seq datasets H3K9me3 Specificity SetDB1 Antibody Specificity PCR primers (mm8) RNA Extraction, cDNA, and TaqMan Expression Analysis Microarray Expression Analysis

Supplemental Discussion

1 H3K9 Associated Validation SetDB1 Validation Criteria for Identifying Chromatin Regulators Screening Results Comparison Comments on Design and Saturation of Screen Comments on Chromatin Regulators Identified in the Screen H3K9 Methylation Associated HDAC Associated Polycomb Repressive Complex Cohesin Complex Members Uncharacterized Methyltransferases H3K4 Methyltransferases Histone Chaperones SWI/SNF Arginine Methylation H3K36 Methylation Associated

Supplemental References

Supplemental Tables

Supplemental Table 1 – Results of chromatin regulator shRNA screen Supplemental Table 2 – Z-scores of shRNAs used in the screen Supplemental Table 3 – Genes bound by SetDB1, H3K9me3, H3K4me3, H3K27me3, H3K36me3, H3K79me2, Pol2, Oct4, Sox2, Nanog and Tcf3 in mES cells Supplemental Table 4 – H3K9me3 at DNA repeats Supplemental Table 5 – SetDB1 at DNA repeats Supplemental Table 6 – Summary of ChIP-Seq data used Supplemental Table 7 – Genomic regions bound by H3K9me3 (3 worksheets) in mES cells Supplemental Table 8 – Genomic regions bound by SetDB1 in mES cells Supplemental Table 9 – Summary of analysis results for Fig. 2B Supplemental Table 10 – Complete ES cell data Supplemental Table 11 – Gene expression data for Fig. 3G Supplemental Table 12 – RSA screen analysis Supplemental Table 13 – Comparison of the different published screen Supplemental Table 14 – Normalized ChIP-Seq data for H3K9me3 in shRNA GFP and shRNA SetDB1 mES cells

Supplemental Figures

Supplemental Figure 1 – Validation of H3K9 Associated shRNAs. (A) The two best shRNAs targeting SetDB1, Suv39h2, Ehmt1 and Ube2i/Ubc9 identified in the screen result in efficient knockdown and also a decrease in Oct4 expression.

2 Murine ES cells were split off of MEF feeder cells, infected with the indicated shRNA knockdown lentivirus or a shRNA GFP control lentivirus. Expression levels were determined by real-time qPCR. RNAi Consortium Oligo ID (TRC) numbers are shown for each shRNA. shRNA sequences are available from Open Biosystems.

Supplemental Figure 2 – Validation of SetDB1 shRNAs and loss of ES cell state. (A) SetDB1 knockdown with shRNA #1 and #2 results in morphological changes and decreased Oct4 staining intensity. Murine ES cells were split off of MEF feeder cells, infected with the indicated SetDB1 knockdown lentivirus or a shRNA GFP control lentivirus. Cells were crosslinked and stained with Hoechst and for Oct4. Z-scores from the screening results and RNAi Consortium Oligo ID (TRC) numbers are shown for each shRNA. shRNA sequences are available from Open Biosystems. (B) SetDB1 knockdown results in decreased expression of Oct4. Real-time qPCR indicates a loss of Oct4 expression following infection of mES cells with shRNAs #1 and #2 targeting SetDB1. (C) SetDB1 knockdown in 129/cast mES cells results in loss of Oct4.

Supplemental Figure 3 – Specificity of SetDB1 . Western blot of (A) SetDB1 SC-66884 (H-300, Santa Cruz) and (B) SetDB1 11231-AP (Proteintech Group) using nuclear extract of mES cells infected with a GFP or SetDB1 shRNA. (C) ChIP PCR using both SetDB1 antibodies at Polrmt, Nnat and Mybl2. (D) ChIP PCR with SetDB1 H-300 antibody in mES cells infected with a GFP or SetDB1 shRNA.

Supplemental Figure 4 – Reproducibility and specificity of ChIP-Seq experiments with H3K9me3 antibodies. (A) Venn diagram representation of genes occupied by histone H3K9me3 in mES cells profiled with two antibodies (Ab8898 and Up07-442) and in previously published data (Mikkelsen et al., 2007). (B, C, D) Examples of H3K9me3 histone modification profiles for the three datasets. (E, F, G) Peptide competition assays to validate antibody specificity. Each antibody binding was competed with unmodified H3, H3K9me3, H3K4me3 and H3K27me3. (H, I, J) Comparison of H3K9me3 and H3K4me3. (K, L, M) Comparison of H3K9me3 and H3K27me3.

Supplemental Figure 5 – SetDB1 and H3K9me3 co-occupy a common set of genes. Venn diagram representation of genes called ± 5kb from the TSS.

Supplemental Figure 6 – SetDB1 dependent H3K9me3. (A) Analysis of normalized H3K9me3 ChIP-Seq in mES cells infected with GFP or SetDB1 shRNAs. Genes that are bound by SetDB1 and H3K9me3 were classified based on their relative level of H3K9me3 following SetDB1 knockdown. The percentage indicates the fraction of SetDB1 and H3K9me3 occupied genes. (B, C) SetDB1 and H3K9me3 profiles showing a loss of H3K9me3 or (D) no change.

3 Supplemental Figure 7 – Supplemental gene tracks. SetDB1 and histone H3K4me3, H3K27me3 and H3K9me3 occupancy profiles for (A) Olig2, (B) Sumo3, (C) Jmjd1a and (D) Pramel1.

Supplemental Figure 8 – Occupancy of H3K4me3, H3K27me3 and H3K9me3 at nucleosomes adjacent to the transcription start site. (A to C) H3K4me3, H3K27me3 and H3K9me3 occupancy profiles for Pax7, Lmx1b, Dlx1. Boxes represent nucleosomes with evidence of all three marks.

Supplemental Data Files

The following files contain data formatted (.WIG) for upload into the UCSC genome browser (Kent et al., 2002). To upload the files, first copy the files onto a computer with internet access. Then use a web browser to go to http://genome.ucsc.edu/cgi-bin/hgCustom?hgsid=105256378 for mouse. In the “Paste URLs or Data” section, select “Browse…” on the right of the screen. Use the pop-up window to select the copied files, then select “Submit”. The upload process may take some time.

MM8_mES_H3K9me3_AB8898.WIG.gz MM8_mES_H3K9me3_UP7442.WIG.gz MM8_mES_SetDB1_H300.WIG.gz MM8_mES_WT_shRNAGFP_shRNASetDB1_normalized _H3K9me3.WIG.gz

These files present ChIP-Seq data for H3K9me3 with two different antibodies and SetDB1 in mES cells. They also include normalized ChIP-Seq data for H3K9me3 for cells infected with GFP and SetDB1 shRNAs. The first track for each data set contains the ChIP-Seq density across the genome in 25bp bins. The minimum ChIP-Seq density shown in these files is 1.5 reads per million total reads. Subsequent tracks identify genomic regions identified as enriched.

Supplemental Experimental Procedures

Growth Conditions for Embryonic Stem Cells

4

V6.5 and 129/Cast hybrid murine embryonic stem (mES) cells were grown on irradiated murine embryonic fibroblasts (MEFs) unless otherwise stated. Cells were grown under standard mES cell conditions as described previously (Boyer et al., 2005). Briefly, cells were grown on 0.2% gelatinized (Sigma, G1890) tissue culture plates in ESC media; DMEM-KO (Invitrogen, 10829-018) supplemented with 15% fetal bovine serum (Hyclone, characterized SH3007103), 1000 U/mL LIF (ESGRO, ESG1106), 100 μM nonessential amino acids (Invitrogen, 11140-050), 2 mM L-glutamine (Invitrogen, 25030-081), 100 U/mL penicillin, 100 μg/mL streptomycin (Invitrogen, 15140-122), and 8 nL/mL of 2-mercaptoethanol (Sigma, M7522).

High-Throughput shRNA Screening

Library Design and Lentiviral Production

Small hairpins targeting 197 chromatin regulators were designed and cloned into pLKO.1 lentiviral vectors as previously described (Moffat et al., 2006). On average 5 different shRNAs targeting each chromatin regulator were used. Lentiviral supernatants were arrayed in 384-well plates with negative control lentivirus (shRNAs targeting GFP, RFP and LacZ) (Moffat et al., 2006).

Lentiviral Infections

Murine ES cells were split off MEFs and placed in a tissue culture dish for 45 minutes to selectively remove the MEFs. Murine ES cells were counted with a Coulter Counter (Beckman) and seeded using a μFill (Bioteck) at a density of 1500 cells/well in 384-well plates (Costar 3712) treated with 0.2% gelatin (Sigma, G1890). An initial cell plating density of 1500 cells/well was established so that an adequate amount of cells would survive puromycin selection for analysis. However, the initial cell plating density was kept low enough to avoid wells reaching confluency during the timeframe of the assay. One day following cell plating the media was removed, replaced with ESC media containing 8 μg/ml of polybrene (Sigma, H9268-10G) and cells were infected with 2 µl of shRNA lentiviral supernatant from the chromatin regulator set. Infections were performed in quadruplicate on separate plates. Control wells on each plate were mock infected and designated as “Empty”. Positive control wells on each plate were infected with 3 µl of validated control shRNA lentiviral supernatant targeting Oct4 (TRCN0000009613), Tcf3 (TRCN0000095454) and Stat3 (TRCN0000071454) that was generated independently of the chromatin regulator set. Plates were spun for 30 minutes at 2150 rpm following infection. Twenty-four hours post infection cells were treated with 3.5 μg/ml of puromycin (Sigma, P8833) in ESC media to select for stable integration of the shRNA construct. ESC media with

5 puromycin was changed daily. Five days post infection cells were crosslinked for 15 minutes with 4% paraformaldehyde (EMS Diasum, 15710).

Immunofluorescence

Following crosslinking, the cells were washed once with PBS, twice with blocking buffer; PBS with 0.25% BSA (Sigma, A3059-10G) and then permeabilized for 15 minutes with 0.2% Triton X-100 (Sigma, T8797-100ml). After two washes with blocking buffer cells were stained overnight at 4ºC for Oct4 (Santa Cruz Biotechnology, sc-5279; 1:100 dilution) and washed twice with blocking buffer. Cells were incubated for 4 hours at room temperature with goat anti-mouse- conjugated Alexa Fluor 488 (Invitrogen; 1:200 dilution) and Hoechst 33342 (Invitrogen; 1:1000 dilution). Finally, cells were washed twice with blocking buffer and twice with PBS before imaging.

Image Acquisition and Analysis

Image acquisition and data analysis were performed essentially as previously described (Moffat et al., 2006). Stained cells were imaged on an Arrayscan HCS Reader (Cellomics) using the standard acquisition camera mode (10X objective, 9 fields). Hoechst was used as the focus channel. Objects selected for analysis were identified based on the Hoechst staining intensity using the Target Activation Protocol and the Fixed Threshold Method. Parameters were established requiring that individual objects pass an intensity and size threshold. The Object Segmentation Assay Parameter was adjusted for maximal resolution between individual cells. Following object selection, the average Oct4 pixel staining intensity was determined per object and then a mean value for each well was calculated. Image acquisition for a well continued until at least 2500 objects were identified, the entire well (9 fields) was imaged or less than 20 objects were identified for three fields imaged in a row. To account for viability defects or low titer lentivirus a shRNA was excluded from subsequent analysis if less than 250 objects were identified for any one of the 4 replicates. The 250 identified objects threshold was determined based on the average number of identified objects for the “Empty” (no virus) wells (mean: 53.4, standard deviation: 49.3). To normalize for plate effects, a Z-score based on the Oct4 staining intensity was calculated for each well using the following negative control infections, five different shRNAs targeting GFP, two different shRNAs targeting RFP and two different shRNAs targeting LacZ on each plate. There were a total of either 18 or 22 wells infected with negative control shRNAs on each 384-well plate. The average Oct4 staining intensity for the negative control infected wells was calculated along with a standard deviation to give an estimation of the amount of the signal variability. The average Oct4 staining intensity for all the negative control infected wells on a plate and the standard deviation were utilized to calculated a Z-score for every well on the plate. The Z-scores for the four quadruplicate infections were averaged for a final Z-score for every shRNA. Representative control 384-well plate images (shRNAs targeting Oct4, Stat3, Tcf3 and GFP) were exported

6 (Cellomics Software), converted from DIBs to TIFs (CellProfiler, http://www.cellprofiler.org), and manipulated with Photoshop CS3 Extended.

Validation of shRNAs

Lentiviral Production and Infection

Lentivirus was produced according to Open Biosystems Trans-lentiviral shRNA Packaging System (TLP4614). The shRNA constructs and sequences targeting murine Oct4 (TRCN0000009613), Tcf3 (TRCN0000095454), Stat3 (TRCN0000071454), Suv39h2 (TRCN0000092814 and TRCN0000092815), Ehmt1 (TRCN0000086071 and TRCN0000086068), Ube2i/Ubc9 (TRCN0000040839 and TRCN0000040841), Cbx3/HP1 (TRCN0000071038 and TRCN0000071038) and SetDB1 (#1, TRCN0000092975; #2, TRCN0000092973; #3 TRCN0000092974; #4 TRCN0000092977; #5, TRCN0000092976) are available from Open Biosystems. The shRNA targeting GFP (TRCN0000072201, Hairpin Sequence: gtcgagctggacggcgacgta) was one of the negative controls included on all plates for the screen.

For validation of the SetDB1 shRNAs, mES cells were split off MEFs and placed in a tissue culture dish for 45 minutes to selectively remove the MEFs. Murine ES cells were counted (Coulter Counter, Beckman) and plated in 6-well plates (106 cells/well). The following day cells were infected in ESC media containing 8 µg/ml polybrene (Sigma, H9268-10G). After 24 hours the media was removed and replaced with ESC media containing 3.5 µg/mL puromycin (Sigma, P8833). Two days post infection cells were split, counted and plated at equal densities. Six days post infection cells were split again, counted and plated at equal densities. On day eight cells were crosslinked for immunofluorescence or treated with TRIzol (Invitrogen, 15596-026) for RNA extraction.

In order to determine the effect of SetDB1 knockdown on the expression of differentiation markers, mES cells were infected with SetDB1 shRNA (#1, TRCN0000092975) lentivirus essentially as described for the SetDB1 shRNA validations, except that cells were plated at a density of 150,000 cells / well and treated with TRIzol 6 days post infection for RNA extraction.

Validation of the shRNAs targeting Suv39h2, Ehmt1, Ube2i/Ubc9 and Cbx3/HP1 were carried out essentially as described for the SetDB1 shRNA validations, except that cells were plated at a density of 150,000 cells / well and treated with TRIzol 6 days post infection for RNA extraction.

Immunofluorescence

7 Cells were crosslinked, permeabilized and stained as described for high- throughput screening. Images were acquired on a Nikon Inverted TE300 with a Hamamatsu Orca camera. Openlab (http://www.improvision.com/products/openlab/) was used for image acquisition and manipulation.

Chromatin Immunoprecipitation

For H3K9me3, we performed independent ChIP-Seq experiments using two different antibodies; Abcam Ab8898 and Upstate 07-442. Both antibodies were raised in rabbit but with different ; a synthetic peptide conjugated to KLH derived from within residues 1-100 of human histone H3K9me3 for Ab8898 and a synthetic 2X-branched peptide containing the sequence AR[Kme3]ST which corresponds to H3K9me3 of human histone H3 for Up07-442. We used data obtained from Ab8898 for our analysis.

For SetDB1, we performed ChIP-Seq experiment using Santa Cruz ESET (H- 300, sc-66884) antibody. The antibody was raised in rabbit against an epitope corresponding to amino acids 1-300 mapping at the N-terminus of ESET of human origin.

Protocols describing chromatin immunoprecipitation materials and methods have been previously described (Lee et al., 2006b). Embryonic stem cells were grown to a final count of 5-10 x 107 cells for each ChIP experiment. Cells were chemically crosslinked by the addition of one-tenth volume of fresh 11% formaldehyde solution for 15 minutes at room temperature. Cells were rinsed twice with 1X PBS and harvested using a silicon scraper and flash frozen in liquid nitrogen. Cells were stored at –80oC prior to use. Cells were resuspended, lysed in lysis buffers and sonicated to solubilize and shear crosslinked DNA. Sonication conditions vary depending on cells, culture conditions, crosslinking and equipment. For H3K9me3, the sonication buffer was Tris-HCl pH8 20mM, 150mM NaCl, 2mM EDTA, 0.1% SDS, Triton X-100 1%. We used a Misonix Sonicator 3000 and sonicated at approximately 24 watts for 10 x 30 second pulses (60 second pause between pulses). Samples were kept on ice at all times. The resulting whole cell extract was incubated overnight at 4°C with 100 μl of Dynal G magnetic beads that had been pre-incubated with approximately 10 μg of the appropriate antibody. Beads were washed 4 times with: 1 time with the sonication buffer, 1 time with 20mM Tris-HCl pH8, 500mM NaCl, 2mM EDTA, 0.1% SDS, 1%Triton X-100, 1 time with 10mM Tris-HCl pH8, 250nM LiCl, 2mM EDTA, 1% NP40 and 1 time with TE containing 50 mM NaCl. Bound complexes were eluted from the beads by heating at 65°C for 1 hour with occasional vortexing and crosslinking was reversed by overnight incubation at 65°C. Whole cell extract DNA reserved from the sonication step was also treated for crosslink reversal.

8 ChIP-Seq Sample Preparation and Analysis

All protocols for Illumina/Solexa sequence preparation, sequencing and quality control are provided by Illumina (http://www.illumina.com/pages.ilmn?ID=203). A brief summary of the technique and minor protocol modifications are described below.

Sample Preparation

DNA was prepared for sequencing according to a modified version of the Illumina/Solexa Genomic DNA protocol. Fragmented DNA was prepared for ligation of Solexa linkers by repairing the ends and adding a single adenine nucleotide overhang to allow for directional ligation. A 1:100 dilution of the Adaptor Oligo Mix (Illumina) was used in the ligation step. A subsequent PCR step with limited (18) amplification cycles added additional linker sequence to the fragments to prepare them for annealing to the Genome Analyzer flow-cell. After amplification, a narrow range of fragment sizes was selected by separation on a 2% agarose gel and excision of a band between 150-300 bp (representing shear fragments between 50 and 200nt in length and ~100bp of primer sequence). The DNA was purified from the agarose and diluted to 10 nM for loading on the flow cell.

Polony Generation and Sequencing

The DNA library (2-4 pM) was applied to the flow-cell (8 samples per flow-cell) using the Cluster Station device from Illumina. The concentration of library applied to the flow-cell was calibrated such that polonies generated in the bridge amplification step originate from single strands of DNA. Multiple rounds of amplification reagents were flowed across the cell in the bridge amplification step to generate polonies of approximately 1,000 strands in 1μm diameter spots. Double stranded polonies were visually checked for density and morphology by staining with a 1:5000 dilution of SYBR Green I (Invitrogen) and visualizing with a microscope under fluorescent illumination. Validated flow-cells were stored at 4oC until sequencing.

Flow-cells were removed from storage and subjected to linearization and annealing of sequencing primer on the Cluster Station. Primed flow-cells were loaded into the Illumina Genome Analyzer 1G. After the first base was incorporated in the Sequencing-by-Synthesis reaction the process was paused for a key quality control checkpoint. A small section of each lane was imaged and the average intensity value for all four bases was compared to minimum thresholds. Flow-cells with low first base intensities were re-primed and if signal was not recovered the flow-cell was aborted. Flow-cells with signal intensities meeting the minimum thresholds were resumed and sequenced for 26 or 32 cycles.

9 ChIP-Seq Data Analysis

Images acquired from the Illumina/Solexa sequencer were processed through the bundled Solexa image extraction pipeline which identified polony positions, performed base-calling and generated QC statistics. Sequences were aligned using ELAND software to NCBI Build 36 (UCSC mm8) of the mouse genome. Only sequences that mapped uniquely to the genome with zero or one mismatch were used for further analysis. When multiple reads mapped to the same genomic position, a maximum of two reads mapping to the same position were used. A summary of the total number of ChIP-Seq reads that were used in each experiment is provided (Supplemental Table S6). ChIP-Seq datasets profiling the genomic occupancy of H3K9me3 (Mikkelsen et al., 2007), H3K27me3 (Mikkelsen et al., 2007), H3K4me3 (Marson et al., 2008), H3K36me3 (Marson et al., 2008), H3K79me2 (Marson et al., 2008), Oct4 (Marson et al., 2008), Sox2 (Marson et al., 2008), Nanog (Marson et al., 2008),Tcf3 (Marson et al., 2008) and RNA polymerase II (Seila et al., 2008) in mES cells were obtained from previous publications and reanalyzed using the methods described below.

Analysis methods were derived from previously published methods (Johnson et al., 2007; Mikkelsen et al., 2007; Marson et al., 2008; Guenther et al., 2008). Sequence reads from multiple flow cells for each IP target were combined. Each read was extended 100bp, towards the interior of the sequenced fragment, based on the strand of the alignment. Across the genome, in 25 bp bins, the number of ChIP-Seq reads within a 1kb window surrounding each bin (+/- 500bp) was tabulated. The 25bp genomic bins that contained statistically significant ChIP-Seq enrichment were identified by comparison to a Poissonian background model. Assuming background reads are spread randomly throughout the genome, the probability of observing a given number of reads in a 1kb window can be modeled as a Poisson process in which the expectation can be estimated as the number of mapped reads multiplied by the number of bins (40) into which each read maps, divided by the total number of bins available (we estimated 70%). Enriched bins within 1kb of one another were combined into regions. The complete set of RefSeq genes was downloaded from the UCSC table browser (http://genome.ucsc.edu/cgi-bin/hgTables?command=start) on December 20, 2008. Gene with enriched regions within 5kb of their transcription start site were called bound.

The Poissonian background model assumes a random distribution of background reads, however we have observed significant deviations from this expectation. Some of these non-random events can be detected as sites of apparent enrichment in negative control DNA samples and can create many false positives in ChIP-Seq experiments. To remove these regions, we compared genomic bins and regions that meet the statistical threshold for enrichment to a set of reads obtained from Solexa sequencing of DNA from whole cell extract (WCE) in matched cell samples. We required that enriched bins and enriched regions have five-fold greater ChIP-Seq density in the specific IP sample, compared with

10 the control sample, normalized to the total number of reads in each dataset. This served to filter out genomic regions that are biased to having a greater than expected background density of ChIP-Seq reads. A summary of the bound regions and genes for each antibody is provided (Supplemental Tables S7-S8).

ChIP-Seq Density Heatmaps

Selected genes were aligned with each other according to the position and direction of their transcription start site. For each experiment, the ChIP-Seq density profiles were normalized to the density per million total reads. Genes were sorted as indicated. Heatmaps were generated using Java Treeview (http://jtreeview.sourceforge.net/) with color saturation as indicated.

Heatmap display of similarity between genomic occupancy of multiple factors

We performed all pair-wise comparisons of the genomic occupancy of multiple factors using a similarity metric based on a correlation coefficient. This similarity metric generates a score between zero and one describing the similarity in the genomic regions occupied by two factors. The value of this score is one if the regions occupied by one of the factors are entirely contained within the regions occupied by the other factor. The value of this score is zero if the amount of overlap of the two factors is less than or equal to what would be predicted from random association. A matrix of all of the pair-wise similarity metric was generated and subjected to hierarchical clustering along the horizontal and vertical axis with a average linkage similarity metric using the software Cluster 3.0 (http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm ). The clustered matrix of similarity scores was visualized using Java TreeView (http://jtreeview.sourceforge.net/).

Gene Ontology Analysis

Gene ontology analysis was performed with using the online tool GOstat (http://gostat.wehi.edu.au/cgi-bin/goStat.pl). The complete set of all RefSeq genes was used as a background. Complete gene ontology analysis results are provided (Supplemental Table S9).

Comparison to Previous Results

The H3K9me3 ChIP-Seq experimental data produced have greater signal intensity than previously described H3K9me3 ChIP-Seq data (Mikkelsen et al., 2007). Our analysis identifies 2282 genes occupied by H3K9me3 using Ab8898, 1478 genes using Up07-442, and 100 genes based on the previously published data, which was also generated with the Ab8898 antibody (Mikkelsen et al., 2007) (Supplemental Fig. S4A). The majority (80/100) of the target genes that had previously been identified were also identified among our target genes and

11 the pattern of ChIP-Seq density at these sites was similar (Supplemental Fig. S4B, C, D). However, direct comparison of peak amplitude indicates that there was a much stronger H3K9me3 ChIP signal in the experimental data produced here (Supplemental Fig. S4B, C).

Comparison of H3K9me3 ChIP-Seq Datasets

In order to facilitate comparison of the three H3K9me3 ChIP-Seq datasets (no shRNA, GFP-shRNA, SetDB1-shRNA), a quantile normalization method was used. Across all datasets the genomic bin with the greatest ChIP-Seq density was identified. The average of these values was calculated and the highest signal bin in each dataset was assigned this average value. This was repeated for all genomic bins from the greatest signal to the least, assigning each the average ChIP-Seq signal for all bins of that rank across all datasets.

H3K9me3 ChIP enriched regions were identified in cells without shRNA treatment, as described in the section “ChIP-Seq Data Analysis”. The total ChIP- Seq density in these regions was tabulated in cells without shRNA, cells treated with control (GFP) shRNA, and cells treated with SetDB1 shRNA (Supplemental Fig. S6 and Table S14).

H3K9me3 Antibody Specificity

The genome-wide occupancy of nucleosomes with histone H3K9me3 modification overlaps regions that are occupied by nucleosomes with H3K4me3 and H3K27me3. For this reason, we considered the possibility that some of the signal we observed for H3K9me3 was a result of antibody cross-reactivity with H3K4me3 and/or H3K27me3. However, several lines of evidence indicate that both H3K9me3 antibodies (Ab8898 and Up07-442) that were used in our experiments have high relative specificity for the histone H3K9me3 modification relative to H3K4me3 and H3K27me3 modifications. First, there is evidence for histone H3K9me3 specificity from the manufacturers of these antibodies. Abcam provided for their antibody (Ab8898 lot #484088) experimental evidence showing that H3K4me1, H3K4me2, H3K4me3, H3K27me1, H3K27me2 and H3K27me3 synthetic peptides were not able to effectively compete for H3K9me3 binding. A similar assay using H3K27me1, H3K27me2 and H3K27me3 peptides was performed by Millipore, the manufacturer of the second antibody (Up07-442 lot #DAM1411287), with the same results (http://www.millipore.com/coa.nsf/a73664f9f981af8c852569b9005b4eee/e6c2c7a b60c78cb18825743600563fff/$FILE/07-442_DAM1411287.pdf). We reproduced these results in ChIP conditions and show that H3K9me3 peptide efficiently competes with Ab8898 antibody while other peptides can’t (Supplemental Fig. S4E). We also validated specificity of our H3K4me3 and H3K27me3 antibodies in the same assay (Supplemental Fig. S4F, G). Second, the genome-wide profiles obtained for H3K9me3 using the two antibodies were very similar (Supplemental Fig. S4B, C, D). Third, strong signals for histone H3K9me3 appear at sites

12 lacking signal for H3K4me3 (Supplemental Fig. S4H, black arrow) or histone H3K27me3 (Supplemental Fig. S4K, black arrow). Similarly, signals for H3K4me3 (Supplemental Fig. S4H,I,J red arrows) and H3K27me3 (Supplemental Fig. S4L, M) occur in the absence of H3K9me3. Based on this evidence, we believe that our H3K9me3 ChIP results with the two H3K9me3 antibodies are specific for this modification and that there is limited cross-reactivity with H3K4me3 and/or H3K27me3.

SetDB1 Antibody Specificity

We used SetDB1 H-300 (Santa Cruz SC-66884 lot #A2809) antibody in our ChIP-Seq experiments because of its high specificity by western blot analysis and gene specific ChIP experiments. We initially tested 10 different commercially available SetDB1/ESET antibodies and determine that 2: SC- 66884 (H-300) antibody from Santa Cruz and 11231-AP antibody from Proteintech Group, inc were both highly specific by western blot analysis (Supplemental Fig. S3A, B). Both antibodies recognize a strong band at approximately 180 kDa (predicted SetDB1 size) with very minimal background bands. These results are in agreement with the manufacturer (Santa Cruz) of the H-300 antibody that tested our specific lot used in western blot where they also observe the expected molecular weight (180 kDa) of SetDB1/ESET in a F9 cell line (mouse embryonal carcinoma). Importantly the targeted knockdown of SetDB1 results in a substantial decrease in the intensity of the band observed at 180 KDa for both antibodies (Supplemental Fig. S3A, B).

Then, we tested the Santa Cruz (H-300) and the Proteintech 11231-AP antibodies by gene specific ChIP experiments at three H3K9 methylated genes (Polrmt, Nnat, and Mybl2). Enrichment of SetDB1 binding was observed at all three genes (Supplemental Fig. 3C). We used the Santa Cruz H-300 antibody for subsequent ChIP-Seq experiments because of its higher enrichment over background when compared to the Proteintech 11231-AP antibody. In order to further demonstrate the high specificity of SetDB1 (H-300) that was used for all ChIP-Seq experiments, we conducted a SetDB1 ChIP analysis in mES cells that were infected with a shRNA targeting SetDB1. These results indicate that relative to a shRNA GFP control infection, there is a substantial decrease in SetDB1 binding at the promoters of Polrmt, Nnat and Mybl3 (Supplemental Fig. 3D). Based on these evidences, we believe that our SetDB1 antibody is highly specific.

PCR primers (mm8)

Polrmt 5’-TCAGCAAACTCCAATAGCGCAC-3’ 5’-TTGCCGCACAACATGGACTT-3’

Nnat

13 5’-TGCTGCTGCAGGTGAGTATGTA-3’ 5’-TTGCGGCAATTGGGATAGGA-3’

Mybl2 5’-AAGTGTGCCTACTTCCTGTGGT-3’ 5’-TGTTGTGCACAGTCCCTGAA-3’

Polrmt (upstream negative control) 5’-TGGGTGCCGTATGCCACATTAT-3’ 5’-TTTCTGGCCATCCGCACCTTAT-3’

Agrn 5’-AAAGATGTGCTCCTGGTTGGCA-3’ 5’-ATGGCACATGTGTGGCAGTGAT-3’

Nef3 5’-TCTTTGCGCTCTACCGTGATGT-3’ 5’-TTTCCTGCGGAGCAATCACGAA-3’

Cdx2 5’-ATGCTCACGTCCTTGTCCAGAA-3’ 5’-TCTGGCAGCCTTCAACGTTTGT-3’

RNA Extraction, cDNA, and TaqMan Expression Analysis

RNA utilized for real-time qPCR was extracted with TRIzol (Invitrogen, 15596- 026). Purified RNA was reverse transcribed using Superscript III (Invitrogen) with oligo dT primed first-strand synthesis following the manufacturer protocol.

Real-time qPCR were carried out on the 7000 ABI Detection System using the following Taqman probes according to the manufacturer protocol (Applied Biosystems).

SetDB1 Mm00450791_m1 Oct4 Mm00658129_gH Gapdh Mm99999915_g1 Suv39h2 Mm00469689_m1 Ehmt1 Mm00553220_m1 Ube2i/Ubc9 Mm00495850_m1 Cbx3/HP1 Mm00850539_g1 Brachyury T Mm00436877_m1 Cdx2 Mm00432449_m1 Gata4 Mm00484689_m1 Hoxa1 Mm00439359_m1 MyoD1 Mm00440387_m1 Nr2f2 Mm00772789_m1

14 Pax3 Mm00435493_m1

Expression levels were normalized to Gapdh levels and relative to a control shRNA targeting GFP.

Microarray Expression Analysis

For SetDB1 knockdown expression analysis, mES cells were split off MEFs, placed in a tissue culture dish for 45 minutes to selectively remove the MEFs and plated in 6-well plates. The following day cells were infected with lentiviral shRNAs targeting GFP (TRCN0000072201) or SetDB1 shRNA #1 (TRCN0000092975) in ESC media containing 8 µg/ml polybrene (Sigma, H9268- 10G). After 24 hours the media was removed and replaced with ESC media containing 3.5 µg/mL puromycin (Sigma, P8833). Six days post infection RNA was isolated with TRIzol (Invitrogen, 15596-026), further purified with RNeasy columns (Qiagen, 74104) and DNase treated on column (Qiagen, 79254) following the manufacturer’s protocols. RNA samples from two biological replicates were used for duplicate microarray expression analysis.

For microarray analysis, Cy3 and Cy5 labeled cRNA samples were prepared using Agilent’s QuickAmp sample labeling kit starting with 1µg total RNA. Briefly, double-stranded cDNA was generated using MMLV-RT enzyme and an oligo-dT based primer. In vitro transcription was performed using T7 RNA polymerase and either Cy3-CTP or Cy5-CTP, directly incorporating dye into the cRNA.

Agilent mouse 4x44k expression arrays were hybridized according to our laboratory’s standard method, which differs slightly from the standard protocol provided by Agilent. The hybridization cocktail consisted of 825 ng cy-dye labeled cRNA for each sample, Agilent hybridization blocking components, and fragmentation buffer. The hybridization cocktails were fragmented at 60°C for 30 minutes, and then Agilent 2X hybridization buffer was added to the cocktail prior to application to the array. The arrays were hybridized for 16 hours at 60°C in an Agilent rotor oven set to maximum speed. The arrays were treated with Wash Buffer #1 (6X SSPE / 0.005% n-laurylsarcosine) on a shaking platform at room temperature for 2 minutes, and then Wash Buffer #2 (0.06X SSPE) for 2 minutes at room temperature. The arrays were then dipped briefly in acetonitrile before a final 30 second wash in Agilent Wash 3 Stabilization and Drying Solution, using a stir plate and stir bar at room temperature.

Arrays were scanned using an Agilent DNA microarray scanner. Array images were quantified and statistical significance of differential expression for each hybridization was calculated using Agilent’s Feature Extraction Image Analysis software with the default two-color gene expression protocol. Probes/genes were called differentially expressed if the average p-value from multiple hybridizations was less than 10-6. Genes were sorted by fold change of geometric mean signal

15 from multiple experiments. Heatmaps were generated using Java Treeview (http://jtreeview.sourceforge.net) with color saturation as indicated. Complete expression data for Fig. 3G are provided (Supplemental Table S10-S11). The statistical significance of the overlap between differentially expressed and H3K9me3/H3K4me3/H3K27me3 bound genes was calculated using a standard Chi-square test.

16 Supplemental Discussion

H3K9 Associated Validation

The shRNA screen identified multiple factors associated with H3K9 methylation, H3K9 methyltransferases (SetDB1, Ehmt1, Suv39h2), the sumoylation E2 conjugating enzyme (Ube2i/Ubc9) and a H3K9 binding protein (Cbx3/HP1). We independently generated lentivirus for the two best shRNAs identified in the screen and determined their knockdown efficiency and effects on Oct4 expression levels by real-time qPCR (Supplemental Fig. S1). All tested shRNAs with the exception of the Cbx3/HP1 shRNAs substantially reduced target gene and Oct4 expression levels. Since neither Cbx3/HP1 shRNA had a significant effect on Oct4 expression levels it was not included in Supplemental Table S1.

SetDB1 Validation

We verified that the two best shRNAs for SetDB1 identified in the screen led to reduced levels of Oct4 protein and mRNA in independent experiments (Supplemental Fig. S2). Knockdown of SetDB1 with the two shRNAs had similar effects on ES cell colony morphology, Oct4 protein staining, and Oct4 RNA expression levels, although shRNA #1 was somewhat more effective. The similar effectiveness of two separate SetDB1 shRNAs suggests that the observed phenotypic changes were not the result of off target effects.

Criteria for Identifying Chromatin Regulators

We used multiple Z-score level thresholds to select chromatin regulators that had significant effects on Oct4 levels for inclusion in Supplemental Table S1. First, the chromatin regulator had to have at least one shRNA with a Z-score greater than 2.8 or less than –1.75. The 2.8 cutoff was utilized because the positive control shRNA targeting Tcf3 gave a Z-score of 2.8. The -1.75 cutoff was chosen because it was within close proximity to the Z-score of the Stat3 control (-2.4). Second, a chromatin factor was also included if at least two separate shRNAs scored above 2.0 or below –1.5 and it was possible to classify the gene based on the literature. Third, in one case we included a gene that encodes the SWI/SNF protein Smarcd1 (single hairpin hit, Z-score 2.2) because two other SWI/SNF complex members (Arid1a and Smarcb1) were identified with robust Z-scores (3.0 and 3.2 respectively). Single hairpin hits should be treated with more caution since there is the risk of the phenotype being induced by off target effects. However, SetDB1, which had a strong single hairpin hit (Z-score of –2.6, third best in the screen), was validated with a second shRNA (Supplemental Fig. S2). Furthermore, for SetDB1 there is a very good correlation between the effectiveness of the shRNA and the strength of the phenotype (Supplemental Fig. S2).

17 We also utilized a redundant siRNA activity (RSA) analysis (Konig et al., 2007) as an additional metric to identify chromatin regulators that result in a decrease in Oct4 levels when knocked down (Supplemental Table S12). This analysis is useful for identifying multiple shRNA hits since it takes into account the effects of the all the shRNAs targeting a gene. The RSA analysis has identified 38 genes (pVal cutoff of 0.05) where their knockdown resulted in a loss of Oct4 expression. The majority of hits resulting in a loss of Oct4 expression presented in Supplemental Table 1 (Results of chromatin regulator screen), including SetDB1 were identified by the RSA analysis, thereby supporting our hit selection methodology (Supplemental Table S13). While the RSA analysis is useful for identifying multiple shRNA hits it fails to identify some strong single shRNA hits. Due to the varying efficiencies of the 4 to 6 shRNAs targeting a gene it is not surprising that only one or two will have a sufficient knockdown to induce the desirable phenotype. For this reason we decided not to use a purely statistical metric that would exclude single shRNA hits from Supplemental Table 1 (Results of chromatin shRNA screen). These single shRNA hits were included in Supplemental Table 1 if they could be classified with additional hits (H3K9 methylation associated, cohesin associated, etc.).

Screening Results Comparison

Of the 23 genes that we identified that result in a loss of Oct4 expression (Supplemental Table 1) when knocked down in ES cells, 32% were identified in at least one of other recently conducted mES cell screens (Ding et al., 2009; Hu et al., 2009; Fazzio et al., 2008). These results have been summarized in Supplementary Table S13. We also identified novel genes that were not discovered in the other screens. This was expected because of the following substantial differences between our screen methodology and the others. 1: Our screen employed a stable and selectable lentiviral knockdown strategy as opposed to the other screens that all utilized a transient non selectable siRNA approach. We choose our approach to be confident that all cells received the knockdown virus avoiding the problem of a subset of cells in a well not receiving a knockdown construct, potentially out competing the ones that did, and thereby masking a differentiation phenotype. 2: There are substantial differences in the ES cell lines used for the different screens. We utilized wild type ES cells and monitored endogenous Oct4 expression levels to directly measure Oct4 levels. Other groups utilized an engineered ES cell line where they instead measured the expression levels of GFP under the control of Oct4 regulatory regions (Hu et al., 2009; Ding et al., 2009). 3: While the other groups relied on morphology (Fazzio et al., 2008) or a whole well FACS measurement (Ding et al., 2009; Hu et al., 2009) for scoring hits, we utilized an image based system where Oct4 staining intensity measurements were made for individual cells in each well. We believe that this sensitive quantitative approach has allowed for detection of subtler phenotypes.

18 Comments on Design and Saturation of Screen

Screening conditions were optimized to identify both positive and negative effects on ES cell state as measured by Oct4 levels. We were limited to a five day screening assay to prevent negative control wells from reaching confluency and overgrowing. Others groups have assayed mES cells for differentiation at time points greater than five days following targeted lentiviral knockdown, suggesting that in some cases longer than five days is required for a loss of pluripotency to be detected (Ivanova et al., 2006). Our SetDB1 validation results support this notion; shRNA #2 produced a modest Z-score (-0.7) in the screen but had a much more pronounced effect when the assay duration was increased to eight days (Supplemental Fig. S2 and Supplemental Table S2). Furthermore, the effectiveness of the shRNAs used in these screens is imperfect. For these reasons, it is likely that some chromatin regulators that contribute to ES cell state were not identified in this screen and thus, like all genetic screens, this screen for chromatin regulators is not saturated.

Comments on Chromatin Regulators Identified in the Screen

H3K9 Methylation Associated

One of the largest class of chromatin regulators that induced a decrease in Oct4 staining intensity when knockdown was associated with H3K9 methylation (Supplemental Table S1). This includes three H3K9 methylases (SetDB1, Suv39h2 and Emh1) and the SUMO E2 conjugating enzyme that is involved in SetDB1 recruitment to chromatin and stimulation of SetDB1 H3K9 methylase activity (Ube2i/Ubc9) (O'Carroll et al., 2000; Tachibana et al., 2005; Ivanov et al., 2007; Schultz et al., 2002; Ayyanathan et al., 2003; Wang et al., 2003). Of these, SetDB1 had the best Z-score (-2.6) and of all the screened genes, only Oct4 and Smc1a had lower Z-scores (-3.3 and –2.9, respectively) (Supplemental Table S2). SetDB1 is important for development since homozygous SetDB1 null murine embryos have severe developmental defects resulting in lethality between 3.5 and 5.5 dpc (Dodge et al., 2004).

HDAC Associated

Knockdown of numerous HDACs (Hdac1, Hdac3, Hdac11, and Sirt6) and components of the Sin3/HDAC complex (Sin3a and Sap18) had both positive and negative effects on Oct4 staining intensity (Supplemental Table S1) (Ahringer, 2000; Michishita et al., 2008). Recently it has been reported that Hdac1 associates with the key pluripotency regulators Nanog and Oct4 in mES cells and Sin3a is in a complex with Nanog (Liang et al., 2008).

19 Polycomb Repressive Complex

Polycomb are important for regulating ES cell state, in part through the repression of developmental regulator genes (Supplemental Table S1) (Boyer et al., 2006; Lee et al., 2006a). Consistent with this, multiple polycomb proteins were identified in the screen (Cbx7, Cbx8/Pc3, Ezh2, Epc2, Cbx6) (Brock and Fisher, 2005). There is a general increase in expression of polycomb bound genes in mES cells that are induced to differentiate (Boyer et al., 2006). The decreased expression of Ezh2, a H3K27 methyltransferase, and the chromodomain containing proteins Cbx7 and Cbx8/Pc3, all resulted in loss in Oct4 staining. Ezh2 is required for murine development and attempts to establish Ezh2 null mES cells have been unsuccessful (O'Carroll et al., 2001).

Cohesin Complex Members

Many members of the cohesin complex (Smc1a, Smc3, Stag2, Nipbl, Smc1b and Stag3) were identified, with multiple hairpin hits for Smc1a, Smc3 and Nipbl (Supplemental Table S1) (Hirano, 2002; Tonkin et al., 2004). The cohesin complex has a well documented role in chromosomal segregation by maintaining sister chromatid cohesion and recent evidence suggests it is also involved in gene regulation (Hirano, 2002; Peric-Hupkes and van Steensel, 2008; Hadjur et al., 2009). In mammals the cohesin complex has been demonstrated to be involved in gene repression by interacting with and enabling the transcriptional insulator CTCF to prevent distal enhancers from activating promoters (Wendt et al., 2008). Furthermore, members of this complex have been implicated in Cornelia de Lange syndrome, a developmental disease, potentially resulting from aberrant gene expression (Liu and Krantz, 2008). All members of this family, with the exception of Smc1b and Stag3, resulted in a decrease in Oct4 staining after targeted knockdown. Smc1b and Stag3 are meiosis-specific cohesins suggesting that they may function independently from the other cohesins, possibly explaining the differential effects on Oct4 staining intensities (Pezzi et al., 2000; Revenkova et al., 2001).

Uncharacterized Methyltransferases

Multiple uncharacterized methyltransferases (Wbscr22, 6430573F11Rik, Wbscr27 and Mettl7a1) were identified in the screen (Supplemental Table S1). Both Wbscr22 and Wbscr27 map to a region of the that is deleted in individuals afflicted with the neurodevelopment disorder Williams- Beuren syndrome (Merla et al., 2002; Tassabehji, 2003). Targeted knockdown of Wbscr22 resulted in a loss of Oct4 staining and four separate shRNAs induced the phenotype.

H3K4 Methyltransferases

20 Knockdown of the H3K4 methyltransferases Setd7, Mll1, Setd1b and Ash1l had varying effects on Oct4 staining intensity (Supplemental Table S1). Decreased expression of Setd7 caused a reduction in Oct4 staining whereas Mll1, Setd1b and Ash1l resulted in an increase (Nishioka et al., 2002; Wang et al., 2001; Lee et al., 2007; Nakamura et al., 2002; Gregory et al., 2007). Both Mll1 and Ash1 have been implicated in positive regulation of the developmental Hox genes, and loss of Mll1 results in aberrant Hox gene expression (Gregory et al., 2007; Yu et al., 1995).

Histone Chaperones

Four genes involved in chaperoning histones and chromatin assembly were identified (Chaf1a, Chaf1b, Hira and Asf1a) (Supplemental Table S1 ) (Eitoku et al., 2008). Decreased expression of two, Chaf1a and Chaf1b, result in a reduction in Oct4 staining. In contrast, loss of Hira and Asf1a caused an increase in Oct4 staining. Chaf1a and Chaf1b are members of the three subunit CAF-1 complex, which is responsible for depositing histones H3 and H4 onto DNA during replication (Eitoku et al., 2008; Kaufman et al., 1995). The Chaf1a subunit is required for embryonic development and its depletion in murine ES cells results in heterochromatic defects (Houlard et al., 2006). Hira is also an essential gene whose function is to deposit the histone H3.3 variant onto DNA in a replication-independent manner (Roberts et al., 2002; Tagami et al., 2004). Asf1a interacts with Hira but also copurifies with the CAF-1 complex (Tagami et al., 2004; Tang et al., 2006).

SWI/SNF

Three components (Smarcb1, Arid1a and Smarcd1) of the SWI/SNF chromatin remodeling complex were identified in the screen (Supplemental Table S1) (Roberts and Orkin, 2004; Hurlstone et al., 2002). Smarcb1 is a core subunit of the SWI/SNF complex and is required for mouse development (Roberts and Orkin, 2004; Klochendler-Yeivin et al., 2000). Decreased expression of all three components resulted in an increase in Oct4 staining intensity.

Arginine Methylation

Two arginine methylases, Prmt1 and Prmt7 result in a decrease in Oct4 staining when knocked down (Supplemental Table S1). Both methylate histone 4 arginine 3 (H4R3) and additional non-histone substrates (Jelinic et al., 2006; Pahlich et al., 2006; Lee et al., 2005). H4R3 methylation stimulated by Prmt1 is generally associated with gene activation and Prmt7 has also been linked to gene activation by preventing the silencing of the paternal allele of the imprinted gene Igf2 (Jelinic et al., 2006; Rezai-Zadeh et al., 2003).

H3K36 Methylation Associated

21 A H3K36 demethylase (Fbxl10) and a chromodomain containing protein which selectively binds to methylated H3K36 (Morf4l1) were identified in the screen; decreased expression of both results in ES cells that have increased Oct4 staining intensities (Supplemental Table S1) (He et al., 2008; Zhang et al., 2006). Morf4l1 is essential for murine development and the targeted knockout results in embryonic lethality that can in part be attributed to cell proliferation defects (Tominaga et al., 2005).

22

Supplemental References

Ahringer,J. (2000). NuRD and SIN3 histone deacetylase complexes in development. Trends Genet. 16, 351-356.

Ayyanathan,K., Lechner,M.S., Bell,P., Maul,G.G., Schultz,D.C., Yamada,Y., Tanaka,K., Torigoe,K., and Rauscher,F.J., III (2003). Regulated recruitment of HP1 to a euchromatic gene induces mitotically heritable, epigenetic gene silencing: a mammalian cell culture model of gene variegation. Genes Dev. 17, 1855-1869.

Boyer,L.A., Lee,T.I., Cole,M.F., Johnstone,S.E., Levine,S.S., Zucker,J.P., Guenther,M.G., Kumar,R.M., Murray,H.L., Jenner,R.G., Gifford,D.K., Melton,D.A., Jaenisch,R., and Young,R.A. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947- 956.

Boyer,L.A., Plath,K., Zeitlinger,J., Brambrink,T., Medeiros,L.A., Lee,T.I., Levine,S.S., Wernig,M., Tajonar,A., Ray,M.K., Bell,G.W., Otte,A.P., Vidal,M., Gifford,D.K., Young,R.A., and Jaenisch,R. (2006). Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature 441, 349-353.

Ding,L., Paszkowski-Rogacz,M., Nitzsche,A., Slabicki,M.M., Heninger,A.K., de,V., I, Kittler,R., Junqueira,M., Shevchenko,A., Schulz,H., Hubner,N., Doss,M.X., Sachinidis,A., Hescheler,J., Iacone,R., Anastassiadis,K., Stewart,A.F., Pisabarro,M.T., Caldarelli,A., Poser,I., Theis,M., and Buchholz,F. (2009). A genome-scale RNAi screen for Oct4 modulators defines a role of the Paf1 complex for embryonic stem cell identity. Cell Stem Cell 4, 403-415.

Dodge,J.E., Kang,Y.K., Beppu,H., Lei,H., and Li,E. (2004). Histone H3-K9 methyltransferase ESET is essential for early development. Mol. Cell Biol. 24, 2478-2486.

Eitoku,M., Sato,L., Senda,T., and Horikoshi,M. (2008). Histone chaperones: 30 years from isolation to elucidation of the mechanisms of nucleosome assembly and disassembly. Cell Mol. Life Sci. 65, 414-444.

Fazzio,T.G., Huff,J.T., and Panning,B. (2008). An RNAi screen of chromatin proteins identifies Tip60-p400 as a regulator of embryonic stem cell identity. Cell 134, 162-174.

Gregory,G.D., Vakoc,C.R., Rozovskaia,T., Zheng,X., Patel,S., Nakamura,T., Canaani,E., and Blobel,G.A. (2007). Mammalian ASH1L is a histone methyltransferase that occupies the transcribed region of active genes. Mol. Cell Biol. 27, 8466-8479.

Guenther,M.G., Lawton,L.N., Rozovskaia,T., Frampton,G.M., Levine,S.S., Volkert,T.L., Croce,C.M., Nakamura,T., Canaani,E., and Young,R.A. (2008). Aberrant chromatin at genes encoding stem cell regulators in human mixed-lineage leukemia. Genes Dev. 22, 3403-3408.

Hadjur,S., Williams,L.M., Ryan,N.K., Cobb,B.S., Sexton,T., Fraser,P., Fisher,A.G., and Merkenschlager,M. (2009). Cohesins form chromosomal cis-interactions at the developmentally regulated IFNG locus. Nature.

He,J., Kallin,E.M., Tsukada,Y., and Zhang,Y. (2008). The H3K36 demethylase Jhdm1b/Kdm2b regulates cell proliferation and senescence through p15(Ink4b). Nat. Struct. Mol. Biol. 15, 1169- 1175.

23 Hirano,T. (2002). The ABCs of SMC proteins: two-armed ATPases for condensation, cohesion, and repair. Genes Dev. 16, 399-414.

Houlard,M., Berlivet,S., Probst,A.V., Quivy,J.P., Hery,P., Almouzni,G., and Gerard,M. (2006). CAF-1 is essential for heterochromatin organization in pluripotent embryonic cells. PLoS. Genet. 2, e181.

Hu,G., Kim,J., Xu,Q., Leng,Y., Orkin,S.H., and Elledge,S.J. (2009). A genome-wide RNAi screen identifies a new transcriptional module required for self-renewal. Genes Dev. 23, 837-848.

Hurlstone,A.F., Olave,I.A., Barker,N., van Noort,M., and Clevers,H. (2002). Cloning and characterization of hELD/OSA1, a novel BRG1 interacting protein. Biochem. J. 364, 255-264.

Ivanov,A.V., Peng,H., Yurchenko,V., Yap,K.L., Negorev,D.G., Schultz,D.C., Psulkowski,E., Fredericks,W.J., White,D.E., Maul,G.G., Sadofsky,M.J., Zhou,M.M., and Rauscher,F.J., III (2007). PHD domain-mediated E3 ligase activity directs intramolecular sumoylation of an adjacent bromodomain required for gene silencing. Mol. Cell 28, 823-837.

Ivanova,N., Dobrin,R., Lu,R., Kotenko,I., Levorse,J., DeCoste,C., Schafer,X., Lun,Y., and Lemischka,I.R. (2006). Dissecting self-renewal in stem cells with RNA interference. Nature 442, 533-538.

Jelinic,P., Stehle,J.C., and Shaw,P. (2006). The testis-specific factor CTCFL cooperates with the protein methyltransferase PRMT7 in H19 imprinting control region methylation. PLoS. Biol. 4, e355.

Johnson,D.S., Mortazavi,A., Myers,R.M., and Wold,B. (2007). Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497-1502.

Kaufman,P.D., Kobayashi,R., Kessler,N., and Stillman,B. (1995). The p150 and p60 subunits of chromatin assembly factor I: a molecular link between newly synthesized histones and DNA replication. Cell 81, 1105-1114.

Kent,W.J., Sugnet,C.W., Furey,T.S., Roskin,K.M., Pringle,T.H., Zahler,A.M., and Haussler,D. (2002). The human genome browser at UCSC. Genome Res. 12, 996-1006.

Klochendler-Yeivin,A., Fiette,L., Barra,J., Muchardt,C., Babinet,C., and Yaniv,M. (2000). The murine SNF5/INI1 chromatin remodeling factor is essential for embryonic development and tumor suppression. EMBO Rep. 1, 500-506.

Konig,R., Chiang,C.Y., Tu,B.P., Yan,S.F., DeJesus,P.D., Romero,A., Bergauer,T., Orth,A., Krueger,U., Zhou,Y., and Chanda,S.K. (2007). A probability-based approach for the analysis of large-scale RNAi screens. Nat. Methods 4, 847-849.

Lee,J.H., Cook,J.R., Yang,Z.H., Mirochnitchenko,O., Gunderson,S.I., Felix,A.M., Herth,N., Hoffmann,R., and Pestka,S. (2005). PRMT7, a new protein arginine methyltransferase that synthesizes symmetric dimethylarginine. J. Biol. Chem. 280, 3656-3664.

Lee,J.H., Tate,C.M., You,J.S., and Skalnik,D.G. (2007). Identification and characterization of the human Set1B histone H3-Lys4 methyltransferase complex. J. Biol. Chem. 282, 13419-13428.

Lee,T.I., Jenner,R.G., Boyer,L.A., Guenther,M.G., Levine,S.S., Kumar,R.M., Chevalier,B., Johnstone,S.E., Cole,M.F., Isono,K., Koseki,H., Fuchikami,T., Abe,K., Murray,H.L., Zucker,J.P., Yuan,B., Bell,G.W., Herbolsheimer,E., Hannett,N.M., Sun,K., Odom,D.T., Otte,A.P., Volkert,T.L.,

24 Bartel,D.P., Melton,D.A., Gifford,D.K., Jaenisch,R., and Young,R.A. (2006a). Control of developmental regulators by Polycomb in human embryonic stem cells. Cell 125, 301-313.

Lee,T.I., Johnstone,S.E., and Young,R.A. (2006b). Chromatin immunoprecipitation and microarray-based analysis of protein location. Nat. Protoc. 1, 729-748.

Liang,J., Wan,M., Zhang,Y., Gu,P., Xin,H., Jung,S.Y., Qin,J., Wong,J., Cooney,A.J., Liu,D., and Songyang,Z. (2008). Nanog and Oct4 associate with unique transcriptional repression complexes in embryonic stem cells. Nat. Cell Biol. 10, 731-739.

Liu,J. and Krantz,I.D. (2008). Cohesin and human disease. Annu. Rev. Genomics Hum. Genet. 9, 303-320.

Marson,A., Levine,S.S., Cole,M.F., Frampton,G.M., Brambrink,T., Johnstone,S., Guenther,M.G., Johnston,W.K., Wernig,M., Newman,J., Calabrese,J.M., Dennis,L.M., Volkert,T.L., Gupta,S., Love,J., Hannett,N., Sharp,P.A., Bartel,D.P., Jaenisch,R., and Young,R.A. (2008). Connecting microRNA genes to the core transcriptional regulatory circuitry of embryonic stem cells. Cell 134, 521-533.

Merla,G., Ucla,C., Guipponi,M., and Reymond,A. (2002). Identification of additional transcripts in the Williams-Beuren syndrome critical region. Hum. Genet. 110, 429-438.

Michishita,E., McCord,R.A., Berber,E., Kioi,M., Padilla-Nash,H., Damian,M., Cheung,P., Kusumoto,R., Kawahara,T.L., Barrett,J.C., Chang,H.Y., Bohr,V.A., Ried,T., Gozani,O., and Chua,K.F. (2008). SIRT6 is a histone H3 lysine 9 deacetylase that modulates telomeric chromatin. Nature 452, 492-496.

Mikkelsen,T.S., Ku,M., Jaffe,D.B., Issac,B., Lieberman,E., Giannoukos,G., Alvarez,P., Brockman,W., Kim,T.K., Koche,R.P., Lee,W., Mendenhall,E., O'Donovan,A., Presser,A., Russ,C., Xie,X., Meissner,A., Wernig,M., Jaenisch,R., Nusbaum,C., Lander,E.S., and Bernstein,B.E. (2007). Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553-560.

Moffat,J., Grueneberg,D.A., Yang,X., Kim,S.Y., Kloepfer,A.M., Hinkle,G., Piqani,B., Eisenhaure,T.M., Luo,B., Grenier,J.K., Carpenter,A.E., Foo,S.Y., Stewart,S.A., Stockwell,B.R., Hacohen,N., Hahn,W.C., Lander,E.S., Sabatini,D.M., and Root,D.E. (2006). A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell 124, 1283-1298.

Nakamura,T., Mori,T., Tada,S., Krajewski,W., Rozovskaia,T., Wassell,R., Dubois,G., Mazo,A., Croce,C.M., and Canaani,E. (2002). ALL-1 is a histone methyltransferase that assembles a supercomplex of proteins involved in transcriptional regulation. Mol. Cell 10, 1119-1128.

Nishioka,K., Chuikov,S., Sarma,K., Erdjument-Bromage,H., Allis,C.D., Tempst,P., and Reinberg,D. (2002). Set9, a novel histone H3 methyltransferase that facilitates transcription by precluding histone tail modifications required for heterochromatin formation. Genes Dev. 16, 479- 489.

O'Carroll,D., Erhardt,S., Pagani,M., Barton,S.C., Surani,M.A., and Jenuwein,T. (2001). The polycomb-group gene Ezh2 is required for early mouse development. Mol. Cell Biol. 21, 4330- 4336.

O'Carroll,D., Scherthan,H., Peters,A.H., Opravil,S., Haynes,A.R., Laible,G., Rea,S., Schmid,M., Lebersorger,A., Jerratsch,M., Sattler,L., Mattei,M.G., Denny,P., Brown,S.D., Schweizer,D., and

25 Jenuwein,T. (2000). Isolation and characterization of Suv39h2, a second histone H3 methyltransferase gene that displays testis-specific expression. Mol. Cell Biol. 20, 9423-9433.

Pahlich,S., Zakaryan,R.P., and Gehring,H. (2006). Protein arginine methylation: Cellular functions and methods of analysis. Biochim. Biophys. Acta 1764, 1890-1903.

Peric-Hupkes,D. and van Steensel,B. (2008). Linking cohesin to gene regulation. Cell 132, 925- 928.

Pezzi,N., Prieto,I., Kremer,L., Perez Jurado,L.A., Valero,C., Del Mazo,J., Martinez,A., and Barbero,J.L. (2000). STAG3, a novel gene encoding a protein involved in meiotic chromosome pairing and location of STAG3-related genes flanking the Williams-Beuren syndrome deletion. FASEB J. 14, 581-592.

Revenkova,E., Eijpe,M., Heyting,C., Gross,B., and Jessberger,R. (2001). Novel meiosis-specific isoform of mammalian SMC1. Mol. Cell Biol. 21, 6984-6998.

Rezai-Zadeh,N., Zhang,X., Namour,F., Fejer,G., Wen,Y.D., Yao,Y.L., Gyory,I., Wright,K., and Seto,E. (2003). Targeted recruitment of a histone H4-specific methyltransferase by the transcription factor YY1. Genes Dev. 17, 1019-1029.

Roberts,C., Sutherland,H.F., Farmer,H., Kimber,W., Halford,S., Carey,A., Brickman,J.M., Wynshaw-Boris,A., and Scambler,P.J. (2002). Targeted mutagenesis of the Hira gene results in gastrulation defects and patterning abnormalities of mesoendodermal derivatives prior to early embryonic lethality. Mol. Cell Biol. 22, 2318-2328.

Roberts,C.W. and Orkin,S.H. (2004). The SWI/SNF complex--chromatin and cancer. Nat. Rev. Cancer 4, 133-142.

Schultz,D.C., Ayyanathan,K., Negorev,D., Maul,G.G., and Rauscher,F.J., III (2002). SETDB1: a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1- mediated silencing of euchromatic genes by KRAB zinc-finger proteins. Genes Dev. 16, 919-932.

Seila,A.C., Calabrese,J.M., Levine,S.S., Yeo,G.W., Rahl,P.B., Flynn,R.A., Young,R.A., and Sharp,P.A. (2008). Divergent transcription from active promoters. Science 322, 1849-1851.

Tachibana,M., Ueda,J., Fukuda,M., Takeda,N., Ohta,T., Iwanari,H., Sakihama,T., Kodama,T., Hamakubo,T., and Shinkai,Y. (2005). Histone methyltransferases G9a and GLP form heteromeric complexes and are both crucial for methylation of euchromatin at H3-K9. Genes Dev. 19, 815- 826.

Tagami,H., Ray-Gallet,D., Almouzni,G., and Nakatani,Y. (2004). Histone H3.1 and H3.3 complexes mediate nucleosome assembly pathways dependent or independent of DNA synthesis. Cell 116, 51-61.

Tang,Y., Poustovoitov,M.V., Zhao,K., Garfinkel,M., Canutescu,A., Dunbrack,R., Adams,P.D., and Marmorstein,R. (2006). Structure of a human ASF1a-HIRA complex and insights into specificity of histone chaperone complex assembly. Nat. Struct. Mol. Biol. 13, 921-929.

Tassabehji,M. (2003). Williams-Beuren syndrome: a challenge for genotype-phenotype correlations. Hum. Mol. Genet. 12 Spec No 2, R229-R237.

Tominaga,K., Kirtane,B., Jackson,J.G., Ikeno,Y., Ikeda,T., Hawks,C., Smith,J.R., Matzuk,M.M., and Pereira-Smith,O.M. (2005). MRG15 regulates embryonic development and cell proliferation. Mol. Cell Biol. 25, 2924-2937.

26 Tonkin,E.T., Wang,T.J., Lisgo,S., Bamshad,M.J., and Strachan,T. (2004). NIPBL, encoding a homolog of fungal Scc2-type sister chromatid cohesion proteins and fly Nipped-B, is mutated in Cornelia de Lange syndrome. Nat. Genet. 36, 636-641.

Wang,H., An,W., Cao,R., Xia,L., Erdjument-Bromage,H., Chatton,B., Tempst,P., Roeder,R.G., and Zhang,Y. (2003). mAM facilitates conversion by ESET of dimethyl to trimethyl lysine 9 of histone H3 to cause transcriptional repression. Mol. Cell 12, 475-487.

Wang,H., Cao,R., Xia,L., Erdjument-Bromage,H., Borchers,C., Tempst,P., and Zhang,Y. (2001). Purification and functional characterization of a histone H3-lysine 4-specific methyltransferase. Mol. Cell 8, 1207-1217.

Wendt,K.S., Yoshida,K., Itoh,T., Bando,M., Koch,B., Schirghuber,E., Tsutsumi,S., Nagae,G., Ishihara,K., Mishiro,T., Yahata,K., Imamoto,F., Aburatani,H., Nakao,M., Imamoto,N., Maeshima,K., Shirahige,K., and Peters,J.M. (2008). Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature 451, 796-801.

Yu,B.D., Hess,J.L., Horning,S.E., Brown,G.A., and Korsmeyer,S.J. (1995). Altered Hox expression and segmental identity in Mll-mutant mice. Nature 378, 505-508.

Zhang,P., Du,J., Sun,B., Dong,X., Xu,G., Zhou,J., Huang,Q., Liu,Q., Hao,Q., and Ding,J. (2006). Structure of human MRG15 chromo domain and its binding to Lys36-methylated histone H3. Nucleic Acids Res. 34, 6621-6628.

27 Supplemental Table 1: Results of chromatin regulator shRNA screen Category Gene Symbol Function Knockdown Phenotype shRNAs Z-score* Decreased Increased Oct4 Staining Oct4 Staining Pluripotency Controls Oct4 Master Regulator + -3.3 Stat3 Lif Signaling + -2.4 Tcf3 Wnt/B-Catenin Signaling + 2.8 Negative Controls GFP - 0.0 RFP - 0.3 H3K9 Associated SetDB1 H3K9 Methyltransferase + 1 -2.6 Ube2i/Ubc9 SetDB1 Recruitment + 2 -1.9 Ehmt1 H3K9 Methyltransferase + 1 -1.9 Suv39h2 H3K9 Methyltransferase + 1 -1.8 HDAC Associated Sap18 Sin3/HDAC Complex + 2 -2.2 Hdac3 Histone Deacetylase + 1 -2.2 Sin3a Sin3/HDAC Complex + 2 -1.5 Hdac1 Histone Deacetylase + 1 2.8 Sirt6 Histone Deacetylase + 1 2.9 Hdac11 Histone Deacetylase + 2 4.3 Polycomb Cbx7 + 1 -2.5 Cbx8/Pc3 Prc1 Component + 1 -2.2 Ezh2 Prc2 Component + 1 -2.0 Cbx6 + 1 3.0 Epc2 + 3 4.2 Cohesin Complex members Smc1a Core Subunit/Mitotic + 5 -2.9 Smc3 Core Subunit/Mitotic + 3 -2.5 Nipbl Loading Factor + 3 -1.9 Stag2 Core Subunit/Mitotic + 1 -1.8 Smc1b Core Subunit/Meiotic + 1 3.2 Stag3 Core Subunit/Meiotic + 1 3.3 Uncharacterized Methyltransferase Wbscr22 Deleted in Williams-Beuren Syndrome + 4 -2.6 6430573F11Rik + 1 -2.3 Wbscr27 Deleted in Williams-Beuren Syndrome + 1 3.1 Mettl7a1 + 1 3.7 H3K4 Methyltransferase Setd7 + 1 -1.8 Ash1l + 2 2.7 Setd1b + 1 3.2 Mll1 + 1 5.1 Histone Chaperones Chaf1a CAF-1 Complex + 2 -2.4 Chaf1b CAF-1 Complex + 1 -2.0 Hira + 2 2.8 Asf1a + 2 3.3 SWI/SNF Smarcd1 + 1 2.2 Arid1a + 2 3.0 Smarcb1 Core Subunit + 1 3.2 Argine Methylation Prmt1 H4R3 Methyltransferase + 1 -2.2 Prmt7 Imprinting Control + 1 -2.0 H3K36 Associated Morf4l1 Binds Methylated H3K36 + 2 2.9 Fbxl10 H3K36 Demethylase + 1 3.0

*Z-score for best shRNA is shown for multiple hairpin hits Bilodeau_FigS1 A Target Expression (%) Oct4 Expression (%) 0 50 100 0 50 100 GFP TRCN0000072201 TRCN0000092975 SetDB1 TRCN0000092973 TRCN0000092814 Suv39h2 TRCN0000092815

TRCN0000086071 Ehmt1 TRCN0000086068 TRCN0000040839 Ube2i/Ubc9 TRCN0000040841 TRCN0000071038 Cbx3/HP1 TRCN0000071041 Bilodeau_FigS2 A shRNA Z-score TRC Number Phase Hoechst Oct4 GFP 0.2 TRCN0000072201

SetDB1 #1 -2.6 TRCN0000092975

SetDB1 #2 -0.7 TRCN0000092973

SetDB1 #3 -0.1 TRCN0000092974

SetDB1 #4 0.0 TRCN0000092977

SetDB1 #5 1.3 TRCN0000092976

Oct4 Expression (%) B SetDB1 Expression (%) shRNA 0 25 50 75 100 0 25 50 75 100 GFP SetDB1 #1 SetDB1 #2 SetDB1 #3 SetDB1 #4 SetDB1 #5

129/Cast hybrid 129/Cast hybrid C SetDB1 Expression (%) Oct4 Expression (%) shRNA 0 25 50 75 100 0 25 50 75 100 GFP SetDB1 #1 Bilodeau_FigS3

A B WB: SetDB1 WB: SetDB1 H-300 11231-AP shRNA GFP + shRNA GFP + shRNA SetDB1 + shRNA SetDB1 +

200 200 SetDB1 150 150 SetDB1 100 100 75 75

WB: Gapdh WB: Gapdh

C SetDB1 ChIP D SetDB1 H-300 ChIP 6 6

5 H-300 5 shRNA-GFP (ctrl) 11231-AP shRNA-SetDB1 4 4

3 3

2 2

1 1 Enrichment Fold (Log2) Enrichment Fold (Log2) Enrichment Fold 0 0 Polrmt Nnat Mybl2 neg Polrmt Nnat Mybl2 neg -1 -1 Bilodeau_FigS4

A H3K9me3 (Ab8898) H3K9me3 (UP07-442)

H3K9me3 (Mikkelsen et al. 2007)

B Polrmt C Agrn D Gpa33 26 5kb Mikkelsen et al. 2007 al. et 26 Ab8898 H3K9me3

26 Reads/million Up07-442 Polrmt Agrn Nnat

Blcap E H3K4me3 ChIP F H3K27me3 ChIP G H3K9me3 ChIP 30 70 15

24 56 12

18 42 9

12 28 6 Enrichment Fold 6 14 3

0 0 0 QPCR: Nef3 Cdx2 Cdx2 Nef3 Polrmt Agrn no peptide H3 unmod. K9me3 K4me3 K27me3

H Trim24 I Nfatc3 J Akt1 15 H3K9me3 56 Reads/million H3K4me3 Trim24 Nfatc3 Akt1

K Hes5 L Osr1 M Phox2b 26 H3K9me3 18 Reads/million

H3K27me3 Hes5 Osr1 Phox2b Pank4 Bilodeau_FigS5

SetDB1 1675 557 1959 H3K9me3 Bilodeau_FigS6

A H3K9me3 variation following SetDB1 KD # genes Percentage loss 50% or more 163 33.3% loss 25-50% 85 17.4% no change 181 37.0% gain 25-50% 25 5.1% gain 50% or more 35 7.2%

B Sox30 C Nnat D Vax2 7 5kb SetDB1 100 WT

100 shpGFP

H3K9me3 100 Normalized reads shpSetDB1 Sox30 Nnat Vax2 A

H3K9me3 H3K27me3 H3K4me3 SetDB1 5kb Olig2 Olig2 B 5kb Sumo3 Sumo3 C 5kb Jmjd1a Jmjd1a Bilodeau_FigS7 D 5kb Pramel1 Pramel1 10 22 60 7

Reads/million A H3K9me3 H3K27me3 H3K4me3 Pax7 Pax7 B Lmx1b Lmx1b C Dlx1 Dlx1 Bilodeau_FigS8 31 34 77

Reads