Supplementary Information for

Acetylated histone H3K56 interacts with Oct4 to promote mouse embryonic stem cell pluripotency

Table of contents

Supplementary Figures 1-4 and Figure Legends

Supplementary Methods

Cell culture

Plasmid construction and transfection

ChIP-Sequencing

ChIP-Seq data analysis

K-means clustering Co-immunoprecipitation assay In vivo peptide pull-down assay

Flag-immunoprecipitation assay

In vitro peptide pull-down assay

Mononucleosome immunoprecipitation

Western blot

Quantitative PCR

Gel mobility shift assay

Supplementary Tables 1-8

Supplementary References

1 Supplementary Figures and Legends

0 1

%&'() %&'()

*(+, *(+,

!"#$ !"#$

-./01&" -./01&"

023 ()*+ 023 ,'-+

. /

%&'() %&'()

*(+, *(+,

!"#$ !"#$

-./01&" -./01&"

023 !"#$% 023 !$&"'

Supplementary Figure 1. The distribution of ChIP-Seq signals for NSO and

H3K56ac at Cluster 1 regions.

(A-D) Enrichment patterns of Nanog, Sox2 and Oct4 (NSO) and H3K56ac at Oct4 (also known as Pou5f1) (A), Klf4 (B), Nanog (C), and Nodal (D) loci are shown by

University of California, Santa Cruz (UCSC) genome browser.

2 !"#$%&$"'($)*+($,-

. F&G-(%5+# F&G-(%5+: F&G-(%5+; F&G-(%5+2 2!! 2!! 2!! 2!! ;!! ;!! ;!! ;!! :!! :!! :!! :!! #!! #!! #!! #!! A50B)&%+0B+C;D"E'/ <05='&>)%,+?*%5'@%+ ! ! ! ! 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478

/ F&G-(%5+# F&G-(%5+: F&G-(%5+; F&G-(%5+2 #"! #"! #"! #"! #!! #!! #!! #!! "! "! "! "! A50B)&%+0B+1/(2 <05='&>)%,+?*%5'@%+ ! ! ! ! 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 0 F&G-(%5+# F&G-(%5+: F&G-(%5+; F&G-(%5+2 :! :! :! :! #" #" #" #" #! #! #! #! " " " " <05='&>)%,+?*%5'@%+ 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 92 9: ! : 2 A50B)&%+0B+1/(2+7).,).@+=0()B- $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ $%&'()*%+,)-('./%+(0+ 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478 1/(2+3%'4+/%.(%5-+6478

Supplementary Figure 2. H3K56ac positively correlates with the enrichment of Oct4 in mouse ESCs.

(A-C) H3K56ac (A), Oct4 (B) and Oct4 binding motifs (MA0142 in JASPAR (1)) (C) signal distribution relative to the center of each cluster of Oct4 regions in Fig. 2A were calculated by CEAS-sitepro (2, 3). The x-axis represents the distance to the center of

Oct4 peaks in mouse ESCs. The y-axis represents ChIP-Seq tag count signals normalized by the number of regions in each cluster.

3

. / %&'()*+,-)./0"+*123-" %&'()*+,-)./0"+*123-" %&'(455/1")/33"+-66" %&@$(-1+" %&#('78")/33"6.-9/" %&B#(=/-0"+/6/5)=;+/" %&'(6/)*50",*3-1":*0;" %&@@(5/21-3"1/.45-3" %&'(<*5-",/332)40-" %&B#(A2.21/"6,45-3")*10C" !"#$%&'' %&'(/+:1;*" %&B@(A2.21/"6,45-3")*10C" %&#('7)/33"6.-9/" %&B#(:1-5)=4-3"-1)=" %=/43/1(6.-9/('" %&@@(6*+4./" %=/43/1(6.-9/(#" %&@!(3*D/1"3/9" %&>(455/1")/33"+-66" %&B#(=/-0"+/6/5)=;+/C" %&#(<*5-",/332)40-" %&@!(=-50,3-./C" %&>(/+:1;*" %&@'(=-41" %&#(6/)*50",*3-1":*0;" %&#(87)/33"6.-9/" %&B'(B6."-1)=C" %&'(/?.1-/+:1;*54)" %&B'(@50"-1)=C" %=/43/1(6.-9/(>" %&@#(+/.-5/,=1*6C" %=/43/1(6.-9/(@" %&B#(@50"-1)=" %&>(.1*,=/).*0/1+" %&B@(A2.21/"6,45-3")*10" %&@8(**);./" %&B>(.-43" %&@8(A/+-3/"9/1+")/33" !"#$%&' %&@@(+4003/"/-1" !" #!" $!" %&BE(1/,1*02).4F/" ()#* -& +, %&@#(A*1/-1+C" !"#$%&''' %&BE(.-43" %&@B(1/6."*A"5/,=14)"02)." !" #!" $!" ()#* -& +,

Supplementary Figure 3. The functional categories for Group I/II/III regions of

Cluster 1.

(A-B) Functional annotations of Group I (A) and II/III (B) Oct4 ChIP regions in mouse genome were performed using GREAT (4). Mouse Genome Informatics (MGI expression detected) indicates the information on tissue and developmental-stage-specific expression in mouse employed as an ontological category. Group I shows the functional enrichment of regions associated with early embryogenesis and pluripotency. The x-axis values (in logarithmic scale) correspond to the binomial raw p-values.

4 !"#$%&$"'($)*+($,-

. / !"# !"# 2"3 2"3 2"4 2"4 $%&'()&*+',-./&*,-*A,B# $%&'()&*+',-./&*,-*0(1,) 2"2 2"2 @4 @# 2 # 4 @4 @# 2 # 4 5&/(6.%&*7.86(19&*6, 5&/(6.%&*7.86(19&*6, :;22*:&(<*9&16&'8*=<>? :;22*:&(<*9&16&'8*=<>? 0 1 4"2 2"C ;"2 2"4 #"2 2"# $%&'()&*+',-./&*,-*D964 $%&'()&*+',-./&*,-*E;FGC(9 !"2 2"2 @4 @# 2 # 4 @4 @# 2 # 4 5&/(6.%&*7.86(19&*6, 5&/(6.%&*7.86(19&*6,* :;22*:&(<*9&16&'8*=<>? :;22*:&(<*9&16&'8*=<>?

Supplementary Figure 4. Nanog, Sox2, Oct4 and H3K56ac are enriched at p300

ChIP regions in the mouse genome.

(A-D) Average enrichment profiles of Nanog (A), Sox2 (B), Oct4 (C) and H3K56ac (D)

relative to p300 peak centers were evaluated by CEAS-sitepro (2, 3). The x-axis

represents the distance to the center of selected ChIP regions in mouse ESCs. The y-axis

represents average ChIP-Seq signals.

5

Supplementary Methods

Cell culture

Mouse embryonic stem cell line E14Tg2a (CRL-1820) was obtained from ATCC and cultured in Knockout™ Dulbecco’s Modified Eagle’s Medium (Invitrogen, Cat # 10829-

018) supplemented with 15% ES-qualified FBS (Omega Sci, Cat # FB-05, Lot # 104100),

2mM GlutaMAX™-I Supplement I (Invitrogen, Cat # 35050-061), 0.1 mM MEM Non-

Essential Amino Acids Solution (Invitrogen, Cat # 11140-050), 55 nM 2- mercaptoethanol (Invitrogen, Cat # 21985-023) and 1,000 units/ml LIF (Millipore,

ESG1107). Mouse ESCs were maintained at 37 °C, 5% carbon dioxide, fed with fresh media daily, and passaged onto new plates at an average ratio of 1:5 after trypsin dissociation. Prior to conducting research with ESCs at UCLA, approval was granted by the UCLA Embryonic Stem Cell Research Oversight (ESCRO) Committee.

Plasmid construction and transfection cDNA of H3.1 was subcloned into the Topo XL vector (Invitrogen) by RT-PCR from murine ES cell RNA with a Flag tag at the C-terminus and sequence verified. Subsequently the H3.1-Flag cassette was transferred into the pcDNA3 vector

(Invitrogen) for expression in mouse ESCs. H3.1K56R-Flag, and H3.1K56Q-Flag,

H3.1K56A-Flag and H3.1K9A-Flag plasmids were constructed using the QuikChange

Site-Directed Mutagenesis Kit (Stratagene) following manufacturer’s instruction and confirmed by sequencing. Plasmids were transfected into E14Tg2a cells using Xfect™ transfection reagent (Clontech, Cat # 631320) in accordance with the manufacturer's

6 instructions. shGFP plasmids were obtained from Gerald Crabtree’s lab. shAsf1a plasmids were bought from Open Biosystems. For prolonged decrease of Asf1a, ESCs were selected with puromycin from day 4 to day 9 after transfection.

ChIP-Sequencing

Chromatin immunoprecipitation (ChIP) was performed using about 2×107 cells with the following protocol described earlier (5). Briefly, E14Tg2a cells were cross linked with formaldehyde for 10 minutes, lysed in 10mM Tris-EDTA pH 8.0 with 1% SDS, and sonicated (Fisher Scientific #550 Sonic Dismembrator) on ice 4 times at 15 second pulses interrupted by 45 second pauses on power 4 followed by 2 times at 20 second pulses interrupted by 40 second pauses on power 2. Clarified sheared chromatin was immunoprecipitated with antibodies to histone H3K56ac (6) overnight at 4°C.

Immunoprecipitated DNA was collected with A dynabeads for 3 hours, washed twice for 5 minutes at 4°C with wash buffers and eluted with elution buffer containing

1% SDS. Eluates were heated at 65°C over night to reverse crosslinks, treated with

RNase and proteinase K, and DNA was purified with the Qiagen PCR purification kit.

10ng DNA was amplified using the ChIP-Seq DNA Sample Prep. Kit (Illumina, P/N #

1003473), and sequenced with Illumina Genome Analyzer Hiseq2000 at the UCLA

Broad Stem Cell Research Center high throughput sequencing facility.

ChIP-Seq data analysis

Raw data for Nanog, Sox2, Oct4, CTCF, pPolII and Smad1 ChIP-Seq in mouse E14Tg2a cells were downloaded from NCBI GEO (GSE11431) (7). H3K56ac ChIP DNA was

7 prepared in our lab and sequenced in the UCLA Broad Stem Cell Research Center High

Throughput Sequencing core facility. The raw ChIP-Seq data mapped uniquely to the mouse genome NCBI Build 37 (UCSC, mm9) by Bowtie (8). We then employed the algorithm described in Ferrari’s paper (9) to evaluate the significantly accumulated peaks of these reads in the genome. The distribution of these peaks around other protein’s peak centers was determined by sitepro (2, 3). The Pearson’s correlation was calculated with

Cistrome (10). Functional annotation of ChIP regions was obtained with GREAT (4), using the basal plus extension association rules and the mouse genome (mm9) as background.

K-means clustering

K-means clustering was performed using GENE CLUSTER 3.0 (11, 12) with Euclidean distance measurement and visualized by Java Treeview (13).

Co-immunoprecipitation (Co-IP) assay

The co-IP assay was performed as described in the Universal Magnetic Co-IP Kit

(Active Motif). 1x108 E14Tg2a cells were scraped, and washed with PBS for each co-

IP experiment. Nuclear extracts were digested with Enzymatic Shearing Cocktail containing MNase and then were cleared by centrifugation and nutated overnight with 2 µg α-IgG (Millipore, Cat # 12-370), α-Nanog (Santa Cruz, Cat # sc-134218), α-

Sox2 (Santa Cruz, Cat # sc-17320) or α-Oct4 (Santa Cruz, Cat # sc-8628) antibody.

Protein G beads were added for 3hrs before washing 5 times in co-IP washing buffer. Protein complexes were eluted for western blot assays. IgG (Millipore, Cat #

8 12-370) was used as a control in Fig. 1B.

In vivo peptide pull-down assay

H3K56 unmodified (47-65) and H3K56ac (47-65) biotinylated peptides were synthesized at the Proteomics Resource Center (Rockerfeller University) and conjugated to streptavidin beads. H3K9 unmodified (1-21) and H3K9ac (1-21) biotinylated peptides were bought from Millipore and conjugated to streptavidin beads. Peptide pull-down experiments were done according to Wysocka’s protocol (14). E14Tg2a cell nuclear extracts were prepared using Buffer A (14) and incubated with peptide-conjugated beads for 5 hrs at 4°C. Beads were then washed 8x in Buffer D/300 mM KCl/0.1% Triton X-

100 and eluted for western blot assays with α-Oct4 (1:250, Santa Cruz, Cat # sc-8628) in

Fig. 1C.

Flag-immunoprecipitation assay

48 hrs after transfection, cells were trypsinized and washed with PBS supplemented with deacetylase inhibitors, phosphatase inhibitors (Active Motif Universal Magnetic Kit, Cat

# 54002), and 1 mM PMSF. About 1×108 cells were resuspended in IP Buffer (50 mM

Tris pH 7.5, 0.5% NP40, 150 mM NaCl, protease inhibitor cocktail (Active Motif

Universal Magnetic Kit, Cat # 54002), deacetylase inhibitors, phosphatase inhibitors, and

1 mM PMSF). Cell lysate was cleared by protein A beads and nutated with 40 µl of α-

Flag M2 agarose beads (Sigma, Cat # A2220) overnight at 4°C. The eluted samples were separated by 15% SDS-PAGE gel and immunoblotted with mouse monoclonal α-Flag

9 (1:10000, Sigma, Cat # F1804) and goat polyclonal α-Oct4 (1:250, Santa Cruz, Cat # sc-

8628) in Fig. 1D.

In vitro peptide pull-down assay

Recombinant Oct4 was purchased from Origene (TP311998). 0.1 µg of Oct4 protein was mixed with beads-conjugated biotinylated peptides (H3K9 unmodified (1-21), H3K9ac

(1-21), H3K56 unmodified (47-65), H3K56ac (47-65)) and incubated in Buffer G5 (50 mM Tris, 150mM NaCl, 0.5% Triton-X, protease inhibitor cocktail, deacetylase inhibitors and phosphatase inhibitors, 1 mM PMSF) at 4°C for 5 hrs. Beads were washed

5x in Buffer G5W (50 mM Tris, 500mM NaCl, 0.5% Triton-X, protease inhibitor cocktail, deacetylase inhibitors and phosphatase inhibitors, 1 mM PMSF), and then eluted for western blot assays with α-Oct4 (1:250, Santa Cruz, Cat # sc-8628) in Fig. 1E.

Mononucleosome immunoprecipitation

Nucleosomes containing unmodified histone H3.1, H3.1K14ac or H3.1K56ac were prepared according to previous protocols (15, 16). Histone octamers were reconstituted by salt dialysis and fractioned by Superdex 200 (GE Healthcare) as described (17, 18).

Mononucleosomes were assembled by salt dilution using histone octamers and biotin-

TEG-20_601_0 DNA species made by PCR (16, 19). 2 µg of each of these nucleosomes were incubated with 100 ng recombinant Oct4 (Origene, TP311998) at 4°C in IP

Buffer (50 mM Tris pH 7.5, 0.5% NP40, 200 mM NaCl, protease inhibitor cocktail, deacetylase inhibitors and phosphatase inhibitors, 1 mM PMSF). After 4 hours incubation, 30 µl of streptavidin resin slurry (M-280 Dynabead, Invitrogen) was added to

10 the mixture and incubated at room temperature for another 2 hours and extensively washed to remove non-specifically adhered nucleosomes by washing 6 times with 1 ml

IP Washing Buffer (50 mM Tris pH 7.5, 0.5% NP40, 300 mM NaCl, protease inhibitor cocktail, deacetylase inhibitors and phosphatase inhibitors, 1 mM PMSF). The retained protein was detected by western blotting with indicated antibodies in Fig. 1F.

Western blot

The core histones were purified by histone purification mini kit from Active Motif (Cat #

40026). Whole cell extracts were prepared using RIPA buffer (150mM NaCl, 0.1% sodium deoxycholate, 0.1% SDS, 50mM Tris, pH 7.4, 1mM EDTA, phosphatase inhibitors, deacetylase inhibitors and 0.5 mM PMSF) supplemented with protease inhibitor cocktail. Equal amounts of proteins were separated by 15% SDS-PAGE gel and transferred to PVDF membrane and immunoblotted with the indicated antibodies.

Quantitative PCR

Quantitative PCR was performed with SYBR qPCR Master Mix (Fermentas) ABI 7500.

Fold enrichment was calculated using the protocols described (20). All primers are shown in Table S8 (20).

Gel mobility shift assays

DNA template was amplified with oligonucleotide primers that contain Oct4 recognition

DNA sequence (ATTTGAAAGGCAAAT) (21) at 3’ end of the 601 positioning sequence with 16 bp spacer sequence. Nucleosomes were then assembled by salt dilution essentially as described previously (see Mononucleosome immunoprecipitation).

11 About 25 ng wild type or H3K56 acetylated nucleosomes were titrated with increasing concentrations of recombinant Oct4 protein (Origene, TP311998) and nutated at room temperature for 2 hours. Samples were then loaded onto 4.5% (w/v) polyacrylamide native gels and stained with ethidium bromide (EB) (Fig. 4B).

Supplementary Tables

Supplementary Table 1. Pearson’s correlation value of each ChIP-Seq data was calculated by Cistrome(10).

Pearson’s Nanog Sox2 Oct4 H3K56ac Smad3 CTCF pPolII correlation

Nanog 1 0.05 0.05 -0.14 0.61 0.39 0.35

Sox2 0.05 1 0.98 0.76 0.08 0.14 0.54

Oct4 0.05 0.98 1 0.79 0.07 0.14 0.52

H3K56ac -0.14 0.76 0.79 1 -0.13 0.03 0.17

Smad3 0.61 0.08 0.07 -0.13 1 0.45 0.37

CTCF 0.39 0.14 0.14 0.03 0.45 1 0.29

pPolII 0.35 0.54 0.52 0.17 0.37 0.29 1

Supplementary Table 2. GREAT analysis to show the functional enrichment of H3K56ac/Oct4/Sox2/Nanog co-occupied Cluster 1 regions in Fig. 2B.

# GREAT version 2.0.1 Species assembly: mm9

12 # Ontology Term Name Binom Raw P- Value

GO Molecular chromatin binding 3.15E-18 Function GO Molecular RNA polymerase II factor 2.64E-07 Function binding GO Biological Process negative regulation of protein 2.40E-19 metabolic process GO Biological Process negative regulation of cellular protein 9.51E-18 metabolic process GO Biological Process chromatin organization 2.16E-17 GO Biological Process stem cell development 2.91E-17 GO Biological Process stem cell differentiation 5.70E-17 GO Biological Process chromatin modification 2.77E-16 GO Biological Process blastocyst development 3.19E-16 GO Biological Process stem cell maintenance 2.11E-15 GO Biological Process covalent chromatin modification 1.39E-13 GO Biological Process negative regulation of calcium ion- 2.13E-13 dependent exocytosis GO Biological Process histone modification 3.90E-13 GO Biological Process blastocyst formation 7.21E-13 GO Biological Process negative regulation of intracellular 9.29E-13 protein kinase cascade GO Biological Process trophectodermal cell differentiation 1.69E-11

GO Biological Process regulation of nucleocytoplasmic 7.61E-11 transport GO Biological Process regulation of protein import into 9.50E-11 nucleus GO Biological Process negative regulation of protein kinase 1.14E-09 activity GO Biological Process negative regulation of protein 2.54E-09 serine/threonine kinase activity GO Biological Process positive regulation of intracellular 2.59E-09 transport GO Biological Process regulation of intracellular transport 3.28E-09

GO Cellular histone methyltransferase complex 2.50E-14 Component GO Cellular chromatin 2.81E-13 Component GO Cellular PcG protein complex 4.01E-13

13 Component GO Cellular ESC/E(Z) complex 2.06E-08 Component GO Cellular cell cortex 4.67E-08 Component GO Cellular heterochromatin 1.59E-07 Component GO Cellular PRC1 complex 2.90E-07 Component GO Cellular transcriptional repressor complex 1.08312E-06 Component GO Cellular oncostatin-M receptor complex 4.58652E-06 Component GO Cellular actomyosin 3.97026E-05 Component GO Cellular filopodium 7.40988E-05 Component GO Cellular aggresome 0.000118575 Component Mouse Phenotype complete prenatal lethality 2.77E-22 Mouse Phenotype complete embryonic lethality before 1.10E-16 somite formation Mouse Phenotype embryonic lethality before somite 1.50E-16 formation Mouse Phenotype abnormal cell proliferation 1.00E-15 Mouse Phenotype decreased granulocyte number 3.36E-15 Mouse Phenotype decreased cell proliferation 6.86E-15 Mouse Phenotype abnormal 7.77E-15 Mouse Phenotype kinked tail 5.16E-14 Mouse Phenotype exencephaly 6.46E-13 Mouse Phenotype abnormal notochord morphology 1.32E-12 Mouse Phenotype liver hypoplasia 2.30E-12 Mouse Phenotype abnormal placenta labyrinth 3.81E-12 morphology Mouse Phenotype abnormal megakaryocyte progenitor 6.55E-12 cell morphology Mouse Phenotype microcephaly 7.95E-12 Mouse Phenotype abnormal eosinophil morphology 9.50E-12 Mouse Phenotype abnormal trophoblast layer morphology 1.98E-11

Mouse Phenotype abnormal blastocyst morphology 2.36E-11 Mouse Phenotype abnormal eosinophil cell number 3.79E-11 Mouse Phenotype retina hypoplasia 1.13E-10 Mouse Phenotype abnormal placenta vasculature 2.35E-10

14 Human Phenotype Duodenal atresia 2.90E-10 Human Phenotype Abnormality of the duodenum 6.36E-10 Human Phenotype Holoprosencephaly 1.31E-08 Human Phenotype Incomplete penetrance 3.94E-08 Human Phenotype Ventricular septal defect 5.24E-08 Human Phenotype Abnormality of the ventricular septum 5.34E-08

Human Phenotype Triphalangeal thumb 1.22E-07 Human Phenotype Esophageal atresia 3.13E-07 Human Phenotype Abnormality of the nasal tip 5.20E-07 Human Phenotype Cleft lip 8.82E-07 Human Phenotype Growth failure 1.43608E-06 Human Phenotype Single median maxillary incisor 1.55341E-06 Human Phenotype Aplasia/Hypoplasia of the corpus 2.17596E-06 callosum Human Phenotype Abnormality of the cerebral white 2.27208E-06 matter Human Phenotype Agenesis of corpus callosum 2.27388E-06 Human Phenotype Genitourinary tract neoplasm 2.36218E-06 Human Phenotype Coloboma 3.27039E-06 Human Phenotype Vertebral fusion 1.08228E-05 Human Phenotype Abnormality of the corpus callosum 1.16959E-05

Human Phenotype Polydactyly 1.39643E-05 Disease Ontology endometrial carcinoma 2.87E-10 Disease Ontology ovary adenocarcinoma 4.57E-10 Disease Ontology pancreatic ductal adenocarcinoma 2.98E-09 Disease Ontology carcinosarcoma 3.93E-09 Disease Ontology endometrioid carcinoma 4.13E-09 Disease Ontology glomerulonephritis 6.02E-09 Disease Ontology embryonal carcinoma 1.00E-08 Disease Ontology intraepithelial neoplasm 1.02E-08 Disease Ontology seminoma 1.62E-08 Disease Ontology endometrial disease 1.77E-08 Disease Ontology malignant mixed cancer 1.77E-08 Disease Ontology endometrial neoplasm 2.30E-08 Disease Ontology gallbladder carcinoma 2.69E-08 Disease Ontology endometrioid ovary carcinoma 2.94E-08 Disease Ontology female reproductive endometrioid 5.05E-08 cancer Disease Ontology germinoma 5.36E-08 Disease Ontology uterine disease 5.51E-08 Disease Ontology purpura 1.19E-07

15 Disease Ontology bacterial meningitis 1.84E-07 Disease Ontology mucoepidermoid carcinoma 1.90E-07 Pathway Commons network 3.12E-07 Pathway Commons Signaling mediated by p38-gamma and 1.25877E-05 p38-delta Pathway Commons ATF-2 transcription factor network 3.47786E-05

Pathway Commons IL6-mediated signaling events 0.000307418 Pathway Commons PDGFR-alpha signaling pathway 0.001329285 MSigDB Pathway Thyroid cancer 5.19E-10 MSigDB Pathway Prostate cancer 2.82E-07 MSigDB Pathway Endometrial cancer 4.76E-07 MSigDB Pathway TGF beta signaling pathway 0.000128851 MSigDB Pathway Glioma 0.000170135 MSigDB Pathway WNT Signaling Pathway 0.001451094 MGI Expression: TS4_compacted morula 2.26E-60 Detected MGI Expression: TS4_inner cell mass 5.08E-59 Detected MGI Expression: Theiler_stage_4 7.52E-59 Detected MGI Expression: TS4_embryo 8.58E-59 Detected MGI Expression: TS4_second polar body 4.63E-56 Detected MGI Expression: TS4_zona pellucida 4.63E-56 Detected MGI Expression: Theiler_stage_5 2.88E-55 Detected MGI Expression: TS5_inner cell mass 2.50E-54 Detected MGI Expression: TS4_extraembryonic component 2.73E-54 Detected MGI Expression: TS5_embryo 5.26E-54 Detected MGI Expression: TS3_zona pellucida 4.87E-51 Detected MGI Expression: Theiler_stage_3 1.87E-50 Detected MGI Expression: TS3_second polar body 3.11E-50 Detected MGI Expression: TS3_4-8 cell stage 4.31E-50 Detected MGI Expression: TS3_4-cell stage 7.89E-48 Detected

16 MGI Expression: TS3_8-cell stage 2.37E-46 Detected MGI Expression: TS5_trophectoderm 1.73E-30 Detected MGI Expression: TS5_extraembryonic component 6.77E-27 Detected MGI Expression: TS12_embryo; ectoderm 5.53E-25 Detected MGI Expression: TS19_primordial germ cells 4.54E-23 Detected MSigDB Perturbation whose expression pattern in 9.30E-28 adult male germ cell tumors (GCT) correlates with POU5F1 [Gene ID=5460]. MSigDB Perturbation Set 'ES exp1': genes overexpressed in 1.26E-18 human embryonic stem cells according to 5 or more out of 20 profiling studies.

MSigDB Perturbation Genes constituting the PluriNet 1.57E-18 protein-protein network shared by the pluripotent cells (embryonic stem cells, embryonical carcinomas and induced pluripotent cells).

MSigDB Perturbation Set 'Set 'ES exp2': genes overexpressed 6.87E-17 in human embryonic stem cells according to a meta-analysis of 8 profiling studies.

MSigDB Perturbation Supplementary Table 2. Genelist 4.94E-16 comparing microarray expression profiles of spermatogonial cells, haGSCs and hES (H1) cells. Examples of expression rates of different hES cell enriched and germ cell specific genes, surface markers for germ cell selection and signal transduction in all three cell types (spermatogonial cells = SC).

17 MSigDB Perturbation Genes down-regulated in hES cells 6.14E-16 (human embryonic stem cells) after treatment with the ALK [Gene ID=238] inhibitor SB-431542 [PubChem=4521392]. MSigDB Perturbation Set 'NOS targets': genes upregulated 7.32E-15 and identified by ChIP on chip as targets of the transcription factors NANOG [Gene ID=79923], OCT4[Gene ID=5460], and Sox2 [Gene ID=6657] (NOS) in human embryonic stem cells.

MSigDB Perturbation Genes up-regulated in HDMEC cells 3.25E-12 (microvascular endothelium): proliferating vs quiescent cells. MSigDB Perturbation Genes down-regulated in HeLa cells 2.58E-11 (cervical carcinoma) 32 h after infection with adenovirus Ad12.

MSigDB Perturbation Genes down-regulated in HCT8/S11 2.72E-11 cells (colon cancer) engineered to stably express NTN1 [Gene ID=1630] off a plasmid vector.

MSigDB Perturbation Cluster 2: late ATM [Gene ID=472] 5.17E-11 dependent genes induced by ionizing radiation treatment. MSigDB Perturbation Genes down-regulated in HeLa cells 1.23E-10 (cervical carcinoma) 24 h after infection with adenovirus Ad12.

MSigDB Perturbation Genes from the black module which 1.52E-10 are up-regulated in HAEC cells (primary aortic endothelium) after exposure to the oxidized 1-palmitoyl-2- arachidonyl-sn-3- glycerophosphorylcholine (oxPAPC).

MSigDB Perturbation Candidate genes in significant regions 1.94E-10 of chromosomal copy number gains in a panel of melanoma samples.

18 MSigDB Perturbation Genes up-regulated in MCF7 cells 3.07E-10 (breast cancer) after stimulation with NRG1 [Gene ID=3084]. MSigDB Perturbation Genes down-regulated in HeLa cells 4.74E-10 (cervical carcinoma) 12 h after infection with adenovirus Ad12.

MSigDB Perturbation Genes down-regulated in HeLa cells 7.08E-10 (cervical carcinoma) 48 h after infection with adenovirus Ad12.

MSigDB Perturbation Genes down-regulated in MEF cells 8.54E-10 (embryonic fibroblasts) with TERT [Gene ID=7015] knockout, after expression of the gene off a retroviral vector. MSigDB Perturbation Genes predicting the embryonic 1.48E-09 carcinoma (EC) subtype of nonseminomatous male germ cell tumors (NSGCT). MSigDB Perturbation Genes changed in INA-6 cells (multiple 4.29E-09 myeloma, MM) by re-addition of IL6 [Gene ID=3569] after its initial withdrawal for 12h. MSigDB Predicted Motif SGCGSSAAA matches E2F1: 8.74E-13 Promoter Motifs E2F transcription factor 1<br> TFDP1: transcription factor Dp- 1<br> RB1: retinoblastoma 1 (including osteosarcoma)

MSigDB Predicted Motif SNNNCCNCAGGCN matches 3.09E-10 Promoter Motifs GTF3A: general transcription factor IIIA MSigDB Predicted Motif GCCAYGYGSN matches MYC: 1.02E-08 Promoter Motifs v-myc myelocytomatosis viral oncogene homolog (avian)<br> MAX: MYC associated factor X

MSigDB Predicted Motif CGGCCATCT (no known TF) 3.72E-08 Promoter Motifs MSigDB Predicted Motif AACYNNNNTTCCS (no known 2.85E-07 Promoter Motifs TF)

19 MSigDB Predicted Motif TTCNRGNNNNTTC (no known 3.57E-07 Promoter Motifs TF) MSigDB Predicted Motif KCCGNSWTTT (no known TF) 1.48323E-06 Promoter Motifs MSigDB Predicted Motif WYAAANNRNNNGCG (no 8.8699E-06 Promoter Motifs known TF) MSigDB Predicted Motif GGCNNMSMYNTTG (no 2.07958E-05 Promoter Motifs known TF) MSigDB Predicted Motif GGCNKCCATNK (no known 6.4982E-05 Promoter Motifs TF) MSigDB Predicted Motif GCGSCMNTTT (no known TF) 0.000362355 Promoter Motifs MSigDB Predicted Motif 0.000644763 Promoter Motifs TGCNHNCWYCCYCATTAKTNND CNMNHYCN matches HOXA5: homeobox A5 MSigDB Predicted Motif ATCMNTCCGY (no known TF) 0.001718802 Promoter Motifs MSigDB miRNA Targets of MicroRNA 4.76059E-06 Motifs AGGGCAG,MIR-18A MSigDB miRNA Targets of MicroRNA 1.87598E-05 Motifs GCTGAGT,MIR-512-5P MSigDB miRNA Targets of MicroRNA 0.000113938 Motifs CCAGGGG,MIR-331 MSigDB miRNA Targets of MicroRNA 0.000221652 Motifs GCGCCTT,MIR-525,MIR-524 MSigDB miRNA Targets of MicroRNA 0.000272303 Motifs AGGAGTG,MIR-483 MSigDB miRNA Targets of MicroRNA 0.000556865 Motifs GTCAACC,MIR-380-5P

Supplementary Table 3. GREAT gene ontology analysis to show the functional enrichment of H3K56ac/Oct4/Sox2 co-occupied Cluster 2 regions in Fig. 2C. # GREAT version 2.0.1 Species assembly: mm9 # Ontology Term Name Binom Raw P-Value

GO Biological Process microtubule cytoskeleton organization 1.01E-09 GO Biological Process translational elongation 3.83E-09 GO Biological Process lens morphogenesis in camera-type 8.02E-08 eye

20 GO Biological Process protein kinase B signaling cascade 1.03E-07 GO Biological Process epithelial cell development 3.35E-07 GO Biological Process mesoderm morphogenesis 3.39419E-05 GO Biological Process cell-cell signaling involved in cell fate 6.26251E-05 commitment GO Biological Process positive regulation of cholesterol 8.21307E-05 biosynthetic process GO Biological Process cellular response to indole-3-methanol 0.000170087 GO Biological Process metanephric nephron development 0.000200856 GO Biological Process negative regulation of intracellular 0.000264134 protein kinase cascade GO Biological Process olfactory nerve development 0.00031727 GO Biological Process cranial nerve development 0.00062236 GO Biological Process neuromuscular synaptic transmission 0.000756695 GO Biological Process positive regulation of axon extension 0.000764383 GO Biological Process metanephric nephron morphogenesis 0.000844075 GO Cellular Component cytoplasmic mRNA processing body 3.33E-07 GO Cellular Component , centromeric region 8.91599E-06 GO Cellular Component autophagic vacuole 1.71367E-05 GO Cellular Component euchromatin 0.000308055 GO Cellular Component basal lamina 0.00035226 GO Cellular Component laminin-10 complex 0.00067866 GO Cellular Component nuclear heterochromatin 0.003116938 Mouse Phenotype small orbits 9.12E-08 Mouse Phenotype camptodactyly 1.02995E-06 Mouse Phenotype pale yolk sac 2.27219E-06 Mouse Phenotype abnormal lens vesicle development 9.9396E-06 Mouse Phenotype abnormal parietal yolk sac 2.55553E-05 morphology Mouse Phenotype abnormal embryonic erythropoiesis 6.5802E-05 Mouse Phenotype aniridia 8.54791E-05 Mouse Phenotype abnormal lens induction 0.000107248 Mouse Phenotype renal hypoplasia 0.000125852 Mouse Phenotype absent oculomotor nerve 0.000152263 Mouse Phenotype absent Meckel's cartilage 0.000174241 Mouse Phenotype absent trophoblast giant cells 0.000177729 Mouse Phenotype increased monocyte cell number 0.0001926 Mouse Phenotype small lens 0.000719417 Mouse Phenotype abnormal metacarpal bone 0.000753939 morphology Pathway Commons Wnt 4.94207E-05 Pathway Commons E-cadherin signaling in keratinocytes 0.00048478

21 Pathway Commons Hypoxic and oxygen homeostasis 0.000603825 regulation of HIF-1-alpha Pathway Commons HIF-1-alpha transcription factor 0.000718571 network Pathway Commons Alpha6 beta4 integrin-ligand 0.001370319 interactions Pathway Commons PLC-gamma1 signalling 0.00144396 Pathway Commons Nongenotropic Androgen signaling 0.003566946 MGI Expression: Detected TS5_embryo 4.18E-24 MGI Expression: Detected TS5_inner cell mass 2.13E-23 MGI Expression: Detected TS21_nasal cavity 1.22E-11 MGI Expression: Detected TS21_nasal cavity; epithelium 5.08E-11 MGI Expression: Detected TS28_olfactory epithelium 2.43E-10 MGI Expression: Detected TS28_nasal cavity epithelium 4.02E-10 MGI Expression: Detected TS12_embryo; ectoderm 2.09E-09 MGI Expression: Detected TS12_embryo; ectoderm; neural 3.94E-09 ectoderm MGI Expression: Detected TS28_trigeminal V ganglion 5.21E-09 MGI Expression: Detected TS17_latero-nasal process; 1.26E-08 mesenchyme MGI Expression: Detected TS14_unsegmented mesenchyme 1.91E-08 MGI Expression: Detected TS14_2nd arch 1.04E-07 MGI Expression: Detected TS23_urinary bladder neck serosa 2.20E-07 MGI Expression: Detected TS23_sublingual gland primordium 6.49E-07 MGI Expression: Detected TS19_tail; mesenchyme; paraxial 8.14E-07 mesenchyme MGI Expression: Detected TS13_branchial arch 8.85E-07 MGI Expression: Detected TS23_pelvic ganglion 2.27516E-06 MGI Expression: Detected TS15_2nd arch; mesenchyme 3.59445E-06 MGI Expression: Detected TS15_2nd arch; mesenchyme; 3.62994E-06 mesenchyme derived from neural crest MGI Expression: Detected TS23_lumbo-sacral plexus 4.1863E-06 MSigDB Perturbation Genes up-regulated in NHEK cells 8.08E-18 (normal epidermal keratinocytes) after UVB irradiation. MSigDB Perturbation Transcripts depleted in pseudopodia 6.66E-12 of NIH/3T3 cells (fibroblast) in response to the chemotactic migration stimulus by lysophosphatidic acid (LPA) [PubChem=3988]. MSigDB Perturbation Housekeeping genes identified as 1.41E-10 expressed across 19 normal tissues.

22 MSigDB Perturbation Genes within amplicon 20q12-q13 1.80E-10 identified in a copy number alterations study of 191 breast tumor samples. MSigDB Perturbation Genes up-regulated in MCF7 cells 1.01E-09 (breast cancer) after stimulation with NRG1 [Gene ID=3084]. MSigDB Perturbation Genes up-regulated in Caco-2 cells 2.15E-09 (intestinal epithelium) after coculture with the probiotic bacteria L. casei for 6h. MSigDB Perturbation Genes up-regulated in MCF7 cells 2.34E-09 (breast cancer) after stimulation with EGF [Gene ID=1950]. MSigDB Perturbation Genes down-regulated in LNCaP cells 8.29E-09 (prostate cancer) by overexpression of SOX4 [Gene ID=6659] and up- regulated by its RNAi knockdown. MSigDB Perturbation Top 50 genes up-regulated in A549 8.50E-09 cells (lung cancer) expressing STAT3 [Gene ID=6774] off an adenovirus vector. MSigDB Perturbation Genes changed in INA-6 cells 2.58E-08 (multiple myeloma, MM) by re- addition of IL6 [Gene ID=3569] after its initial withdrawal for 12h. MSigDB Perturbation Genes up-regulated in MDA-MB-231 4.09E-07 cells (breast cancer) after knockdown of PTHLH [Gene ID=5744] by RNAi. MSigDB Perturbation Genes down-regulated in reverted 7.28E-07 NIH3T3 cells (fibroblasts transformed by activated KRAS [Gene ID=3845] which then reverted to normal cells upon stable over-expression of a dominant negative form of CDC25 [Gene ID=5923]) vs normal fibroblasts. MSigDB Perturbation Genes whose expression peaked at 4.45869E-06 480 min after stimulation of HeLa cells with EGF [Gene ID=1950]. MSigDB Perturbation Genes up-regulated in NHEK cells 9.91518E-06 (normal keratinocytes) by UV-B irradiation.

23 MSigDB Perturbation Genes up-regulated in the invasive 1.22361E-05 ductal carcinoma (IDC) compared to the invasive lobular carcinoma (ILC), the two major pathological types of breast cancer. MSigDB Perturbation Genes down-regulated in transformed 1.59486E-05 NIH3T3 cells (fibroblasts transformed by activated KRAS [Gene ID=3845]) vs normal cells. MSigDB Perturbation Gene changed by CD40 [Gene 2.51141E-05 ID=958] signaling in Ramos cells (EBV negative Burkitt lymphoma). MSigDB Perturbation Genes up-regulated by imatinib 3.01531E-05 [PubChem ID=5291] during dendritic cell differentiation. MSigDB Perturbation Genes associated with clinical 4.47793E-05 prognosis of pediatric AML (acute myeloid leukemia): good prognosis=no relapse > 3 years; poor prognosis=relapse < 1 year or no response to therapy. MSigDB Perturbation Genes down-regulated in fibroblasts 4.64605E-05 expressing the XP/CS mutant form of ERCC3 [Gene ID=2071], after low dose UVC irradiation. MSigDB Predicted Motif NNRYCACGTGRYNN (no 2.06E-10 Promoter Motifs known TF) MSigDB Predicted Motif NDDNNCACGTGNNNNN 2.96E-08 Promoter Motifs matches ARNT: aryl hydrocarbon receptor nuclear translocator MSigDB Predicted Motif 6.55E-08 Promoter Motifs NNNNNRTCACGTGAYNNNNN matches ARNT: aryl hydrocarbon receptor nuclear translocator MSigDB Predicted Motif WCTCNATGGY (no known 9.23E-08 Promoter Motifs TF) MSigDB Predicted Motif GCCAYGYGSN matches 1.59E-07 Promoter Motifs MYC: v-myc myelocytomatosis viral oncogene homolog (avian)<br> MAX: MYC associated factor X MSigDB Predicted Motif GYCACGTGNC (no known 1.87E-07 Promoter Motifs TF)

24 MSigDB Predicted Motif 3.96E-07 Promoter Motifs NNNNNNNCACGTGNNNNNNN matches MYC: v-myc myelocytomatosis viral oncogene homolog (avian)<br> MAX: MYC associated factor X MSigDB Predicted Motif NNANCACGTGNTNN 9.16E-07 Promoter Motifs matches MAX: MYC associated factor X MSigDB Predicted Motif TTTSGCGC matches E2F1: 1.12091E-06 Promoter Motifs E2F transcription factor 1<br> TFDP1: transcription factor Dp- 1<br> RB1: retinoblastoma 1 (including osteosarcoma) MSigDB Predicted Motif GGGGAGGG matches MAZ: 1.8279E-06 Promoter Motifs MYC-associated zinc finger protein (purine-binding transcription factor) MSigDB Predicted Motif NRCCACGTGASN (no known 1.91827E-06 Promoter Motifs TF) MSigDB Predicted Motif TTTSGCGS (no known TF) 2.83808E-06 Promoter Motifs MSigDB Predicted Motif CASGYG (no known TF) 4.44193E-06 Promoter Motifs MSigDB Predicted Motif CGCATGCGCR matches 7.00957E-06 Promoter Motifs NRF1: nuclear respiratory factor 1 MSigDB Predicted Motif NATCACGTGAY matches 0.000010816 Promoter Motifs SREBF1: sterol regulatory element binding transcription factor 1 MSigDB Predicted Motif GGGGCGGGGC matches SP1: 2.62153E-05 Promoter Motifs Sp1 transcription factor MSigDB Predicted Motif ATCMNTCCGY (no known 4.64323E-05 Promoter Motifs TF) MSigDB Predicted Motif SGCGSSAAA matches E2F1: 6.78738E-05 Promoter Motifs E2F transcription factor 1<br> TFDP1: transcription factor Dp- 1<br> RB1: retinoblastoma 1 (including osteosarcoma) MSigDB Predicted Motif YRCCAKNNGNCGC (no 0.000188636 Promoter Motifs known TF) MSigDB Predicted Motif TWSGCGCGAAAAYKR (no 0.000252024 Promoter Motifs known TF) MSigDB miRNA Motifs Targets of MicroRNA 1.6884E-05 CCAGGGG,MIR-331

25 MSigDB miRNA Motifs Targets of MicroRNA 7.20971E-05 AGCGCAG,MIR-191 MSigDB miRNA Motifs Targets of MicroRNA 0.00047898 AGCTCCT,MIR-28 MSigDB miRNA Motifs Targets of MicroRNA 0.001254237 GCTGAGT,MIR-512-5P

Supplementary Table 4. GREAT gene ontology analysis to show the functional enrichment of H3K56ac/Oct4/Nanog co-occupied Cluster 2 regions in Fig. 2D. # GREAT version 2.0.1 Species assembly: mm9 # Ontology Term Name Binom Raw P-Value

GO Biological Process organelle transport along 3.46605E-05 microtubule GO Biological Process establishment of protein 0.000296315 localization to organelle GO Biological Process regulation of cell shape 0.000324575 GO Cellular perinuclear region of cytoplasm 1.93121E-06 Component GO Cellular ruffle membrane 0.000953068 Component Mouse Phenotype complete embryonic lethality 2.67E-07 Mouse Phenotype absent gametes 1.83328E-05 Mouse Phenotype abnormal renal glomerular capsule 2.14847E-05 Mouse Phenotype azoospermia 4.10136E-05 Mouse Phenotype absent germ cells 7.01266E-05 Mouse Phenotype increased pancreatic beta cell 8.82316E-05 number Mouse Phenotype enlarged ovary 0.00012994 Mouse Phenotype increased myocardial fiber number 0.000202969 Disease Ontology mucinous tumor 3.83696E-06 Disease Ontology cystic, mucinous, and serous 1.52386E-05 neoplasm Disease Ontology blastoma 1.53869E-05 Disease Ontology clear cell adenocarcinoma 2.02323E-05 Disease Ontology mucoepidermoid carcinoma 2.87029E-05 Disease Ontology uterine neoplasm 3.23792E-05 Disease Ontology ovary adenocarcinoma 0.000101376 Disease Ontology seminoma 0.000227286 Disease Ontology nephroblastoma 0.000628488

26 Pathway Commons Cell Cycle, Mitotic 9.33E-07 Pathway Commons Metabolism of RNA 4.6295E-06 Pathway Commons Metabolism of mRNA 8.42562E-06 Pathway Commons PI3K Cascade 0.000263695 MGI Expression: TS4_inner cell mass 1.94E-13 Detected MGI Expression: TS3_4-cell stage 3.62E-13 Detected MGI Expression: TS4_second polar body 4.76E-11 Detected MGI Expression: TS4_zona pellucida 4.76E-11 Detected MGI Expression: TS4_compacted morula 6.64E-11 Detected MGI Expression: TS5_embryo 1.23E-10 Detected MGI Expression: TS25_smooth muscle 2.39561E-06 Detected MGI Expression: TS8_ectoplacental cone 1.80323E-05 Detected MGI Expression: TS20_midbrain; lateral wall 2.79504E-05 Detected MGI Expression: TS11_primitive streak 3.46355E-05 Detected MGI Expression: TS12_head mesenchyme; 4.70052E-05 Detected mesenchyme derived from neural crest MGI Expression: TS20_midbrain; lateral wall; 6.90383E-05 Detected ventricular layer MGI Expression: TS8_polar trophectoderm 9.64114E-05 Detected MGI Expression: TS18_diencephalon 0.000141392 Detected MGI Expression: TS24_oviduct 0.000198077 Detected MSigDB Perturbation Cluster 1: genes up-regulated in 1.60E-07 B493-6 cells (B lymphocytes) upon serum stimulation but not by affected by MYC [Gene ID=4609]. MSigDB Perturbation Transcripts depleted in pseudopodia 1.2142E-06 of NIH/3T3 cells (fibroblast) in response to the chemotactic migration stimulus by lysophosphatidic acid (LPA) [PubChem=3988].

27 MSigDB Perturbation Genes up-regulated in robust 1.29291E-06 Cluster 2 (rC2) of hepatoblastoma samples compared to those in the robust Cluster 1 (rC1). MSigDB Perturbation Genes down-regulated in the 4.7874E-05 MM1S cells (multiple myeloma) after treatment with aplidin [PubChem=44152164], a marine- derived compound with potential anti-cancer properties. MSigDB Perturbation Genes up-regulated in lesional skin 0.000175863 biopsies from mycosis fundoides patients compared to the normal skin samples. MSigDB Perturbation Genes up-regulated in endometroid 0.000383821 endometrial tumors from patients with lymph node metastases compared to those without the metastases. MSigDB Perturbation Genes identified by method 2 as 0.00043722 coordinately down-regulated late in HMEC cells (mammary epithelium) during acinar development in vitro. MSigDB Perturbation Genes abnormally regulated in 0.000695664 response to CD40L and IL4 [Gene ID=959, 3565] stimulation of B lymphocytes from patients with a hypomorphic mutation of IKBKG [Gene ID=8517]. MSigDB Perturbation Down-regulated genes in Calu3 0.001228845 cells (non-small cell lung cancer, NSCLC) resistant to gemcitabine [PubChem=3461] compared to the parental line sensitive to the drug. MSigDB Perturbation Up-regulated genes in Calu3 cells 0.001228845 (non-small lung cancer cells, NSCLC) resitant to gemcitabine [PubChem=3461] in response to bexarotene [PubChem=82146]. MSigDB Predicted Motif 4.81859E-06 Promoter Motifs NNNNNRTCACGTGAYNNNNN matches ARNT: aryl hydrocarbon receptor nuclear translocator

28 MSigDB Predicted Motif GGCNKCCATNK (no 6.94593E-06 Promoter Motifs known TF) MSigDB Predicted Motif NNRYCACGTGRYNN (no 8.6398E-06 Promoter Motifs known TF) MSigDB Predicted Motif GCCATNTTG matches 9.81787E-05 Promoter Motifs YY1: YY1 transcription factor MSigDB Predicted Motif TTTSGCGC matches E2F1: 0.000168443 Promoter Motifs E2F transcription factor 1<br> TFDP1: transcription factor Dp-1<br> RB1: retinoblastoma 1 (including osteosarcoma) MSigDB Predicted Motif 0.000320323 Promoter Motifs NNNNNNNCACGTGNNNNNNN matches MYC: v-myc myelocytomatosis viral oncogene homolog (avian)<br> MAX: MYC associated factor X MSigDB Predicted Motif NNNRRCCAATSRGNNN 0.00034873 Promoter Motifs (no known TF) MSigDB Predicted Motif NDDNNCACGTGNNNNN 0.000506458 Promoter Motifs matches ARNT: aryl hydrocarbon receptor nuclear translocator MSigDB Predicted Motif WCAANNNYCAG (no 0.000594534 Promoter Motifs known TF) MSigDB Predicted Motif GGCNNMSMYNTTG (no 0.00064261 Promoter Motifs known TF) MSigDB Predicted Motif NATCACGTGAY matches 0.000821104 Promoter Motifs SREBF1: sterol regulatory element binding transcription factor 1 MSigDB Predicted Motif TTTSGCGS (no known TF) 0.00128181 Promoter Motifs MSigDB Predicted Motif GGGGAGGG matches 0.00204494 Promoter Motifs MAZ: MYC-associated zinc finger protein (purine-binding transcription factor) MSigDB miRNA Motifs Targets of MicroRNA 3.68829E-05 CTCAGGG,MIR-125B,MIR-125A InterPro Rho GTP exchange factor 1.0831E-06

Supplementary Table 5. GREAT gene ontology analysis to show the functional enrichment of Group I regions in supplementary Fig. 3A.

29 # GREAT version 2.0.2 Species assembly: mm9 # Ontology Term Name Binom Raw P-Value

GO Molecular Function chromatin binding 1.01E-17 GO Molecular Function transcription regulatory region DNA 1.17E-12 binding GO Molecular Function regulatory region DNA binding 2.01E-12 GO Molecular Function sequence-specific DNA binding 3.06E-09 GO Biological Process stem cell differentiation 9.65E-24 GO Biological Process stem cell maintenance 2.08E-23 GO Biological Process stem cell development 4.78E-22 GO Biological Process regulation of transcription from 8.83E-16 RNA polymerase II promoter GO Biological Process positive regulation of cellular 1.57E-15 metabolic process GO Biological Process blastocyst development 4.03E-15 GO Biological Process embryo development ending in birth 1.66E-14 or egg hatching GO Biological Process chordate embryonic development 1.70E-14 GO Biological Process negative regulation of cellular 2.08E-14 metabolic process GO Biological Process negative regulation of cellular 1.16E-13 biosynthetic process GO Biological Process blastocyst formation 1.18E-13 GO Biological Process embryo development 1.33E-13 GO Biological Process negative regulation of biosynthetic 2.39E-13 process GO Biological Process positive regulation of transcription, 2.87E-13 DNA-dependent GO Biological Process positive regulation of 2.91E-13 macromolecule biosynthetic process

GO Biological Process negative regulation of metabolic 3.12E-13 process GO Biological Process positive regulation of gene 3.26E-13 expression GO Biological Process positive regulation of RNA 8.96E-13 metabolic process GO Biological Process positive regulation of nitrogen 1.02E-12 compound metabolic process GO Biological Process positive regulation of cellular 1.28E-12 biosynthetic process

30 GO Cellular Component nucleoplasm 3.57E-16 GO Cellular Component female pronucleus 7.02E-09 GO Cellular Component transcription factor complex 6.55E-08 Mouse Phenotype prenatal growth retardation 3.97E-23 Mouse Phenotype embryonic lethality 8.44E-22 Mouse Phenotype abnormal embryogenesis/ 1.83E-21 development Mouse Phenotype abnormal prenatal 2.81E-19 growth/weight/body size Mouse Phenotype abnormal extraembryonic tissue 6.35E-19 morphology Mouse Phenotype abnormal embryonic tissue 4.76E-17 morphology Mouse Phenotype complete embryonic lethality before 1.00E-16 somite formation Mouse Phenotype embryonic growth retardation 4.31E-16 Mouse Phenotype complete prenatal lethality 5.23E-16 Mouse Phenotype abnormal embryonic 8.28E-16 growth/weight/body size Mouse Phenotype embryonic lethality before somite 1.35E-15 formation Mouse Phenotype embryonic lethality between 3.62E-15 implantation and placentation Mouse Phenotype abnormal blastocyst morphology 2.80E-13 Mouse Phenotype abnormal triploblastic development 4.15E-13 Mouse Phenotype abnormal placenta morphology 9.44E-13 Mouse Phenotype partial embryonic lethality during 1.01E-12 organogenesis Mouse Phenotype abnormal intestinal epithelium 2.95E-12 morphology Mouse Phenotype embryonic lethality during 3.94E-12 organogenesis Mouse Phenotype abnormal prenatal body size 4.09E-12 Mouse Phenotype abnormal trophoblast layer 4.83E-12 morphology Disease Ontology seminoma 8.48E-14 Disease Ontology germinoma 1.58E-12 Disease Ontology squamous cell neoplasm 6.65E-11 Disease Ontology squamous cell carcinoma 2.11E-10 Disease Ontology epithelioma 2.30E-10 Disease Ontology carcinosarcoma 1.80E-09 Disease Ontology malignant mixed cancer 4.61E-09 Disease Ontology pancreatic ductal adenocarcinoma 4.36E-07

31 Disease Ontology endometrial carcinoma 6.11996E-06 Disease Ontology chronic leukemia 9.07627E-05 Disease Ontology invasive carcinoma 0.000446338 Disease Ontology colon adenocarcinoma 0.001032948 Disease Ontology large Intestine adenocarcinoma 0.001192973 Disease Ontology chronic lymphocytic leukemia 0.004017054 Pathway Commons C-MYC pathway 7.30E-08 Pathway Commons Canonical Wnt signaling pathway 1.33E-07 Pathway Commons Noncanonical Wnt signaling 2.11042E-06 pathway Pathway Commons Regulation of nuclear beta catenin 2.17567E-06 signaling and target gene transcription Pathway Commons Validated targets of C-MYC 3.6374E-06 transcriptional activation Pathway Commons Stabilization and expansion of the 3.91776E-06 E-cadherin adherens junction Pathway Commons E-cadherin signaling in the nascent 3.91776E-06 adherens junction Pathway Commons E-cadherin signaling events 4.76015E-06 Pathway Commons Wnt signaling network 1.12911E-05 Pathway Commons E2F transcription factor network 1.47906E-05 Pathway Commons Posttranslational regulation of 1.99375E-05 adherens junction stability and dissassembly Pathway Commons Glypican 3 network 2.40177E-05 Pathway Commons Syndecan-4-mediated signaling 2.9713E-05 events Pathway Commons N-cadherin signaling events 0.000142339 Pathway Commons Regulation of RAC1 activity 0.001457759 Pathway Commons RAC1 signaling pathway 0.001457759 Pathway Commons RhoA signaling pathway 0.001457759 Pathway Commons Regulation of RhoA activity 0.001457759 MGI Expression: TS4_compacted morula 9.77E-53 Detected MGI Expression: TS4_inner cell mass 1.02E-50 Detected MGI Expression: TS3_4-8 cell stage 6.43E-50 Detected MGI Expression: TS4_second polar body 8.44E-50 Detected MGI Expression: TS4_zona pellucida 8.44E-50 Detected MGI Expression: TS4_embryo 2.67E-49

32 Detected MGI Expression: TS3_4-cell stage 3.44E-49 Detected MGI Expression: Theiler_stage_4 8.75E-49 Detected MGI Expression: Theiler_stage_3 2.55E-48 Detected MGI Expression: TS5_inner cell mass 5.57E-48 Detected MGI Expression: TS3_zona pellucida 1.33E-47 Detected MGI Expression: TS5_embryo 1.67E-47 Detected MGI Expression: TS3_second polar body 3.27E-47 Detected MGI Expression: TS3_8-cell stage 1.75E-45 Detected MGI Expression: TS4_extraembryonic component 3.34E-45 Detected MGI Expression: Theiler_stage_5 6.80E-44 Detected MGI Expression: Theiler_stage_2 4.67E-32 Detected MGI Expression: TS5_trophectoderm 1.02E-28 Detected MGI Expression: TS28_oocyte 7.41E-28 Detected MGI Expression: TS28_female germ cell 2.08E-27 Detected MSigDB Perturbation Genes whose expression pattern in 1.61E-25 adult male germ cell tumors (GCT) correlates with POU5F1 [Gene ID=5460]. MSigDB Perturbation Supplementary Table 2. Genelist 1.20E-21 comparing microarray expression profiles of spermatogonial cells, haGSCs and hES (H1) cells. Examples of expression rates of different hES cell enriched and germ cell specific genes, surface markers for germ cell selection and signal transduction in all three cell types (spermatogonial cells = SC).

33 MSigDB Perturbation Genes down-regulated in hES cells 4.31E-18 (human embryonic stem cells) after treatment with the ALK [Gene ID=238] inhibitor SB-431542 [PubChem=4521392]. MSigDB Perturbation Set 'ES exp1': genes overexpressed 4.91E-18 in human embryonic stem cells according to 5 or more out of 20 profiling studies. MSigDB Perturbation Genes constituting the PluriNet 7.62E-18 protein-protein network shared by the pluripotent cells (embryonic stem cells, embryonical carcinomas and induced pluripotent cells). MSigDB Perturbation The 'core ESC-like gene module': 1.02E-14 genes coordinately up-regulated in a compendium of mouse embryonic stem cells (ESC) which are shared with the human ESC-like module. MSigDB Perturbation Set 'NOS targets': genes upregulated 6.78E-13 and identified by ChIP on chip as targets of the transcription factors NANOG [Gene ID=79923], OCT4[Gene ID=5460], and Sox2 [Gene ID=6657] (NOS) in human embryonic stem cells.

MSigDB Perturbation Genes predicting the embryonic 1.81E-12 carcinoma (EC) subtype of nonseminomatous male germ cell tumors (NSGCT). MSigDB Perturbation Set 'Nanog targets': genes 1.88E-12 upregulated and identified by ChIP on chip as Nanog [Gene ID=79923] transcription factor targets in human embryonic stem cells. MSigDB Perturbation Genes up-regulated in T24 (bladder 4.31E-12 cancer) cells in response to the photodynamic therapy (PDT) stress.

34 MSigDB Perturbation Set 'Oct4 targets': genes upregulated 4.82E-12 and identified by ChIP on chip as OCT4 [Gene ID=5460] transcription factor targets in human embryonic stem cells. MSigDB Perturbation Up-regulated genes distinguishing 6.64E-11 between breast cancer tumors with mutated BRCA1 [Gene ID=672] from those with mutated BRCA2 [Gene ID=675]. MSigDB Perturbation Genes down-regulated in HL-60 3.19E-10 cells (acute promyelocytic leukemia, APL) after treatment with the aminopeptidase inhibitor tosedostat (CHR-2797) [PubChem=15547703] for 6 h. MSigDB Perturbation Genes down-regulated in hES cells 4.27E-10 (human embryonic stem cells) after treatment with the ALK [Gene ID=238] inhibitor SB-431542 [PubChem=16079008]. MSigDB Perturbation Genes from the black module which 8.93E-10 are up-regulated in HAEC cells (primary aortic endothelium) after exposure to the oxidized 1- palmitoyl-2-arachidonyl-sn-3- glycerophosphorylcholine (oxPAPC). MSigDB Perturbation Genes changed in INA-6 cells 1.39E-09 (multiple myeloma, MM) by re- addition of IL6 [Gene ID=3569] after its initial withdrawal for 12h. MSigDB Perturbation Set 'Set 'ES exp2': genes 1.44E-09 overexpressed in human embryonic stem cells according to a meta- analysis of 8 profiling studies. MSigDB Perturbation Set 'Sox2 targets': genes upregulated 2.96E-09 and identified by ChIP on chip as SOX2 [Gene ID=6657] transcription factor targets in human embryonic stem cells.

35 MSigDB Perturbation Candidate genes in significant 6.32E-09 regions of chromosomal copy number gains in a panel of melanoma samples. MSigDB Perturbation Genes up-regulated in UB27 cells 8.21E-09 (osteosarcoma) at 12 hr after inducing the expression of a mutated form of WT1 [Gene ID=7490]. MSigDB Predicted Motif SNNNCCNCAGGCN 1.41E-09 Promoter Motifs matches GTF3A: general transcription factor IIIA MSigDB Predicted Motif 2.86E-07 Promoter Motifs NNNCGGCCATCTTGNCTSNW matches YY1: YY1 transcription factor MSigDB Predicted Motif NNNRRCCAATSRGNNN 3.43E-07 Promoter Motifs (no known TF) MSigDB Predicted Motif GCTNWTTGK (no known 3.87E-07 Promoter Motifs TF) MSigDB Predicted Motif GSCCSCRGGCNRNRNN 5.36E-07 Promoter Motifs matches GTF3A: general transcription factor IIIA MSigDB Predicted Motif GCCATNTTN matches YY1: 1.4673E-06 Promoter Motifs YY1 transcription factor MSigDB Predicted Motif CAGCCAATGAG matches 1.5712E-06 Promoter Motifs PCBP1: poly(rC) binding protein 1 MSigDB Predicted Motif NKTSSCGC matches E2F1: 1.74866E-06 Promoter Motifs E2F transcription factor 1 MSigDB Predicted Motif GCCAYGYGSN matches 2.06892E-06 Promoter Motifs MYC: v-myc myelocytomatosis viral oncogene homolog (avian)<br> MAX: MYC associated factor X MSigDB Predicted Motif GCCATNTTG matches YY1: 4.89446E-06 Promoter Motifs YY1 transcription factor MSigDB Predicted Motif TTTSGCGS matches E2F1: 5.62814E-06 Promoter Motifs E2F transcription factor 1

36 MSigDB Predicted Motif SWWCAAAGGG matches 7.29348E-06 Promoter Motifs LEF1: lymphoid enhancer-binding factor 1<br> TCF1: transcription factor 1, hepatic; LF- B1, hepatic nuclear factor (HNF1), albumin proximal factor MSigDB Predicted Motif ANNGACGCTNN matches 2.14043E-05 Promoter Motifs FOXN1: forkhead box N1 MSigDB Predicted Motif CANCCNNWGGGTGDGG 3.75655E-05 Promoter Motifs (no known TF) MSigDB Predicted Motif NTGGNNNNNNGCCAANN 3.85028E-05 Promoter Motifs matches NF1: neurofibromin 1 (neurofibromatosis, von Recklinghausen disease, Watson disease)

MSigDB Predicted Motif NMGATANSG matches 5.49311E-05 Promoter Motifs LMO2: LIM domain only 2 (rhombotin-like 1) MSigDB Predicted Motif 6.1104E-05 Promoter Motifs NGNVGTCANGCGTGNNSNNYN matches PAX4: paired box gene 4

MSigDB Predicted Motif CATTGTYY matches SOX9: 8.28664E-05 Promoter Motifs SRY (sex determining region Y)- box 9 (campomelic dysplasia, autosomal sex-reversal) MSigDB Predicted Motif NNNNNGATANKGNN 9.19915E-05 Promoter Motifs matches GATA1: GATA binding protein 1 (globin transcription factor 1) MSigDB Predicted Motif NNNNAACAATRGNN 0.000102179 Promoter Motifs matches SOX9: SRY (sex determining region Y)-box 9 (campomelic dysplasia, autosomal sex-reversal) MSigDB miRNA Motifs Targets of MicroRNA 0.000141249 AGGGCAG,MIR-18A MSigDB miRNA Motifs Targets of MicroRNA 0.000573999 TTTTGAG,MIR-373 MSigDB miRNA Motifs Targets of MicroRNA 0.001428911 CACGTTT,MIR-302A

37 Supplementary Table 6. GREAT gene ontology analysis to show the functional enrichment of Group II regions in Fig. S3B. # GREAT version 2.0.2 Species assembly: mm9 # Ontology Term Name Binom Raw P-Value

GO Biological Process negative regulation of protein 2.23E-07 metabolic process GO Biological Process negative regulation of cellular 1.8475E-06 protein metabolic process GO Biological Process chromatin organization 2.7537E-06 GO Biological Process chromatin modification 5.0712E-06 GO Biological Process regulation of cell cycle 8.62937E-06 GO Biological Process peptidyl-tyrosine 2.18057E-05 phosphorylation GO Biological Process regulation of histone H3-K36 0.000168946 methylation GO Biological Process histone modification 0.000179224 GO Biological Process regulation of protein 0.000190181 serine/threonine kinase activity GO Biological Process peptidyl-amino acid modification 0.000201796 GO Biological Process covalent chromatin modification 0.0002133 GO Biological Process regulation of transcription 0.000223624 elongation from RNA polymerase II promoter GO Biological Process G1/S transition of mitotic cell 0.000223858 cycle GO Biological Process negative regulation of catabolic 0.000225109 process GO Biological Process regulation of cell shape 0.000242775 GO Biological Process pathway-restricted SMAD 0.000359368 protein phosphorylation GO Biological Process negative regulation of protein 0.000390184 catabolic process GO Biological Process developmental growth 0.00058609 GO Biological Process leukemia inhibitory factor 0.000754627 signaling pathway GO Biological Process negative regulation of cellular 0.000771029 component organization GO Cellular Component PcG protein complex 1.00237E-06 GO Cellular Component chromatin 1.34345E-05 GO Cellular Component chromatin remodeling complex 3.95675E-05

38 GO Cellular Component nuclear chromatin 0.00020864 GO Cellular Component histone deacetylase complex 0.001743676 Mouse Phenotype myocarditis 1.92E-10 Mouse Phenotype abnormal notochord morphology 4.92E-08 Mouse Phenotype heart inflammation 1.15E-07 Mouse Phenotype abnormal hypersensitivity 6.03E-07 reaction Mouse Phenotype enlarged spleen 1.52362E-06 Mouse Phenotype increased susceptibility to type I 1.5433E-06 hypersensitivity reaction Mouse Phenotype abnormal neural tube closure 9.34229E-06 Mouse Phenotype abnormal maxilla morphology 1.68632E-05 Mouse Phenotype abnormal incisor morphology 1.84564E-05 Mouse Phenotype embryonic growth arrest 1.94401E-05 Mouse Phenotype increased autoantibody level 2.15782E-05 Mouse Phenotype abnormal T cell clonal deletion 2.33173E-05 Mouse Phenotype abnormal somite development 3.6334E-05 Mouse Phenotype anemia 5.09885E-05 Mouse Phenotype abnormal rostral-caudal axis 6.23074E-05 patterning Mouse Phenotype abnormal granulocyte 6.53716E-05 morphology Mouse Phenotype increased T cell number 6.73978E-05 Mouse Phenotype abnormal immune tolerance 8.48964E-05 Mouse Phenotype impaired neutrophil recruitment 0.000110919 Mouse Phenotype autoimmune response 0.000115184 Disease Ontology mediastinal neoplasm 1.11804E-05 Disease Ontology mycosis fungoides 2.08301E-05 Disease Ontology cerebrovascular disease 2.78837E-05 Disease Ontology tuberculous meningitis 6.34775E-05 Disease Ontology lichen planus 6.98045E-05 Disease Ontology epithelial ovarian cancer 8.80655E-05 Disease Ontology systemic inflammatory response 0.000436503 syndrome Disease Ontology T-cell leukemia 0.001250942 Disease Ontology multiple sclerosis 0.00201673 Disease Ontology demyelinating disease of central 0.002113812 nervous system Pathway Commons CXCR4-mediated signaling 0.000127833 events Pathway Commons TCR signaling in naïve 0.000822541 CD4+ T cells Pathway Commons Cell-Cell communication 0.002029853

39 MGI Expression: TS4_compacted morula 4.34E-07 Detected MGI Expression: TS26_arm 3.23443E-05 Detected MGI Expression: TS13_head mesenchyme 4.31917E-05 Detected MGI Expression: TS22_neural retinal epithelium 7.05649E-05 Detected MGI Expression: TS13_future spinal cord; neural 7.24577E-05 Detected plate; neural fold MGI Expression: TS12_future spinal cord; neural 7.31329E-05 Detected tube MGI Expression: TS13_branchial arch 0.000154952 Detected MGI Expression: TS22_somite 0.000181874 Detected MGI Expression: TS20_lower leg 0.000183865 Detected MGI Expression: TS13_head mesenchyme; 0.000252777 Detected mesenchyme derived from neural crest MGI Expression: TS20_handplate; interdigital 0.00027301 Detected region; mesenchyme MGI Expression: TS24_hair 0.000273519 Detected MGI Expression: TS14_1st arch; mesenchyme; 0.000290923 Detected mesenchyme derived from neural crest MGI Expression: TS14_2nd arch; mesenchyme; 0.000290923 Detected mesenchyme derived from neural crest MGI Expression: TS23_metanephros; drainage 0.000303126 Detected component; pelvis MGI Expression: TS13_2nd arch 0.000365732 Detected MGI Expression: TS12_future spinal cord 0.000464744 Detected MGI Expression: TS15_tail 0.000526716 Detected MSigDB Perturbation Genes down-regulated in 2.48446E-05 HCT8/S11 cells (colon cancer) engineered to stably express NTN1 [Gene ID=1630] off a plasmid vector.

40 MSigDB Perturbation Genes up-regulated in luminal- 7.27769E-05 like breast cancer cell lines compared to the basal-like ones. MSigDB Perturbation Genes down-regulated in BJAB 0.000367354 cell line (B lymphocyte) after expression of the viral microRNA miR-K12-11 which functions as an ortholog of cellular MIR155 [Gene ID=406947]. MSigDB Perturbation Cluster 1: genes with similar 0.000460581 expression profiles across follicular thyrorid carcinoma (FTC) samples. MSigDB miRNA Motifs Targets of MicroRNA 0.000193299 CTCAGGG,MIR-125B,MIR- 125A MSigDB miRNA Motifs Targets of MicroRNA 0.000266343 GCTGAGT,MIR-512-5P MSigDB miRNA Motifs Targets of MicroRNA 0.000365212 GTGTTGA,MIR-505

Supplementary Table 7. GREAT gene ontology analysis to show the functional enrichment of Group III regions in Fig. S3B. # GREAT version 2.0.2 Species assembly: mm9 # Ontology Term Name Binom Raw P-Value

GO Biological Process regulation of metanephric nephron 0.000140086 tubule epithelial cell differentiation GO Biological Process negative regulation of cell cycle 0.000162433 process GO Biological Process regulation of peptidyl-tyrosine 0.000277011 phosphorylation GO Cellular Component nuclear envelope 3.8059E-05 GO Cellular Component synaptic vesicle 0.000130735 Mouse Phenotype abnormal cell cycle 3.52E-08 Mouse Phenotype abnormal nucleus morphology 5.36E-07 Mouse Phenotype abnormal chromosome number 2.72942E-05 Mouse Phenotype female infertility 2.89004E-05 Mouse Phenotype pale liver 5.22159E-05

41 Mouse Phenotype cervical vertebral fusion 6.08592E-05 Mouse Phenotype abnormal white fat cell 6.29221E-05 morphology Mouse Phenotype abnormal cell content or 6.78518E-05 morphology Mouse Phenotype abnormal vagus nerve morphology 8.41423E-05 Mouse Phenotype uterus hypoplasia 0.000106512 Mouse Phenotype abnormal alveolar process 0.000154283 Disease Ontology familial hyperlipidemia 2.1528E-06 MGI Expression: Detected TS22_middle ear 1.4403E-05 MGI Expression: Detected TS19_reproductive system 5.15583E-05 MGI Expression: Detected TS23_forearm; mesenchyme; rest 0.000171988 of mesenchyme MGI Expression: Detected TS19_tail 0.000422279 MGI Expression: Detected TS21_rest of nephric duct of 0.00043017 female MSigDB Perturbation Genes up-regulated in MCF7 cells 5.41852E-05 (breast cancer) engineered to conditionally express LMO4 [Gene ID=8543] by a Tet Off system. MSigDB miRNA Motifs Targets of MicroRNA 1.92279E-05 ATCATGA,MIR-433

Supplementary Table 8. List of primers used for RT-qPCR analysis in this paper.

Gapdh-L ACCACAGTCCATGCCATCAC

Gapdh-R TCCACCACCCTGTTGCTGTA

Nanog-L CAGAAAAACCAGTGGTTGAAGACTAG

Nanog-R GCAATGGATGCTGGGATACTC

Sox2-L CACAACTCGGAGATCAGCAA

Sox2-R CTCCGGGAAGCGTGTACTTA

Oct4-L AGTCTGGAGACCATGTTTCTGAAGT

Oct4-R TACTCTTCTCGTTGGGAATACTCAATA

Sox17-L GCTAAGCAAGATGCTAGGCAAGT

42 Sox17-R TCATGCGCTTCACCTGCT

FoxA2-L CCCTTCTCTATCAACAACCTCATGT

FoxA2-R GGGTAGTGCATGACCTGTTCGT

Pax6-L CTGAGGAATCAGAGAAGACAGGC

Pax6-R ATGGAGCCAGATGTGAAGGAGG

Tbx6-L ACCGCTACCCTGATTTGGATA

Tbx6-R AGATGGGAGAAGGGGCAAAG

Vcam1-L TGGAGGTCTACTCATTCCCTGA

Vcam1-R TAGTCTCCCCCTTCAGTAATTCAA

Supplementary References

1. Portales-Casamar E, et al. (2010) JASPAR 2010: the greatly expanded open- access database of transcription factor binding profiles. Nucleic Acids Res 38(Database issue):D105-110. 2. Ji X, Li W, Song J, Wei L, & Liu XS (2006) CEAS: cis-regulatory element annotation system. Nucleic Acids Res 34(Web Server issue):W551-554. 3. Shin H, Liu T, Manrai AK, & Liu XS (2009) CEAS: cis-regulatory element annotation system. Bioinformatics 25(19):2605-2606. 4. McLean CY, et al. (2010) GREAT improves functional interpretation of cis- regulatory regions. Nat Biotechnol 28(5):495-501. 5. Xie W, et al. (2009) Histone h3 lysine 56 acetylation is linked to the core transcriptional network in human embryonic stem cells. Mol Cell 33(4):417- 427. 6. Xu F, Zhang K, & Grunstein M (2005) Acetylation in histone H3 globular domain regulates in yeast. Cell 121(3):375-385. 7. Chen X, et al. (2008) Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133(6):1106-1117. 8. Langmead B, Trapnell C, Pop M, & Salzberg SL (2009) Ultrafast and memory- efficient alignment of short DNA sequences to the . Genome Biol 10(3):R25. 9. Ferrari R, et al. (2012) Reorganization of the host epigenome by a viral oncogene. Genome Res.

43 10. Liu T, et al. (2011) Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol 12(8):R83. 11. Eisen MB, Spellman PT, Brown PO, & Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95(25):14863-14868. 12. de Hoon MJ, Imoto S, Nolan J, & Miyano S (2004) Open source clustering software. Bioinformatics 20(9):1453-1454. 13. Saldanha AJ (2004) Java Treeview--extensible visualization of microarray data. Bioinformatics 20(17):3246-3248. 14. Wysocka J (2006) Identifying novel proteins recognizing histone modifications using peptide pull-down assay. Methods 40(4):339-343. 15. Neumann H, et al. (2009) A method for genetically installing site-specific acetylation in recombinant histones defines the effects of H3 K56 acetylation. Mol Cell 36(1):153-163. 16. Kuryan BG, et al. (2012) Histone density is maintained during transcription mediated by the chromatin remodeler RSC and histone chaperone NAP1 in vitro. Proc Natl Acad Sci U S A 109(6):1931-1936. 17. Luger K, Rechsteiner TJ, Flaus AJ, Waye MM, & Richmond TJ (1997) Characterization of nucleosome core particles containing histone proteins made in bacteria. J Mol Biol 272(3):301-311. 18. Yu Y, et al. (2012) Histone H3 lysine 56 methylation regulates DNA replication through its interaction with PCNA. Mol Cell 46(1):7-17. 19. Carrozza MJ, Hassan AH, & Workman JL (2003) Assay of activator recruitment of chromatin-modifying complexes. Methods Enzymol 371:536- 544. 20. Bernardo AS, et al. (2011) BRACHYURY and CDX2 mediate BMP-induced differentiation of human and mouse pluripotent stem cells into embryonic and extraembryonic lineages. Cell Stem Cell 9(2):144-155. 21. Botquin V, et al. (1998) New POU dimer configuration mediates antagonistic control of an osteopontin preimplantation enhancer by Oct-4 and Sox-2. Genes Dev 12(13):2073-2090.

44