Supplementary Material

for

“A New Class of Temporarily Phenotypic Enhancers Identified by CRISPR/Cas9 Mediated Genetic Screening “

Yarui Diao, Bin Li, Zhipeng Meng, Inkyung Jung, Ah Young Lee, Jesse Dixon, Lenka Maliskova, Kun-liang Guan, Yin Shen, and Bing Ren

1. Supplementary Methods

2. Supplementary Figures

Fig. S1. Experimental design. Fig. S2. The number and quality score of positive hit called from screenings. Fig. S3. Characterization of regulatory elements identified by lentiCRISPR screening. Fig. S4. Characterization of bi-allelic mutants. Fig. S5. Representative FACS plots for clonal characterization of eGFP levels. Fig. S6. Characterization of pluripotency and differentiation potentials of clones with temp enhancer deletion. Fig. S7. Genotype information for mono-allelic mutant clones. Fig. S8. Temp enhancers are bound by multiple TFs. Fig. S9. Conservation study of identified regulatory elements, and characterization of the pluripotent and differentiation properties of KO clones.

3. Supplementary Tables

Table S1. A list of candidate regulatory regions (hg18) surveyed in this study.

Table S2. A list of sgRNA library sequences designed for lentiCRISPR screening.

Table S3. Data analysis of CRISPR/Cas9 screening experiments. We first identified sgRNA candidates that are enriched in eGFP- population by meeting the criteria of (a) each sgRNA should have a sequencing coverage of 50 reads and (b) each sgRNA should have at least two fold enrichment in eGFP- population. We then generated the high confident elements that control POU5F1 expression if they are at least supported by the enrichment two sgRNAs targeting the same elements in three different experiments.

1. Supplementary Methods

Array oligo synthesis and pooled library cloning The oligo pool were amplified with primers with 15 bp overlap with BbsI cutting site in LentiCRISPR plasmid: F: AAAGGACGAAACACC R: TTCTAGCTCTAAAAC The PCR product was size selected and gel-purified with NucleoSpin Gel and PCR Clean-Up Kit (Clontech, Cat# 740609), followed by In-fusion cloning (Clontech, Cat# 638909). The cloning product was electro-transformed into 5-alpha Electrocompetent E. coli (NEB, Cat#C2989K) and amplified on Agar plates. About 1 million of independent bacterial colonies growing on Agar plates were collected and subjected to plasmids extraction with EndoFree Plasmid Mega Kit (Qiagen, Cat#12381).

Lentiviral particle production, concentration, and titration. The lentiCRIPSR library was prepared as previously described (Meng et al. 2013) with minor modifications. Briefly, 5ug of lentiCRISPR plasmid library was co-transfected with 4 ug PsPAX2 and 1 ug pMD2.G (Addgene #12260 and #12259 ) into a 10-cm dish of HEK293T cells in DMEM (Life Technologies) containing 10% FBS (Life Technologies) by PolyJet transfection reagents (Signagen, Cat# SL100688). Medium was changed 6 hours after transfection. The supernatant of cell culture media was harvested at 24 hours and 48 hours after transfection, and filtered by Millex-HV 0.45 µm PVDF filters (Millipore, Cat# SLHV033RS). The viruses were further concentrated with 100, 000 NMWL Amicon Ultra-15 Centrifugal Filter Units (Amicon, Cat#UFC910008).

For viral titration, 0.5 million hESC POU5F1-eGFP cells were seeded per well on 6-well plate. 12 hours later, different amount (1ul, 2ul, 4ul, 8ul) of concentrated viral-containing media were added to the cell culture media to infect the hESC following the same protocol described in the lentiviral screening section. The same amount of non-infected cells were seeded and not treated with puromycin as the control. 24 hours post-infection, the viral infected cells were treated with 250ng/ml Puromycin (Life Technologies, Cat#A1113802) for another 72 hours. We counted the number of Puromycin resistant cells and the control cells to calculate the ration of infected cells, and then viral titer. In the screening, 12 million POU5F1-eGFP hESCs were used in each independent screening and infected with viral particles at low MOI (0.02) to make sure each infected cell get one viral particle. The library coverage in each screen is more than 120 times.

Lentiviral CRISPR/Cas9 screening To perform sgRNA screening, 12 million H1-eGFP cells were seeded on matrigel-coated petri dish overnight. Low titer of lentiviral particles (MOI 1-2 for experiment 1 and 2, and MOI 0.02 for experiment 3 and 4) were added into the media, supplemented with 8µg/ml polybrene. Immediately after viral infection, the plates was centrifuged at 2,000 rpm for 2 h at 37°C. The viral containing media was then replaced with fresh E8 media without polybrene. 24 hours after viral transduction, the cells were cultured in E8 media with 250ng/ml puromycin for 7 days, followed by another 3-day-culture in regular E8 media without puromycin before FACS sorting. 4 million of non-sorted cells and 1 million of FACS sorted GFP-low cells were collected for the further analysis. The genomic DNA was extracted, followed by two rounds of PCR amplification to prepare DNA library ready for deep sequencing. The sgRNA integrated into the genomic DNA was amplified with F1/R1 primers for 25 cycles, and F2/R2 (R2 index varies) primers for 10 cycles in the 1st step and the 2nd step PCR, respectively. The final PCR products were purified and sequenced.

F1: AATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCG R1: CTTTAGTTTGTATGTCTGTTGCTATTATGTCTACTATTCTTTCC F2:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTT CCGATCTtcttgtggaaaggacgaaacaccg R2_index1: CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt R2_index2: CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt R2_index3: CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt R2_index4: CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt R2_index5: CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt R2_index6: CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCtctactattctttcccctgcactgt

Transfection and luciferase reporter assay Luciferase assays were conducted as previously described (Heintzman et al. 2007). Detailed information can be found be Supplemental Materials. Briefly, the DHS_37, DHS_65, DHS_108, and DHS_115 elements were PCR amplified from H1 hESC genomic DNA, and cloned into pGL3- vector (Promega) by In-fusion cloning. The sequences of the primers are:

DHS_37 _F GCATGAGCCACAGGAGGTAG DHS_37 _R CGCTTTCTCTCCCTCAACC

DHS_65 _F GAGGCAGCATCTAACCTTGC DHS_65 _R TCCTTACCATGTGGCATTTG

DHS_108 _F GAATTCCGAAGGAGGGGTAG DHS_108 _R CGTGCAATACGAACACATCA

DHS_115 _F GATGCTAGGGAATTCGATCCCCT DHS_115 _R ATCCGAGCTCTGCAGGATTTGCT

CTCF_38_F CCAAGCCTCCTGCTTCTGG CTCF_38_R TCTTACTTCCCACACCCTGAGC

To test the enhancer activity of these elements with native POU5F1 promoter and the proximal enhancer (DHS_115), the 360bp POU5F1 minimal promoter (Chia et al. 2010) (hg18 Chr 6:31,246,377-31,246,736) was synthesized (IDT gblock) to replace the SV-40 promoter in pGL3-promoter vector.

After validation of sequence by Sanger sequencing, the constructs were co-transfected with pRL-SV40 Renilla reporter vector in H1 hESCs with Fugene HD (Roche) at a 4:1 reagent to DNA ratio. Transfected cells were cultured for an additional 2 days prior to harvest for reporter assay. The Dual-Luciferase Reporter Assay kit (Promega Cat#:E1960) was used according to manufacturer’s protocol. The adjusted firefly luciferase activity of each sample was normalized to the average of active of 3 negative control regions.

RT-qPCR Total RNA was isolated using TRIzol (Thermo Fisher Scientific, Cat# 15596026) as per manufacturer’s instruction, and 1ug of RNA was reverse transcribed into cDNA by SuperScript® III First-Strand Synthesis System (Thermo Fisher Scientific, Cat# 18080051). The expression levels was quantified by real time PCR by KAPA SYBR® FAST qPCR Kits (KAPA biosystems, Cat# KK4600) on LightCycler 480 System (Roche). The following primers were used.

POU5F1_F CGAAAGAGAAAGCGAACCAGTATCGAGAAC POU5F1_R CGTTGTGCATAGTCGCTGCTTGATCGC SOX2_F ATGCACCGCTACGACGTGA SOX2_R CTTTTGCACCCCTCCCATTT NanoG_F TGATTTGTGGGCCTGAAGAAA NanoG_R GAGGCATCTCAGCAGAAGACA AFP_F TGGGACCCGAACTTTCCA AFP_R GGCCACATCCAGGACTAGTTTC T brachyury transcription factor_F TGCTTCCCTGAGACCCAGTT T brachyury transcription factor_R GATCACTTCTTTCCTTTGCATCAAG Sox1_F GTACAGCCCCATCTCCAACTC Sox1_R GGCTCCGACTTCACCAGAGAG 18sRNA F GGCCCTGTAATTGGAATGAGTC 18sRNA R CCAAGATCCAACTACGAGCTT

Embryoid body formation and differentiation Embryoid bodies (EB) were generated with AggreWell 400 (Stemcell Technologies, Cat# 27845) following the manufacture’s protocol. Briefly, hESC was dissociated into single cell and suspended into AggreWell™ Medium (Stemcell technologies, Cat# 05893) supplemented with ROCK inhibitor Y-27632 (Stemcell technologies, Cat# 72304). About 2.4 million cells were plated into each well to let each EB contain about 2000 cells. After 24 hours, the media was switched to STEMdiff™ APEL™ Medium (Stemcell technologies, Cat# 05210) to induce differentiation. The EB was maintained in STEMdiff APEL medium for 14 days and fresh media was changed on a daily basis.

ChIP-seq, 4C-seq experiments and data analysis The ChIP-seq and 4C-seq experiments and data analysis were performed following the the procedures described earlier (van de Werken et al. 2012). The 4 base cutter used for 4C-seq experiments are DpnII and NlaIII. The following primers including Illumina’s TruSeq adaptors are used to amplified the target sequence interacting with bait region (POU5F1 promoter, hg18 Chr 6:31,247,202-31,248,137) Pou5f1_4C_F: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCT CTTCCGATCTTGCCTAGGATTCTGGATGGA Pou5f1_4C_R21: CAAGCAGAAGACGGCATACGAGATTCCGAAACGTGACTGGAGTTCAG ACGTGTGCTCTTCCGATCTGCAGAGGAGGTGGAGAGTGA Pou5f1_4C_R20: CAAGCAGAAGACGGCATACGAGATAAGGCCACGTGACTGGAGTTCAG ACGTGTGCTCTTCCGATCTGCAGAGGAGGTGGAGAGTGA

Supplemental Fig. S1

A B 120

100 105

80 counts % ) 4 ( A 10 y c n

e 60

u 3 q

e 10 r

F 40 102 20 in total population Normalized sgRN 10 0

1 1 10 102 103 104 105

Bin (bp) Normalized sgRNA counts in eGFP- population

C 106 D 106 105 105

counts 4 A

10 counts 4 A 10 103 103 102 102 in total population Normalized sgRN

10 in total population Normalized sgRN 10

1 1 10 102 103 104 105 106 1 1 10 102 103 104 105 106 Normalized sgRNA counts in eGFP- population Normalized sgRNA counts in eGFP- population

SUPPLEMENTAL Fig. S1. Experimental design (A) Size Distributions of candidate regulatory regions surveyed in this study. (B-D) H1 POU5F1-eGFP cells were infected by lentiCRISPR library, selected under puromycin for 7 days, and cultured without puromycin for another 10-days. Genomic DNA was collected from the non-sorted control cells and FACS sorted eGFP negative cells, followed by PCR amplification of sgRNA sequence and deep sequencing. Scatter plots for sgRNA read counts in eGFP- cells compared to the control cells for experiment 2, 3, and 4 after LOESS normalization. Green line: 2-fold cutoff. Supplemental Fig. S2

A Rep_1(YD36) 19 3 Rep_2(YD38) 8 4 Rep_3(YD44) 22 14 Rep_4(YD46) 21 15 Rep_1&2 3 Rep_1&3 6 1 Rep_1&4 7 2 Rep_2&3 5 Rep_2&4 5 1 Rep_3&4 9 3 Rep_2&3&4 4 Rep_1&3&4 4 Rep_1&2&4 2 Rep_1&2&3 2

0 10 20 30 40 # of elements supported by >=2 effective sgRNAs # of elements supported by >=2 ineffective sgRNAs

B 100 n=83 n=45

80

60 sgRNAs quality score 40

20 All sgRNAs Enriched sgRNAs

SUPPLEMENTAL Fig. S2. The number and quality score of positive hit called from screenings. (A) The element was called positive in each replicate when hit by at least two significant sgRNAs. The bar graph shows the number of true (green) and false (blue) elements identified from each experiment and how these positive elements overlapped between replicates. (B) Box plot shows the distribution of quality scores of all sgRNAs targeting the six identified (left) compared to the ones enriched in the GFP- cells (right). We obtained quality scores of a total of 83 detectable sgRNAs from http://crispr.mit.edu (>50 reads in at least one experiment in either the control or eGFP- enriched sequencing libraries). The sgRNA enriched in the eGFP- cells are plotted on the right. Supplemental Fig. S3 A *** 1.5

1.0

0.5

0

Fold change of reproter activity PE

DHS_37_PECTCF38_PEDHS_65_PEDHS_108_PE

B Scale 5 kb hg18 Chr6: 31,241,000 31,242,000 31,243,000 31,244,000 31,245,000 31,246,000 31,247,000 31,248,000 31,249,000 TCF19 PSORS1C3 TCF19 POU5F1 POU5F1 POU5F1 POU5F1 POU5F1

DHS

CTCF

H3K27ac

H3K4me1

H3K4me3

Chr6:31247817-31247860 ACCCTTGAATGGGCCTGGATGGCTCCCCTGGGGACTGCTTCCTG Allele 1 ACCCTTGAATGGGC------CTGGGGACTGCTTCCTG Allele 2 ACCCTTGAATGGGC------TTGCGGACTGCTTCCTG

SUPPLEMENTAL Fig. S3. Characterization of regulatory elements identified by lentiCRISPR screening. (A) DHS_37, CTCF_38, DHS_65 or DHS_108 sequence was cloned into the PE plasmid, a luciferase reporter construct containing the 360bp minimal POU5F1 promoter and the proximal enhancer DHS_115. H1 POU5F1-eGFP cells were transfected with indicated reporter constructs. 48 hours after transfection, the cells were subjected to luciferase reporter assay to test the effect of given element on PE reporter activity. P-value was calculated by t-test. ** P =0.004. (B) Genotyping results of an eGFP- clone derived from the lentiCRISPR screening showed bi- allelic mutation of DHS_115 region. Supplemental Fig. S4

A Scale 500 bases hg18 B Chr6: 30,754,500 30,755,000 30,755,500 PPP1R18 PPP1R18 WT clone 1 WT clone 2 DHS DHS_37 clone 3 -/- DHS_37 clone 4 -/- CTCF DHS_65 clone 6 -/- DHS_108 clone 2 -/- H3K27ac

H3K4me1

H3K4me3 DHS_37_C3_P1 DHS_37_C3_P2 DHS_37_C4_P1 DHS_37_C4_P2 eGFP DHS_37_C1_P1 DHS_37_C1_P2

Scale 500 bases hg18 Chr6: 30,904,000 30,904,500 30,905,000 LINC00243

DHS

CTCF

H3K27ac

H3K4me1

H3K4me3 DHS_65_C6_P1 DHS_65_C6_P2 tKO_DHS_37_65_108_C9_P1 tKO_DHS_37_65_108_C9_P2 tKO_DHS_37_65_108_C16_P1 tKO_DHS_37_65_108_C16_P2

Scale 500 bases hg18 Chr6: 31,234,200 31,234,300 31,234,400 31,234,500 31,234,600 31,234,700 31,234,800 31,234,900 31,235,000 TCF19 TCF19

DHS

CTCF

H3K27ac

H3K4me1

H3K4me3 DHS_108_C2_P1 DHS_108_C2_P2 dKO_DHS_37_108_C4_P1 dKO_DHS_37_108_C4_P2 dKO_DHS_37_108_C7_P1 dKO_DHS_37_108_C7_P2

SUPPLEMENTAL Fig. S4. Characterization of bi-allelic mutants. (A) Genomic DNA was isolated from each mutant clones and the genotypes were confirmed by Sanger sequencing. Genotyping information for all the mutants used in this study with deletions on the loci of DHS_37, DHS_65, and DHS_108 are showing on top panel, middle panel and bottom panel, respectively. DKO: DHS_37 and DHS_108 double knockout clones. DKO clones were derived from DHS_37 bi-allelic mutant clone C1 by transfection of sgRNA pair targeting DHS_108. TKO: DHS_37, DHS_65, and DHS_108 triple knockout clones, derived from DHS_37_108 bi-allelic mutant clone C4 by transfection of sgRNA pair targeting DHS_65. (B) The bi-allele KO clones were cultured for 14 days and subjected to FACS analyses. All clones exhibit recovered eGFP levels similar to the parental cells after long-term culture. Supplemental Fig. S5 A

B

C

D

live cells singlets Tra1-60+ and SSEA4+ GFP+ GFP-

SUPPLEMENTAL Fig. S5. Representative FACS plots for clonal characterization of eGFP levels. WT and temp enhancer KO mutants were dissociated by Accutase and stained with PE- Tra1-60 and APC-SSEA4 antibody (markers for hESCs). The cells were subjected to FACS analysis as indicated, and the un-stained cells were used as negative control for gating. (A) FACS plots gating for a wild type, unstained clone (PE-YG-A indicates Tra1-60 and APC-A indicates SSEA4). (B) FACS plots gating for a wild type clone stained with Tra1-60 and SSEA4. (C) FACS plots gating for a mutant clone stained with Tra1-60 and SSEA4. (D) Step-wise gating for live, singlets, Tra1-60 and SSEA4+, eGFP- and eGFP+ cells. Supplemental Fig. S6

A Undifferentiated cells

1.2 WT

0.8 DHS_37 KO

Fold change DHS_65 KO 0.4 DHS_108 KO 0 POU5F1 SOX2 NANOG

B 100kb DHS37 DHS65 CTCF38 DHS108 WT rep1 clone 1 rep2 rep1 clone 6 rep2

TRIM40 HCG17 HLA-E GNL1 MRPS18B NRM IER3 LINC00243 DDR1 MUC21 HCG22 CDSN HCG27 TRIM40 HLA-L TRIM39 PRR3 ATAT1 NRM DDR1 MUC21 C6orf15 TCF19 TRIM10 HCG18 PRR3 ATAT1 NRM DDR1 MUC22 PSORS1C1 TRIM10 HCG18 ABCF1 DHX16 TUBB DDR1 PSORS1C2 TRIM15 HCG18 ABCF1 DHX16 TUBB DDR1 CCHCR1 TRIM26 HCG18 MIR877 NRM DDR1 CCHCR1 TRIM26 TRIM39 PPP1R10 PPP1R18 DDR1 CCHCR1 TRIM39-RPP21 PPP1R10 PPP1R18 DDR1 TCF19 RPP21 ATAT1 NRM DDR1 POU5F1 RPP21 C6orf136 NRM MIR4640 DPCR1 POU5F1 RPP21 C6orf136 NRM GTF2H4 POU5F1 NRM VARS2 POU5F1 MDC1 VARS2 POU5F1 MDC1-AS1 VARS2 PSORS1C3 TUBB SFTA2 TUBB TUBB TUBB TUBB FLOT1

C Embryoid body 1.2 WT

0.8 DHS_37 KO

DHS_65 KO Fold change 0.4 DHS_108 KO 0 AFP T SOX1

SUPPLEMENTAL Fig. S6. Characterization of pluripotency and differentiation potentials of clones with temp enhancer deletion. (A) The eGFP negative cells in the indicated mutant lines were FACS sorted, followed by RT-qPCR analysis. The eGFP- single KO cells still maintain normal level of transcription of other stem cell marker including SOX2 and NANOG. (B) A genome browser snapshot of H3K27ac ChIP-seq signals at POU5F1 in WT and two DHS_37-/- clones. (A) Embryoid bodies (EB) were generated from FACS sorted eGFP negative cells described in (A). 24 hours after EB formation, the media was switched to differentiation media to induce EB differentiation for 14 days. Total RNA isolated from EBs was analyzed by qPCR. RT-qPCR data showed similar expression levels differentiation marks of the three germ layers in the mutant compared to the embryonic bodies derived from WT cells. Supplemental Fig. S7

A 3’ 5’ POU5F1 p1 allele

Known genotypes GCAAG AAGTG CTCAG CAATT GCCTC

POU5F1 specific PCR primer Common PCR primer 3’ 5’ POU5F1 p2 allele

Known genotypes GCCAG AAATG CTTAG CAGTT GCATC

H1 POU5F1-eGFP Native allele genotyping GCCAG AAATG CTTAG CAGTT GCATC

B Scale 500 bases hg18 chr6: 30,904,000 30,904,500 30,905,000 LINC00243

DHS

CTCF

H3K27ac

H3K4me1

H3K4me3 DHS_65_C7_P1 DHS_65_C7_P2 DHS_65_C8_P1 DHS_65_C8_P2

C Scale 500 bases hg18 chr6: 31,234,200 31,234,300 31,234,400 31,234,500 31,234,600 31,234,700 31,234,800 31,234,900 31,235,000 TCF19 TCF19

DHS

CTCF

H3K27ac

H3K4me1

H3K4me3 DHS_108_C1_P1 DHS_108_C1_P2 DHS_108_C4_P1 DHS_108_C4_P2 DHS_108_C6_P1 DHS_108_C6_P2 SUPPLEMENTAL Fig. S7. Genotype information for mono-allelic mutant clones (A) Schemtic of phasing eGFP (P1) and non-eGFP (P2) alleles of H1 POU5F1-eGFP line. We performed PCR from genomic DNA samples of individual clones of a region in the 3' UTR between primer pairs (indicated by black arrows) that would be broken by the inserted transgene, so the only allele that can be amplified is the native one. We then infer what the SNPs on the non- targeted allele are to deduce whether P1 or P2 is the targeted vs. non-targeted allele. (B) Genotyping information for DHS_65 mono-allelic KO clones. (C) Genotyping information for DHS_108 mono-allelic KO clones. are patterns A SUPPLEMENTAL Fig.S8.Tempenhancersareboundbymu highlighted withyellowboxes. Supplemental Fig.S8 genome

obtained Transcription binding peaks (ENCODE) RefSeq SIN3AK20 ZNF143 RAD21 RNAPII HDAC2 TEAD4 JUND SIN3A MAFK EGR1 CHD2 CTCF NRF1 TAF1 ATF3 ATF2

SRF TBP YY1 SP2 SP1 at C6orf136 C6orf136 C6orf136 ATAT1 ATAT1 ATAT1

four br 0 3 _ _

D D owser from HX16 HX16 DHS37 PPP1R18 PPP1R18 non-canonical 100kb MDC1-AS1 NRM NRM NRM NRM NRM NRM NRM NRM ENCODE MDC1 snapshot TUBB TUBB TUBB TUBB TUBB TUBB TUBB FLOT1 IER3 is

dataset. elements

shown LINC00243 DHS65

TF for together peaks PolII MIR4640 DDR1 DDR1 DDR1 DDR1 DDR1 DDR1 DDR1 DDR1 DDR1

GTF2H4 w that binding VARS2 VARS2 VARS2 ith SFTA2

are RefS DPCR1 (top associated eq MUC21 MUC21 gene green MUC22 ltiple TFs. annotation.

CTCF38 track) with HCG22 the and PSORS1C1

C6orf15 novel

TF CDSN multiple PSORS1C2 CCHCR1 CCHCR1 CCHCR1 peaked DHS108 elements PSORS1C3 POU5F1 POU5F1 POU5F1 POU5F1 POU5F1 TCF19 TCF19

TF

regions HCG27

peak

are Supplemental Fig. S9

100kb

DHS108 CTCF38 DHS65 DHS37 orthologous orthologous orthologous orthologous

Pou5f1 Cdsn Gm9573 Sfta2 Ddr1 4833427F10Rik Ier3 Mdc1 Dhx16 Pou5f1 2300002M23Rik Dpcr1 Gtf2h4 Flot1 Nrm Atat1 Tcf19 Vars2 Tubb5 Ppp1r18 Atat1 Tcf19 Ddr1 Ppp1r18 Atat1 Tcf19 Ddr1 Ppp1r18 Mrps18b Cchcr1 Ddr1 2310061I04Rik Psors1c2 Orangutan Dog Opossum Chicken Stickleback Vertabrate conservation

H3K4me3

H3K9ac

H3K27ac

SUPPLEMENTAL Fig. S9. Conservation study of identified regulatory elements, and characterization of the pluripotent and differentiation properties of KO clones. Genome browser snapshot of mouse Pou5f1 locus. Orthologous of four elements are shown (highlighted in orange) with vertebrate alignments and conservation scores. Three active histone modification marks are shown together.