Supporting Information

Kim et al. 10.1073/pnas.1018279108

A IMR90 vs hES CBS5 16 2

12 1.5

8 1

4 0.5

Normalized reads 0 0 27000000 27100000 27200000 27300000 27400000

HOXA7 HOXA9 B mNPC vs mES CBS5 3.5 1.4 3 1.2

2 0.8

1 0.4

Normalized reads 0 0 52000000 52100000 52200000 52300000 52400000 Hoxa7 Hoxa9

Fig. S1. Heterochromatin structure in (HOX) A locus. A is overlaid chromatin immunoprecipitation coupled to deep sequencing (ChIP-Seq) results for trimethylated histone H3 lysine 27 from chromatin from primary lung fibroblasts (IMR90, black lines) and human embryonic stem cells (hES, gray lines) for the human HOXA locus. The y axes represent the number of recovered sequence tags within a 1-Kb window from ChIP-Seq data. CCCTC-binding factor binding sites and HOXA are represented by blue vertical lines and yellow filled circles at the x axis, respectively. B is the same ChIP-Seq results for H3K27me3 in mouse embryonic stem cells (mES, gray line) and derived neuronal progenitor cells (mNPC, black line) in the mouse HOXA locus.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 1of7 IMR90 - H3K4me2 A 6

4

2

0 Normalized reads 27000000 27100000 27200000 27300000 27400000

IMR90 - H3K4me3 B 10 8 6 4 2 0 Normalized reads 27000000 27100000 27200000 27300000 27400000

IMR90 - H3K36me3 C 1.4 1.2

0.8

0.4

0 Normalized reads 27000000 27100000 27200000 27300000 27400000

D hES - H3K4me3 1.6 1.2 0.8 0.4 0 Normalized reads 27000000 27100000 27200000 27300000 27400000

hES - H3K36me3 E 0.7 0.6

0.4

0.2

0 Normalized reads 27000000 27100000 27200000 27300000 27400000

Fig. S2. Euchromatin structure in homeobox (HOX) A locus. A–C are chromatin immunoprecipitation coupled to deep sequencing (ChIP-Seq) results for di- methylated histone H3 lysine 4 (H3K4me2), trimethylated histone H3 lysine 4 (H3K4me3), and trimethylated histone H3 lysine 36 (H3K36me3) in the HOXA locus from chromatin of primary lung fibroblasts (IMR90). D and E are ChIP-Seq results for H3K4me3 and H3K36me3 from human embryonic stem cells (hES) in the same locus. The y axes represent the number of normalized sequence tags from ChIP-Seq experiments.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 2of7 A 3C Primers in human HOXA locus

chr7: 27000000 27050000 27100000 27150000 27200000 27250000 27300000 27350000

CTCF binding sites (CBS) u10 14 18 40 EcoRI

HOXA1 HOXA4 HOXA9 HOXA13 HOXA1 HOXA5 HOXA11 HOXA2 HOXA6 HOXA3 HOXA7 HOXA3 HOXA10 HOXA3 HOXA10

B 3C Primers in mouse Hoxa locus

chr7: 52000000 52050000 52100000 52150000 52200000 52250000 52300000 52350000

CTCF binding sites (CBS) 17 25 43 51 71 79 EcoRI

HOXA1 HOXA4 HOXA7 HOXA11 HOXA1 Evx1 HOXA2 HOXA5 HOXA9 HOXA13 HOXA3 HOXA10 HOXA3 HOXA6 HOXA7

Fig. S3. Primers for conformation capture (3C) analysis of the homeobox (HOX) A locus. The top track indicates genomic coordinates. The second track shows locations of CCCTC-binding factor (CTCF) binding sites mapped by our study. The third track displays the location of the 3C primers designed to detect captured ligation junctions at EcoRI sites. Notable primers are indicated by their primer ID. The fourth track shows all RefSeq genes in the locus. A represents the human HOXA locus and B shows the mouse HOXA locus. Primers 14 (p14) and 43 (p43) in A and B, respectively, are anchor primers located at CTCF binding site 5 (CBS5); p18 and p51 are used as control anchor primers; p40 and p79 indicate the interaction sites with the CBS5 anchor primer in each species, which are also heterochromatin boundary sites; pU10 and p17 or p25 are upstream sites in each species that interact with CBS5 anchor primers, and are located in the euchromatin boundary.

3C assay - MseI digestion 0.1 0.09 0.08 0.07 0.06 0.05 0.04 Frequency 0.03 0.02 0.01 0 u10 p40 control

Fig. S4. Chromosome conformation capture (3C) assay using MseI restriction enzyme. Interaction frequencies between the anchor primer located in CCCTC- binding factor binding site 5 position and major looping sites (u10 and p40) and control site (p18) with p14 in the homeobox A locus were determined by real- time PCR and normalized. Chromosome conformation capture chromatin was prepared by digestion with MseI restriction enzyme.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 3of7 A CTCF KD - mRNA expression Control CTCF KD 100000

10000

1000

100

10 Relative copy nubmer nubmer Relative copy 1 HOXA9 HOXA10 HOXA11 HOXA13 B RAD21 KD - mRNA expression Control RAD21 KD 100000

10000

1000

100

10 Relative copy number number Relative copy 1 HOXA9 HOXA10 HOXA11 HOXA13 C OCT4 OE - mRNA expression Control OCT4 OE 100000

10000

1000

100

10 Relative copy number number Relative copy 1 HOXA9 HOXA10 HOXA11 HOXA13

Fig. S5. Homeobox (HOX) A mRNA expression in heterochromatin domain. The mRNA expression level (relative copy number) of HOXA genes (HOXA9–13) located in heterochromatin domain was investigated in primary lung fibroblasts cells infected with either CCCTC-binding factor (CTCF) (A) or radiation repair 21 homolog (RAD21) knockdown (KD) (B) viruses, or octamer binding factor 4 (OCT4) overexpression (OE) virus (C). The y axes represent the cDNA copy number of HOXA genes normalized by the cDNA copy number of the control gamma actin , ACTG1 (HOXA copy no.∕ACTG1 copy no. × 108).

Heterochromatin pGIPz RAD21 KD CTCF KD

4 3 2 1 Enrichment 0 CBS4 CBS5 CBS6 CBS7

Fig. S6. Heterochromatin structure at the homeobox (HOX) A locus. The trimethylated lysine residue 27 on histone H3 (H3K27me3) level between at CCCTC- binding factor (CTCF) binding sites (CBS) 4 and at CBS7 including CBS5 in the HOXA locus was detected by the ChIP-qPCR using H3K27me3 antibody, in primary lung fibroblasts cells infected with either CTCF knockdown (KD) and RAD21 KD, or control (pGIPz) virus.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 4of7 Table S1. Primers for ChIP-qPCR at the CCCTC-binding factor (CTCF) binding site Primer ID Forward sequence Reverse sequence Human CBS4 ATCTCTGAGGGGCCAGTACA TATTTATTGCGACCGTGCTG CBS4.5 GCCTGGGAACAATACATGCT AGGAAGTCCACAGTGGGTTG CBS5 CGGAAGCCTCTTGCATGG TTATTGGCATTGCCTCCTCT CBS5.5 ATGGGTAAGGGGGAGTATGC CATGTGGGGAGGAGATAGGA CBS6 ACAAAGGCCAAGAATCATGC GGAGCTGGTTTCCGTCTCTC CBS7 TCTGTAGCTCCAGCGGTTTT CTGGGGCTCCTGATCCTAAT Control GCATCTCCTCTCGCAGTTG GGAACTCCGGCTCTGCTG

Mouse mCBS4 CTGCTTCTGGGGATTCTGAG ATTTATTGCGACCGAACAGG mCBS5 CGACTGCTGCTCACACAAAT AATTCCAGGAGATGGCAGTG mCBS6 GCTGAGTTCCTTTCGTCTGG GGGAGGCCTAAGTGGAAAAG mCBS7 ATACTGCACGGTCCAAGGTC GTGTTCACCTCCGACTCCAT Control TGACCATGTTCCCTGTCAAA TGTTAGTGGAGTCGCAGGTG Shown are the primer sets for CTCF binding sites (CBS4, 5, 6, and 7) and a random site used as a control for ChIP- qPCR analysis of the human and mouse homeobox (HOX) A locus, respectively.

Table S2. Primers for RT-qPCR analysis of homeobox (HOX) A genes Primer ID Forward sequence Reverse sequence Human HOXA6 AAAGCACTCCATGACGAAGG GTCTGGTAGCGCGTGTAGGT HOXA7 CCAATTTCCGCATCTACCC GAACTCCTTCTCCAGCTCCA HOXA9 GCGCCTTCTCTGAAAACAAT CAGTTCCAGGGTCTGGTGTT HOXA10 CCTTCCGAGAGCAGCAAAG CCTTCTCCAGCTCCAGTGTC HOXA11 GGCCACACTGAGGACAAGG AGAACTCCCGTTCCAGCTCT HOXA13 CTGGAACGGCCAAATGTACT GCTTCTTTCTCCCCCTCCTA ActG GCAAAGACCTGTACGCCAAC ACACCGAGTACTTGCGCTCT

Mouse mHOXA6 ACCGACCGGAAGTACACAAG GTCTGGTAGCGCGTGTAGGT mHOXA7 GAAGCCAGTTTCCGCATCTA CGTCAGGTAGCGGTTGAAAT mHOXA9 ACAATGCCGAGAATGAGAGC GTTCCAGCGTCTGGTGTTTT mHOXA10 CAGCCCCTTCAGAAAACAGT TCTTTGCTGTGAGCCAGTTG mHOXA11 GGCCACACTGAGGACAAGG GAACTCTCGCTCCAGCTCTC mHOXA13 CTGGAACGGCCAAATGTACT CCTATAGGAGCTGGCGTCTG mActG CATGGGCCAGAAAGACTCAT CTTCTCCATGTCGTCCCAGT Shown are the primer sets for real-time PCR for HOXA genes (HOXA6, 7, 9, 10, 11, and 13) and control gene, ACTG1, in the human and mouse HOXA locus, respectively.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 5of7 Table S3. Chromosome conformation capture primers for the human homeobox (HOX) A locus Primer ID Position Primer sequence u1 27000840 TGTTCCCAGCTGTTTACCAA u2 27047173 TTTCTTGAAATTACAGATCATTATGC u3 27065747 GATCACACACATCAACCCTGA u4 27070348 TTTTGAGCTTCACATGCAATTT u5 27086427 GGCCAATTGCTTCTCCATTA u6 27098403 CGCAAAGTTCAGCCTTCTCT u7 27100318 TTCTGTTGGCAAAGGGAACT u8 27107806 AGCTATTGTGCTGCCTTTCC u10 27113002 GCACGGCGTTACCAGAGC 4 27120449 ACAAACGGCTCTCACAAAGG 5 27132231 TGGCTCCATCCTGGGTATTA 6 27137247 GTGATGGATGCTGCTGTCC 7 27141602 TTTCATGGCACCAGTAAGCA 9 27146115 CTTTTGGGCAGAGGGAAAGT 11 27152680 ACAATCCGCATTAGGCTCTG 13 27157717 ATCCCATCCTCTCCTCCAAC 14 27167568 GGGCTTTGGTGGAATATCCT 18 27194283 GTGTCCGCAGGAGACAAAG 21 27202762 GGCATTGCTGCTACAAAACA 26 27211768 GGTCCATCTCGCCCTAGACT 31 27231571 GCCTGAGCTTTTCTCAGCAG 33 27238340 CAAGGATGAGGCAGGGACTA 34 27248670 GCGCGTTAAGCTCTCTTTTC 36 27256720 ACTGCCAGCAGATTGAGCTT 37 27265151 GGAGCAACATTCAAGATTGTTTAGTA 39 27269728 ACTCACCGATGAACCCAGTT 40 27272123 GAGGAGGAAACTGAGGCCAGG 41 27272933 GAATGTCCTTTTGGTGTCGTGTCT 43 27294792 GCAAGAACTGGTGGTGGATT 44 27299836 TGATTGGAATAGTTTCAGAAGGAATGG 45 27302003 CAGGAACAGATCATGGGAGA 49 27315977 CTTTTATTTCATTGAGCAGTGGTTTG The first column represents primer ID numbers and their corresponding chromosomal position. Each primer is located at unique EcoRI restriction digestion sites across the human HOXA locus. The second column indicates the position of the primer on (based on the hg18 University of California at Santa Cruz release of the ) and the third column is the primer sequence

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 6of7 Table S4. Chromosome conformation capture primers for the mouse homeobox (HOX) A locus Primer ID Position Primer sequence 2 52026579 GGTCCTCAGGAGAAACCAAA 3 52037704 AAGGGTGGGTTGTGAAAACA 4 52044823 TTTCTCCAGGAAAGCCTCAA 5 52048788 GATTGGAGGTAGTGCCGTGT 6 52052251 GAAGCAGGGAAACCCACTTA 7 52052408 GGCCCAACGTGTATTGACTC 12 52079272 CACTGCACTTTTGCCTCAAG 15 52099905 AGCTGGAGATTCATGTGGAGA 17 52102345 TGGTCACTGGCTCTTCTGC 19 52106572 CCGATGTGGATGAAGGAGTT 20 52114831 GGGACCGCGCTACTATTAAA 21 52118410 AAATCCACAGAATGGCAGATG 22 52118615 CCCATCTTCTTTAGGCGTGA 24 52120432 ACACTGTTGACCAGCGAATG 25 52121812 TTTCAGTATGGGGAGGATGC 26 52125013 GCAGCGTCCACAAACTTAAA 27 52128989 CAGGCAATAGCTCCCTTTCA 29 52137257 TGAGCTCCATCCTGGGTATT 30 52138974 GATCTGGCTCTCTGGGAGTG 33 52146675 CCTGGCCAATCAGAATCACT 35 52152751 GGCATGAGCTATTTCGATCC 36 52156544 TGCGTGGAGTTGATGAGTTT 37 52158193 CCTGCCAGCTTTTCATTCTC 38 52162384 TCGGTTCTCTCCTCCAACAC 39 52163899 GTCGTCCAAGAATCCCTCAG 40 52164022 CCCCTGGACTGTACTCAGGA 41 52165962 CGTCAGGTAGCGGTTGAAAT 42 52166256 GGCCAGACAAGCAAGAACAT 43 52172095 GCTGTTTCCGCAGTCTCTTT 45 52179117 GTAGTCGGGCGACACAGAAC 46 52179843 AAAATGACAGAGCCCCACAC 47 52182845 TCTTTGCTGTGAGCCAGTTG 48 52188274 CTTTTCTCCTTTGCCCTTCC 49 52189335 ACAGGGGAAGCATCTACACG 50 52193140 TCAAAGCTTTTGCAGGCTCT 51 52202191 GCTAGGCCTGATGCTCTGAC 53 52213462 GTCTGGGAAGAACGAAGCAG 56 52218310 CTGGGAGACAAAGAGGAAGG 58 52232377 GCCATCATTGGTCTGACCTT 60 52240299 GCTCTGAGGAAACGGATACC 62 52241178 CTTGTGGGGTGAAGGTGTCT 67 52248706 ATGGATCACCAAGCCTCAAG 69 52251199 AGTTGTCTGCCCGTTCTTGT 70 52260688 AGGATTCTCAGCCACCCTCT 71 52262316 TTCTCCCAAACACCAAGTCC 73 52263359 CCTTAAGCGCTCCCTTTTCT 74 52265391 CACTCCTGCAATAGGTGCAA 76 52271071 AACACCTTTGGGATCTGCTG 78 52272005 GCCAGAGATGGTAACAAATGG 79 52273660 GCACTGAAGTGGAGGAAAGG The first column represents primer ID numbers and their corresponding chromosomal position. Each primer is located at unique EcoRI restriction digestion sites across the mouse HOXA locus. The second column indicates the position of the primer on chromosome 5 (based on the mm9 University of California at Santa Cruz release of the mouse genome) and the third column is the primer sequence.

Kim et al. www.pnas.org/cgi/doi/10.1073/pnas.1018279108 7of7