Genome-wide analysis reveals Sall4 to be a major regulator of pluripotency in murine-embryonic stem cells

Jianchang Yanga, Li Chaib, Taylor C. Fowlesa, Zaida Alipioa, Dan Xua, Louis M. Finka, David C. Warda,1, and Yupo Maa,1

aDivision of Laboratory Medicine, Nevada Cancer Institute, One Breakthrough Way, Las Vegas, NV 89135; and bDepartment of Pathology, Joint Program in Transfusion Medicine, Brigham and Women’s Hospital/Children’s Hospital Boston, Harvard Medical School, 75 Francis Street, Boston, MA 02115

Contributed by David C. Ward, September 18, 2008 (sent for review June 2, 2008) Embryonic stem cells have potential utility in regenerative medi- signal transduction pathways, and genes relating to epigenetic cine because of their pluripotent characteristics. Sall4, a zinc-finger processes associated with PRCs as well as bivalent histone , is expressed very early in embryonic develop- methylations. These observations suggest that Sall4 is an essen- ment with Oct4 and Nanog, two well-characterized pluripotency tial regulator of cell pluripotency and differentiation. regulators. Sall4 plays an important role in governing the fate of stem cells through transcriptional regulation of both Oct4 and Results Nanog. By using chromatin immunoprecipitation coupled to mi- Sall4 Is a Major Transcriptional Regulator in ES Cells. A growing body croarray hybridization (ChIP-on-chip), we have mapped global of evidence has shown that Sall4 plays a vital role in maintaining gene targets of Sall4 to further investigate regulatory processes in ES cell pluripotency and in governing ES cell-fate decisions (9, W4 mouse ES cells. A total of 3,223 genes were identified that were 10, 14, 15). This prompted us to investigate the global down- bound by the Sall4 protein on duplicate assays with high confi- stream targets of Sall4 in mouse ES cells. By using a duplicate dence, and many of these have major functions in developmental set of ChIP-on-chip assays, we performed a global analysis of and regulatory pathways. Sall4 bound approximately twice as Sall4 binding sites in the mouse ES-cell line W4. This cell line was many annotated genes within promoter regions as Nanog and chosen because it was previously used to generate a conditional approximately four times as many as Oct4. Immunoprecipitation Sall4 knockout ES-cell line (9). The majority of transcription revealed a heteromeric protein complex(es) between Sall4, Oct4, factor binding sites in humans are known to occur Ϸ1–2 kb of the and Nanog, consistent with binding site co-occupancies. Decreas- transcription start site (15). Thus, promoter tiling arrays ing Sall4 expression in W4 ES cells decreases the expression levels (NimbleGen, build MM8) spanning 2.5 kb of promoter regions of Oct4, , c-, and , four proteins capable of reprogram- (2 kb upstream and 500 bp downstream from the transcription ming somatic cells to an induced pluripotent state. Further, Sall4 start site) were selected for hybridization to chromatin- bound many genes that are regulated in part by chromatin-based immunoprecipitated DNA obtained by using an affinity-purified epigenetic events mediated by polycomb-repressive complexes anti-Sall4 antibody (16). and bivalent domains. This suggests that Sall4 plays a diverse role Successful ChIP assays critically depend on the specificity of in regulating stem cell pluripotency during early embryonic devel- the antibody used. Therefore, we rigorously characterized the opment through integration of transcriptional and epigenetic antibody used in these immunoprecipitation assays. First, west- controls. ern blot analysis was used to compare the Sall4 antibody preparation with a commercially available anti-HA antibody to induced pluripotent stem cells ͉ epigenetic regulation ͉ Oct4 ͉ demonstrate specificity for either WT Sall4 or a Sall4-HA fusion Nanog ͉ Sox2 protein. Initially, in mouse fibroblast cells transfected with Sall4-HA, we were able to detect the fusion protein by using an all4 is a zinc-finger transcription factor that was originally anti-HA antibody, whereas in untransfected fibroblast cells, no Scloned based on to Drosophila spalt Sall4 band was detected [supporting information (SI) Fig. S1A, (sal) (1–3). In Drosophila, sal is a homeotic gene essential in the Lanes 0 and 1]. Although no expression was observed in fibro- development of posterior-head and anterior-tail segments (4). blasts, experiments in W4 ES cells were able to detect expression Human SALL4 mutations are associated with the Duane-radial of endogenous Sall4 [Lanes 2 and 3 (14)]. The endogenous band ray syndrome (Okihiro syndrome), a human autosomal- observed in ES cells was also successfully absorbed (Lanes 4 dominant disease involving multiple organ defects (3, 5, 6). Sall4 and 5). homozygous knockout mice die at an early embryonic stage (7, Next we sought to determine whether our antibody was 8). Our group and others have recently shown that mouse Sall4 applicable in ChIP experiments. ChIP-PCR of DNA fragments plays an essential role in maintaining the self-renewal and obtained by using the anti-Sall4 antibody was able to detect pluripotent properties of ES cells and in governing the fate of the enrichment of the peaks identified by the ChIP-on-chip assay. By inner-cell mass through transcriptional modulation of Oct4 (also using heterozygous Sall4 ES cells overexpressing Sall4-HA, known as Pou5f1) and Nanog (8–10). ES cells are derived from the inner cell mass of the developing Author contributions: J.Y. and Y.M. designed research; J.Y., Z.A., and D.X. performed embryo, and ES-cell pluripotency is regulated in part by Oct4, research; J.Y., L.C., T.C.F., Z.A., D.X., L.M.F., D.C.W., and Y.M. analyzed data; and J.Y., L.C., Sox2, and Nanog, as well as through 2 polycomb-repressive T.C.F., L.M.F., D.C.W., and Y.M. wrote the paper. complexes (PRCs) (11, 12). Sall4 is expressed by cells of the early The authors declare no conflict of interest. embryo, exhibiting an expression pattern similar to Oct4 (8, 9). Data deposition: The data reported in this paper have been deposited in the Gene In recent studies, Sall4 has also been used as part of a gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE11305). signature for pluripotency and an enhancer for somatic cell 1To whom correspondence may be addressed. E-mail: [email protected] or yma@ reprogramming (13, 14). However, the complete mechanism nvcancer.org. whereby Sall4 controls pluripotency and differentiation in ES This article contains supporting information online at www.pnas.org/cgi/content/full/ cells is unknown. The studies reported here demonstrate that 0809321105/DCSupplemental. Sall4 interacts with core transcription factors, genes in multiple © 2008 by The National Academy of Sciences of the USA

19756–19761 ͉ PNAS ͉ December 16, 2008 ͉ vol. 105 ͉ no. 50 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0809321105 Downloaded by guest on September 30, 2021 immunoprecipitation of the HA-tag identified 88% (23:26) of A the genes identified by the anti-Sall4 antibody (Fig. S1B). This p<0.001 suggests that our anti-Sall4 antibody is both sensitive and specific Cell Communication for the Sall4 protein when used in immunoprecipitation. We Signal Transduction p<0.001 have also used this antibody for immunohistochemistry to detect Sall4 protein in different tissue samples (Fig. S1C) and for flow Nuclear Protein p<0.001

cytometry to identify cell populations corresponding to leukemic p<0.001 blasts in patient bone marrow samples that uniquely express DNA Binding Sall4 [Fig. S1D (16)]. Transcription Regulation p<0.001 Following binding site determination by NimbleGen, the ChIP-on-chip duplicate assays identified roughly 5,200 Sall4- Developmental Protein p<0.001 bound genes in array 1, and 4,400 Sall4-bound genes in array 2. 0 100 200 300 400 500 600 700 The overall false discovery rate was Ͻ0.20. Comparison of the data from arrays 1 and 2 showed that 3,223 gene promoters gave B Organ positive hybridization signals on both arrays. Of the 1,000 genes Development exhibiting the most intense hybridization signals on array 2, 947 Pattern p<0.05 were also positive on array 1. When only the top 200 genes were Specification considered, the concordance rate was 98.5%. In contrast, when Brain p the 800 lowest intensity signals on each array were analyzed, only Development <0.05 37.2% (array 1) and 52.6% (array 2) of the signals were concordant in both assays. Therefore, we selected only the 3,223 050100150200 genes that were positive on both arrays for further analysis. Examples We next validated a subset of the putative Sall4 binding sites C κ Map4k4, Tlr4, Tlr7, Traf2 by using a ChIP PCR strategy. A total of 55 genes were NF- B interrogated. Primer pairs were prepared for a randomly se- Apoptosis Birc2, Birc4, Casp6, Tnf lected set of hybridization positive genes with varying degrees of Wnt/β-catenin Dkk1, Frat1, Tcf4, Wif1 signal intensity. If a selected gene did not initially produce an PTEN Akt3, Casp3, FoxG1, FoxO1 amplicon level above background, a new primer set was designed PDGF Fos, Pitx2, Smad3, Tgfb1 200–300 bases distal to the first primer site, and the quantitative Brca1, Casp6, Ccnd1, Ccnd2 real-time PCR (Q-RT-PCR) assay was repeated. In some cases, TGF-β Abl1, Abl2, Pdgfra, Elk1 a third primer set was used before designating that gene to be a Sonic Hedgehog Dyrk1a, Hhip, Prkaca, Stk36 false positive. In addition, ChIP-PCR using primers located adjacent to true positive loci were shown to give negative 0 5 10 15 20 25 amplification results, further demonstrating the specificity of Fig. 1. Sall4 is a major regulator in mouse ES cells. (A) Sall4 bound to Sall4 binding-site identification. Based on the Q-RT-PCR data Ϸ promoters that over-represent a broad classification of GO annotations. These (52:55 positive), we concluded that 94.5% of the 3,223 genes included various potential regulatory and developmental annotations. Anal- common to both arrays are true positive SALL4 binding sites in ysis was done with DAVID, and the x axis represents the gene number. (B) the mouse ES cell line (Fig. S2). Further classification of developmentally important genes over-represented The full list of the 3,223 Sall4-bound genes and their respective in the Sall4 binding pool. For the organ development annotation, the over- array hybridization data can be found within the supplemental representation was insignificant (P Ͻ 0.074) but notable. The x axis represents data (Dataset S1) and on the gene expression omnibus (GEO) the gene number. P values are inset following each bar and were calculated as accession number GSE11305. The number of promoter by using Fisher’s Exact Test based on over-representation in comparison to the sequences binding Sall4 is quite high, but other transcription genome. (C) Sall4 binds promoter regions belonging to a variety of pathways that have definitive roles in development, suggesting that Sall4 may control factor proteins, such as , have been reported to bind over a wide variety of developmental processes. Listed genes are only representa- 5,000 gene promoters (17). Myc has recently been reported to tive of the Sall4-bound genes in each pathway. P values for this analysis are not bind a similar number of genes in mouse ES cells (18). presented because of the low number of genes within each pathway. Classi- We then sought to determine the distribution of Sall4 binding fication was done by using Ingenuity Pathway Analysis (www.ingenuity.com). sites within the mapped regions of the genome by using DAVID (19). Analysis of over-represented annotations for promoter regions bound by Sall4 revealed significant representation of a Sall4 modulates expression of both Oct4 and Nanog (9, 10). broad variety of genes that may be important for stem cell However, the magnitude of the Sall4 transcriptional network is BIOLOGY functions (Fig. 1A). These include developmental genes and quite striking and suggests that Sall4 may play a central role in DEVELOPMENTAL genes necessary for signal transduction and other regulatory embryonic development. processes. Further classification of the developmental genes revealed over-representation of genes associated with organ Sall4 Targets Important Signals That Control ES-Cell Differentiation development, pattern specification, and brain development (Fig. and Lineage Specification. Numerous signaling pathways play 1B). Sall4 bound to promoter regions of 11 members of the Hox important roles in maintaining pluripotency during embryogen- esis. For example, the has important roles gene family, and 42 other or homeobox-like genes in embryogenesis and cancer (20–23). TGF-␤ signaling is nec- (Table S1). The binding of Sall4 to promoter regions of vital essary to maintain ES cell pluripotency, and PTEN signaling developmental genes and others that govern ES cell fate support plays important roles in the maintenance of hematopoetic stem the phenotypic consequence of Sall4 reduction in ES cells. This cell self-renewal. Fig. 1C shows the number of genes bound by also suggests that Sall4 plays a vital role in ES cells that may be Sall4 within several developmentally important pathways and similar to Oct4 and Nanog. This hypothesis is supported by 3 examples of the genes bound within each pathway. This suggests lines of evidence: (a) Sall4 is expressed very early in the that Sall4 may play a broad role in regulation of ES-cell developing embryo and is subsequently down-regulated in most pluripotency through interactions with key signaling pathways. differentiated tissues, (b) both over- and under-expression of Sall4 cause ES-cell differentiation, demonstrating the necessity Magnitude of the Sall4 Transcriptional Network in ES Cells. Recently, for tight regulation of Sall4 expression, and (c) the finding that ChIP-on-chip studies have been performed on the gatekeeper

Yang et al. PNAS ͉ December 16, 2008 ͉ vol. 105 ͉ no. 50 ͉ 19757 Downloaded by guest on September 30, 2021 A 6

5

4

3

control (Log2) 2

1 Fold enrichmentrelative to 0 c-Myc Klf4 Sox2 Oct4

B Sall4+/- ES cells WT ES cells 1.2

1.0

0.8

0.6

0.4 Relative Expression Relative 0.2 Fig. 2. Coimmunoprecipitation and co-occupancy of Sall4, Oct4, and Nanog. (A) Transient transfection of W4 ES cells with a Sall4-HA construct exhibited 0 protein expression detected by both anti-HA and anti-Sall4 antibodies (the Oct4 Sox2 c-Myc Klf4 Sall4 latter data not shown) in the cell extract (left). Oct4 and Nanog are detected Fig. 3. Decreased expression of iPS genes in Sall4ϩ/Ϫ ES cells. Ectopic expres- by using respective antibodies in the whole ES cell extract (input). Immuno- sion of 4 key transcription factors, Oct4, Sox2, c-Myc, and Klf4, produces iPS precipitation of Sall4-HA with an anti-HA antibody revealed both Oct4 and cells. (A) Sall4 binds to promoter regions of Oct4, Sox2, c-Myc, and Klf4 as Nanog bands, whereas immunoprecipitation with an IgG antibody detected shown by ChIP-PCR. (B) Following adenovirus induced removal of 1 Sall4 allele, neither protein. (B) Venn diagram showing the overlapping target genes of expression of all 4 transcription factors is decreased as measured by Q-RT-PCR. Sall4, Oct4, and Nanog as determined by ChIP-Chip and ChIP-PET experiments. The Sall4/Gapdh ratio in control cells was set at 1. The values are the mean of These complexes may function in the regulation of stem cell pluripotency. triplicate reactions, and the bars indicate SD.

genes, Oct4 and Nanog. This enabled us to compare genes bound Nanog target genes, representing only 15% of Nanog’s bound by Oct4 and Nanog with those bound by Sall4. Interestingly, genes (Fig. 2B). This suggests that Sall4–Oct4 and Sall4–Nanog ChIP-on-chip assays showed that Oct4 had 783 promoter binding interactions may form functional complexes only at select pro- sites, whereas Nanog had 1,284 binding sites within the mouse moter regions (9, 10). There are only 45 genes that are co- genome (18). The ChIP-on-chip data presented here revealed Ϸ occupied by Oct4, Sall4, and Nanog (Table S2). However, this that Sall4 bound 3,200 gene promoters. Given the similar group includes developmentally important genes, such as Dkk1, expression patterns of the transcription factors Sall4, Oct4, and Msx2, Fbxl10, and Epc1. Nanog, this is remarkable. These observations suggest that Sall4 may play a similar, but broader role in regulating ES-cell Sall4؉/؊ ES Cells Exhibit Decreased Expression of iPS Genes. Recent properties. However, the roles of each in vivo are not completely studies have shown that ectopic expression of Oct4, Sox2, Klf4, understood. and c-Myc is capable of reprogramming somatic cells to confer a pluripotent state, termed induced pluripotent stem (iPS) cells Interaction and Co-Occupation of Sall4 with Oct4 and Nanog in ES (26). It has previously been demonstrated that Sall4 binds to Cells. Wu et al. elegantly demonstrated that Sall4 and Nanog form Oct4 and regulates its expression (9). We show here that the Sall4 a regulatory complex in ES cells (10). Liang et al. (24) recently protein binds to the promoter regions of Oct4, c-Myc, Sox2, and showed that Sall4 forms a complex (or complexes) with both Klf4 through ChIP-PCR (Fig. 3A). Oct4 and Nanog by using mass spectrometry and immunopre- To determine the relationship between binding and Sall4 cipitation of endogenous proteins. We have confirmed these function, we used Q-RT-PCR to measure mRNA levels from observations by immunoprecipitation experiments using ES cells Sall4ϩ/Ϫ ES cells. Expression levels of all 4 transcription factors transiently transfected with Sall4-HA. Western blotting detected decreased in Sall4ϩ/Ϫ ES cells indicating that Sall4 plays an an overexpression of Sall4-HA protein by both anti-HA (Fig. activating role on these genes (Fig. 3B). Because Sall4 is not 2A) and anti-Sall4 antibodies (data not shown). Immunoprecipi- expressed in the majority of differentiated tissues including tation with an anti-HA antibody produced a unique endogenous fibroblasts, this suggests that exogenous expression of Sall4 45-kDa protein, Oct4, in the precipitate. By contrast, an IgG- may play a role in reprogramming somatic cells to confer a negative control failed to generate the Oct4 band in the same pluripotent state. This hypothesis has recently been supported extract, indicating a specific Sall4–Oct4 interaction. By using the by others (27). same method, the Sall4–Nanog interaction was confirmed in the same anti-HA-pulldown precipitate (Fig. 2A). Sall4 Binds to Genes Associated with H3K27 Methylation Domains as Because these transcription factors physically interact, one well as to Target Genes of PRC1 and PRC2. Numerous studies have would expect them to colocalize to some of the same gene implicated epigenetic modifications as a means for regulating promoters (25). A gene bound by any two of these proteins we stem cell pluripotency (28–30). We have previously shown that will refer to as ‘‘co-occupied’’. However, Oct4 and Sall4 co- Bmi-1, a polycomb group member, is a downstream target of occupied only 92 common genes representing 12% of genes SALL4 (31), thus, we focused on other polycomb-associated bound by Oct4. Similarly, Sall4 binding was identified at 198 genes for analysis. Although various covalent modifications can

19758 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0809321105 Yang et al. Downloaded by guest on September 30, 2021 A

Fig. 5. Sall4 target genes are associated with bivalent domains. Venn diagram displaying the GAHMs bound by Sall4 within HCNEs. Notably, the B Sall4-PRC Bound majority of Sall4-bound genes within characterized HCNEs are marked by Sall4 Bound bivalent histone methylation domains including a cluster of homeobox genes (see Table S3). p<0.001 Development p<0.07 with 29 and 69 bound by PRC1 and PRC2, respectively. Inter- estingly, Sall4 bound 164 GAHMs that were not occupied by either PRC1 or PRC2. To determine the function of this subset p<0.001 of genes, we categorized them based on overrepresentation by Morphogenesis using DAVID. As expected, GAHM-H3K27 and PRCs had p <0.07 extremely significant roles in development (P Ͻ 0.001). Surpris- ingly, GAHM-H3K27 and Sall4 also had notable roles in devel- opment (P Ͻ 0.07; Fig. 4B). This reveals a system in which 0 10 20 30 40 50 60 70 80 90 regulation of GAHM-H3K27 may be controlled by dynamic Fig. 4. The role of Sall4 in H3K27 methylation regulation (A) Sall4 binds to involvement of both Sall4 and PRCs. 422 GAHM that are methylated at H3K27. Some of the GAHMs are also bound by PRC1 (Rnf2, Phc1) and PRC2 (Suz12, Eed). One hundred sixty-four of these Many Sall4 Targets Harbor Bivalent Domains. It has been reported genes are associated with Sall4, PRC1, and PRC2 (inner orange circle) with 69 that dual epigenetic markers, coined ‘‘bivalent domains’’, con- PRC2-bound genes and 29 PRC1-bound genes also bound by Sall4 (outer sisting of methylations at H3K27 and at histone 3 on lysine 4 orange circle). However, Sall4 binds 164 of GAHM-H3K27 that do not bind the (H3K4), exist for a large set of developmental genes within polycomb group proteins (outer gray circle). (B) Two hundred fifty-eight genes are bound by one or more polycomb group protein(s) and Sall4. Of Highly Conserved Noncoding Elements (HCNEs) (33). ES-cell these, 81 have developmental functions that display significantly over- pluripotency is hypothesized to be maintained, in part, through represented (P Ͻ 0.001) binding to genes associated with various develop- a balance of H3K4 gene activation and H3K27 gene repression mental processes (orange). Binding of Sall4 to GAHM-H3K27, but neither PRC1 at these bivalent domains (33). To explore the role that Sall4 may nor PRC2 occurs for 164 genes, with 23 genes having developmental processes play in this epigenetic mechanism, Sall4 binding sites were that are notable but not statistically significant (P Ͻ 0.07; white). This suggests compared with bivalent domains identified within HCNEs. We that two or more mechanisms may interact to regulate cell fate through found that Sall4 co-occupied 27% (37:135) of Genes Associated histone methylation H3K27. The x axis represents the gene number. These GO with Bivalent Histone Methylations (GABHMs), including 11 annotations are not mutually exclusive, and P values were determined by using Fisher’s Exact Test. Hox gene family members (Fig. 5, Table S3). In contrast, Oct4 and Nanog each bind only 12% of GABHMs (Fig. S3). Surpris- ingly, there are no genes that are bound by Sall4, Oct4, and influence chromatin remodeling, here we investigate the com- Nanog. Only 11 of the GABHMs are co-occupied by any 2 bined roles of Sall4, methylation of histone 3 on lysine 27 proteins, suggesting that Sall4, Oct4, and Nanog may play (H3K27), and PRCs. PRCs are key modulators of stem-cell independent roles in methylation regulation. Further, these 3 pluripotency and consist of 2 distinct groups (11, 12). PRC1 transcription factors account for binding to only 39% of iden- consists of Ͼ10 proteins, including Bmi1, Rnf2, PhcI, and the tified bivalent domains. It remains to be determined what other HPC proteins, whereas PRC2 contains Ezh2, Eed, Suz12, and genes emerge as further epigenetic regulators. BIOLOGY RbAp46:48 (32). Representative ChIP-on-chip assays have been DEVELOPMENTAL performed for PRC1 genes Rnf2 and Phc1, and PRC2 group Discussion members Suz12 and Eed (11). PRCs maintain ES-cell pluripo- We have shown that Sall4 binds Ϸ3,200 genes within their tency by facilitating H3K27 methylation, a modification that promoter regions in mouse ES cells. An analogous ChIP-chip represses gene expression (11). The majority of H3K27 trim- assay preformed by using chromatin-precipitated DNA obtained ethylation and PRC binding sites are Ϸ1 kb of the transcription by using Oct4 and Nanog antibodies revealed 783 and 1,284 start site, and both are frequently present on gene promoters. bound genes, respectively. Given the similar gene expression However, not all H3K27 methylated domains are associated with patterns of Sall4 and Oct4, the magnitude of Sall4 binding is PRCs. Thus, Sall4 may bind and potentially regulate expression remarkable. Although extensive functional studies need to be of a subset of genes associated with H3K27 methylated domains. done on Sall4, Oct4, and Nanog, it appears that the role of Sall4 Binding of Sall4 to Genes Associated with Histone Methylations may be significant in determining stem cell fate. (GAHMs) occurred at 17% (422:2557) of previously identified Sall4, Oct4, and Nanog have been shown to form heteromeric H3K27 methylation domains within 1 kb of annotated- protein complexes that may regulate ES-cell gene expression in transcription start sites of mouse ES cells. We expect PRCs to be complex ways. Transient combinatorial binding of Sall4, Oct4, associated with some of the Sall4-bound GAHMs. Sall4, PRC1, and Nanog may determine cell fate with different combinations and PRC2 co-occupied 160 GAHMs (Fig. 4A). There were also of these proteins controlling different aspects of pluripotency. GAHMs co-occupied by Sall4 and one of the polycomb proteins, Although trimeric protein complexes may exist, there are -

Yang et al. PNAS ͉ December 16, 2008 ͉ vol. 105 ͉ no. 50 ͉ 19759 Downloaded by guest on September 30, 2021 tively few genes bound by all 3 transcription factors. The Materials and Methods Sall4–Oct4 complex binds genes that have statistically signif- Cell Culture. Embryonic stem cells from the W4 mouse cell line (Gene Targeting icant roles in some developmental processes associated with Core Facility, University of Iowa) were cultured with irradiated mouse embry- stem cell activities (P Ͻ 0.05). In contrast, although Sall4– onic fibroblast feeders or under feeder-free conditions as described previously Nanog complexes bind genes that have similar developmental (9). For W4 clone EA231, Sall4ϩ/Ϫ ES cells were cultured with the antibiotic and transcriptional functions, this dimeric protein combina- G418 at a concentration of 125 ␮g/ml. tion also binds genes important for organ development and pattern specification at statistically significant frequencies ChIP-on-chip Assays. A complete ChIP-on-chip assay protocol was provided by (P Ͻ 0.05; Fig. S4 and Table S4). This data suggests that the NimbleGen Systems, Inc. In brief, W4 ES cells were cross-linked with formal- dehyde and lysed, and then the DNA was sheared by sonication. A sonication binding of these 3 transcription factors at select promoter regime consisting of 8 pulses lasting 20 seconds each were used with 90 regions may dynamically control transcription required for the seconds in-between spent on ice. The Misonix Sonicator 3000 was used at stem cell state, although it is likely that many other proteins power level 3.5 for the sonication procedure. Following immunoprecipitation also play important roles. with an affinity-purified anti-Sall4 antibody (16), ChIP-purified DNA was Another interesting observation is that down-regulation of blunt-ended, ligated to linkers, and subjected to low-cycle PCR amplification. Sall4 also causes down-regulation of Oct4, Sox2, Klf4, and Resultant ChIP-DNA was then hybridized to duplicate promoter tiling arrays c-Myc, 4 genes that induce reprogramming of somatic cells to (RefSeq arrays, build MM8) each containing 19,457 promoter annotations induced pluripotent stem cells. This suggests a mechanism by produced by NimbleGen. Design of the mouse promoter array is a single array which Sall4 could be a key regulator for the reprogramming containing 2.5 kb of each RefSeq promoter region. The promoter region is process. This interpretation was recently supported by Wong covered by 50–75 mer probes at roughly 100-bp spacing dependent on the sequence composition of the region. The arrays were hybridized and the data et al., who used cell fusion to demonstrate that Sall4 can extracted according to NimbleGen standard procedures. Data extraction was enhance somatic cell reprogramming (27). Nevertheless, the done by using NimbleScan, which searches for 4 or more probes above a importance of Sall4 in somatic cell reprogramming and the specified cutoff value ranging from 90–15% using a 500-bp sliding window. role it plays in regulating this gene quartet remain to be The cutoff value is a percentage of the hypothetical maximum determined by determined. using the mean plus 6 standard deviations and is decreased in 1% increments Interestingly, recent work by Dr. Austin Smith’s group has from 90–15%. The data are then randomized 20 times to evaluate the suggested that inhibiting a cell’s intrinsic signaling pathways is possibility of false positives, and each peak is assigned a false discovery rate sufficient to prevent differentiation (23). These pathways include based on this randomization. Confirmation of the predicted binding sites was those that proceed through mitogen-activated protein kinases performed by using ChIP-PCR analysis of the amplicons applied to the arrays (ERK1/2) and glycogen synthase kinases (GSK3). Work from (Fig. S1A). Negative control primers were designed adjacent to Sall4-bound peaks (Fig. S2B). our lab and from others suggests that Sall4 may interact with Stat3, and preliminary data indicate that Stat3 is an upstream Coimmunoprecipitation and Western Blotting. For Oct4–Sall4 and Nanog–Sall4 regulator of Sall4. This would implicate Sall4 in this complex interactions, plasmid pcDNA3/Sall4-HA was transfected into W4 ES cells to regulatory loop. How Sall4 interacts with other microenviron- express the Sall4-HA protein by using Lipofectamine 2000 reagent (Invitro- mental signals is unknown at this time. gen). Coimmunoprecipitations were performed following the Catch and Re- Important questions remain to be answered regarding evi- lease v2.0 High Throughput Immunoprecipitation Assay Kit (Upstate) as rec- dence for an Oct4–Sall4–Nanog complex and the regulatory ommended. For western blots, the membrane was incubated with Oct-3:4 role that it may play in ES cell pluripotency maintenance. (H-134), Nanog (M-149) (both from Santa Cruz Biotechnology, Inc), or Sall4 Further, evidence presented here indicates that Sall4 may play antibodies at a 1:300 dilution at 4 °C overnight. Detection was done by using an important role in regulating chromatin remodeling. This SuperSignal West Pico solutions (Pierce). connects the independent processes of transcriptional regula- ACKNOWLEDGMENTS. This work was supported in part by National Institutes tion and epigenetic regulation and may provide insights into an of Health Grants R01HL087948, NIH R21CA131522, and P20 RR016464 (to integrated control process involved in determining stem cell Y.M.), The Leukemia and Lymphoma Society Special Fellow Award (to J.Y.), fate. and Harvard Stem Cell Institute (L.C.).

1. Kohlhase J, et al. (1996) Isolation, characterization and organ-specific expression of 13. Lowry WE, et al. (2008) Generation of human induced pluripotent stem cells from two novel human zinc finger genes related to the Drosophila gene spalt. Genomics dermal fibroblasts. Proc Natl Acad Sci USA 105:2883–2888. 38:291–298. 14. Takahashi K, et al. (2007) Induction of pluripotent stem cells from adult human 2. Kohlhase J, et al. (1999) SALL3, a new member of the human spalt-like gene family, fibroblasts by defined factors. Cell 131:861–872. maps to 18q23. Genomics 62:216–222. 15. Boyer LA, et al. (2005) Core transcriptional regulatory circuitry in human embryonic 3. Al-Baradie R, et al. (2002) Duane-radial ray syndrome (Okihiro syndrome) maps to stem cells. Cell 122:947–956. 20q13 and results from mutations in SALL4, a new member of the SAL family. Am J Hum 16. Ma Y, et al. (2006) SALL4, a novel oncogene, is constitutively expressed in human acute Genet 71:1195–1199. myeloid leukemia (AML) and induces AML in transgenic mice. Blood 108:2726–2735. 4. Kuhnlein RP, Schuh R (1996) Dual function of the region-specific homeotic gene spalt 17. Bieda M, Xu X, Singer MA, Green R, Farnham PJ (2006) Unbiased location analysis of during Drosophila tracheal system development. Development 122:2215–2223. E2F1-binding sites suggests a widespread role for E2F1 in the . Genome 5. Kohlhase J, et al. (2002) Okihiro syndrome is caused by SALL4 mutations. Hum Mol Res 16:595–605. Genet 11:2979–2987. 18. Kim J, Chu J, Shen X, Wang J, Orkin SH (2008) An extended transcriptional network for 6. Borozdin W, et al. (2004) Novel mutations in the gene SALL4 provide further evidence pluripotency of embryonic stem cells. Cell 132:1049–1061. for acro-renal-ocular and Okihiro syndromes being allelic entities, and extend the 19. Dennis G, Jr., et al. (2003) DAVID: Database for Annotation, Visualization, and Inte- phenotypic spectrum. J Med Genet 41:e102. grated Discovery. Genome Biol 4:P3. 7. Sakaki-Yumoto M, et al. (2006) The murine homolog of SALL4, a causative gene in Okihiro 20. Czyz J, Wobus A (2001) differentiation: The role of extracellular syndrome, is essential for embryonic stem cell proliferation, and cooperates with Sall1 in factors. Differentiation 68:167–174. anorectal, heart, brain and kidney development. Development 133:3005–3013. 21. Dravid G, et al. (2005) Defining the role of Wnt/beta-catenin signaling in the survival, 8. Elling U, Klasen C, Eisenberger T, Anlag K, Treier M (2006) Murine inner cell mass- derived lineages depend on Sall4 function. Proc Natl Acad Sci USA 103:16319–16324. proliferation, and self-renewal of human embryonic stem cells. Stem Cells 23:1489–1501. 9. Zhang J, et al. (2006) Sall4 modulates embryonic stem cell pluripotency and early 22. Walsh J, Andrews PW (2003) Expression of Wnt and Notch pathway genes in a embryonic development by the transcriptional regulation of Pou5f1. Nat Cell Biol pluripotent human embryonal carcinoma cell line and embryonic stem cell. Apmis 8:1114–1123. 111:197–210. 10. Wu Q, et al. (2006) Sall4 interacts with Nanog and co-occupies Nanog genomic sites in 23. Ying QL, et al. (2008) The ground state of embryonic stem cell self-renewal. Nature embryonic stem cells. J Biol Chem 281:24090–24094. 453:519–523. 11. Boyer LA, et al. (2006) Polycomb complexes repress developmental regulators in 24. Liang J, et al. (2008) Nanog and Oct4 associate with unique transcriptional repression murine embryonic stem cells. Nature 441:349–353. complexes in embryonic stem cells. Nat Cell Biol 10:731–739. 12. Lee TI, et al. (2006) Control of developmental regulators by Polycomb in human 25. Zhou Q, Chipperfield H, Melton DA, Wong WH (2007) A gene regulatory network in embryonic stem cells. Cell 125:301–313. mouse embryonic stem cells. Proc Natl Acad Sci USA 104:16438–16443.

19760 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0809321105 Yang et al. Downloaded by guest on September 30, 2021 26. Takahashi K, Yamanaka S (2006) Induction of pluripotent stem cells from mouse 30. Meshorer E, et al. (2006) Hyperdynamic plasticity of chromatin proteins in pluripotent embryonic and adult fibroblast cultures by defined factors. Cell 126:663–676. embryonic stem cells. Dev Cell 10:105–116. 27. Wong CC, Gaspar-Maia A, Ramalho-Santos M, Reijo Pera RA (2008) High-efficiency stem cell 31. Yang J, et al. (2007) Bmi-1 is a target gene for SALL4 in hematopoietic and leukemic fusion-mediated assay reveals Sall4 as an enhancer of reprogramming. PLoS ONE 3:e1955. cells. Proc Natl Acad Sci USA 104:10494–10499. 28. Azuara V, et al. (2006) Chromatin signatures of pluripotent cell lines. Nat Cell Biol 32. Valk-Lingbeek ME, Bruggeman SW, van Lohuizen M (2004) Stem cells and cancer; the 8:532–538. polycomb connection. Cell 118:409–418. 29. Martens JH, et al. (2005) The profile of repeat-associated histone lysine methylation 33. Bernstein BE, et al. (2006) A bivalent chromatin structure marks key developmental states in the mouse epigenome. EMBO J 24:800–812. genes in embryonic stem cells. Cell 125:315–326. BIOLOGY DEVELOPMENTAL

Yang et al. PNAS ͉ December 16, 2008 ͉ vol. 105 ͉ no. 50 ͉ 19761 Downloaded by guest on September 30, 2021