Yeo and Ng Genome Medicine 2011, 3:68 http://genomemedicine.com/content/3/10/68

REVIEW Transcriptomic analysis of pluripotent stem cells: insights into health and disease Jia-Chi Yeo1,2 and Huck-Hui Ng1,2,3,4,5,*

Abstract types, termed ‘pluripotency’, allows researchers to study early mammalian development in an artificial setting and Embryonic stem cells (ESCs) and induced pluripotent offers opportunities for regenerative medicine, whereby stem cells (iPSCs) hold tremendous clinical potential ESCs could generate clinically relevant cell types for because of their ability to self-renew, and to tissue repair. However, this same malleability of ESCs di erentiate into all cell types of the body. This unique also renders it a challenge to obtain in vitro differentiation capacity of ESCs and iPSCs to form all cell lineages of ESCs to specific cell types at high efficacy. erefore, is termed pluripotency. While ESCs and iPSCs are harnessing the full potential of ESCs requires an in-depth pluripotent and remarkably similar in appearance, understanding of the factors and mechanisms regulating whether iPSCs truly resemble ESCs at the molecular ESC pluripotency and cell lineage decisions. level is still being debated. Further research is therefore Early studies on ESCs led to the discovery of the core needed to resolve this issue before iPSCs may be safely pluripotency factors Oct4, and Nanog [1], and, applied in humans for cell therapy or regenerative increasingly, the use of genome-level screening assays has medicine. Nevertheless, the use of iPSCs as an in vitro revealed new insights by uncovering additional trans- human genetic disease model has been useful in cription factors, transcriptional cofactors and studying the molecular pathology of complex genetic remodeling complexes involved in the maintenance of diseases, as well as facilitating genetic or drug screens. pluripotency [1]. e study of ESC transcriptional regu- Here, we review recent progress in transcriptomic lation is also useful in the understanding of human approaches in the study of ESCs and iPSCs, and discuss diseases. ESCs, for instance, are known to share certain how deregulation of these pathways may be involved cellular and molecular signatures similar to those of in the development of disease. Finally, we address the cancer cells [2], and deregulation of ESC-associated importance of these advances for developing new trans criptional regulators has been implicated in many therapeutics, and the future challenges facing the human developmental diseases. clinical application of ESCs and iPSCs. Despite the promising potential, the use of human Keywords Embryonic stem cells, gene expression, ESCs (hESCs) in clinical applications has been slow induced pluripotent stem cells, pluripotency, because of ethical, immunological and tumorigenicity regenerative medicine, therapy, transcriptional concerns [3]. ese ethical and immunogenicity issues regulation, transcriptomics. were seemingly overcome by the creation of induced pluripotent stem cells (iPSCs), whereby exogenous expres sion of Oct4, Sox2, and c- in differentiated cells could revert them to pluripotency [4]. However, the Stem cell transcriptomics and transcriptional question of whether these iPSCs truly resemble ESCs is networks still actively debated and remains unresolved [5]. Never- Embryonic stem cells (ESCs) have the unique ability to theless, the application of iPSCs as an in vitro human self-renew and differentiate into cells of all three germ genetic disease model has been successful in revealing layers of the body. is capacity to form all adult cell novel molecular disease pathologies, as well as facilitating genetic or drug screenings [6]. In this review, we describe recent advances in under- *Correspondence: [email protected] 1Gene Regulation Laboratory, Genome Institute of Singapore, 60 Biopolis Street, standing the ESC and iPSC transcriptional network, and Genome, Singapore 138672 also discuss how deregulation of ESC pathways is Full list of author information is available at the end of the article implicated in human diseases. Finally, we address how the knowledge gained through transcriptional studies of © 2010 BioMed Central Ltd © 2011 BioMed Central Ltd ESCs and iPSCs has impacted translational medicine. Yeo and Ng Genome Medicine 2011, 3:68 Page 2 of 12 http://genomemedicine.com/content/3/10/68

Transcriptomic approaches for studying stem cells Table 1. Transcriptomic approaches for studying stem cells The transcriptome is the universe of expressed transcripts Objective Method Reference within a cell at a particular state [7]; and understanding DNA sequencing NGS [65] the ESC transcriptome is key towards appreciating the mRNA expression analysis Microarray [58] mechanism behind the genetic regulation of pluripotency RNA-seq [93] and differentiation. The methods used to study gene expression patterns can be classified into two groups: (1) miRNA expression analysis Microarray [11] those using hybridization-based approaches, and (2) RNA-seq [18] those using sequencing-based approaches (Table 1). lncRNA expression analysis Microarray [12,13] For hybridization-based methods, the commonly used Identification of alternative splicing isoforms Microarray [22,94] ‘DNA microarray’ technique relies on hybridization RNA-seq [95] between expressed transcripts and microarray printed Mapping of protein-DNA binding ChIP-chip [9,10,24] oligo­nucleotide (oligo) probes from annotated gene regions [7]. In addition to allowing the identification of ChIP-PET [23] highly expressed genes, microarrays also enable the study ChIP-seq [15] of gene expression changes under various conditions. DNA methylation profiling BS-seq [68] However, microarrays have their limitations, whereby MethylC-seq [68] prior knowledge of genomic sequences is required, and DIP-seq [16] cross-hybridization of oligo probes may lead to false Mapping of long-range chromatin interactions ChIA-PET [17] identification [7]. Subsequently, later versions of micro­ arrays were modified to include exon-spanning probes 3C [29] for alternative-spliced isoforms, as well as ‘tiling arrays’, Identification of RNA-protein interactions RIP-seq [19] which comprise oligo probes spanning large genomic RIP and regions to allow for the accurate mapping of gene trans­ direct RNA [14] quantification cripts [7,8]. Indeed, conventional microarrays and tiling arrays have been instrumental in advancing our under­ BS-seq, bisulfite sequencing; ChIA-PET, chromatin interaction analysis with paired-end tag sequencing; ChIP-chip, chromatin immunoprecipitation standing of ESC transcriptional regulation (Table 1) on chip; ChIP-PET, chromatin immunoprecipitation with paired-end tag through the mapping of ESC-associated - sequencing; ChIP-seq, chromatin immunoprecipitation and sequencing; DIP- seq, DNA immunoprecipitation and sequencing; MethylC-seq, methylcytosine factor binding sites (chromatin immunoprecipitation sequencing; NGS, next-generation sequencing; RIP, RNA-binding protein (ChIP)-chip) [9,10], identification of microRNA (miRNA) immunoprecipitation; RIP-seq, RNA-binding protein immunoprecipitation and regulation in ESCs [11], as well as the identification of sequencing; RNA-seq, RNA sequencing; 3C, conformation capture. long non-coding RNA (lncRNA) [12] and long intergenic non-coding RNA (lincRNA) [13,14]. Transcriptomics has been instrumental in the study of Sequence-based transcriptomic analysis on the other alternative splicing events. It has been suggested that hand involves direct sequencing of the cDNA. Initially, around 95% of all multi-exon human genes undergo Sanger sequencing techniques were used to sequence alternative splicing to generate different protein variants gene transcripts, but these methods were considered for an assortment of cellular processes [20], and that expensive and low throughput [7]. However, with the alter­native splicing contributes to higher eukaryotic development of next-generation sequencing (NGS), such complexity [21]. In mouse ESCs (mESCs) undergoing as the 454, Illumina and SOLiD platforms, it is now embryoid body formation, exon-spanning microarrays possible to perform affordable and rapid sequencing of have identified possible alternative splicing events in massive genomic information [8]. Importantly, NGS when genes associated with pluripotency, lineage specification coupled with transcriptome sequencing (RNA-seq) offers and cell-cycle regulation [22]. More interestingly, it was high-resolution mapping and high-throughput transcrip­ found that alternative splicing of the Serca2b gene during tome data, revealing new insights into transcriptional ESC differentiation resulted in a shorter Serca2a isoform events such as alternative splicing, cancer fusion-genes with missing miR-200 targeting sites in its 3’-UTR. Given and non-coding RNAs (ncRNAs). This versatility of NGS that miR-200 is highly expressed in cardiac lineages, and for ESC research is evident through its various appli­ca­ that Serca2a protein is essential for cardiac function, the tions (Table 1), such as chromatin immunoprecipitation results suggest that during mESC differentiation some coupled to sequencing (ChIP-seq) [15], methylated DNA genes may utilize alternative splicing to bypass lineage- immunoprecipitation coupled to sequencing (DIP-seq) specific miRNA silencing [22]. With the largely [16], identification of long-range chromatin interactions [17], uncharacterized nature of alternative splicing in ESCs, miRNA profiling [18], and RNA-binding protein immuno­ and the availability of high-throughput sequencing tools, precipitation coupled to sequencing (RIP-seq) [19]. it would be of interest to further dissect these pathways. Yeo and Ng Genome Medicine 2011, 3:68 Page 3 of 12 http://genomemedicine.com/content/3/10/68

Esrrb ESC Klf2/4/5 transcription factors Nestin Sall4 Pax6 Ectoderm miR-9 Stat3 Signaling Tcf3 factors miR-124 Transcriptional regulatory core

HoxA1 Mediator Transcriptional cofactors Mesoderm T Oct4 Cohesin PcG miR-155 Sox2 Nanog Wdr5 Trithorax group proteins Foxa2 Rbbp5 Endoderm Gata6 miR-375 Jarid2 Polycomb Pcl2 group proteins

Cdx2 Trophectoderm miR-290 Eomes miRNA miR-302 network Lin28 Non-coding RNA Xist Non-coding RNA lincRNA network

Repressed Active

Figure 1. The embryonic stem cell transcriptional regulatory circuit. The embryonic stem cell (ESC) transcription factors Oct4, Sox2 and Nanog form an autoregulatory network by binding their own promoters as well as promoters of the other core members. These three core factors maintain an ESC gene expression profile by occupying: (1) actively transcribed genes, such as ESC-specific transcription factors; (2) signaling transcription factors; (3) chromatin modifiers; (4) ESC-associated microRNA (miRNA); and (5) other non-coding RNA, such as long intergenic non-coding RNA (lincRNA). Conversely, Oct4, Sox2 and Nanog, in concert with Polycomb group proteins (PcG), bind lineage-specific and non-coding RNA genes, such as Xist, to repress lineage gene expression and inhibit ESC differentiation.

Transcriptional networks controlling ESCs transcription factors using ChIP-based transcriptomics The core transcriptional regulatory network led to the discovery of transcription factors associating In ESCs, the undifferentiated state is maintained by the into an ‘Oct4’ or ‘Myc’ module [10,15]. core transcription factors Oct4, Sox2 and Nanog [1]. Early mapping studies revealed that Oct4, Sox2 and The expanded pluripotency network Nanog co-bind gene promoters of many mESC and hESC Apart from Oct4, Sox2 and Nanog, the Oct4 module also genes [23,24]. Importantly, the core transcription factors includes the downstream transcription factors of the LIF, were found to maintain pluripotency by: (1) activating BMP4 and Wnt signaling pathways: Stat3, Smad1 and other pluripotency factors, while simultaneously repress­ Tcf3 [15,25]. Indeed, Stat3, Smad1 and Tcf3 co-occupy ing lineage-specific genes via Polycomb group proteins; certain regulatory regions with Oct4, Sox2 and Nanog, and (2) activating their own gene expression, as well as thus establishing the pathway in which external signaling that of each other. Therefore, with this autoregulatory can affect ESC transcriptional regulation [15,25]. Mass and feed-forward system, Oct4, Sox2 and Nanog con­ spectrometry has also facilitated the study of protein- stitute the ESC core transcriptional network (Figure 1) protein interaction networks of core transcription factors [23,24]. Subsequent studies on additional ESC-related [26,27], revealing that Oct4 can interact with a diverse Yeo and Ng Genome Medicine 2011, 3:68 Page 4 of 12 http://genomemedicine.com/content/3/10/68

population of proteins, including transcriptional regula­ ability of Oct4, Sox2 and Nanog to regulate tors, chromatin-binding proteins and modifiers, protein- modifications expands our understanding of how the modifying factors, and chromatin assembly proteins. core transcriptional factors regulate chromatin structure Importantly, knockdown of Oct4 protein levels is known to ultimately promote a pluripotent state. to cause the loss of co-binding activity of other trans­ cription factors [15,27], suggesting that Oct4 serves as a Pluripotent regulation of non-coding platform for the binding of its interacting protein RNA partners onto their target genes. ncRNAs are a diverse group of transcripts, and are classi­ The Myc module consists of transcription factors such fied into two groups: (a) lncRNAs for sequences more as c-Myc, n-Myc, Zfx, and Rex1, and is associated than 200 nucleotides in length; and (b) short ncRNAs for with self-renewal and cellular metabolism [10,15]. Approxi­ transcripts of less than 200 bases [38]. mately one-third of all active genes in ESCs are bound by miRNAs that are about 22 nucleotides in length are both c-Myc and the core transcription factors [28]. How­ considered to be short ncRNAs. In ESCs, miRNA expres­ ever, unlike Oct4, Sox2 and Nanog, which can recruit sion is also regulated by the core transcription factors RNA polymerase II via coactivators such as the Mediator (Figure 1), whereby the promoters of miRNA genes, complex [29], c-Myc rather appears to control the trans­ which are preferentially expressed in ESCs, are bound by criptional­ pause release of RNA polymerase II, via Oct4, Sox2, Nanog and Tcf3 factors. Similarly, miRNA recruit­ment of a cyclin-dependent kinase, p-TEFb [28]. It genes involved in lineage specification were occupied by is therefore proposed that Oct4-Sox2-Nanog selects ESC core transcription factors in conjunction with Polycomb genes for expression by recruiting RNA polymerase II, group proteins, to exert transcriptional silencing [39]. while c-Myc serves to regulate gene expression efficiency Examples of these silenced miRNA genes include let-7, by releasing transcriptional pause [1]. This may thus which targets pluripotency factors Lin28 and Sall4 [11], account for the reason why overexpression of c-Myc is as well as miR-145, which is expressed during hESC able to improve the efficiency of iPSC generation, and differentiation to suppress the pluripotency factors how c-Myc could be oncogenic. In fact, the Myc module OCT4, SOX2 and KLF4 in hESCs [40]. rather than the Oct4 module in ESCs was recently found The lncRNA Xist, which performs a critical role in to be active in various cancers, and may serve as a useful X‑chromosome inactivation, is silenced by the core ESC tool in predicting cancer prognosis [9]. factors along intron 1 of the mESC Xist gene (Figure 1) Besides targeting transcription factors to regulate gene [41]. Similarly, ESC transcription factors also regulate the expression, Oct4 is also known to affect the ESC chroma­ expression of the Xist antisense gene Tsix [42,43]. tin landscape. Jarid2 [30-34] and Pcl2/Mtf2 [30,31,34-35] However, it was found that deletion of Xist intron 1 have been identified as components of the Polycomb containing the Oct4-binding sites in ESCs did not result Repressive Complex 2 (PRC2) in ESCs, and regulated by in Xist derepression [44]. Epiblast-derived stem cells and the core ESC transcription factors [10,15]. From these hESCs that express Oct4 are known to possess an inactive studies, Jarid2 is suggested to recruit PRC2 to its genomic X-chromosome [45], and interestingly, pre-X inactivation targets, and can also control PRC2 histone methyl­trans­ hESCs have been derived from human blastocysts cultured ferase activity [30-34]. The second protein Pcl2 shares a under hypoxic conditions [46]. Therefore, it is likely that subset of PRC2 targets in ESCs [34-35] and appears to the ESC transcriptional network indirectly regulates promote histone H3 27 trimethylation [35]. Knock­ X‑chromosome activation status via an intermediary down of Pcl2 promotes self-renewal and impairs differen­ effector. tiation, suggesting a repressive function of Pcl2 by sup­ Recently, lincRNAs have been demonstrated to both pres­sing the pluripotency-associated factors Tbx3, Klf4 maintain pluripotency and suppress lineage specification, and Foxd3 [35]. Oct4 has also been demonstrated to hence integrating into the molecular circuitry governing physically interact with Wdr5, a core member of the ESCs [14]. Pluripotency factors such as Oct4, Sox2, mam­malian Trithorax complex, and cooperate in the Nanog and c-Myc have also been found to co-localize at trans­criptional activation of self-renewal genes [36]. As lincRNA promoters, indicating that lincRNA expression Wdr5 is needed for histone H3 lysine 4 trimethylation is under the direct regulation of the ESC transcriptional (H3K4me3), Oct4 depletion notably caused a decrease in network. Interestingly, mESC lincRNAs have been found both Wdr5 binding and H3K4me3 levels at Oct4-Wdr5 to bind multiple ubiquitous chromatin complexes and co-bound promoters. This indicates that Oct4 may be RNA-binding proteins, leading to the proposal that responsible for directing Wdr5 to ESC genes and main­ lincRNAs function as ‘flexible scaffolds’ to recruit differ­ taining H3K4me3 open chromatin [36]. As chromatin ent protein complexes into larger units. By extension of structure and transcriptional activity can be altered via this concept, it is possible that the unique lincRNA addition or removal of histone modifications [37], the signature of each cell type may serve to bind protein Yeo and Ng Genome Medicine 2011, 3:68 Page 5 of 12 http://genomemedicine.com/content/3/10/68

complexes to create a cell-type-specific gene expression studies is needed to resolve this issue. Here, we summar­ profile. ize and present the key findings that address this topic. Initially, it was believed that hiPSCs were similar to Cellular reprogramming and iPSCs hESCs [52,57], but subsequent studies argued otherwise The importance of the transcriptional regulatory network as differential gene expression [58], as well as DNA in establishing ESC self-renewal and pluripotency was methylation patterns [59], could be distinguished between elegantly demonstrated by Takahashi and Yamanaka [4], hiPSCs and hESCs (Table 2). However, these differences whereby introduction of four transcription factors Oct4, were proposed to be a consequence of comparing cells of Sox2, Klf4 and c-Myc (OSKM) could revert differentiated different genetic origins [60], laboratory-to-laboratory cells back to pluripotency as iPSCs. iPSCs were later variation [61], and the iPSC passage number [62]. Later, demonstrated to satisfy the highest stringency test of hiPSCs were described to contain genomic abnormalities, pluripotency via tetraploid complementation to form including gene copy number variation [63,64], point viable ‘all-iPSC’ mice [47]. muta­tions [65] and chromosomal duplications [66] However, reprogramming is not restricted to the four (Table 2). However, whether these genomic instabilities OSKM factors only. Closely related family members of are inherent in hiPSCs only, or a consequence of culture- the classical reprogramming factors such as Klf2 and Klf5 induced mutations, as previously described in hESCs, is can replace Klf4, Sox1 can substitute for Sox2, and c-Myc still not certain [67]. Extended passages of iPSCs can be replaced by using N-myc and L-myc [48]. However, appeared to reduce such aberrant genomic abnormalities, Oct4 cannot be replaced by its close homologs Oct1 and possibly via growth outcompetition by healthy iPSCs Oct6 [48], but can be substituted using an unrelated [64], but this was contradicted by a separate study that orphan nuclear , Nr5a2, to form mouse iPSCs found that parental epigenetic signatures are retained in [49]. Similarly, another orphan , Esrrb, iPSCs even after extended passaging [68]. Indeed, this was demonstrated to replace Klf4 during iPSC generation ‘epigenetic memory’ phenomenon was also reported in [50]. Human iPSCs (hiPSCs), aside from the classical two earlier studies, whereby donor cell epigenetic memory OSKM factors [51], can also be generated using a led to an iPSC differentiation bias towards donor-cell- different cocktail of factors comprising OCT4, SOX2, related lineages [62,69]. The mechanism behind this NANOG and LIN28 [52]. Recently, the maternally residual donor cell memory found in iPSCs was attri­ expressed transcription factor Glis1 replaced c-Myc to buted to incomplete promoter DNA methylation [70]. generate both mouse iPSCs and hiPSCs [53]. Glis1 is Surprisingly, knockdown of incompletely repro­grammed­ highly expressed in unfertilized eggs and zygotes but not somatic genes was found to reduce hiPSC generation, in ESCs; thus, it remains to be determined if other suggesting that somatic memory genes may play an active maternally expressed genes could similarly reinitiate role in the reprogramming process [70].Differences in pluripotency. ncRNA expression were also found between iPSCs and While certain transcription factors may be replaced ESCs (Table 2). For instance, the aberrantly silenced with chemicals during the reprogramming process, they imprinted Dlk1-Dio3 gene in iPSCs results in the all still require at least one transcription factor [54]. differential expression of its encoded ncRNA Gtl2 and Recently, however, the creation of hiPSCs and mouse Rian, and 26 miRNAs, and consequent failure to generate iPSCs via miRNA without additional protein-encoding ‘all-iPSC’ mice [60]. Upregulation of lincRNAs specifi­ factors was reported [55,56]. By expressing the miR-302- cally in hiPSCs was also reported [13]. Expression of miR-367 clusters, iPSCs can be generated with two lincRNA-RoR with OSKM could also enhance iPSC orders of magnitude higher efficiency compared with formation by twofold, suggesting a critical function of conventional OSKM reprogramming [55]. Similarly, lincRNA in the reprogramming process [13]. iPSCs could be formed by transfecting miR-302, miR-200 As these reported variations between hESCs and and miR-369 into mouse adipose stromal cells, albeit at hiPSCs could be attributed to small sample sizes, a recent lower efficiency[56] . The ability of miRNAs to reprogram large-scale study by Bock et al. [71] profiled the global somatic cells is intriguing, and it would be of great transcription and DNA methylation patterns of 20 differ­ interest to determine the gene targets of these repro­ ent hESC lines and 12 hiPSC lines. Importantly, the study gramming miRNAs. revealed that hiPSCs and hESCs were largely similar, and that the observed hiPSC differences were similar to Expression profiling of ESCs and iPSCs normally occurring variation among hESCs. Additionally, The question of whether pluripotent iPSCs truly resemble Bock et al. established a scoring algorithm to predict ESCs is an actively debated and evolving field, with lineage and differentiation propensity of hiPSCs. As evidence arguing both for and against iPSC-ESC simi­ traditional methods of screening hiPSC quality rely on larity. As such, further research using better controlled time-consuming and low-throughput teratoma assays, Yeo and Ng Genome Medicine 2011, 3:68 Page 6 of 12 http://genomemedicine.com/content/3/10/68

Table 2. Transcriptomic comparisons between induced pluripotent stem cells and embryonic stem cells Characteristic Mouse iPSCs Human iPSCs mRNA expression Distinct from mESCs at lower passages, donor cell gene Distinct from hESCs at lower passages [58], with residual donor expression still present [62,69]; closely resemble mESCs at gene expression [70,96,97]; closely resemble hESCs at late late passages [62] passages [58,98] miRNA expression miRNA encoded within the imprinted Dlk1-Dio3 locus is Small number of differences reported [58,99], but variation aberrantly silenced [60] between hESCs and hiPSCs comparable to somatic and cancer cells [100] lncRNA expression Not determined Differences in lincRNA expression reported. lincRNA-RoR enhances reprogramming by twofold [13] DNA methylation status Distinct from mESCs at lower passages, donor cell DNA Differences in DNA methylation reported [59,68,70], but not in methylation pattern still present [62,69]; closely identical to all hiPSCs [71] mESCs at late passages [62] Genome status Not determined Possess gene copy number deletions and duplications [63-64], somatic coding mutations [65], and chromosomal duplications [66] hESC, human embryonic stem cell; hiPSC, human induced pluripotent stem cell; iPSC, induced pluripotent stem cell; lincRNA, long-intergenic non-coding RNA; lncRNA, long non-coding RNA; mESC, mouse embryonic stem cell; miRNA, microRNA. the hiPSC genetic scorecard offers researchers a quick abnormalities. The Mediator-cohesin complex, which assessment of the epigenetic and transcriptional status of occupies 60% of active mESC genes, is responsible for pluripotent cells. This may be especially useful for the regulating gene expression by physically linking gene rapid monitoring of cell-line quality during large-scale enhancers to promoters though chromatin loops [29]. production of iPSCs [71]. Notably, the binding pattern of Mediator-cohesin onto gene promoters differs among cell types, indicative of Deregulation of transcriptional networks in cell-type-specific gene regulation[29] . In hESCs, Mediator disease was also revealed to be important in the maintenance of Blastocyst-derived ESCs possess an innate ability for pluripotent stem cell identity during a genome-wide indefinite self-renewal, and can be considered a primary siRNA screen, suggesting an evolutionarily conserved untransformed cell line. Unlike primary cell cultures with role [75]. Given this important gene regulatory function limited in vitro lifespans, or immortalized/tumor-derived of the Mediator-cohesin complex in mESCs and hESCs, cell lines that do not mimic normal cell behavior, ESCs mutations in these proteins are associated with disorders thus offer a good model for studying cellular pathways. such as schizophrenia, and Opitz-Kaveggia and Lujan ESC transcriptomics have indeed advanced our under­ syndromes [29]. Interestingly, the Cornelia de Lange syn­ standing into the molecular mechanisms affecting certain drome, which causes mental retardation due to gene dys­ human diseases. regulation rather than chromosomal abnormalities, is For instance, it was previously reported that cancer associated with mutations in cohesin-loading factor cells possess an ESC-like transcriptional program, suggest­ Nipbl [29]. Therefore, it is proposed that such develop­ ing that ESC-associated genes may contribute to tumor mental syndromes may arise as a result of the failure to formation [72]. However, this expression signature was form appropriate enhancer-promoter interactions. shown to be a result of c-Myc, rather than from the core Mutations in core ESC transcription factor SOX2 and pluripotency factors (Table 3) [9]. As c-Myc somatic copy- the ATP-chromatin remodeler CHD7 result in develop­ number duplications are the most frequent in cancer mental defects such as SOX2 anophthalmia (congenital [73], the finding that c-Myc releases RNA poly­merase II absence of eyeballs) and CHARGE syndrome, respectively from transcriptional pause [28] offers new understanding­ [76]. Although a direct association between CHARGE into the transcriptional regulatory role of c-Myc in ESCs syndrome and ESCs is not known, mESC studies revealed and cancer cells. Another pluripotency-associated factor, that Chd7 co-localizes with core ESC factors and p300 Lin28, which suppresses the maturation of pro-differen­ protein at gene enhancers to modulate expression of tiation let-7 miRNA, is also highly expressed in poorly ESC-specific genes [77]. It is thus possible that CHARGE differentiated and low prognosis tumors [74]. Importantly, syndrome may arise due to CHD7 enhancer-mediated let-7 silences several , such as c-Myc, K-Ras, gene dysregulation. In neural stem cells, Chd7 is able to Hmga2 and the gene encoding cyclin-D1, suggesting that bind with Sox2 at the Jag1, Gli3 and Mycn genes, which Lin28 deregulation may promote oncogenesis [74]. are mutated in the developmental disorders Alagille, Aside from cancer, mutations in ESC-associated Pallister-Hall and Feingold syndromes [78]. Similarly, transcriptional regulators can cause developmental Chd7 has been described to interact with the PBAF Yeo and Ng Genome Medicine 2011, 3:68 Page 7 of 12 http://genomemedicine.com/content/3/10/68

Table 3. Dysregulation of transcriptional networks in stem cells and disease Gene/protein Role in ESCs Role in disease c-MYC Involved in the expression of self-renewal genes [101]; recruits Most common gene duplication in cancer [73]; c-Myc appears to be p-TEFb to initiate transcriptional pause release of RNA polymerase responsible for the gene expression signature of cancer cells [9] II [29] LIN28 Maintains ESC pluripotency by binding and inhibiting the Highly expressed in poorly differentiated and low prognosis tumors; maturation of pro-differentiationlet-7 miRNA; LIN28 is also a hiPSC as let-7 silences the expression of oncogenes c-Myc, K-Ras, Hmga2 reprogramming factor [74] and the gene encoding cyclin-D1, Lin28 suppression of let-7 miRNA may thus promote oncogenesis [74] SOX2 A core ESC transcription factor together with Oct4 and Nanog. Mutation in SOX2 causes anophthalmia (congenital loss of eyeballs) Regulates the expression of pluripotency genes, and suppresses in humans. Proposed to cooperate with CHD7 to regulate genes lineage-specific genes [23,24]; Sox2 is also an iPSC reprogramming involved in Alagille, Pallister-Hall and Feingold syndromes [76] factor [4] CHD7 Binds with core ESC factors and p300 at gene enhancers to Mutations in CHD7 result in CHARGE syndrome; proposed to modulate ESC-specific gene expression [77] cooperate with SOX2 to regulate genes involved in Alagille, Pallister- Hall and Feingold syndromes [76] Mediator Physically links the Oct4/Sox2/Nanog-bound gene enhancers to Mutations in Mediator are associated with Opitz-Kaveggia, Lujan, active gene promoters via chromatin looping [29]; necessary for and transposition of the great arteries syndromes; also implicated in normal gene activity schizophrenia, colon cancer progression [1] and uterine leiomyomas [102] Cohesin Proposed to bind and stabilize the Oct4/Sox2/Nanog enhancer- Cohesin mutations implicated in Cornelia de Lange syndrome, promoter chromatin loops [1]; necessary for normal gene activity whereby patients exhibit developmental defects and mental retardation due to dysregulation of gene expression [29] Nipbl Binds with mediator complex to allow loading of cohesion and Nipbl mutations implicated in Cornelia de Lange syndrome, whereby formation of stable chromatin loop [29] patients exhibit developmental defects and mental retardation due to dysregulation of gene expression [29]

ESC, embryonic stem cell; hiPSC, human induced pluripotent stem cell; iPSC, induced pluripotent stem cell; miRNA, microRNA. complex to control neural crest formation [79]. Therefore, drug design and screening under disease conditions these data hint that Chd7 may partner different proteins (Figure 2) [6]. One such example is the drug roscovitine, to cooperatively regulate developmental genes. Although which was found to restore the electrical and Ca2+ signal­ the mechanism behind gene regulation by Chd7 and its ing in Timothy syndrome cardiomyocytes [81]. interacting partners is not well understood, the use of The self-renewing ability of hiPSCs means that a ESCs may serve as a useful system to further probe Chd7 potentially unlimited source of patient-specific cells can function during development and disease. be generated for regenerative purposes (Figure 2). Impor­ tantly, hiPSCs, when coupled with gene targeting Clinical and therapeutic implications approaches to rectify genetic mutations, can be differen­ The development of hiPSC technology offers the unique tiated into the desired cell type and reintroduced to the opportunity to derive disease-specific hiPSCs for the in patient (Figure 2) [5]. However, unlike mESCs, hESCs vitro study of human disease pathogenesis (Figure 2). A and hiPSCs cannot be passaged as single cells and have major advantage of using disease-specific hiPSCs is that very poor homologous recombination ability [84]. they allow the capture of the patient’s genetic background Circum­venting this problem may require the conversion and, together with the patient’s medical history, will of hiPSCs into a mESC-like state, which is more amenable enable the researcher to uncover the disease genotypic- to gene targeting [85]. Alternatively, recent reports of phenotypic relationship [6]. A number of patient-derived successful gene targeting in human pluripotent stem cells hiPSC disease models have been established, including using zinc-finger nucleases (ZFNs)[86] , and transcription those for Hutchinson Gilford Progeria, Timothy syn­ activator-like effector nucleases (TALENs) [87], presents drome, schizophrenia and Alzheimer’s disease [5,80-83], another option for genetically altering hiPSCs for cell and these have been useful in understanding the cellular therapy. Albeit that there are concerns of off-target effects, mechanisms behind these illnesses. For example, trans­ the advantage of using nuclease-targeting approaches is criptional profiling of schizophrenia neurons derived that they do not necessitate the conversion of hESCs and from iPSCs have identified 596 differentially expressed hiPSCs into mESC-like states prior to genomic genes, 75% of which were not previously implicated in manipulation. schizophrenia [82]. This highlights the potential of While it has been assumed that iPSCs generated from disease-specific iPSCs in unlocking hidden pathways. an autologous host should be immune-tolerated, Zhao et al. Additionally, the use of disease cell lines can facilitate [88] recently demonstrated that iPSCs were immunogenic Yeo and Ng Genome Medicine 2011, 3:68 Page 8 of 12 http://genomemedicine.com/content/3/10/68

Patient

Tissue biopsy

Transplant Treatment Reprogramming

iPSCs

Regeneration and differentiation Gene targeting Disease modeling

Disease pathogenesis Drug screening

Corrected iPSC

Figure 2. The application of induced pluripotent stem cell technology for therapeutic purposes. Patient-derived somatic cells can be isolated through tissue biopsies and converted into induced pluripotent stem cells (iPSCs) through reprogramming. From there, iPSCs can be expanded into suitable quantities before di erentiation into desired tissue types for transplantation purposes. Gene targeting of patient-derived iPSCs can also be done through homologous recombination or via gene-editing nucleases to correct genetic mutations. Upon successful modication, the genetically corrected iPSCs can then be expanded, di erentiated and transplanted back into the patient for cell therapy. iPSCs from patients harboring genetic diseases can similarly be used as an in vitro disease model to study disease pathogenesis, or for drug development and screening. Data gained through the study of disease-specic cell culture models will enable the identication of critical molecular and cellular pathways in disease development, and allow for the formulation of e ective treatment strategies.

and could elicit a T-cell immune response when trans- iPSC-derived differentiated cells may not be immuno- planted into syngeneic mice. However, it should be genic. It would thus be necessary to experimentally verify distinguished that in the Zhao et al. study undiffer- if iPSC-derived differentiated cells are immunogenic in entiated iPSCs were injected into mice, rather than syngeneic hosts. differentiated iPSC-derived cells, which are the clinically relevant cell type for medical purposes. Furthermore, the Conclusions and future challenges immune system is capable of ‘cancer immunosurveillance’ Understanding and exploiting the mechanisms that to identify and destroy tumorigenic cells [89]. Hence, it govern pluripotency are necessary if hESCs and hiPSCs may be possible that the observed iPSC immunogenicity are to be successfully translated to benefit clinical and could have arisen through cancer immunosurveillance medical applications. One approach for understanding against undifferentiated tumor-like iPSCs, and that hESCs and hiPSCs would be to study their transcriptomes, Yeo and Ng Genome Medicine 2011, 3:68 Page 9 of 12 http://genomemedicine.com/content/3/10/68

and, through various approaches, we have learnt how the Acknowledgements The authors thank all the members of the Ng laboratory for their comments core pluripotency factors create an ESC gene expression on this manuscript. signature by regulating other transcription factors and controlling chromatin structure and ncRNA expression. Author details 1Gene Regulation Laboratory, Genome Institute of Singapore, 60 Biopolis Current methodologies to generate iPSCs are ineffi­ Street, Genome, Singapore 138672. 2School of Biological Sciences, Nanyang cient, suggesting that significant and unknown epigenetic Technological University, 60 Nanyang Drive, Singapore 637551. 3Department barriers to successful reprogramming remain [90]. of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 8 Medical Drive, Singapore 117597. 4Department of Biological However, defining these barriers is difficult, as existing Sciences, National University of Singapore, 14 Science Drive 4, Singapore transcriptomic studies rely on average readings taken 117597. 5NUS Graduate School for Integrative Sciences and Engineering, across a heterogeneous cell population. This therefore National University of Singapore, 28 Medical Drive, Singapore 117456. masks essential rate-limiting transcriptional and epi­ Published: 27 October 2011 genetic remodeling steps in iPSC formation. Future studies in elucidating the iPSC generation process may References 1. Young RA: Control of the embryonic stem cell state. Cell 2011, 144:940-954. thus adopt a single-cell approach [91], which will offer 2. Ben-David U, Benvenisty N: The tumorigenicity of human embryonic and the resolution needed to define key reprogramming induced pluripotent stem cells. Nat Rev Cancer 2011, 11:268-277. steps. Future efforts should also be focused upon improv­ 3. Barrilleaux B, Knoepfler PS: Inducing iPSCs to escape the dish. Cell Stem Cell 2011, 9:103-111. ing hiPSC safety for human applications, through the use 4. Takahashi K, Yamanaka S: Induction of pluripotent stem cells from mouse of stringent genomic and functional screening strategies embryonic and adult fibroblast cultures by defined factors. Cell 2006, on hiPSCs and their differentiated tissues [3]. Only with 126:663-676. 5. Wu SM, Hochedlinger K: Harnessing the potential of induced pluripotent well-defined and non-tumorigenic iPSC-derived tissue stem cells for regenerative medicine. Nat Cell Biol 2011, 13:497-505. would we then be able to assess the transplant potential 6. Zhu H, Lensch MW, Cahan P, Daley GQ: Investigating monogenic and of iPSCs in personalized medicine. complex diseases with pluripotent stem cells. Nat Rev Genet 2011, In addition to generating disease-specific iPSCs from 12:266-275. 7. Wang Z, Gerstein M, Snyder M: RNA-seq: a revolutionary tool for patients, the use of gene-modifying nucleases to create transcriptomics. Nat Rev Genet 2009, 10:57-63. hESCs harboring specific genetic mutations may be a 8. Marguerat S, Bahler J: RNA-seq: from technology to biology. Cell Mol Life Sci forward approach towards studying human disease 2010, 67:569-579. 9. Kim J, Woo AJ, Chu J, Snow JW, Fujiwara Y, Kim CG, Cantor AB, Orkin SH: patho­genesis [86]. With the recent creation of approxi­ A Myc network accounts for similarities between embryonic stem and mately 9,000 conditional targeted alleles in mESCs [92], it cancer cell transcription programs. Cell 2010, 143:313-324. would be of tremendous scientific and clinical value to 10. Kim J, Chu J, Shen X, Wang J, Orkin SH: An extended transcriptional network for pluripotency of embryonic stem cells. Cell 2008, 132:1049-1061. likewise establish a hESC knockout library to study the 11. Melton C, Judson RL, Blelloch R: Opposing microRNA families regulate role of individual genes in disease and development. self-renewal in mouse embryonic stem cells. Nature 2010, 463:621-626. Furthermore, while SNP and haplotype mapping may be 12. Dinger ME, Amaral PP, Mercer TR, Pang KC, Bruce SJ, Gardiner BB, Askarian- Amiri ME, Ru K, Soldà G, Simons C, Sunkin SM, Crowe ML, Grimmond SM, useful in associating diseases with specific genetic loci, Perkins AC, Mattick JS: Long noncoding RNAs in mouse embryonic stem the use of ZFNs or TALENs to recreate these specific cell pluripotency and differentiation. Genome Res 2008, 18:1433-1445. gene variations in hESCs may offer an experimental 13. Loewer S, Cabili MN, Guttman M, Loh YH, Thomas K, Park IH, Garber M, Curran M, Onder T, Agarwal S, Manos PD, Datta S, Lander ES, Schlaeger TM, Daley GQ, means of verifying the relationship of SNPs or haplotypes Rinn JL: Large intergenic non-coding RNA-RoR modulates reprogramming with diseases. of human induced pluripotent stem cells. Nat Genet 2010, 42:1113-1117. 14. Guttman M, Donaghey J, Carey BW, Garber M, Grenier JK, Munson G, Young G, Lucas AB, Ach R, Bruhn L, Yang X, Amit I, Meissner A, Regev A, Rinn JL, Root Abbreviations DE, Lander ES: lincRNAs act in the circuitry controlling pluripotency and CHARGE, Coloboma of the eye, Heart defects, Atresia of the choanae, differentiation. Nature 2011, 477:295-300. Retardation of growth and/or development, Genital and/or urinary 15. Chen X, Xu H, Yuan P, Fang F, Huss M, Vega VB, Wong E, Orlov YL, Zhang W, abnormalities, and Ear abnormalities and deafness; ChIP, chromatin Jiang J, Loh YH, Yeo HC, Yeo ZX, Narang V, Govindarajan KR, Leong B, Shahab immunoprecipitation; ChIP-chip, chromatin immunoprecipitation on chip; A, Ruan Y, Bourque G, Sung WK, Clarke ND, Wei CL, Ng HH: Integration of ChIP-seq, chromatin immunoprecipitation and sequencing; DIP-seq, DNA external signaling pathways with the core transcriptional network in immunoprecipitation and sequencing; ESC, embryonic stem cell; hESC, embryonic stem cells. Cell 2008, 133:1106-1117. human embryonic stem cell; hiPSC, human induced pluripotent stem cell; 16. Williams K, Christensen J, Pedersen MT, Johansen JV, Cloos PA, Rappsilber J, H3K4me3, histone H3 lysine 4 trimethylation; iPSC, induced pluripotent stem Helin K: TET1 and hydroxymethylcytosine in transcription and DNA cell, lincRNA, long intergenic non-coding RNA; lncRNA, long non-coding methylation fidelity. Nature 2011, 473:343-348. RNA; mESC, mouse embryonic stem cell; miRNA, microRNA; NGS, next- 17. Handoko L, Xu H, Li G, Ngan CY, Chew E, Schnapp M, Lee CW, Ye C, Ping JL, generation sequencing; ncRNA, non-coding RNA; oligo, oligonucleotide; Mulawadi F, Wong E, Sheng J, Zhang Y, Poh T, Chan CS, Kunarso G, Shahab A, OSKM, Oct4, Sox2, Klf4 and c-Myc; PRC2, Polycomb repressive complex Bourque G, Cacheux-Rataboul V, Sung WK, Ruan Y, Wei CL: CTCF-mediated 2; RIP-seq, RNA-binding protein immunoprecipitation and sequencing; functional chromatin interactome in pluripotent cells. Nat Genet 2011, RNA‑seq, RNA sequencing; siRNA, short interfering RNA; SNP, single nucleotide 43:630-638. polymorphism; TALEN, transcription activator-like effector nuclease; UTR, 18. Morin RD, O’Connor MD, Griffith M, Kuchenbauer F, Delaney A, Prabhu AL, untranslated region; ZFN, zinc-finger nuclease. Zhao Y, McDonald H, Zeng T, Hirst M, Eaves CJ, Marra MA: Application of massively parallel sequencing to microRNA profiling and discovery in Competing interests human embryonic stem cells. Genome Res 2008, 18:610-621. The authors declare that they have no competing interests. 19. Zhao J, Ohsumi TK, Kung JT, Ogawa Y, Grau DJ, Sarma K, Song JJ, Kingston RE, Yeo and Ng Genome Medicine 2011, 3:68 Page 10 of 12 http://genomemedicine.com/content/3/10/68

Borowsky M, Lee JT: Genome-wide identification of polycomb-associated Volkert TL, Gupta S, Love J, Hannett N, Sharp PA, Bartel DP, Jaenisch R, Young RNAs by RIP-seq. Mol Cell 2010, 40:939-953. RA: Connecting microRNA genes to the core transcriptional regulatory 20. Chen M, Manley JL: Mechanisms of alternative splicing regulation: insights circuitry of embryonic stem cells. Cell 2008, 134:521-533. from molecular and genomics approaches. Nat Rev Mol Cell Biol 2009, 40. Xu N, Papagiannakopoulos T, Pan G, Thomson JA, Kosik KS: MicroRNA-145 10:741-754. regulates OCT4, SOX2, and KLF4 and represses pluripotency in human 21. Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, Armour CD, Santos embryonic stem cells. Cell 2009, 137:647-658. R, Schadt EE, Stoughton R, Shoemaker DD: Genome-wide survey of human 41. Navarro P, Chambers I, Karwacki-Neisius V, Chureau C, Morey C, Rougeulle C, alternative pre-mRNA splicing with exon junction microarrays. Science Avner P: Molecular coupling of Xist regulation and pluripotency. Science 2003, 302:2141-2144. 2008, 321:1693-1695. 22. Salomonis N, Schlieve CR, Pereira L, Wahlquist C, Colas A, Zambon AC, 42. Donohoe ME, Silva SS, Pinter SF, Xu N, Lee JT: The pluripotency factor Oct4 Vranizan K, Spindler MJ, Pico AR, Cline MS, Clark TA, Williams A, Blume JE, interacts with Ctcf and also controls X-chromosome pairing and counting. Samal E, Mercola M, Merrill BJ, Conklin BR: Alternative splicing regulates Nature 2009, 460:128-132. mouse embryonic stem cell pluripotency and differentiation. Proc Natl 43. Navarro P, Oldfield A, Legoupi J, Festuccia N, Dubois A, Attia M, Acad Sci U S A 2010, 107:10514-10519. Schoorlemmer J, Rougeulle C, Chambers I, Avner P: Molecular coupling of 23. Loh YH, Wu Q, Chew JL, Vega VB, Zhang W, Chen X, Bourque G, George J, Tsix regulation and pluripotency. Nature 2010, 468:457-460. Leong B, Liu J, Wong KY, Sung KW, Lee CW, Zhao XD, Chiu KP, Lipovich L, 44. Barakat TS, Gunhanlar N, Pardo CG, Achame EM, Ghazvini M, Boers R, Kenter Kuznetsov VA, Robson P, Stanton LW, Wei CL, Ruan Y, Lim B, Ng HH: The Oct4 A, Rentmeester E, Grootegoed JA, Gribnau J: RNF12 activates Xist and is and Nanog transcription network regulates pluripotency in mouse essential for X chromosome inactivation. PLoS Genet 2011, 7:e1002001. embryonic stem cells. Nat Genet 2006, 38:431-440. 45. Guo G, Yang J, Nichols J, Hall JS, Eyres I, Mansfield W, Smith A:Klf4 reverts 24. Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, Guenther MG, developmentally programmed restriction of ground state pluripotency. Kumar RM, Murray HL, Jenner RG, Gifford DK, Melton DA, Jaenisch R, Young Development 2009, 136:1063-1069. RA: Core transcriptional regulatory circuitry in human embryonic stem 46. Lengner CJ, Gimelbrant AA, Erwin JA, Cheng AW, Guenther MG, Welstead GG, cells. Cell 2005, 122:947-956. Alagappan R, Frampton GM, Xu P, Muffat J,Santagata S, Powers D, Barrett CB, 25. Cole MF, Johnstone SE, Newman JJ, Kagey MH, Young RA: Tcf3 is an integral Young RA, Lee JT, Jaenisch R, Mitalipova M: Derivation of pre-X inactivation component of the core regulatory circuitry of embryonic stem cells. Genes human embryonic stem cells under physiological oxygen concentrations. Dev 2008, 22:746-755. Cell 2010, 141:872-883. 26. Pardo M, Lang B, Yu L, Prosser H, Bradley A, Babu MM, Choudhary J: An 47. Zhao XY, Li W, Lv Z, Liu L, Tong M, Hai T, Hao J, Guo CL, Ma QW, Wang L, Zeng expanded Oct4 interaction network: implications for stem cell biology, F, Zhou Q: iPS cells produce viable mice through tetraploid development, and disease. Cell Stem Cell 2010, 6:382-395. complementation. Nature 2009, 461:86-90. 27. van den Berg DL, Snoek T, Mullin NP, Yates A, Bezstarosti K, Demmers J, 48. Nakagawa M, Koyanagi M, Tanabe K, Takahashi K, Ichisaka T, Aoi T, Okita K, Chambers I, Poot RA: An Oct4-centered protein interaction network in Mochiduki Y, Takizawa N, Yamanaka S: Generation of induced pluripotent embryonic stem cells. Cell Stem Cell 2010, 6:369-381. stem cells without Myc from mouse and human fibroblasts. Nat Biotechnol 28. Rahl PB, Lin CY, Seila AC, Flynn RA, McCuine S, Burge CB, Sharp PA, Young RA: 2008, 26:101-106. c-Myc regulates transcriptional pause release. Cell 2010, 141:432-445. 49. Heng JC, Feng B, Han J, Jiang J, Kraus P, Ng JH, Orlov YL, Huss M, Yang L, 29. Kagey MH, Newman JJ, Bilodeau S, Zhan Y, Orlando DA, van Berkum NL, Lufkin T, Lim B, Ng HH: The nuclear receptor Nr5a2 can replace Oct4 in the Ebmeier CC, Goossens J, Rahl PB, Levine SS, Taatjes DJ, Dekker J, Young RA: reprogramming of murine somatic cells to pluripotent cells. Cell Stem Cell Mediator and cohesin connect gene expression and chromatin 2010, 6:167-174. architecture. Nature 2010, 467:430-435. 50. Feng B, Jiang J, Kraus P, Ng JH, Heng JC, Chan YS, Yaw LP, Zhang W, Loh YH, 30. Landeira D, Sauer S, Poot R, Dvorkina M, Mazzarella L, Jorgensen HF, Pereira Han J, Vega VB, Cacheux-Rataboul V, Lim B, Lufkin T, Ng HH: Reprogramming CF, Leleu M, Piccolo FM, Spivakov M, Brookes E, Pombo A, Fisher C, Skarnes of fibroblasts into induced pluripotent stem cells with orphan nuclear WC, Snoek T, Bezstarosti K, Demmers J, Klose RJ, Casanova M, Tavares L, receptor Esrrb. Nat Cell Biol 2009, 11:197-203. Brockdorff N, Merkenschlager M, Fisher AG: Jarid2 is a PRC2 component in 51. Takahashi K, Tanabe K, Ohnuki M, Narita M, Ichisaka T, Tomoda K, Yamanaka S: embryonic stem cells required for multi-lineage differentiation and Induction of pluripotent stem cells from adult human fibroblasts by recruitment of PRC1 and RNA Polymerase II to developmental regulators. defined factors. Cell 2007, 131:861-872. Nat Cell Biol 2010, 12:618-624. 52. Yu J, Vodyanik MA, Smuga-Otto K, Antosiewicz-Bourget J, Frane JL, Tian S, 31. Li G, Margueron R, Ku M, Chambon P, Bernstein BE, Reinberg D: Jarid2 and Nie J, Jonsdottir GA, Ruotti V, Stewart R, Slukvin II, Thomson JA: Induced PRC2, partners in regulating gene expression. Genes Dev 2010, 24:368-380. pluripotent stem cell lines derived from human somatic cells. Science 2007, 32. Pasini D, Cloos PA, Walfridsson J, Olsson L, Bukowski JP, Johansen JV, Bak M, 318:1917-1920. Tommerup N, Rappsilber J, Helin K: JARID2 regulates binding of the 53. Maekawa M, Yamaguchi K, Nakamura T, Shibukawa R, Kodanaka I, Ichisaka T, Polycomb repressive complex 2 to target genes in ES cells. Nature 2010, Kawamura Y, Mochizuki H, Goshima N, Yamanaka S: Direct reprogramming 464:306-310. of somatic cells is promoted by maternal transcription factor Glis1. Nature 33. Peng JC, Valouev A, Swigut T, Zhang J, Zhao Y, Sidow A, Wysocka J: Jarid2/ 2011, 474:225-229. Jumonji coordinates control of PRC2 enzymatic activity and target gene 54. Gonzalez F, Boue S, Izpisua Belmonte JC: Methods for making induced occupancy in pluripotent cells. Cell 2009, 139:1290-1302. pluripotent stem cells: reprogramming a la carte. Nat Rev Genet 2011, 34. Shen X, Kim W, Fujiwara Y, Simon MD, Liu Y, Mysliwiec MR, Yuan GC, Lee Y, 12:231-242. Orkin SH: Jumonji modulates polycomb activity and self-renewal versus 55. Anokye-Danso F, Trivedi CM, Juhr D, Gupta M, Cui Z, Tian Y, Zhang Y, Yang W, differentiation of stem cells. Cell 2009, 139:1303-1314. Gruber PJ, Epstein JA, Morrisey EE: Highly efficient miRNA-mediated 35. Walker E, Chang WY, Hunkapiller J, Cagney G, Garcha K, Torchia J, Krogan NJ, reprogramming of mouse and human somatic cells to pluripotency. Cell Reiter JF, Stanford WL: Polycomb-like 2 associates with PRC2 and regulates Stem Cell 2011, 8:376-388. transcriptional networks during mouse embryonic stem cell self-renewal 56. Miyoshi N, Ishii H, Nagano H, Haraguchi N, Dewi DL, Kano Y, Nishikawa S, and differentiation. Cell Stem Cell 2010, 6:153-166. Tanemura M, Mimori K, Tanaka F, Saito T, Nishimura J, Takemasa I, Mizushima 36. Ang YS, Tsai SY, Lee DF, Monk J, Su J, Ratnakumar K, Ding J, Ge Y, Darr H, T, Ikeda M, Yamamoto H, Sekimoto M, Doki Y, Mori M: Reprogramming of Chang B, Wang J, Rendl M, Bernstein E, Schaniel C, Lemischka IR: Wdr5 mouse and human cells to pluripotency using mature . Cell mediates self-renewal and reprogramming via the embryonic stem cell Stem Cell 2011, 8:633-638. core transcriptional network. Cell 2011, 145:183-197. 57. Park IH, Zhao R, West JA, Yabuuchi A, Huo H, Ince TA, Lerou PH, Lensch MW, 37. Gaspar-Maia A, Alajem A, Meshorer E, Ramalho-Santos M: Open chromatin in Daley GQ: Reprogramming of human somatic cells to pluripotency with pluripotency and reprogramming. Nat Rev Mol Cell Biol 2011, 12:36-47. defined factors. Nature 2008, 451:141-146. 38. Pauli A, Rinn JL, Schier AF: Non-coding RNAs as regulators of 58. Chin MH, Mason MJ, Xie W, Volinia S, Singer M, Peterson C, Ambartsumyan G, embryogenesis. Nat Rev Genet 2011, 12:136-149. Aimiuwu O, Richter L, Zhang J, Khvorostov I, Ott V, Grunstein M, Lavon N, 39. Marson A, Levine SS, Cole MF, Frampton GM, Brambrink T, Johnstone S, Benvenisty N, Croce CM, Clark AT, Baxter T, Pyle AD, Teitell MA, Pelegrini M, Guenther MG, Johnston WK, Wernig M, Newman J, Calabrese JM, Dennis LM, Plath K, Lowry WE: Induced pluripotent stem cells and embryonic stem Yeo and Ng Genome Medicine 2011, 3:68 Page 11 of 12 http://genomemedicine.com/content/3/10/68

cells are distinguished by gene expression signatures. Cell Stem Cell 2009, 74. Viswanathan SR, Daley GQ: Lin28: a microRNA regulator with a macro role. 5:111-123. Cell 2010, 140:445-449. 59. Doi A, Park IH, Wen B, Murakami P, Aryee MJ, Irizarry R, Herb B, Ladd-Acosta C, 75. Chia NY, Chan YS, Feng B, Lu X, Orlov YL, Moreau D, Kumar P, Yang L, Jiang J, Rho J, Loewer S, Miller J, Schlaeger T, Daley GQ, Feinberg AP: Differential Lau MS, Huss M, Soh BS, Kraus P, Li P, Lufkin T, Lim B, Clarke ND, Bard F, Ng HH: methylation of tissue- and cancer-specific CpG island shores distinguishes A genome-wide RNAi screen reveals determinants of human embryonic human induced pluripotent stem cells, embryonic stem cells and stem cell identity. Nature 2010, 468:316-320. fibroblasts. Nat Genet 2009, 41:1350-1353. 76. Puc J, Rosenfeld MG: SOX2 and CHD7 cooperatively regulate human 60. Stadtfeld M, Apostolou E, Akutsu H, Fukuda A, Follett P, Natesan S, Kono T, disease genes. Nat Genet 2011, 43:505-506. Shioda T, Hochedlinger K: Aberrant silencing of imprinted genes on 77. Schnetz MP, Handoko L, Akhtar-Zaidi B, Bartels CF, Pereira CF, Fisher AG, chromosome 12qF1 in mouse induced pluripotent stem cells. Nature 2010, Adams DJ, Flicek P, Crawford GE, Laframboise T, Tesar P, Wei CL, Scacheri PC: 465:175-181. CHD7 targets active gene enhancer elements to modulate ES cell-specific 61. Newman AM, Cooper JB: Lab-specific gene expression signatures in gene expression. PLoS Genet 2010, 6:e1001023. pluripotent stem cells. Cell Stem Cell 2010, 7:258-262. 78. Engelen E, Akinci U, Bryne JC, Hou J, Gontan C, Moen M, Szumska D, Kockx C, 62. Polo JM, Liu S, Figueroa ME, Kulalert W, Eminli S, Tan KY, Apostolou E, Stadtfeld van Ijcken W, Dekkers DH, Demmers J, Rijkers EJ, Bhattacharya S, Philipsen S, M, Li Y, Shioda T, Natesan S, Wagers AJ, Melnick A, Evans T, Hochedlinger K: Pevny LH, Grosveld FG, Rottier RJ, Lenhard B, Poot RA: Sox2 cooperates with Cell type of origin influences the molecular and functional properties of Chd7 to regulate genes that are mutated in human syndromes. Nat Genet mouse induced pluripotent stem cells. Nat Biotechnol 2010, 28:848-855. 2011, 43:607-611. 63. Laurent LC, Ulitsky I, Slavin I, Tran H, Schork A, Morey R, Lynch C, Harness JV, 79. Bajpai R, Chen DA, Rada-Iglesias A, Zhang J, Xiong Y, Helms J, Chang CP, Zhao Lee S, Barrero MJ, Ku S, Martynova M, Semechkin R, Galat V, Gottesfeld J, Y, Swigut T, Wysocka J: CHD7 cooperates with PBAF to control multipotent Izpisua Belmonte JC, Murry C, Keirstead HS, Park HS, Schmidt U, Laslett AL, neural crest formation. Nature 2010, 463:958-962. Muller FJ, Nievergelt CM, Shamir R, Loring JF: Dynamic changes in the copy 80. Liu GH, Barkho BZ, Ruiz S, Diep D, Qu J, Yang SL, Panopoulos AD, Suzuki K, number of pluripotency and cell proliferation genes in human ESCs and Kurian L, Walsh C, Thompson J, Boue S, Fung HL, Sancho-Martinez I, Zhang K, iPSCs during reprogramming and time in culture. Cell Stem Cell 2011, Yates J 3rd, Izpisua Belmonte JC: Recapitulation of premature ageing with 8:106-118. iPSCs from Hutchinson-Gilford progeria syndrome. Nature 2011, 64. Hussein SM, Batada NN, Vuoristo S, Ching RW, Autio R, Närvä E, Ng S, Sourour 472:221-225. M, Hämäläinen R, Olsson C, Lundin K, Mikkola M, Trokovic R, Peitz M, Brüstle 81. Yazawa M, Hsueh B, Jia X, Pasca AM, Bernstein JA, Hallmayer J, Dolmetsch RE: O, Bazett-Jones DP, Alitalo K, Lahesmaa R, Nagy A, Otonkoski T: Copy number Using induced pluripotent stem cells to investigate cardiac in variation and selection during reprogramming to pluripotency. Nature Timothy syndrome. Nature 2011, 471:230-234. 2011, 471:58-62. 82. Brennand KJ, Simone A, Jou J, Gelboin-Burkhart C, Tran N, Sangar S, Li Y, Mu Y, 65. Gore A, Li Z, Fung HL, Young JE, Agarwal S, Antosiewicz-Bourget J, Canto I, Chen G, Yu D, McCarthy S, Sebat J, Gage FH: Modelling schizophrenia using Giorgetti A, Israel MA, Kiskinis E, Lee JH, Loh YH, Manos PD, Montserrat N, human induced pluripotent stem cells. Nature 2011, 473:221-225. Panopoulos AD, Ruiz S, Wilbert ML, Yu J, Kirkness EF, Izpisua Belmonte JC, 83. Qiang L, Fujita R, Yamashita T, Angulo S, Rhinn H, Rhee D, Doege C, Chau L, Rossi DJ, Thomson JA, Eggan K, Daley GQ, Goldstein LS, Zhang K: Somatic Aubry L, Vanti WB, Moreno H, Abeliovich A: Directed conversion of coding mutations in human induced pluripotent stem cells. Nature 2011, Alzheimer’s disease patient skin fibroblasts into functional neurons. Cell 471:63-67. 2011, 146:359-371. 66. Mayshar Y, Ben-David U, Lavon N, Biancotti JC, Yakir B, Clark AT, Plath K, Lowry 84. Zwaka TP, Thomson JA: Homologous recombination in human embryonic WE, Benvenisty N: Identification and classification of chromosomal stem cells. Nat Biotechnol 2003, 21:319-321. aberrations in human induced pluripotent stem cells. Cell Stem Cell 2010, 85. Buecker C, Chen HH, Polo JM, Daheron L, Bu L, Barakat TS, Okwieka P, Porter A, 7:521-531. Gribnau J, Hochedlinger K, Geijsen N: A murine ESC-like state facilitates 67. Maitra A, Arking DE, Shivapurkar N, Ikeda M, Stastny V, Kassauei K, Sui G, transgenesis and homologous recombination in human pluripotent stem Cutler DJ, Liu Y, Brimble SN, Noaksson K, Hyllner J, Schulz TC, Zeng X, Freed cells. Cell Stem Cell 2010, 6:535-546. WJ, Crook J, Abraham S, Colman A, Sartipy P, Matsui S, Carpenter M, Gazdar 86. Soldner F, Laganière J, Cheng AW, Hockemeyer D, Gao Q, Alagappan R, AF, Rao M, Chakravarti A: Genomic alterations in cultured human Khurana V, Golbe LI, Myers RH, Lindquist S, Zhang L, Guschin D, Fong LK, Vu embryonic stem cells. Nat Genet 2005, 37:1099-1103. BJ, Meng X, Urnov FD, Rebar EJ, Gregory PD, Zhang HS, Jaenisch R: 68. Lister R, Pelizzola M, Kida YS, Hawkins RD, Nery JR, Hon G, Antosiewicz- Generation of isogenic pluripotent stem cells differing exclusively at two Bourget J, O’Malley R, Castanon R, Klugman S, Downes M, Yu R, Stewart R, Ren early onset Parkinson point mutations. Cell 2011, 146:318-331. B, Thomson JA, Evans RM, Ecker JR: Hotspots of aberrant epigenomic 87. Hockemeyer D, Wang H, Kiani S, Lai CS, Gao Q, Cassady JP, Cost GJ, Zhang L, reprogramming in human induced pluripotent stem cells. Nature 2011, Santiago Y, Miller JC, Zeitler B, Cherone JM, Meng X, Hinkley SJ, Rebar EJ, 471:68-73. Gregory PD, Urnov FD, Jaenisch R: Genetic engineering of human 69. Kim K, Doi A, Wen B, Ng K, Zhao R, Cahan P, Kim J, Aryee MJ, Ji H, Ehrlich LI, pluripotent cells using TALE nucleases. Nat Biotechnol 2011, 29:731-734. Yabuuchi A, Takeuchi A, Cunniff KC, Hongguang H, McKinney-Freeman S, 88. Zhao T, Zhang ZN, Rong Z, Xu Y: Immunogenicity of induced pluripotent Naveiras O, Yoon TJ, Irizarry RA, Jung N, Seita J, Hanna J, Murakami P, Jaenisch stem cells. Nature 2011, 474:212-215. R, Weissleder R, Orkin SH, Weissman IL, Feinberg AP, Daley GQ: Epigenetic 89. Vesely MD, Kershaw MH, Schreiber RD, Smyth MJ: Natural innate and memory in induced pluripotent stem cells. Nature 2010, 467:285-290. adaptive immunity to cancer. Annu Rev Immunol 2011, 29:235-271. 70. Ohi Y, Qin H, Hong C, Blouin L, Polo JM, Guo T, Qi Z, Downey SL, Manos PD, 90. Plath K, Lowry WE: Progress in understanding reprogramming to the Rossi DJ, Yu J, Hebrok M, Hochedlinger K, Costello JF, Song JS, Ramalho- induced pluripotent state. Nat Rev Genet 2011, 12:253-265. Santos M: Incomplete DNA methylation underlies a transcriptional 91. Smith ZD, Nachman I, Regev A, Meissner A: Dynamic single-cell imaging of memory of somatic cells in human iPS cells. Nat Cell Biol 2011, 13:541-549. direct reprogramming reveals an early specifying event. Nat Biotechnol 71. Bock C, Kiskinis E, Verstappen G, Gu H, Boulting G, Smith ZD, Ziller M, Croft GF, 2010, 28:521-526. Amoroso MW, Oakley DH, Gnirke A, Eggan K, Meissner A: Reference maps of 92. Skarnes WC, Rosen B, West AP, Jackson D, Severin J, Biggs P, Fu J, Nefedov M, human ES and iPS cell variation enable high-throughput characterization de Jong PJ, Stewart AF, Bradley A: A conditional knockout resource for the of pluripotent cell lines. Cell 2011, 144:439-452. genome-wide study of mouse gene function. Nature 2011, 474:337-342. 72. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, Weinberg RA: 93. Islam S, Kjallquist U, Moliner A, Zajac P, Fan JB, Lonnerberg P, Linnarsson S: An embryonic stem cell-like gene expression signature in poorly Characterization of the single-cell transcriptional landscape by highly differentiated aggressive human tumors. Nat Genet 2008, 40:499-507. multiplex RNA-seq. Genome Res 2011, 21:1160-1167. 73. Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, 94. Yeo GW, Xu X, Liang TY, Muotri AR, Carson CT, Coufal NG, Gage FH: Barretina J, Boehm JS, Dobson J, Urashima M, Mc Henry KT, Pinchback RM, Alternative splicing events identified in human embryonic stem cells and Ligon AH, Cho YJ, Haery L, Greulich H, Reich M, Winckler W, Lawrence MS, neural progenitors. PLoS Comput Biol 2007, 3:1951-1967. Weir BA, Tanaka KE, Chiang DY, Bass AJ, Loo A, Hoffman C, Prensner J, Liefeld 95. Wu JQ, Habegger L, Noisa P, Szekely A, Qiu C, Hutchison S, Raha D, Egholm M, T, Gao Q, Yecies D, Signoretti S, et al.: The landscape of somatic copy- Lin H, Weissman S, Cui W, Gerstein M, Snyder M: Dynamic transcriptomes number alteration across human cancers. Nature 2010, 463:899-905. during neural differentiation of human embryonic stem cells revealed by Yeo and Ng Genome Medicine 2011, 3:68 Page 12 of 12 http://genomemedicine.com/content/3/10/68

short, long, and paired-end sequencing. Proc Natl Acad Sci U S A 2010, 100. Neveu P, Kye MJ, Qi S, Buchholz DE, Clegg DO, Sahin M, Park IH, Kim KS, Daley 107:5254-5259. GQ, Kornblum HI, Shraiman BI, Kosik KS: MicroRNA profiling reveals two 96. Marchetto MC, Yeo GW, Kainohana O, Marsala M, Gage FH, Muotri AR: distinct -related human pluripotent stem cell states. Cell Stem Cell 2010, Transcriptional signature and memory retention of human-induced 7:671-681. pluripotent stem cells. PLoS One 2009, 4:e7076. 101. Sridharan R, Tchieu J, Mason MJ, Yachechko R, Kuoy E, Horvath S, Zhou Q, 97. Ghosh Z, Wilson KD, Wu Y, Hu S, Quertermous T, Wu JC: Persistent donor cell Plath K: Role of the murine reprogramming factors in the induction of gene expression among human induced pluripotent stem cells pluripotency. Cell 2009, 136:364-377. contributes to differences with human embryonic stem cells. PLoS One 102. Mäkinen N, Mehine M, Tolvanen J, Kaasinen E, Li Y, Lehtonen HJ, Gentile M, 2010, 5:e8975. Yan J, Enge M, Taipale M, Vahteristo P, Aaltonen LA: MED12, the Mediator 98. Guenther MG, Frampton GM, Soldner F, Hockemeyer D, Mitalipova M, complex subunit 12 gene, is mutated at high frequency in uterine Jaenisch R, Young RA: Chromatin structure and gene expression programs leiomyomas. Science 2011. of human embryonic and induced pluripotent stem cells. Cell Stem Cell 2010, 7:249-257. doi:10.1186/gm284 99. Wilson KD, Venkatasubrahmanyam S, Jia F, Sun N, Butte AJ, Wu JC: MicroRNA Cite this article as: Yeo J-C, Ng H-H: Transcriptomic analysis of pluripotent profiling of human-induced pluripotent stem cells. Stem Cells Dev 2009, stem cells: insights into health and disease. Genome Medicine 2011, 3:68. 18:749-758.