US 20160264934A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2016/0264934 A1 GALLOURAKIS et al. (43) Pub. Date: Sep. 15, 2016

(54) METHODS FOR MODULATING AND Publication Classification ASSAYING MI6AIN STEM CELL POPULATIONS (51) Int. Cl. CI2N5/0735 (2006.01) (71) Applicants: THE GENERAL, HOSPITAL AOIN I/02 (2006.01) CORPORATION, Boston, MA (US); CI2O I/68 (2006.01) The Regents of the University of GOIN 33/573 (2006.01) California, Oakland, CA (US) CI2N 5/077 (2006.01) CI2N5/0793 (2006.01) (72) Inventors: Cosmas GIALLOURAKIS, Boston, (52) U.S. Cl. MA (US); Alan C. MULLEN, CPC ...... CI2N5/0606 (2013.01); CI2N5/0657 Brookline, MA (US); Yi XING, (2013.01); C12N5/0619 (2013.01); C12O Torrance, CA (US) I/6888 (2013.01); G0IN33/573 (2013.01); A0IN I/0226 (2013.01); C12N 2501/72 (73) Assignees: THE GENERAL, HOSPITAL (2013.01); C12N 2506/02 (2013.01); C12O CORPORATION, Boston, MA (US); 2600/158 (2013.01); C12Y 201/01062 The Regents of the University of (2013.01); C12Y 201/01 (2013.01) California, Oakland, CA (US) (57) ABSTRACT (21) Appl. No.: 15/067,780 The present invention generally relates to methods, assays and kits to maintain a human stem cell population in an (22) Filed: Mar 11, 2016 undifferentiated state by inhibiting the expression or function of METTL3 and/or METTL4, and mA fingerprint methods, assays, arrays and kits to assess the cell state of a human stem Related U.S. Application Data cell population by assessing mA levels (e.g. mA peak inten (60) Provisional application No. 62/131,490, filed on Mar. sities) of a set of target disclosed herein to determine if 11, 2015. the stem cell is in an undifferentiated or differentiated state.

Patent Application Publication Sep. 15, 2016 Sheet 2 of 13 US 2016/0264934 A1

X- & FG. 2B FG, 2C

ACN s

ACN fic lett3 iype via FIG. 2E FG. 2F g 8 hasg g 08 3 -i-. is

3. s C - - 883-i- if 3 : s s 2 c - a. ;3. : Six a w kE E. : s as E a s 33is is s 8: isN see * : s i $2 s Patent Application Publication Sep. 15, 2016 Sheet 3 of 13 US 2016/0264934 A1

F.G. 3A 88 Narc 33 8 : 50 -. 8 3 3: 8. W 33 is: SS S. & E. c 8 i. s: 30 : g

aw& " is a 20 3 4 s: is 8.3. is 10 a c 2

8 && Š.s:

FG. 3B iQ sri A {E} 3. a. 80 3 c ; is $ 503 s s : 9 8 S. 4 Aay 8 3.: saE. i.; SSes 30 3.g sis is: 3 sx 20 asa as . t

R s 838. F.G. 3D &:::::y: 3. g a 38 ------8

33.------w838 ye ie:S F.G. 3E

s: Patent Application Publication Sep. 15, 2016 Sheet 4 of 13 US 2016/0264934 A1

FG. 4A FG. AB & 838 & 3A RP : 8.-- 8 & ...-a-...--...------...------...------. 1.s x $ 8. s s: s a.

xy 3. : l a. š : ------k . & 3:

s ; ---8. - : s 8 s 3g2 33.3%. 38te:8:ty atog FG. AD FIG. 4C fat gets S. & its aba RNA

sex. 8; Riccie.8

Boss a risihyatic: pya:32 ki3.3s pya:38 x8.3

FG. A.E x- F.G. 4F mA-8-Nanostring RA ests 8er tiavoiridot treat sett

8 Wild type 32. c B.S OS is x &lett:3 -i- Qs 3 8.8 85 3. E3 8 3.4 83.3 8s 8 V . 3.3 $3.3 c $ 3. 4 3 : 0.2 3. 8 t . 2. 5 3.3 B {} Narog Octa 8 Wild typs & Metti3 :- Patent Application Publication Sep. 15, 2016 Sheet 5 of 13 US 2016/0264934 A1

F.G. 5A

8: 8:8 F.G. SD

------&-.

F.G. SE

3. s 8 & s s 3. : s s

F.G. SG

F.G. SH FIG. 5 F.G. 5,

Patent Application Publication Sep. 15, 2016 Sheet 6 of 13 US 2016/0264934 A1

& rSSC & ESS |----+--~~~~~~~~~ aga is: 88. s

F.G. 6D F.G. 6E Rossa Sesii. iiia Speci 8. FIG. 6F

is: 88s &

...... S.S.C.

A38:8

w kissets: 8kse:

Patent Application Publication Sep. 15, 2016 Sheet 8 of 13 US 2016/0264934 A1

FIG. 8

Patent Application Publication Sep. 15, 2016 Sheet 9 of 13 US 2016/0264934 A1

Ajacg FIG.9B

se

k s

::::::

8 : SO FIG. 9F Excisgth its

8: 8. 88scing

s &

8 hSassass -8-38.-2S 250 583. 750. 5' UTai cis 3' uri; Esince cast exor-exxiiikoi by FG. 9. RNA Polymerase E

3st Eartile w

2st quartile

& st cartile

s

3.3 -- fr&A frtified Patent Application Publication Sep. 15, 2016 Sheet 10 of 13 US 2016/0264934 A1

F.G. 10A

fiss448 3 3.8 mala-Han SRNA taigeied region gr8A targeted fegion

Adeiti3 /

S. gasysexes s ...... Seri areas 38ere stated Eferentiate: six type & Aitk: -- i.18:34atti-i- is FIG. OF FG. LOG FIG. OH 5 exile.iii.333

as: -late:3:2 s g is 4.8 i. s. 3 aš& s ;: 3s is . . s: 3 0:5 E3 Š s esses Asti ges 8 Patent Application Publication Sep. 15, 2016 Sheet 11 of 13 US 2016/0264934 A1

FIG. 11A

KANS: i

FG, 11B mRNA level in Metii3 -f- derived tumors compared to wildtype derived tumors. O 35 3. s ; :

e

a

r

3. 4. 3.

Patent Application Publication Sep. 15, 2016 Sheet 12 of 13

FG, 2A

LNCE:08:MA s

LINQRoš

50 Set isfai exis asses FG, 12E

it:8 scietscore

Patent Application Publication Sep. 15, 2016 Sheet 13 of 13 US 2016/0264934 A1

F.G. 13A F.G. 13D

***&&$33.33%

F.G. 13B F.G. 13E

FG, 13C

28

23.3

82.3.g US 2016/0264934 A1 Sep. 15, 2016

METHODS FOR MODULATING AND functional characterization of the YTH domain family of ASSAYING MI6AN STEM CELL “reader which specifically bind mA sites and POPULATIONS recruit the linked transcripts to RNA decay bodies (Kang et al., 2014; Wang et al., 2014a). CROSS REFERENCE TO RELATED 0006 Whereas the DNA methylome undergoes dramatic APPLICATIONS reprogramming during early embryonic life, the developmen tal origins and functions of mA in mammals are incom 0001. This application claims benefit under 35 U.S.C. pletely understood. Furthermore, the degree of evolutionary S119(e) of U.S. Provisional Application No. 62/131,490 filed conservation of mA sites is not known in ESCs. Therefore, on Mar. 11, 2015, the contents of each of which are incorpo there is a need in the art for effective and efficient methods for rated herein by reference in their entireties. assessing mA mRNA methylome in stem cells and human GOVERNMENT SUPPORT stem cells, for example, to characterize and validate cells, including human pluripotent stem cells, and for determining 0002 This invention was made, in part, with government the quality and cell State of a human stem cell populations, support under NIH Grant Number DK090122 awarded by e.g., prior to its use, e.g., in therapeutic administration, dis National Institutes of Health. The Government of the U.S. has ease modeling, drug development and screening and toxicity certain rights in the invention. assays etc. FIELD OF THE INVENTION SUMMARY OF THE INVENTION 0003. The present invention relates to arrays and methods 0007. The present invention is directed to, in part, meth for characterizing stem cell populations assessing transcrip ods, compositions and kits to maintain a stem cell population, tionwide distribution of mA methylation to characterize and Such as a human stem cell population, in an undifferentiated permit selection of stem cell lines for further use, and to state, comprising contacting the stem cell population with an modulation of METTL3, e.g., inhibition to maintain stem inhibitor of METTL3 or METTL4. In some embodiments, cells in an undifferentiated state or activation of METTL3 to the methods, compositions and kits as disclosed herein relate promote differentiation along endoderm lineages. to methods to prevent a stem cell population differentiating along an endoderm lineage. Other aspects of the technology BACKGROUND OF THE INVENTION described herein relates to methods, compositions and kits to 0004 Reversible chemical modifications on messenger promote a stem cell population to differentiate along an endo have emerged as prevalent phenomena that may open derm lineage. Moreover, another aspect of the technology a new field of “RNA epigenetics', akin to the diverse roles described herein relates to methods, assays, arrays and kits that DNA modifications play in epigenetics (reviewed by Fu for performing mA analysis of RNA from stem cell popula and He, 2012; Sibbritt et al., 2013). N6-methyl-adenosine tions to characterize the cell state of the cell population, (mA) is the most prevalent modification of mRNAs in which can be used, for example, as a quality control for the Somatic cells, and dysregulation of this modification has stem cell population. In some embodiments, the stem cell already been linked to obesity, cancer, and other human dis population is a human stem cell population, e.g., a hESC cell eases (Sibbritt et al., 2013). mA has been observed in a wide population or other human stem cell line. range of organisms, and the known methylation complex is 10008 N6-methyl-adenosine (mA) is the most abundant conserved across eukaryotes (Bokar et al., 1997, Bujnicki, covalent modification on messenger RNAS in Somatic cells 2002 #375). In budding yeast, the mA methylation program and is linked to human diseases, but its functions in mamma is activated by starvation and required for sporulation (Agar lian development are poorly understood. Furthermore, while wala et al., 2012; Clancy et al., 2002; Schwartz et al., 2013; them"A RNA modification pathway is linked to developmen Shah and Clancy, 1992). In Arabidopsis, the methylase tal decisions in lower eukaryotes, little is known concerning responsible for mA modification, MTA, is essential for the dynamic extent, conservation and potential function(s) of embryonic development, plant growth and patterning (Bodi the mA modification in human development. Herein, the et al., 2012: Zhong et al., 2008), and the Drosophila homolog inventors demonstrate a genome-wide analysis of mA modi IME4 is expressed in ovaries and testes and is essential for fications in human embryonic stem cells (hESCs) differenti viability (Hongay and Orr-Weaver, 2011). ated towards endoderm. mA sites are observed on thousands 0005 While mA has been suggested to affect almost all of transcripts including those encoding master regulators of aspects of RNA metabolism, the molecular function of this hESC identity and differentiation. A comparative genomic modification remains incompletely understood (Niu et al., analysis of mA maps in mouse and human ESCs reveals a 2013). Importantly, mA modification(s) are reversible in conserved set of methylated genes and sites of modification. mammalian cells. The fat-mass and obesity associated pro Moreover, human endoderm differentiation is distinguished tein, FTO, has mA demethylase activity (Jia et al., 2011)and, by the dynamic regulation of rn6A peak intensities. Impor ALKBH5, also a member of the alphaketoglutarate-depen tantly, we demonstrate that hESCs are reliant on the mA dent dioxygenases family, has also been shown to act methyltransferase component METTL3 for normal endo as mA demethylase, with particular importance in spermatic derm differentiation. Thus, the inventors reveal a novel layer development (Zheng et al., 2013) Manipulating global mA of hESC regulation at the epitranscriptomic level. levels has implicated mA modifications in a variety of cel 0009. Further, it is to be understood that m6A modification lular processes including nuclear RNA export, control of also is involved in differentiation to other cell types, such as, protein translation and splicing (Dominissini et al., 2012; but not limited thereto, iPSCs, adult stem cells, Sertoli cells Gulati et al., 2013: Hess et al., 2013; Zheng et al., 2013). and neural stem cells, for example. Recently, it has been suggested that mA modification may 0010 Moreover, the inventors have performed global also play a role in controlling transcript stability based on the sequence analysis of mRNAS immuneprecipitated with a US 2016/0264934 A1 Sep. 15, 2016 mA RNA-specific antibody to define the mRNA methylome strates the transcriptome flexibility and is required for human in human embryonic stem cells. In particular, the inventors stem cells to differentiate to specific lineages. In particular, have discovered a function of mA by mapping the mA the inventors have discovered that mA-modifications in the methylome in both mouse and human embryonic stem cells RNA (in mRNA transcripts, non-coding regions and in non (ESCs). The inventors discovered that thousands of messen coding RNAS) of human stem cell populations serve as stem ger and long noncoding RNAS have conserved mA modifi cells internal “quality control” as the mA marks the mRNA cation, including transcripts encoding multiple core pluripo as having passed a quality control test in the cell, as stem cells tency transcription factors, including but not limited to Nanog cannot differentiate without mA-modifications on key tran and Sox2. mA was discovered to be enriched over 3' untrans Scripts. lated regions at defined sequence motifs, and importantly 0014 Thus, a key concept of the technology described marks unstable transcripts, including transcripts that need to herein relates to the discovery that inhibition of the METTL3 be turned over upon differentiation. Importantly, the inven enzyme prevents human stem cells from differentiating. tors have discovered that the mA-modified mRNAs include Stated a different way, the inventors have discovered a pro multiple core pluripotency factors and transcripts involved in cess which “locks hESCs into their pluripotent state (see development and the cell cycle, and were frequently located FIG. 5). Depleting METTL3 or METTL4 levels (e.g., using near stop codons, at the beginning of 3' untranslated regions RNAi) and/or inhibiting METTL3 or METTL4 enzyme func (3'UTR) and in the long internal exons, indicating that mGA tion, (e.g., using METTL3 or METTL4 small molecule site is tied to functional roles in regulating the RNA life cycle inhibitors) allows human stem cell populations to remain in a and marks the RNA for turn-over. In particular, the inventors pluripotent, undifferentiated state, and prevents them from discovered that while unmodified transcripts and méA-modi spontaneously differentiating along specific lineages. This is fied transcripts had similar rates of transcription, the mA useful for maintaining human stem cell populations for long mRNAs had shorter half-lives and reduced translation effi periods of time, e.g., in culture and after multiple passages ciencies, demonstrating a role form"A-modification in influ without the risk of the human stem cell line differentiating encing human stem cell RNA turn-over and the fate of the and/or changing phenotype. Furthermore, if a specific hESC transcript. or iPSC cell subclone is identified that has particular benefi 0011. To date, the functions of mA in mammalian cells cial properties, inhibition of METTL3 and/or METTL4 is have only been examined by RNAi knockdown. Depletion of useful to propogate the stem cell line and prevent them from METTL3 and METTL14 in human cancer cell lines led differentiating, therefore enabling consistency amoung ali decreased cell viability and apoptosis, leading to the interpre quots of a stem cell population. Importantly, while much of tation that mA is important for cell viability (Dominissini et the field of stem cell research focuses on methods to differ al., 2012; Liu et al., 2014). entiate stem cells into specific lineages, there limited options 0012 Here, the inventors assessed the conservation of the on methods to keep a stem cell population in an undifferen mA methylome at the level of targets and function in tiated state. This is useful as stem cells are typically cultured human ESCs. Using genetic inactivation or depletion of in a defined media to prevent differentiation, however, and mouse and human Mettl3 (one of the known mA methy Some cells spontaneously differentiate regardless of the cul lases), the inventors discovered a decrease in mA levels (i.e. ture media used. mA erasure) on select target genes, a prolonged Nanog 0015. Another aspect of the technology disclosed herein expression upon differentiation, and impaired ESC's exit relates to the use of the intensity of méA sites of methylation from self-renewal towards differentiation into several lin (i.e., méA peak intensity) as a quantitative metric or measure eages in vitro and in vivo. Importantly, the inventors demon to distinguish cell states. Stated another way, the intensity of strate that inhibition or knock-down of Mettl3 in human ESC m6A sites of methylation (i.e., m6A peak intensity) of a set of increased self-renewal and proliferation, but reduced their specific target gene, e.g., at least 10 or more selected from ability to different ate along specific lineages, in particular Table 1 or Table 2, can be used to “fingerprint a cell state, endoderm lineages. This is in contrast to the report by Wang e.g., determine the cell state of the stem cell population, i.e., and colleagues (Wang et al., 2014, Nat. Cell Biol. 16, 191 to determine if the stem cell population is pluripotent (i.e., in 198) which report Mettl3 and Mettl4 knockdown in mouse an undifferentiated pluripotent state) or if the human stem cell ESCs lead to decreased self-renewal and regeneration, and population has differentiated along a cell lineage pathway. ectopic differentiation (see., review articles Jalkanen et al., Importantly, using the intensity of méA sites of methylation Cell Stem Cell, 2014, 15(669-670), “Stem cell RNA epige (i.e., méA peak intensity) of specific target genes is indepen netics: M'Arking your territory” and Zhao et al., Genome dent of levels, which is the current standard Biology, 2015, 16: 45, “Fate by RNA methylation: mGA of analysis of stem cell populations. steers stem cell pluripotency'.). Furthermore, Geula et al., 0016. Accordingly, another aspect of the technology (Science, 2015; 347(6225); 1002-1006) show that in native described herein relates to methods, compositions, assays, pluripotent mouse ESCs, knockdown of Mettl3 blocked dif arrays and kits to characterize a stem cell population, Such as ferentiation, whereas knockdown of Mettl3 in differentiation a human stem cell population, comprising performing mA primed mouse ESCs (mESCs) reduced stem cell self-re analysis on the RNA obtained from the population of stem newal. This is in contrast with the present invention which cells, and assessing the intensity of the mA levels of the demonstrate that knock-down of METTL3 in human ESCs mRNA of at least 10 genes selected from any of those in Table led to the unexpected finding of increased self-renewal and 1, or Table 2 as disclosed herein. proliferation, and that mA and Mettl3 in particular are not 0017. Another aspect of the technology described herein required for ESC growth but rather, are required for stem cells relates to methods, compositions, assays, arrays and kits for to adopt new cell fates. assessing mA levels in the RNA obtained from a population 0013 Thus, the inventors have discovered that, in human of stem cells, e.g., human stem cells. In some embodiments, stem cell populations in particular, mA on RNA demon the method comprises (i) measuring them"A levels of least 10 US 2016/0264934 A1 Sep. 15, 2016 mRNA transcripts selected from any of those listed in Table 1 set enrichment analysis for mA modified genes. FIG. 1D or Table 2, for example by contacting an array with RNA shows a sequence motif identified after analysis of mA isolated from a cell population, where the array comprises at enrichment regions. FIG. 1E shows the normalized distribu least 10 or more oligonucleotides that hybridize to at least 10 tion of mA peaks across 5' UTR, CDS and 3'UTR of mRNAs mRNA transcripts, or to at least 103'UTR or other untrans for peaks common to all samples. FIG.1F shows the graphi lated regions of at least 10 genes selected from any of those cal representation of frequency of mA peaks and methyla listed in Table 1 or Table 2, and (ii) contacting the array with tion motifs in genes, divided into 5 distinct regions. FIG. 1G at least one reagent which binds to móA in the RNA, such as shows multi-exon coding and non-coding RNAS exhibit an anti-mA antibody, or fragment thereof, such as an anti enrichment of mA sites near the last exon-exon splice junc mA antibody which is fluorescently labeled or otherwise has tion. The distribution of mA peaks across the length of the a detectable label, therefore allowing the measurements of the mRNAs (n=5070) and non-coding RNAs (n=51) is shown. levels of méA in the at least selected 10 mRNA transcripts, or FIG. 1H is a scatter plot representation of mA enrichment to at least 103'UTR or other untranslated regions of at least 10 score (on the X axis) and gene expression level (on the Y axis) genes selected from any of those listed in Table 1 or Table 2. for each mA peak. FIG. 1I shows a Box plot representing the 0018. A further aspect of the technology described herein half-life for transcripts with at least one modification site and relates to methods, compositions, assays, arrays and kits for transcripts with no modification site identified. use in a method for determining the cell state of a stem cell 0023 FIGS. 2A-2F show characterization of Mettl3 population comprising performing the assay of claim 10, and knock out cells. FIG. 2A is a western blot for Mettl3 and comparing the levels of méA (i.e., peak intensities) of at least PARP in wild type and two cell lines with CRISPR induced 10 genes selected from any of Table 1 in the RNA from the loss of protein. DD, DNA damaging agent. Actin is used as stem cell population with the levels of méA (i.e., peak inten loading control. FIG. 2B shows mA ratio determined by sities) in a reference stem cell population, and based on this 2D-TLC in wildtype and Mettl3 KO. FIG. 2C shows alkaline comparison, determining the cell state of the stem cell popu phosphatase staining of wildtype and Mettl3 knockout cells. lation. FIG. 2D is a box plot representation of colony radius for wild 0019. Another aspect of the present invention relates to a type and Mettl3 mutant cells. Experiments were performed in kit comprising: (i) anarray composition for characterizing the triplicate, with at least 50 colonies measured for each repli cell State of a population of stem cells, comprising at least 10 cate. FIG. 2E shows nanog staining of colonies of wild type oligonucleotides that hybridize to the RNA (i.e. mRNA tran and two cell lines with CRISPR induced loss of protein. FIG. scripts, 3'UTR or other untranslated RNAs) of at least 10 2F is a cell proliferation assay showing wildtype and two cell genes selected from any of those in Table 1 or Table 2 as lines with CRISPR induced loss of Mettl3 protein. disclosed herein; and (ii) at least one regent to detect the m6A (0024 FIGS. 3A-3F show mettl3 loss of function impairs in RNA, such as, for example, an anti-mA antibody, or ESC ability to differentiate. FIG.3A shows the percentage of fragment thereof, for example an anti-méA antibody or frag embryoid bodies with beating activity in Mettl3 KO and wild ment thereof which is detectably labeled (e.g., with a flores type control cells (right panel). Representative images of cent label, colorimetric marker etc.). bodies stained for MHC and DAPI (center panel) and mRNA 0020. In some embodiments, the kit comprises a computer levels of Nanog and Myhé in Mettl3 KO cells in relation to readable medium comprising instructions on a computer to wild type control cells. * represents p-value-0.05. FIG. 3B compare the measured levels of méA (i.e., peak intensities) shows the percentage of colonies with Tujl projections in from the test stem cell population with reference levels of the Mettl3 KO and wild type control cells (right panel). Repre same RNA transcripts assessed. In some embodiments, the kit sentative images of bodies stained for Tuj1 and DAPI (center comprises instructions to access to a Software program avail panel) and mRNA levels of Nanog and Tuj1 in Mettl3 KO able online (e.g., on a cloud) to compare the measured levels cells in relation to wild type control cells. * represents of the m6A (i.e., peak intensities) from the test stem cell p-values 0.05. FIG. 3C shows the weight differences between population, e.g., human stem cell population, with reference teratomas generated from wild type and Mettl3 knock out levels of méA for the same RNAS assessed from a reference cells. Tumors are paired by animal (n=5). FIG. 3D shows the stem cell population, e.g., human stem cell population. representative sections of teratomas stained with hematoxy lin and eosin at low magnification. The bar represents 1000 BRIEF DESCRIPTION OF THE DRAWINGS FIG.3E shows immunohistochemistry images with antibody 0021. This patent or application file contains at least one against Ki67. FIG. 3F shows immunohistochemistry images drawing executed in color. Copies of this patent or patent with antibody against Nanog. The bar represents 100 um. application publication with color drawing(s) will be pro (0025 FIGS. 4A-4F shows the impact of loss of Mettl3 on vided by the Office upon request and payment of the neces the mESC methylome. FIG. 4A shows the cumulative distri sary fee. bution function of log2 peak intensity of méA modified sites. 0022 FIGS. 1A-H show topology and characterization of FIG. 4B shows the sequencing read density for input (grey) mA target genes. FIG. 1A shows UCSC Genome browser and after mA IP (red) for Nanog. Blue thick boxes represent plots of mA-seq reads along indicated mRNAs. Grey reads the open reading frame while the blue line represents the are from non-immunoprecipitated control input libraries and untranslated regions. FIG. 4C is a heatmap representing IP redreads anti-mA immunoprecipitation libraries. The y-axis enrichment values for peaks with statistically significant dif represents normalized number of reads. Blue thick boxes ference between wild type and Mettl3 mutant. FIG. 4D is a represent the open reading frame while the blue line repre model of genes involved in maintenance of stem cell State sents the untranslated regions. FIG. 1B is a model of genes (adapted from Young et al., 2011), representing transcripts involved in maintenance of stem cell state (adapted from with loss of méA modification in Mettl3-f- cells. FIG. 4E Young et al., 2011). Red hexagons represent modified shows the percentage of input recovered after mA IP mea mRNAs. FIG. 1C is a heatmap with log 10 (p-vlaue) of gene sured by nanostring. FIG. 4F shows the mRNA levels of US 2016/0264934 A1 Sep. 15, 2016

Nanog and Oct4 after PolII inhibition relative to untreated purple and redreads are from the anti-mARIP of mESCs and sample in wild type and Mettl3 KO cells. hESCs (TO) respectively. FIG. 6D shows representative 0026 FIGS.5A-5J show mA-seq profiling of hESC dur examples of species-specific mA modifications in mouse ing endoderm differentiation. FIG. 5A shows mA-seq was ESCs. FIG. 6E shows species-specific mA modifications in performed in resting (i.e. undifferentiated) human H1-ESCs human ESCs. FIG. 6F shows representative examples of con (TO) and after 48 hrs of Activin A induction towards endo served mA modifications at the gene and site level are rep derm (mesoendoderm) (T48). FIG. 5B is a Venn diagram of resented. Genes such CHD6 have a conserved mA peak the overlap between high-confidence T0 and T48 mA peaks location at its 3'UTR as well as mouse and human specific and methylated genes (parenthesis). FIG. 5C shows a mA peaks at conserved but distinct exons. sequence motif identified after analysis of mA enrichment (0028 FIGS. 7A-7F shows METTL3 is required for nor regions. FIG. 5D shows UCSC Genome browser plots of mal human ESC endoderm differentiation. mA-seq reads along indicated RNAs. Grey reads are from (0029 Model of METTL3 function(s). FIG. 7A shows non-immunoprecipitated control input libraries and red (TO) hESC cells transfected with anti-METTL3 shRNA (KD) as or blue (T48) reads are from anti-mA immunoprecipitation well control shRNA and stable hESC colonies were obtained libraries. The y-axis represents normalized number of reads. after drug selection. Two independent clones were Subjected Blue thick boxes represent the open reading frame while the to endodermal differentiation with Activin A and examined at blue line represents the untranslated regions. Key regulators various indicated time points. A schematic of the trends of of stem cell maintenance (left) and master regulators of endo gene expression for indicated markers of stem maintenance derm differentiation (right) are represented. FIG.5E shows a and endoderm differentiation is also shown. FIG. 7B shows Scatterplot of mA peak intensities between two different Knockdown of METTL3 leads to a reduction in METTL3 time points (TO versus T48) of the same biological replicate mRNA levels. qRT-PCR for METTL3 mRNA was performed with only “high-confidence' TO or T48 specific peaks Sup from RNA extracted from hESC cells with control shRNA ported by both biological replicates highlighted. FIG. 5F versus anti-METTL3 shRNA (KD) across the three indicated shows UCSC Genome browser plots of mA-seq reads along time points during endodermal differentiation (n=2 indepen indicated mRNAs in undifferentiated (TO) versus differenti dent generally ES cell knockdown and control clones shown: ated cells (T48). The grey reads are from non-immunopre error bars represent standard deviation of qPCRX3 per time cipitated control input libraries. The red and blue reads are point). FIG. 7C shows knockdown of METTL3 leads to a from the anti-m'A RIP of T=0 and T-48 samples respec reduction in mA levels. Ananti-m''A dot blot was performed tively. The y-axis represents normalized number of reads. on 10x fold dilutions of polyA selected RNA from hESC cells Blue thick boxes represent the open reading frame while the derived from control shRNA versus anti-METTL3 shRNA blue line represents the untranslated regions. FIG.5G shows clones. FIG.7D shows knockdown of METTL3 prevents the that differential intensities of mA peaks (DMPIs) identify normal reduction of stem maintenance/marker genes. qRT hESC cell states T0 vs T48 hrs. Z score scaled Log 2 peak PCR was performed for indicated genes and time points. (n=2 intensities of DMPIs are color-coded according to the legend. independent generally ES cell knockdown and control clones The peaks and samples are both clustered by average linkage shown; error bars represent standard deviation of qPCRX3 per hierarchical clustering using 1-Pearson correlation coeffi time point). FIG. 7E shows knockdown of METTL3 leads to cient of log 2 peak intensity as the distance metric. FIG. 5H a delayed and reduced induction of endodermal marker show the number of peaks perexon normalized by the number genes. qRT-PCR was performed on indicated genes and time of motifs (on sense strand) in the exon. The error bars repre points (n=2 independent generately ES cell knockdown and sent standard deviations from 1000 times of bootstrapping. control clones shown; error bars represent standard deviation FIG.5I show the normalized distribution of mA peaks across of qPCRx3 per time point). FIG. 7F shows that mGA marks the 5' UTR, CDS, and 3'UTR of mRNAs for T0 and T48 mA transcripts for faster turn-over. Upon transition to new cell peaks. FIG. 5J is a box plot representing the half-life for fate, mGA marked transcripts are readily removed to allow the transcripts, with transcripts separated according to enrich expression of new gene expression networks. In the absence ment score. Genes with higher levels of mA enrichment in ofmóA, the unwanted presence of transcripts will disturb the hESCs tend to exhibit lower mRNA stability in human proper balanced required for cell fate transitions. induced pluripotent cells (iPSCs). 0030 FIG. 8 is a schematic representation showing that 0027 FIGS. 6A-6F show the evolutionary conservation selected mRNA transcripts (i.e., core pluripotent factor tran and divergence of the mA epi-transcriptomes of human and scripts) are mA and translated for a time period, allowing mouse ESCs. FIG. 6A is a Venn diagram showing a 62% self-renewal and proliferation of the pluripotent human stem overlap between methylated genes in M. musculus (purple) cell, whereas after differentiation, the non mA mRNA tran and H. sapiens (red) embryonic stem cells (p value=3.5x10 Scripts are predominantly translated. 92; Fisher exact test). FIG. 6B shows the mA peaks that could 0031 FIGS. 9A-9K shows topology and characterization be mapped to orthologous genomic windows between mouse if néA target genes and is related to FIG. 1. FIG. 9A shows and human were identified. The intensities of mA-seq sig m6A enrichment determined by qRT-PCR. Vertical axis rep nals in human and mouse ESCs were shown for mA peaks resents percentage of recovery. Error bars represent standard found to be unique in mouse (blue), unique in human (red), deviation of the AACT value. ** represents p-values 0.01. and conserved between human and mouse (black). FIG. 6C is FIG.9B is a histogram representing motif density in méA a boxplot of peak intensities of mA sites conserved ("com peaks (Blue) and a random control group of windows (Red). mon') or not conserved (“specific') in mouse and human FIG. 9C shows metagene representation of read density ESCs. (p values=1.3x10' and 8.7x10° respectively). FIG. obtained in input and after méA enrichment for genes with at 6D, FIG. 6E and FIG. 6F show UCSC Genome browser plots least one modification. Black thick box represents the open ofm'A-Seq reads along indicated mRNAS. The grey reads are reading frame while the black line represents the untranslated from non-immunoprecipitated control input libraries and the regions. The CDS and 3' UTR are divided in 100 bins, while US 2016/0264934 A1 Sep. 15, 2016

the 5' UTR is divided in 50 bins. FIG. 9D shows the exon scription. Related to FIG. 5D. FIG. 12B shows multi-exon length distribution of methylated vs unmethylated internal coding and non-coding RNAs exhibit enrichment of méA exons of coding genes is shown. FIG.9E shows the number of sites near the last exon-exon splice junction. The distribution peaks per exon normalized by exon length is shown for dif of méA peaks across the length of the mRNAs (n=9489) and ferent bins of exon length. The error bars represent standard noncoding RNAs (n=207) is shown. The 5' most (first) exon, deviations from 1000 times of bootstrapping. FIG.9F shows all internal exons, and the 3' most (last) exon are divided into the number of peaks per exon normalized by the number of 10bins and the percentage of méA peaks that fall within each motifs (on sense strand) in the exon is shown. The error bars bin are shown (FIG. 12B is related to FIG. 5I). FIG. 12C represent standard deviations from 1000 times of bootstrap shows the density of méA-seq read coverage increases ping. FIG.9G shows the density of méA-seq read coverage sharply downstream of the last exon-exon splice junction in increases sharply downstream of the last exon-exon splice both coding (n=5231) and non-coding RNAs (n=68) (FIG. junction in both coding and non-coding RNAs. FIG. 9H 12C is related to FIG.5I). FIG. 12D shows single-exon genes shows the percentage of méA peaks that fall into normalized tend to have more m6A sites at their 3' end. The percentage of bins across the 5' UTR, CDS, and 3'UTR of single-exon genes m6A peaks that fall into normalized bins across the 5' UTR, is shown. CDS, and 3'UTR of single-exon genes is shown for hESC 0032 FIG.91 shows pie charts representing the fraction of cells (TO and T48 combined, n=137) as well as in merged data genes with móA modification for each quartile of expression. (“All merged'; n=200) from hESCs, 293T (Meyer et al., Black area represents modified genes. FIG. 9J shows the 2012) and HepG2 (Dominissini et al., 2012). (FIG. 12 D is average coverage of Pol2 signal at the transcriptional start site related to FIG.5I). FIG. 12E is a scatter plot representation of of modified and unmodified genes. FIG. 9K is a box plot m6A enrichment score (on the X axis) and gene expression representing translation efficiency as measured by ribosome level in FPKM (on the Y axis) for each móA peak (FIG. 12E profile. is related to FIG.5J). FIG. 12F shows méA peak intensity is 0033 FIGS. 10A-10H show the characterization of Mettl3 not correlated with nascent RNA transcription based on paus knockout cells (FIG. 10 is related to FIG. 2F). FIG. 10A is ing index. The m6A enrichment scores vs GRO-seq deter representative example of DNA sequencing of mutations mined the Pol II traveling ratio is plotted. The pausing index induced by CRISPR genome engineering. The grey areas equal GRO-seq density at promoter defined as -300 and +300 indicate codons in the open reading frame. Representation of of TSS divided by GRO-seq density in the gene body defines the Mettl3 , and Mettl3 protein, with the CRISPR tar as +300 to end of the gene. (FIG. 12F is related to FIG.5J). geted region marked in red. FIG. 10B shows representative FIG. 12G shows mRNA half-life is anti-correlated with móA examples of 2D-TLC plates for mESC wild type and enrichment in genes. (FIG. 12G is related to FIG. 5J). Mettl3-/- mutant. Nucleotide positions are indicated in the 0036 FIGS. 13 A-13E show METTL3 is required for nor leftmost panel. FIG. 10C is a Western blot for Mettl14 in wild mal human ESC endoderm differentiation (and is related to type and two cell lines with Mettl3 KO cell lines. Actin is used FIG. 7). FIG. 13A shows staining for SOX1 and DNA of as loading control. FIG. 10D shows FACS plots of Annexin V neural stem cells in METTL3 knock down (KD) and control and Aqua Live/Dead fixable Viability dye for Wild type and cells. FIG. 13B shows knockdown of METTL3 leads to a two Mettl3 KO cell lines. FIG. 10E shows quantification of reduction in METTL3 mRNA levels. qRT-PCR for METTL3 colony morphologies for Wild type and two Mettl3 KO cell mRNA was performed from RNA extracted from control WT lines. Experiment performed in triplicate, with at least 50 hESC cells versus hESCs with anti-METTL3 shRNA (KD) colonies counted per replicate. Error bars represent standard clone #3 across the three indicated time points during endo deviation. FIG. 10F is a Western blot for Mettl3 in wild type derm differentiation. Error bars represent standard deviation and two independent Mettl3 shRNAs. Actin is used as loading across 3 replicates per time point. (FIG. 13B is related to FIG. control. FIG. 10G shows the m6A ratio, determined by 7B). FIG. 13C shows knockdown of METTL3 leads to a 2D-TLC, in wild type and Mettl3 shRNA line. FIG. 10H functional reduction in m6A levels. Ananti-méAdot blot was shows a cell proliferation assay of wild type and two inde performed on 10x fold dilution of polyA selected RNA from pendent Mettl3 shRNA lines. wildtype (WT) hESC cells versus anti-METTL3 knockdown 0034 FIGS. 11A-11B shows Mettl3 loss of function (KD) clone #3. (FIG. 13C is related to FIG. 7C). FIG. 13D impairs ESC ability to differentiate (and is related to FIGS. 2E shows knockdown of METTL3 leads to a delayed and and 2F). FIG. 11A shows representative sections ofteratomas reduced induction of endodermal marker genes. qRT-PCR stained with hematoxylin and eosin (left), and immunohis was performed on indicated genes and time points. Error bars tochemistry with antibody against Nanog (center) and Ki67 represent standard deviation across 3 replicates per time (right). The bar represents 100 um. (related to FIG. 3D). FIG. point. (FIG. 13D is related to FIGS. 7D and 7E). FIG. 13E 11B shows relative mRNA levels between mettl3-f- derived shows knockdown of METTL3 leads prevents the normal tumors and wild-type derived tumors for Oct4, Nanog, Ki67, reduction of stem maintenance/marker genes. qRT-PCR was Myhé, Tuj1 and Sox 17. Error bars represent standard devia performed for indicated genes and time points. Error bars tion of the AACT value. represent standard deviation across 3 replicates per time 0035 FIGS. 12A-12G show mA-seq profiling of hESC point. (FIG. 13E is related to FIGS. 7D and 7E). during endoderm differentiation (and is related to FIG. 5.) FIG. 12A shows representative examples of méA location in DETAILED DESCRIPTION OF THE INVENTION multi-exon non-coding RNAS and single-exon mRNAS. 0037. The present invention is directed to, in part, meth UCSC Genome browser plots of méA-seq reads (red) along ods, compositions and kits to maintain a stem cell population, indicated RNAs in undifferentiated hESCs (i.e.TO). The grey Such as a human stem cell population, in an undifferentiated reads are from non-immunoprecipitated control input librar state, comprising contacting the stem cell population with an ies. The read density is calculated from the average of the two inhibitor of METTL3 or METTL4. In some embodiments, replicate T0 samples. Arrow indicates the direction of tran the methods, compositions and kits as disclosed herein relate US 2016/0264934 A1 Sep. 15, 2016 to methods to prevent a stem cell population differentiating levels of mA (i.e., mA peak intensity) of a subset of RNA along an endoderm lineage. Other aspects of the technology transcripts can accurately predict the cell State of a human described herein relates to methods, compositions and kits to stem cell population. promote a stem cell population to differentiate along an endo 0042 Another aspect of the present invention relates to a derm lineage. Moreover, another aspect of the technology method for assessing méA levels in set of RNA transcripts in described herein relates to methods, assays, arrays and kits a population of stem cells, which is useful to predict the for performing méA analysis of RNA from stem cell popu functionality and Suitability of a stem cell line, e.g., a pluri potent stem cell line for a desired use. lations to characterize the cell state of the cell population, I0043. In some embodiments, the level of mA (i.e., mA which can be used, for example, as a quality control for the peak intensity) of a subset of RNA transcripts measured in the stem cell population. In some embodiments, the stem cell methods, arrays, assays, kits and systems as disclosed herein population is a human stem cell population, e.g., a hESC cell includes at least 10, or at least 20 genes selected from any population or other human stem cell line. combination of the genes listed in Table 1 or Table 2. 0038. The present invention is also directed to an array 0044. In some embodiments, the differentiation assays, comprising nucleic acid sequences that hybridize to a set of methods, systems and kits as disclosed herein can be used to RNA sequences (RNA transcripts, including mRNA tran characterize and determine the differentiation potential of a scripts and 3'UTR regions, and untranslated RNA variety of stem cell lines, e.g., a pluripotent stem cell lines, sequences), or Subsets thereof, which can be used to assess the Such as, but not limited to embryonic stem cells, adult stem m6A levels for use in characterizing the cell state of a stem cells, autologous adult stem cells, iPS cells, and other pluri cell population, e.g., human stem cell population. Aspects of potent stem cell lines, such as reprogrammed cells, direct the present invention relate to arrays, assays, systems, kits and reprogrammed cells or partially reprogrammed cells. In some methods to rapidly and inexpensively assess méA levels (i.e., embodiments, a stem cell line is a human stem cell line. In m6A peak intensities) in a set of RNA sequences (e.g., RNA Some embodiments, a stem cell line, e.g., a pluripotent stem transcripts, including mRNA transcripts and 3'UTR regions, cell line is a genetically modified stem cell line. In some and untranslated RNA sequences) to assess stem cell popu embodiments, where the stem cell line, e.g., a pluripotent lations, including human stem cell populations, for their gen stem cell line is for therapeutic use or for transplantation into eral quality (e.g., pluripotent capacity and cell state) and a subject, a stem cell line is an autologous stem cell line, e.g., differentiation capacity. derived from a subject to which a population of stem cells will 0039. As disclosed herein in the Examples, the inventors be transplanted back into, and in alternative embodiments, a have discovered the function of mA in human embryonic stem cell line, e.g., a pluripotent stem cell line is an allogeneic stem cells (ESCs), and surprisingly discovered that mA is pluripotent stem cell line. present on transcripts encoding multiple core pluripotency DEFINITIONS transcription factors, including but not limited to Nanog and SoX2, and was also enriched in 3' untranslated regions at 0045. For convenience, certain terms employed herein, in defined sequence motifs, and importantly marks unstable the specification, examples and appended claims are col transcripts, including transcripts that need to be turned over lected here. Unless stated otherwise, or implicit from context, upon differentiation. Using genetic inactivation or depletion the following terms and phrases include the meanings pro of human Mettl3 in hESCs, the inventors discovered a vided below. Unless explicitly stated otherwise, or apparent decrease in mA levels on select target genes, a prolonged from context, the terms and phrases below do not exclude the Nanog expression upon differentiation, and impaired ESCs meaning that the term or phrase has acquired in the art to exit from self-renewal towards differentiation into several which it pertains. The definitions are provided to aid in lineages in vitro and in vivo. In contrast to prior reports of describing particular embodiments, and are not intended to Mettl3 knockdown in mESCs, knockdown of Mettl3 inhESC limit the claimed invention, because the scope of the inven lead to the unexpected result of increased self-renewal and tion is limited only by the claims. Unless otherwise defined, proliferation of hESC, and reduced ability to differentiate all technical and Scientific terms used herein have the same along specific lineages, in particular endoderm lineages. meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. 0040. Thus, the inventors have discovered that, in human 0046. The term “nucleic acid' or “nucleic acid sequence' stem cell populations in particular, mA on RNA demon as used herein is defined as a molecule comprised of two or strates the transcriptome flexibility and is required for human more deoxyribonucleotides or ribonucleotides. The exact stem cells to differentiate to specific lineages. In particular, length of the sequence will depend on many factors, which in the inventors have discovered that mA-modifications in the turn depends on the ultimate function or use of the sequence. RNA (in mRNA transcripts, non-coding regions and in non The sequence can be generated in any manner, including coding RNAS) of human stem cell populations serve as stem chemical synthesis, DNA replication, reverse transcription, cells internal “quality control” as the mA marks the mRNA or a combination thereof. Due to the amplifying nature of the as having passed a quality control test in the cell, as stem cells present invention, the number of deoxyribonucleotide or cannot differentiate without mA-modifications on key tran ribonucleotide bases within a nucleic acid sequence can be Scripts. virtually unlimited. The term "oligonucleotide,” as used 0041 As disclosed herein in the Examples, the inventors herein, is interchangeably synonymous with the term have surprisingly discovered that inhibition of METTL3 and/ “nucleic acid sequence'. or METTL4 in human stem cell populations can be used to 0047. As used herein, oligonucleotide sequences that are maintain the cells in a pluripotent state, and promote self complementary to one or more of the genes described herein, renewal and proliferation. Also disclosed herein in the refers to oligonucleotides that are capable of hybridizing Examples, the inventors have surprisingly discovered that the under stringent conditions to at least part of the nucleotide US 2016/0264934 A1 Sep. 15, 2016 sequence of said genes. Such hybridizable oligonucleotides coefficient, or a classification probability or can simply be will typically exhibit at least about 75% sequence identity at expressed as the expression level difference, or the aggregate the nucleotide level to said genes, preferably about 80% or of the expression level differences, between a cell sample 85% sequence identity or more preferably about 90% or 95% expression profile and a baseline template. or more sequence identity to said genes. 0055. The term “expression” refers to the cellular pro 0048. The term “primer' as used herein refers to a cesses involved in producing RNA and proteins and as appro sequence of nucleic acid which is complementary or Substan priate, Secreting proteins, including where applicable, but not tially complementary to a portion of the target gene of inter limited to, for example, transcription, translation, folding, est. Typically 2 primers (e.g., a 3' primer and a 5' primer) are modification and processing. "Expression products include complementary to different portions of the target gene of RNA transcribed from a gene and polypeptides obtained by interest and can be used to amplify a portion of the mRNA of translation of mRNA transcribed from a gene. the target gene by RT-PCR. 0056. As used herein, the terms “measuring méA levels.” 0049. The phrase “Bind(s) substantially” refers to comple “obtaining méA level.” and “detecting méA levels and the mentary hybridization between a probe nucleic acid and a like, includes methods that quantify méA levels on RNA target nucleic acid and embraces minor mismatches that can species, for example, a transcript of a gene, or non-coding be accommodated by reducing the stringency of the hybrid RNA. In some embodiments, the assay provides an indicator ization media to achieve the desired detection of the target of the cell state of a stem cell population (e.g., if it is an polynucleotide sequence. undifferentiated state or differentiated state). In some 0050. The phrase “hybridizing specifically to” refers to the embodiments, the indicator is a numerical value (e.g., the binding, duplexing or hybridizing of a molecule Substantially value from a t-test from the comparison of the average ACt for to or only to a particular nucleotide sequence or sequences each target gene measured as compared to reference ACt of under stringent conditions when that sequence is present in a the same gene for a reference m6A level or peak intensity, as complex mixture (e.g., total cellular) DNA or RNA. disclosed herein in the Examples). In some embodiments, the 0051. The term “biomarker” means any gene, protein, or assay can provide a “yes” or “no result without necessarily an EST derived from that gene, the expression or level of providing quantification, indicating that the stem cell popu which changes between certain conditions. Where the expres lation analysed is in an undifferentiated (i.e., pluripotent) sion of the gene correlates with a certain condition, the gene state or not, respectively. Alternatively, a measured móA lev is a biomarker for that condition. els or méA peak intensity can be expressed as any quantitative 0.052 As used herein, the term “gene' has its meaning as value, for example, a fold-change in m6A peak intensity, up understood in the art. However, it will be appreciated by those or down, relative to a control level of méA peak intensity of of ordinary skill in the art that the term “gene' can include the same gene in another sample, or a log ratio of expression, gene regulatory sequences (e.g., promoters, enhancers, etc.) or any visual representation thereof. Such as, for example, a and/or intron sequences. It will further be appreciated that “heatmap' where a color intensity is representative of the definitions of gene include references to nucleic acids that do m6A peak intensity for a given RNA species. not encode proteins but rather encode functional RNA mol 0057 The terms “móA” and “mA are used interchange ecules such as tRNAS. For clarity, the term gene generally ably herein and refers to N(6)-methyladenosine residues in refers to a portion of a nucleic acid that encodes a protein; the RNA species in a cell, including mA modifications in any term can optionally encompass regulatory sequences. This region of a mRNA molecule (including coding regions and definition is not intended to exclude application of the term non-coding regions such as untranslated 3'UTR and STOP 'gene' to non-protein coding expression units but rather to codons), and untranslated RNA molecules, such as linc RNA clarify that, in most cases, the term as used in this document and miRNA molecules or other multi-exon non-coding RNAs refers to a protein coding nucleic acid. In some cases, the gene and single-exon mRNAS. includes regulatory sequences involved in transcription, or 0058. The term “méAintensity profile' or “méA signature message production or composition. In other embodiments, profile' as used herein is intended to refer to the m6A levels the gene comprises transcribed sequences that encode for a of a gene, or a set of genes, in a stem cell population. In one protein, polypeptide or peptide. In keeping with the terminol embodiments the term “gene profile” refers to the m6A peak ogy described herein, an "isolated gene' can comprise tran intensity levels or of a set of 10 or more genes listed in Table scribed nucleic acid(s), regulatory sequences, coding 1 or Table 2, or any selection of the genes of between 10-20, sequences, or the like, isolated Substantially away from other or 20-30, or 30-50, or 50-100, or 100-200, or 200-300, or Such sequences, such as other naturally occurring genes, 300-400, or 400-600 listed in Table 1 or Table 2, which are regulatory sequences, polypeptide or peptide encoding described herein. sequences, etc. In this respect, the term “gene' is used for 0059. The term “differential expression” in the context of simplicity to refer to a nucleic acid comprising a nucleotide the present invention means the gene is up-regulated or down sequence that is transcribed, and the complement thereof. regulated in comparison to its normal variation of expression 0053. The term “signature' as used herein refers to the in a pluripotent stem cell. Statistical methods for calculating m6A levels present on a set of target genes (or RNA species or differential expression of genes are discussed elsewhere mRNA transcipts). herein. 0054 The term a “similarity value” is a number that rep 0060. The term “genes of Table 1 or Table 2 is used resents the degree of similarity between two things being interchangeably herein with "gene listed in Table 1 or Table compared. For example, a similarity value can be a number 2 and refers to the RNA species or gene products of genes that indicates the overall similarity between a cell sample listed in Table 1 and/or Table 2, respectively. By “gene prod expression profile using specific phenotype-related biomark uct” is meant any product of transcription or translation of the ers and a control specific to that template. The similarity value genes, whether produced by natural or artificial means. In can be expressed as a similarity metric, such as a correlation Some embodiments, the genes referred to herein are those US 2016/0264934 A1 Sep. 15, 2016

listed in Table 1. The same applies to “genes of Table 2, but conditions; factors such as the length and nature (DNA, RNA, refers to the gene products of genes listed in Table 2. base composition) of the probe and nature of the target (DNA, 0061. The term “hybridization” or “hybridizes” as used RNA, base composition, present in Solution or immobilized, herein involves the annealing of a complementary sequence etc.) and the concentration of the salts and other components to the target nucleic acid (the sequence to be detected). The (e.g., the presence or absence of formamide, dextran Sulfate, ability of two polymers of nucleic acid containing comple polyethylene glycol) are considered and the hybridization mentary sequences to find each other and anneal through base Solution can be varied to generate conditions of either low or pairing interaction is a well-recognized phenomenon. The high stringency hybridization different from, but equivalent initial observations of the “hybridization' process by Marmur to, the above listed conditions. and Lane, Proc. Natl. Acad. Sci. USA, 46:453 (1960) and 0066. The term "solid surface' as used herein refers to a Doty et al., Proc. Natl. Acad. Sci. USA, 46:461 (1960) have material having a rigid or semi-rigid Surface. Such materials been followed by the refinement of this process into an essen will preferably take the form of chips, plates (e.g., microtiter tial tool of modern biology. plates), slides, Small beads, pellets, disks or other convenient 0062. The terms “complementary' or “substantially forms, although other forms can be used. In some embodi complementary” as used herein refer to the hybridization or ments, at least one surface of the solid surface will be sub base pairing between nucleotides or nucleic acids. Such as, for stantially flat. In other embodiments, a roughly spherical instance, between the two strands of a double stranded DNA shape is preferred. molecule or between an oligonucleotide primer and a primer 0067. The term “reprogramming as used herein refers to binding site on a single Stranded nucleic acid to be sequenced a process that alters or reverses the differentiation state of a or amplified. Complementary nucleotides are, generally, A differentiated cell (e.g. a somatic cell). Stated another way, and T (or A and U), or C and G. Two single stranded RNA or reprogramming refers to a process of driving the differentia DNA molecules are said to be substantially complementary tion of a cell backwards to a more undifferentiated or more when the nucleotides of one strand, optimally aligned with primitive type of cell. Complete reprogramming involves appropriate nucleotide insertions or deletions, pair with at complete reversal of at least some of the heritable patterns of least about 80% of the nucleotides of the other strand, usually nucleic acid modification (e.g., methylation), chromatin con at least about 90% to 95%, and more preferably from about 98 densation, epigenetic changes, genomic imprinting, etc., that to 100%. Alternatively, substantial complementarity exists occur during cellular differentiation as a Zygote develops into when an RNA or DNA strand will hybridize under selective an adult. Reprogramming is distinct from simply maintaining hybridization conditions to its complement. Typically, selec the existing undifferentiated state of a cell that is already tive hybridization will occur when there is at least about 65% pluripotent or maintaining the existing less than fully differ complementarity over a stretch of at least 14 to 25 nucle entiated State of a cell that is already a multipotent cell (e.g., otides, preferably at least about 75%, more preferably at least a hematopoietic stem cell). Reprogramming is also distinct about 90% complementarity. See M. Kanehisa, Nucleic from promoting the self-renewal or proliferation of cells that Acids Res., 12:203 (1984), incorporated herein by reference. are already pluripotent or multipotent. The term “at least a portion of as used herein, refers to the 0068. The term “induced pluripotent stem cell” or “iPSC complimentarity between a circular DNA template and an or “iPS cell refers to a cell derived from a complete reversion oligonucleotide primer of at least one . or reprogramming of the differentiation state of a differenti 0063 Partially complementary sequences will hybridize ated cell (e.g. a somatic cell). As used herein, an iPSC is fully under low stringency conditions. This is not to say that con reprogrammed and is a cell which has undergone complete ditions of low Stringency are Such that non-specific binding is epigenetic reprogramming. As used herein, an iPSC is a cell permitted; low stringency conditions require that the binding which cannot be further reprogrammed to a more immature of two sequences to one another be a specific (i.e., selective) state (e.g., an iPSC cell is terminally reprogrammed). interaction. The absence of non-specific binding can be tested 0069. The term “pluripotent” as used herein refers to a cell by the use of a second target which lacks even a partial degree with the capacity, under different conditions, to differentiate of complementarity (e.g., less than about 30% identity); in the to cell types characteristic of all three germ cell layers (endo absence of non-specific binding the probe will not hybridize derm, mesoderm and ectoderm). A pluripotent stem cell typi to the second non-complementary target. cally has the potential to divide in vitro for a long period of 0064. The term “stringency” refers to the degree of speci time, e.g., greater than one year or more than 30 passages. ficity imposed on a hybridization reaction by the specific (0070. The term “differentiated cell” refers to any primary conditions used for a reaction. When used in reference to cell that is not, in its native form, pluripotent as that term is nucleic acid hybridization, stringency typically occurs in a defined herein. The term a “differentiated cell also encom range from about T-5°C. (5° C. below the T of the probe) passes cells that are partially differentiated, such as multipo to about 20° C., 25° C. below T. As will be understood by tent cells, or cells that are stable non-pluripotent partially those of skill in the art, a stringent hybridization can be used reprogrammed cells. It should be noted that placing many to identify or detect identical polynucleotide sequences or to primary cells in culture can lead to some loss of fully differ identify or detect similar or related polynucleotide sequences. entiated characteristics. However, such cells are included in Under'stringent conditions a nucleic acid sequence of inter the term differentiated cells and the loss of fully differentiated est will hybridize to its exact complement and closely related characteristics does not render these cells non-differentiated sequences. Suitably stringent hybridization conditions for cells (e.g. undifferentiated cells) or pluripotent cells. The nucleic acid hybridization of a primer or short probe include, transition of a differentiated cell to pluripotency requires a e.g., 3xSSC, 0.1% SDS, at 50° C. reprogramming stimulus beyond the stimuli that lead to par 0065. When used in reference to nucleic acid hybridiza tial loss of differentiated character in culture. Reprogrammed tion the art knows well that numerous equivalent conditions cells also have the characteristic of the capacity of extended can be employed to comprise either low or high Stringency passaging without loss of growth potential, relative to pri US 2016/0264934 A1 Sep. 15, 2016

mary cell parents, which generally have capacity for only a tially identical population of ancestor cells. The cell line can limited number of divisions in culture. In some embodiments, have been or can be capable of being maintained in culture for the term “differentiated cell also refers to a cell of a more an extended period (e.g., months, years, for an unlimited specialized cell type derived from a cell of a less specialized period of time). Cell lines include all those cell lines recog cell type (e.g., from an undifferentiated cell or a repro nized in the art as such. It will be appreciated that cells acquire grammed cell) where the cell has undergone a cellular differ mutations and possibly epigenetic changes over time Such entiation process. that at least some properties of individual cells of a cell line (0071. As used herein, the term “adult cell refers to a cell can differ with respect to each other. found throughout the body after embryonic development. 0078. The term “lineages” as used herein describes a cell 0072. In the context of cell ontogeny, the term “differen with a common ancestry or cells with a common develop tiate', or “differentiating is a relative term meaning a “dif mental fate. By way of an example only, stating that a cell that ferentiated cell' is a cell that has progressed further down the is of endoderm origin or is of "endodermal lineage' means the developmental pathway than its precursor cell. Thus in some cell was derived from an endodermal cell and can differenti embodiments, a reprogrammed cell as this term is defined ate along the endodermal lineage restricted pathways, such as herein, can differentiate to lineage-restricted precursor cells one or more developmental lineage pathways which give rise (such as a mesodermal stem cell), which in turn can differen to definitive endoderm cells, which in turn can differentiate tiate into other types of precursor cells further down the into liver cells, thymus, pancreas, lung and intestine. pathway (Such as antissue specific precursor, for example, a cardiomyocyte precursor), and then to an end-stage differen 007.9 The terms “decrease”, “reduced, “reduction', tiated cell, which plays a characteristic role in a certain tissue “decrease' or “inhibit are all used herein generally to mean type, and can or cannot retain the capacity to proliferate a decrease by a statistically significant amount. However, for further. avoidance of doubt, “reduced, “reduction' or “decrease' or 0073. The term "embryonic stem cell” is used to refer to “inhibit means a decrease by at least 10% as compared to a the pluripotent stem cells of the inner cell mass of the embry reference level, for example a decrease by at least about 20%, onic blastocyst (see U.S. Pat. Nos. 5,843,780, 6,200,806, or at least about 30%, or at least about 40%, or at least about which are incorporated herein by reference). Such cells can 50%, or at least about 60%, or at least about 70%, or at least similarly be obtained from the inner cell mass of blastocysts about 80%, or at least about 90% or up to and including a derived from Somatic cell nuclear transfer (see, for example, 100% decrease (e.g. absent level as compared to a reference U.S. Pat. Nos. 5,945,577, 5,994,619, 6,235,970, which are sample), or any decrease between 10-100% as compared to a incorporated herein by reference). The distinguishing char reference level. acteristics of an embryonic stem cell define an embryonic 0080. The terms “increased, “increase' or “enhance' or stem cell phenotype. Accordingly, a cell has the phenotype of “activate” are all used hereinto generally mean an increase by an embryonic stem cell if it possesses one or more of the a statically significant amount; for the avoidance of any unique characteristics of an embryonic stem cell Such that doubt, the terms “increased, “increase' or "enhance' or that cell can be distinguished from other cells. Exemplary “activate” means an increase of at least 10% as compared to a distinguishing embryonic stem cell characteristics include, reference level, for example an increase of at least about 20%, without limitation, gene expression profile, proliferative or at least about 30%, or at least about 40%, or at least about capacity, differentiation capacity, karyotype, responsiveness 50%, or at least about 60%, or at least about 70%, or at least to particular culture conditions, and the like. about 80%, or at least about 90% or up to and including a 0074 The term “phenotype” refers to one or a number of 100% increase or any increase between 10-100% as com total biological characteristics that define the cellor organism pared to a reference level, or at least about a 2-fold, or at least under aparticular set of environmental conditions and factors, about a 3-fold, or at least about a 4-fold, or at least about a regardless of the actual genotype. 5-fold or at least about a 10-fold increase, or any increase 0075. The term “cell culture medium' (also referred to between 2-fold and 10-fold or greater as compared to a ref herein as a “culture medium' or “medium') as referred to erence level. herein is a medium for culturing cells containing nutrients I0081. The term “statistically significant’ or “signifi that maintain cell viability and support proliferation. The cell cantly refers to statistical significance and generally means a culture medium can contain any of the following in an appro two standard deviation (2 SD) or greater difference in a value priate combination: salt(s), buffer(s), amino acids, glucose or of the marker. The term refers to statistical evidence that there other Sugar(s), antibiotics, serum or serum replacement, and is a difference. It is defined as the probability of making a other components such as peptide growth factors, etc. Cell decision to reject the null hypothesis when the null hypothesis culture media ordinarily used for particular cell types are is actually true. Statistical significance can be determined by known to those skilled in the art. t-test or using a p-value. 0076. The term “self-renewing media' or “self-renewing 0082. As used herein, the term “DNA is defined as deox culture conditions' refers to a medium for culturing stem yribonucleic acid. cells which contains nutrients that allow a stem cell line to propagate in an undifferentiated State. Self-renewing culture 0083. The term “differentiation as used herein refers to media is well known to those of ordinary skill in the art and is the cellular development of a cell from a primitive stage ordinarily used for maintenance of stem cells as embroid towards a more mature (i.e. less primitive) cell. bodies (EBs), where the stem cells divide and replicate in an 0084. The term “directed differentiation' as used herein undifferentiated state. refers to forcing differentiation of a cell from an undifferen 0077. The term “cell line” refers to a population of largely tiated (e.g. more primitive cell) to a more mature cell type (i.e. or substantially identical cells that has typically been derived less primitive cell) via genetic and/or environmental manipu from a single ancestor cell or from a defined and/or Substan lation. In some embodiments, a reprogrammed cell as dis US 2016/0264934 A1 Sep. 15, 2016

closed herein is subject to directed differentiation into spe ing domain of a monoclonal antibody. For example, an anti cific cell types, such as neuronal cell types, muscle cell types body can include a heavy (H) chain variable region and the like. (abbreviated herein as VH), and a light (L) chain variable 0085. The term “disease modeling as used herein refers region (abbreviated herein as VL). In another example, an to the use of laboratory cell culture or animal research to antibody includes two heavy (H) chain variable regions and obtain new information about human disease or illness. In two light (L) chain variable regions. The term “antibody Some embodiments, a reprogrammed cell produced by the reagent encompasses antigen-binding fragments of antibod methods as disclosed herein can be used in disease modeling ies (e.g., single chain antibodies, Fab and SFab fragments, experiments. F(ab')2, Fa fragments, Fv fragments, scFV, and domain anti I0086. The term “drug screening as used herein refers to body (dAb) fragments (see, e.g. de Wildt et al., Eur J. Immu the use of cells and tissues in the laboratory to identify drugs nol. 1996; 26(3):629-39; which is incorporated by reference with a specific function. herein in its entirety)) as well as complete antibodies. An 0087. The term “marker” as used interchangeably with antibody can have the structural features of IgA, IgG, IgE, “biomarker and describes the characteristics and/or pheno IgD, IgM (as well as Subtypes and combinations thereof). type of a cell. Markers can be used for selection of cells Antibodies can be from any source, including mouse, rabbit, comprising characteristics of interest. Markers will vary with pig, rat, and primate (human and non-human primate) and specific cells. Markers are characteristics, whether morpho primatized antibodies. Antibodies also include midibodies, logical, functional or biochemical (enzymatic) characteristics humanized antibodies, chimeric antibodies, and the like. of the cell of a particular cell type, or molecules expressed by (0091. The VH and VL regions can be further subdivided the cell type. Preferably, such markers are gene transcripts or into regions of hyperVariability, termed “complementarity their translation products (e.g., proteins). However, a marker determining regions” (“CDR), interspersed with regions that can consist of any molecule found in a cell including, but not are more conserved, termed “framework regions” (“FR). limited to, proteins (peptides and polypeptides), lipids, The extent of the framework region and CDRs has been polysaccharides, nucleic acids and steroids. Examples of precisely defined (see, Kabat, E.A., et al. (1991) Sequences of morphological characteristics or traits include, but are not Proteins of Immunological Interest, Fifth Edition, U.S. limited to, shape, size, and nuclear to cytoplasmic ratio. Department of Health and Human Services, NIH Publication Examples of functional characteristics or traits include, but No. 91-3242, and Chothia, C. et al. (1987) J. Mol. Biol. are not limited to, the ability to adhere to particular substrates, 196:901-917; which are incorporated by reference herein in ability to incorporate or exclude particular dyes, ability to their entireties). Each VH and VL is typically composed of migrate under particular conditions, and the ability to differ three CDRs and four FRs, arranged from amino-terminus to entiate along particular lineages. Markers can be detected by carboxy-terminus in the following order: FR1, CDR1, FR2, any method available to one of skill in the art. Markers can CDR2, FR3, CDR3, FR4. also be the absence of a morphological characteristic or absence of proteins, lipids etc. Markers can be a combination 0092. The terms “antigen-binding fragment' or “antigen of a panel of unique characteristics of the presence and binding domain', which are used interchangeably herein are absence of polypeptides and other morphological character used to refer to one or more fragments of a full length anti istics. body that retain the ability to specifically bind to a target of 0088 As used herein an “antibody' refers to IgG, IgM, interest. Examples of binding fragments encompassed within IgA, Ig) or IgE molecules or antigen-specific antibody frag the term “antigen-binding fragment of a full length antibody ments thereof (including, but not limited to, a Fab, F(ab'). Fv, include (i) a Fab fragment, a monovalent fragment consisting disulphide linked Fv, scfv, single domain antibody, closed of the VL, VH, CL and CH1 domains; (ii) a F(ab')2 fragment, conformation multispecific antibody, disulphide-linked Scfv, a bivalent fragment including two Fab fragments linked by a diabody), whether derived from any species that naturally disulfide bridge at the hinge region; (iii) an Fd fragment produces an antibody, or created by recombinant DNA tech consisting of the VH and CH1 domains; (iv) an Fv fragment nology; whether isolated from serum, B-cells, hybridomas, consisting of the VL and VH domains of a single arm of an transfectomas, yeast or bacteria. antibody, (v) a dAb fragment (Ward et al., (1989) Nature 0089. As described herein, an “antigen' is a molecule that 341:544-546; which is incorporated by reference herein in its is bound by a binding site comprising the complementarity entirety), which consists of a VH or VL domain; and (vi) an determining regions (CDRs) of an antibody agent. Typically, isolated complementarity determining region (CDR) that antigens are bound by antibody ligands and are capable of retains specific antigen-binding functionality. raising an antibody response in vivo. An antigen can be a 0093. As used herein, the term “specific binding refers to polypeptide, protein, nucleic acid or other molecule or por a chemical interaction between two molecules, compounds, tion thereof. The term “antigenic determinant” refers to an cells and/or particles wherein the first entity binds to the epitope on the antigen recognized by an antigen-binding mol second, target entity with greater specificity and affinity than ecule, and more particularly, by the antigen-binding site of it binds to a third entity which is a non-target. In some said molecule. embodiments, specific binding can refer to an affinity of the 0090. As used herein, the term “antibody reagent” refers to first entity for the second target entity which is at least 10 a polypeptide that includes at least one immunoglobulin Vari times, at least 50 times, at least 100 times, at least 500 times, able domain or immunoglobulin variable domain sequence at least 1000 times or greater than the affinity for the third and which specifically binds to a given antigen. An antibody nontarget entity. A reagent specific for a given target is one reagent can comprise an antibody or a polypeptide compris that exhibits specific binding for that target under the condi ing an antigen-binding domain of an antibody. In some tions of the assay being utilized. In certain embodiments, embodiments, an antibody reagent can comprise a mono specific binding is indicated by a dissociation constant on the clonal antibody or a polypeptide comprising an antigen-bind order of s 10 M,<10 M, s10' Mor below. US 2016/0264934 A1 Sep. 15, 2016

0094. As used herein, “expression level” refers to the num to as “substantially complementary' with respect to a second ber of mRNA molecules and/or polypeptide molecules sequence herein, the two sequences can be fully complemen encoded by a given gene that are present in a cell or sample. tary, or they can form one or more, but generally not more Expression levels can be increased or decreased relative to a than 5, 4, 3 or 2 mismatched base pairs upon hybridization for reference level. a duplex up to 30 base pairs, while retaining the ability to 0095. As used herein, the term “iRNA agent” or “RNAi hybridize under the conditions most relevant to their ultimate agent” refers to an agent that contains RNA as that term is application, e.g., inhibition of gene expression via a RISC defined herein, and which mediates the targeted cleavage of pathway. However, where two oligonucleotides are designed an RNA transcript via an RNA-induced silencing complex to form, upon hybridization, one or more single stranded (RISC) pathway. In one embodiment, an iRNA as described overhangs. Such overhangs shall not be regarded as mis herein inhibits the expression METTL3/Link a stem cell or matches with regard to the determination of complementarity. progenitor cell, e.g., HSC or a mammal. For example, a dsRNA comprising one oligonucleotide 21 0096. As used herein, “target sequence” refers to a con nucleotides in length and another oligonucleotide 23 nucle tiguous portion of the nucleotide sequence of a messenger otides in length, wherein the longer oligonucleotide com RNA (mRNA) molecule formed during the transcription of a prises a sequence of 21 nucleotides that is fully complemen gene, including mRNA that is a product of RNA processing of tary to the shorter oligonucleotide, can yet be referred to as a primary transcription product. The target portion of the “fully complementary for the purposes described herein. sequence will be at least long enough to serve as a specific 0100 “Complementary’ sequences, as used herein, can binding site for an iRNA agent and/or as a substrate for also include, or be formed entirely from, non-Watson-Crick iRNA-directed cleavage at or near that portion. For example, base pairs and/or base pairs formed from non-natural and the target sequence will generally be from 9-36 nucleotides in modified nucleotides, in as far as the above requirements with length, e.g., 15-30 nucleotides in length, including all Sub respect to their ability to hybridize are fulfilled. Such non ranges therebetween. As non-limiting examples, the target Watson-Crick base pairs includes, but are not limited to, G:U sequence can be from 15-30 nucleotides, 15-26 nucleotides, Wobble or Hoogstein base pairing. 15-23 nucleotides, 15-22 nucleotides, 15-21 nucleotides, 0101 The terms “complementary.” “fully complemen 15-20 nucleotides, 15-19 nucleotides, 15-18 nucleotides, tary' and “substantially complementary' herein can be used 15-17 nucleotides, 18-30 nucleotides, 18-26 nucleotides, with respect to the base matching between the sense strand 18-23 nucleotides, 18-22 nucleotides, 18-21 nucleotides, and the antisense strand of a dsRNA, or between the antisense 18-20 nucleotides, 19-30 nucleotides, 19-26 nucleotides, strand of an iRNA agent and a target sequence, as will be 19-23 nucleotides, 19-22 nucleotides, 19-21 nucleotides, understood from the context of their use. 19-20 nucleotides, 20-30 nucleotides, 20-26 nucleotides, 0102. As used herein, a polynucleotide that is “substan 20-25 nucleotides, 20-24 nucleotides, 20-23 nucleotides, tially complementary to at least part of a messenger RNA 20-22 nucleotides, 20-21 nucleotides, 21-30 nucleotides, (mRNA) refers to a polynucleotide that is substantially 21-26 nucleotides, 21-25 nucleotides, 21-24 nucleotides, complementary to a contiguous portion of the mRNA of 21-23 nucleotides, or 21-22 nucleotides. interest (e.g., an mRNA encoding METTL3). For example, a 0097. As used herein, the term “strand comprising a polynucleotide is complementary to at least a part of a mRNA sequence” refers to an oligonucleotide comprising a chain of if the sequence is substantially complementary to a non nucleotides that is described by the sequence referred to using interrupted portion of the mRNA. the standard nucleotide nomenclature. (0103. The term double-stranded RNA or “dsRNA, as 0098. As used herein, and unless otherwise indicated, the used herein, refers to an iRNA that includes an RNA molecule term “complementary, when used to describe a first nucle or complex of molecules having a hybridized duplex region otide sequence in relation to a second nucleotide sequence, that comprises two anti-parallel and Substantially comple refers to the ability of an oligonucleotide or polynucleotide mentary nucleic acid strands, which will be referred to as comprising the first nucleotide sequence to hybridize and having “sense' and “antisense' orientations with respect to a form a duplex structure under certain conditions with an target RNA. The duplex region can be of any length that oligonucleotide or polynucleotide comprising the second permits specific degradation of a desired target RNA through nucleotide sequence, as will be understood by the skilled a RISC pathway, but will typically range from 9 to 36 base person. Such conditions can, for example, be stringent con pairs in length, e.g., 15-30 base pairs in length. Considering a ditions, where stringent conditions can include: 400 mM duplex between 9 and 36 base pairs, the duplex can be any NaCl, 40 mM PIPES pH 6.4, 1 mM EDTA, 50° C. or 70° C. length in this range, for example, 9, 10, 11, 12, 13, 14, 15, 16, for 12-16 hours followed by washing. Other conditions, such 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,33, as physiologically relevant conditions as can be encountered 34, 35, or 36 and any Sub-range therein between, including, inside an organism, can apply. The skilled person will be able but not limited to 15-30 base pairs, 15-26 base pairs, 15-23 to determine the set of conditions most appropriate for a test base pairs, 15-22 base pairs, 15-21 base pairs, 15-20 base of complementarity of two sequences in accordance with the pairs, 15-19 base pairs, 15-18 base pairs, 15-17 base pairs, ultimate application of the hybridized nucleotides. 18-30 base pairs, 18-26 base pairs, 18-23 base pairs, 18-22 0099 Complementary sequences within an iRNA, e.g., base pairs, 18-21 base pairs, 18-20 base pairs, 19-30 base within a dsRNA as described herein, include base-pairing of pairs, 19-26 base pairs, 19-23 base pairs, 19-22 base pairs, the oligonucleotide or polynucleotide comprising a first 19-21 base pairs, 19-20 base pairs, 20-30 base pairs, 20-26 nucleotide sequence to an oligonucleotide or polynucleotide base pairs, 20-25 base pairs, 20-24 base pairs, 20-23 base comprising a second nucleotide sequence over the entire pairs, 20-22 base pairs, 20-21 base pairs, 21-30 base pairs, length of one or both nucleotide sequences. Such sequences 21-26 base pairs, 21-25 base pairs, 21-24 base pairs, 21-23 can be referred to as “fully complementary' with respect to base pairs, or 21-22 base pairs. dsRNAs generated in the cell each other herein. However, where a first sequence is referred by processing with Dicer and similar enzymes are generally US 2016/0264934 A1 Sep. 15, 2016

in the range of 19-22 base pairs in length. One strand of the dsRNA. However, it is self evident that under no circum duplex region of a dsDNA comprises a sequence that is Sub stances is a double stranded DNA molecule encompassed by stantially complementary to a region of a target RNA. The the term 'iRNA two strands forming the duplex structure can be from a single 0106. In one aspect, an RNA interference agent includes a RNA molecule having at least one self-complementary single stranded RNA that interacts with a target RNA region, or can be formed from two or more separate RNA sequence to direct the cleavage of the target RNA. Without molecules. Where the duplex region is formed from two wishing to be bound by theory, long double stranded RNA Strands of a single molecule, the molecule can have a duplex introduced into plants and invertebrate cells is broken down region separated by a single stranded chain of nucleotides into siRNA by a Type III endonuclease known as Dicer (Sharp (herein referred to as a “hairpin loop) between the 3'-end of et al., Genes Dev. 2001, 15:485). Dicer, a ribonuclease-III one strand and the 5'-end of the respective other strand form like enzyme, processes the dsRNA into 19-23 base pair short ing the duplex structure. The hairpin loop can comprise at interfering RNAs with characteristic two base 3' overhangs (Bernstein, et al., (2001) Nature 409:363). The siRNAs are least one unpaired nucleotide; in some embodiments the hair then incorporated into an RNA-induced silencing complex pin loop can comprise at least 3, at least 4, at least 5, at least (RISC) where one or more helicases unwind the siRNA 6, at least 7, at least 8, at least 9, at least 10, at least 20, at least duplex, enabling the complementary antisense Strand to guide 23 or more unpaired nucleotides. Where the two substantially target recognition (Nykanen, et al., (2001) Cell 107:309). complementary strands of a dsRNA are comprised by sepa Upon binding to the appropriate target mRNA, one or more rate RNA molecules, those molecules need not, but can be endonucleases within the RISC cleaves the target to induce covalently connected. Where the two strands are connected silencing (Elbashir, et al., (2001) Genes Dev. 15:188). Thus, covalently by means other than a hairpin loop, the connecting in one aspect the technology described herein relates to a structure is referred to as a “linker.” The term “siRNA is also single stranded RNA that promotes the formation of a RISC used herein to refer to a dsRNA as described above. complex to effect silencing of the target gene. 0104. The skilled artisan will recognize that the term 0107 As used herein, the term “nucleotide overhang “RNA molecule' or “ribonucleic acid molecule' encom refers to at least one unpaired nucleotide that protrudes from passes not only RNA molecules as expressed or found in the duplex structure of an iRNA, e.g., a dsRNA. For example, nature, but also analogs and derivatives of RNA comprising when a 3'-end of one strand of a dsRNA extends beyond the one or more ribonucleotide/ribonucleoside analogs or deriva 5'-end of the other strand, or vice versa, there is a nucleotide tives as described herein or as known in the art. Strictly overhang. A dsRNA can comprise an overhang of at least one speaking, a “ribonucleoside' includes a nucleoside base and nucleotide; alternatively the overhang can comprise at least a ribose sugar, and a “ribonucleotide' is a ribonucleoside with two nucleotides, at least three nucleotides, at least four nucle one, two or three phosphate moieties. However, the terms otides, at least five nucleotides or more. A nucleotide over “ribonucleoside' and “ribonucleotide' can be considered to hang can comprise or consist of a nucleotide/nucleoside ana be equivalent as used herein. The RNA can be modified in the log, including a deoxynucleotide/nucleoside. The overhang nucleobase structure or in the ribose-phosphate backbone (s) can be on the sense Strand, the antisense Strand or any structure, e.g., as described herein below. However, the mol combination thereof. Furthermore, the nucleotide(s) of an ecules comprising ribonucleoside analogs or derivatives must overhang can be present on the 5' end, 3' end or both ends of retain the ability to form a duplex. As non-limiting examples, either an antisense or sense strand of a dsRNA. an RNA molecule can also include at least one modified 0108. In one embodiment, the antisense strand of a dsRNA ribonucleoside including but not limited to a 2'-O-methyl has a 1-10 nucleotide overhang at the 3' end and/or the 5' end. modified nucleoside, a nucleoside comprising a 5' phospho In one embodiment, the sense strand of a dsRNA has a 1-10 rothioate group, a terminal nucleoside linked to a cholesteryl nucleotide overhang at the 3' end and/or the 5' end. In another derivative or dodecanoic acid bisdecylamide group, a locked embodiment, one or more of the nucleotides in the overhang nucleoside, an abasic nucleoside, a 2'-deoxy-2'-fluoro modi is replaced with a nucleoside thiophosphate. fied nucleoside, a 2-amino-modified nucleoside, 2-alkyl 0109. The terms “blunt’ or “bluntended as used herein in modified nucleoside, morpholino nucleoside, a phosphora reference to a dsRNA or dsDNA mean that there are no midate or a non-natural base comprising nucleoside, or any unpaired nucleotides or nucleotide analogs at a given terminal combination thereof. Alternatively, an RNA molecule can end of a dsRNA or dsDNA molecule, i.e., no nucleotide comprise at least two modified ribonucleosides, at least 3, at overhang. One or both ends of a dsRNA or dsDNA can be least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at blunt. Where both ends of a dsRNA or dsDNA are blunt, the least 10, at least 15, at least 20 or more, up to the entire length dsRNA or dsDNA is said to be blunt ended. To be clear, a of the dsRNA molecule. The modifications need not be the “blunt ended dsRNA or dsDNA is a dsRNA or dsDNA that same for each of such a plurality of modified ribonucleosides is blunt at both ends, i.e., no nucleotide overhang at either end in an RNA molecule. In one embodiment, modified RNAs of the molecule. Most often such a molecule will be double contemplated for use in methods and compositions described stranded over its entire length. In contrast “sticky ends’ refers herein are peptide nucleic acids (PNAs) that have the ability to dsDNA or dsRNA molecule that has at least 1 or more to form the required duplex structure and that permit or medi (typically 2-5 or more) nucleotide overhang. ate the specific degradation of a target RNA via a RISC pathway. 0110. The term “antisense strand’ or “guide strand refers to the strand of an iRNA, e.g., a dsRNA, which includes a 0105. In one aspect, a modified ribonucleoside includes a region that is Substantially complementary to a target deoxyribonucleoside. In Such an instance, an iRNA agent can sequence. As used herein, the term “region of complementa comprise one or more deoxynucleosides, including, for rity” refers to the region on the antisense strand that is sub example, a deoxynucleoside overhang(s), or one or more stantially complementary to a sequence, for example a target deoxynucleosides within the double stranded portion of a sequence, as defined herein. Where the region of complemen US 2016/0264934 A1 Sep. 15, 2016 tarity is not fully complementary to the target sequence, the processed to short, about 70-nucleotide “stem-loop struc mismatches can be in the internal or terminal regions of the tures' known as “pre-miRNA’ in the cell nucleus. This pro molecule. Generally, the most tolerated mismatches are in the cessing is performed in animals by a protein complex known terminal regions, e.g., within 5, 4, 3, or 2 nucleotides of the 5' as the Microprocessor complex, consisting of the nuclease and/or 3' terminus. Drosha and the double-stranded RNA binding protein Pasha. 0111. The term “sense strand, or “passenger strand as These pre-miRNAs are then processed to mature miRNAs in used herein, refers to the strand of an iRNA that includes a the cytoplasm by interaction with the endonuclease Dicer, region that is Substantially complementary to a region of the which also initiates the formation of the RNA-induced silenc antisense strand as that term is defined herein. ing complex (RISC). This complex is responsible for the gene 0112. The terms “microRNA’ or “miRNA or “mir or silencing observed due to miRNA expression and RNA inter “miR are used interchangeably herein, are endogenous ference. The pathway is different for miRNAs derived from RNAs, some of which are known to regulate the expression of intronic stem-loops; these are processed by Drosha but not by protein-coding genes at the posttranscriptional level. As used Dicer. In some instances, a given region of DNA and its herein, the term “microRNA’ refers to any type of micro complementary strand can both function as templates to give interfering RNA, including but not limited to, endogenous rise to at least two miRNAs. Mature miRNAs can direct the microRNA and artificial microRNA. “MicroRNA also cleavage of mRNA or they can interfere with translation of means a non-coding RNA between 18 and 25 nucleobases in the mRNA, either of which results in reduced protein accu length, which is the product of cleavage of a pre-miRNA by mulation, rendering miRNAS capable of modulating gene the enzyme Dicer. Examples of mature miRNAs are found in expression and related cellular activities. the miRNA database known as miRBase (http://microma. 0115 “Pri-miRNA” or “pri-miR' means a non-coding sanger.ac.uk/). In certain embodiments, microRNA is abbre RNA having a hairpin structure that is a substrate for the viated as “miRNA or “miR. Typically, endogenous double-stranded RNA-specific ribonuclease Drosha. A "pri microRNA are small RNAs encoded in the genome which are miRNA is a precursor to a mature miRNA molecule which capable of modulating the productive utilization of mRNA. A comprises: (i) a microRNA sequence and (ii) stem-loop com mature miRNA is a single-stranded RNA molecule of about ponent which are both flanked (i.e. surrounded on each side) 21-23 nucleotides in length which is complementary to a by “microRNA flanking sequences’, where each flanking target sequence, and hybridizes to the target RNA sequence to sequence typically ends in either a cap or poly-A tail. Pri inhibit expression of a gene which encodes a miRNA target microRNA, (also referred to as large RNA precursors), are sequence. miRNAs themselves are encoded by genes that are composed of any type of nucleic acid based molecule capable transcribed from DNA but not translated into protein (non of accommodating the microRNA flanking sequences and the coding RNA); instead they are processed from primary tran microRNA sequence. Examples of pri-miRNAs and the indi Scripts known as pri-miRNA to short stem-loop structures vidual components of Such precursors (flanking sequences called pre-miRNA and finally to functional miRNA. Mature and microRNA sequence) are provided herein. The nucle miRNA molecules are partially complementary to one or otide sequence of the pri-miRNA precursor and its stem-loop more messenger RNA (mRNA) molecules, and their main components can vary widely. In one aspect a pre-miRNA function is to downregulate gene expression. MicroRNA molecule can be an isolated nucleic acid; including sequences have been described in publications such as, Lim, microRNA flanking sequences and comprising a stem-loop et al., Genes & Development, 17, p. 991-1008 (2003), Lim et structure and a microRNA sequence incorporated therein. A al Science 299, 1540 (2003), Lee and Ambros Science, 294, pri-miRNA molecule can be processed in vivo or in vitro to an 862 (2001), Lau et al., Science 294,858-861 (2001), Lagos intermediate species caller “pre-miRNA, which is further Quintana et al. Current Biology, 12, 735-739 (2002), Lagos processed to produce a mature miRNA. Quintana et al. Science 294, 853-857 (2001), and Lagos 0116. A "pre-miRNA or “pre-miR' means a non-coding Quintana et al, RNA, 9, 175-179 (2003), which are incorpo RNA having a hairpin structure, which is the product of rated by reference. Multiple microRNAs can also be incor cleavage of a pri-miR by the double-stranded RNA-specific porated into the precursor molecule. ribonuclease known as Drosha A. The term “pre-miRNA 0113. A “mature microRNA (mature miRNA) typically refers to the intermediate miRNA species in the processing of refers to a single-stranded RNA molecules of about 21-23 a pri-miRNA to mature miRNA, where pri-miRNA is pro nucleotides in length, which regulates gene expression. miR cessed to pre-miRNA in the nucleus, whereupon pre-miRNA NAs are encoded by genes from whose DNA they are tran translocates to the cytoplasm where it undergoes additional scribed, but miRNAs are not translated into protein; instead processing in the cytoplasm to form mature miRNA. Pre each primary transcript (pri-miRNA) is processed into a short miRNAs are generally about 70 nucleotides long, but can be stem-loop structure (precursor microRNA) before undergo less than 70 nucleotides or more than 70 nucleotides. ing further processing into a functional mature miRNA. Mature miRNA molecules are partially complementary to 0117 The term “miRNA precursor means a transcript one or more messenger RNA (mRNA) molecules, and their that originates from a genomic DNA and that comprises a main function is to down-regulate gene expression. As used non-coding, structured RNA comprising one or more miRNA throughout, the term “microRNA’ or “miRNA’ includes both sequences. For example, in certain embodiments a miRNA mature microRNA and precursor microRNA. precursor is a pre-miRNA. In certain embodiments, a miRNA 0114. A mature miRNA is produced as a result of a series precursor is a pri-miRNA of miRNA maturation steps: first a gene encoding the miRNA 0118. As used herein, the phrase “inhibit the expression is transcribed. The gene encoding the miRNA is typically of refers to atan least partial reduction of gene expression of much longer than the processed mature miRNA molecule: a gene encoding METTL3 in a cell treated with METTL3 miRNAs are first transcribed as primary transcripts or “pri inhibitor (e.g., an iRNA composition as described herein) miRNA with a cap and poly-A tail, which is subsequently compared to the expression of METTL3 in an untreated cell. US 2016/0264934 A1 Sep. 15, 2016

0119) The terms “silence.” “inhibit the expression of.” puter; a Supercomputer, a mainframe; a Supermini-computer; "down-regulate the expression of 'suppress the expression a mini-computer, a workstation; a micro-computer, a server; of” and the like, in so far as they refer to METTL3, herein an interactive television; a hybrid combination of a computer refer to the at least partial Suppression of the expression of a and an interactive television; and application-specific hard gene encoding METTL3, as manifested by a reduction of the ware to emulate a computer and/or software. A computer can amount of mRNA encoding METTL3 which can be isolated have a single processor or multiple processors, which can from or detected in a first cell or group of cells in which that operate in parallel and/or not in parallel. A computer also gene is transcribed and which has or have been treated Such refers to two or more computers connected together via a that the expression of METTL3 is inhibited, as compared to a network for transmitting or receiving information between second cell or group of cells substantially identical to the first the computers. An example of Such a computer includes a cell or group of cells but which has or have not been so treated distributed computer system for processing information via (control cells). The degree of inhibition is usually expressed computers linked by a network. in terms of 0.124. The term “computer-readable medium' can refer to any storage device used for storing data accessible by a com puter, as well as any other means for providing access to data (aka in control cells - mRNA in treated cells by a computer. Examples of a storage-device-type computer mRNA in control cells }x 100% readable medium include, but is not limited to: a magnetic hard disk; a floppy disk; an optical disk, such as a CD-ROM 0120 Alternatively, the degree of inhibition can be given and a DVD; DATs, a USB drive, a magnetic tape; a memory in terms of a reduction of a parameter that is functionally chip. A computer-readable medium is a tangible media not a linked to gene expression, e.g., the amount of protein encoded signal, and does not include carrier waves or other waveforms by a gene, or the number of cells displaying a certain pheno for data transmission. type. In principle, gene silencing can be determined in any 0.125. The term “software' is used interchangeably herein cell expressing, either constitutively or by genomic engineer with “program” and refers to prescribed rules to operate a ing, and by any appropriate assay. However, when a reference computer. Examples of software include: Software; code seg is needed in order to determine whether a given iRNA (or ments; instructions; computer programs; and programmed gene editing procedure) inhibits the expression of the gene logic. encoding METTL3 by a certain degree and therefore is 0.126 The term a “computer system’ can refer to a system encompassed by the technology described herein, the assays having a computer, where the computer comprises a com provided in the Examples below shall serve as such reference. puter-readable medium embodying Software to operate the 0121 For example, in certain instances, expression of computer. METTL3 is suppressed by at least about 10%, 15%, 20%, I0127. The phrase “displaying or outputting or providing 25%, 30%, 35%, 40%, 45%, or 50% by administration of an an “indication of the result of the m6A levels or peak inten iRNA featured herein. In some embodiments, a gene encod sities, or a prediction result, means that the results of a gene ing METTL3 in a cell is suppressed by at least about 60%, expression are communicated to a user using any medium, 70%, or 80% or more than 80% by administration of an iRNA Such as for example, orally, writing, visual display, etc., com or gene editing procedures (i.e., CRISPR/Cas9 or CRISPR/ puter readable medium or computer system. It will be clear to Cpf1) as featured herein. In some embodiments, a gene one skilled in the art that outputting the result is not limited to encoding METTL3 is suppressed by at least about 85%, 90%, outputting to a user or a linked external component(s), such as 95%, 98%, 99% or more by administration of an iRNA (or a computer system or computer memory, but can alternatively gene editing procedures) as described herein. or additionally be outputting to internal components, such as 0122 “Introducing into a cell, when referring to an any computer readable medium. It will be clear to one skilled iRNA, means facilitating or effecting uptake or absorption in the art that the various sample classification methods dis into the cell, as is understood by those skilled in the art. closed and claimed herein, can, but need not be, computer Absorption or uptake of an iRNA can occur through unaided implemented, and that, for example, the displaying or output diffusive or active cellular processes, or by auxiliary agents or ting step can be done by, for example, by communicating to a devices. The meaning of this term is not limited to cells in person orally or in writing (e.g., in handwriting). vitro; an iRNA can also be “introduced into a cell, wherein I0128. As used herein the term “comprising or “com the cell is part of a living organism. In such an instance, prises' is used in reference to compositions, methods, and introduction into the cell will include the delivery to the respective component(s) thereof, that are essential to the organism. For example, for in vivo delivery, iRNA can be invention, yet open to the inclusion of unspecified elements, injected into a tissue site or administered systemically. In vivo whether essential or not. delivery can also be by a beta-glucan delivery system, Such as I0129. As used herein the term “consisting essentially of those described in U.S. Pat. Nos. 5,032,401 and 5,607,677, refers to those elements required for a given embodiment. The and U.S. Publication No. 2005/0281781 which are hereby term permits the presence of additional elements that do not incorporated by reference in their entirety. In vitro introduc materially affect the basic and novel or functional character tion into a cell includes methods known in the art such as istic(s) of that embodiment of the invention. electroporation and lipofection. Further approaches are 0.130. The term “consisting of refers to compositions, described herein below or are known in the art. methods, and respective components thereof as described 0123. The term “computer can refer to any non-human herein, which are exclusive of any element not recited in that apparatus that is capable of accepting a structured input, description of the embodiment. processing the structured input according to prescribed rules, I0131. As used in this specification and the appended and producing results of the processing as output. Examples claims, the singular forms “a,” “an,” and “the include plural of a computer include: a computer, a general purpose com references unless the context clearly dictates otherwise. Thus US 2016/0264934 A1 Sep. 15, 2016

for example, references to “the method’ includes one or more activity, or combinations thereof. The inhibition of METTL3 methods, and/or steps of the type described herein and/or can be done using a variety of methods known in the art which will become apparent to those persons skilled in the art including, but not limited to, genome editing, gene silencing, upon reading this disclosure and so forth. disruption of normal METTL3 protein activity, and combi 0.132. Other than in the operating examples, or where oth nations thereof. erwise indicated, all numbers expressing quantities of ingre 0.138. In some embodiments, METTL3 can be inhibited in dients or reaction conditions used herein should be under the stem cells and/or progenitor cells before the cells are stood as modified in all instances by the term “about.” The expanded and/or enriched. In some embodiments, the stem term “about when used in connection with percentages can cells and/or progenitor cells are expanded and/or enriched meant 1%. The present invention is further explained in detail prior to METTL3 inhibition. by the following, including the Examples, but the scope of the 0.139. In some embodiments, METTL3 and/or METTL4 invention should not be limited thereto. can control all stages of differentiation. Accordingly, the tech 0133. It is understood that the detailed description and the nology described herein of inhibiting METTL3 and/or Examples that follow are illustrative only and are not to be METTL4 function or gene expression for a certain period of taken as limitations upon the scope of the invention. Various time can be used to prevent differentiation of any cell type, changes and modifications to the disclosed embodiments, and/or keep a cell in a particular state of differentiation. For which will be apparent to those of skill in the art, can be made example, without being limited to theory, if we wanted to without departing from the spirit and scope of the present increase the number of hair stem cells on the scalp for a period invention. Further, all patents, patent applications, and pub of time (i.e. to expand the number of hair stem cells), then the lications identified are expressly incorporated herein by ref a METTL3 and/or METTL4 inhibitor can be applied to the erence for the purpose of describing and disclosing, for skin stem cell population, (e.g., on the scalp for a period of example, the methodologies described in Such publications time), after which the expanded stem cell population can be that might be used in connection with the present invention. allowed to differentiate and repopulate the scalp with hair. Put These publications are provided solely for their disclosure another way, manipulation of METTL3 and/or METTL4 may prior to the filing date of the present application. All State allow the expansion of a number of human stem cells, includ ments as to the date or representation as to the contents of ing adult human stem cells), which is useful for expanding these documents are based on the information available to the Small populations of stem cells, as well as isolated Stem cell applicants and do not constitute any admission as to the populations (e.g., isolated from a human Subject, or rare stem correctness of the dates or contents of these documents. cell populations). In other words, the technology described I. Modification of METTL3 and/or METTL4 herein oftemporarily inhibiting METTL3 and/or METTL4 in 0134 Herein, the inventors have surprisingly discovered a stem cell population can be used for production of industrial that, in human ESCs, méA is present on transcripts encoding scale stem cells populations from a limited, or Small quantity multiple core pluripotency transcription factors, including of initial stem cell population. but not limited to Nanog and Sox2, and is also enriched in 3 untranslated regions at defined sequence motifs, and impor METTL3 Antagonists tantly marks unstable transcripts, including transcripts that 0140. In some embodiments, the inhibition of METTL3 need to be turned over upon differentiation. When human comprises contacting the population of stem cells and/or pro Mettl3 was knocked down inhESCs, the inventors discovered genitor cells with an antagonist of METTL3. As used herein, a decrease in méA levels on select target genes, a prolonged the term “antagonist of METTL3’ refers to any agent that Nanog expression upon differentiation, and impaired ESCs decreases the level and/or activity of METTL3. The term exit from self-renewal towards differentiation into several “antagonist of METTL3’ refers to an agent which decreases lineages in vitro and in vivo. Importantly, knockdown of the expression and/or activity METTL3 in a stem cell popu Mettl3 in hESC lead to the unexpected result of increased lation by at least 10%, e.g. by at least 10%, at least 20%, at self-renewal and proliferation ofhESC, and reduced ability to least 30%, at least 40%, at least 50%, at least 60%, at least differentiate along specific lineages, in particular endoderm 70%, at least 80%, or at least 90%. Examples of antagonists of lineages. Thus, modulation of Mettl3 and/or Mettl4 can be METTL3 include, but are not limited to, an inorganic mol used to promote self-renewal and prevent differentiation (by ecule, an organic molecule, a nucleic acid, a nucleic acid inhibition of Mettl3 and/or Mettl4), or alternatively promote analog or derivative, a peptide, a peptidomimetic, a protein, differentiation into specific cell lineages (e.g., by increasing an antibody oran antigen-binding fragment thereof, and com mA on specific RNA species in a stem cell population). binations thereof. 0135 A. Inhibition of METTL3 and/or METTL4. 0.141. In some embodiments, the antagonist of METTL3 is 0136. One aspect of the technology as disclosed herein a nucleic acid or a nucleic acid analog or derivative thereof, relates to, in part, methods, compositions and kits to maintain also referred to as a nucleic acid agent herein. As will be a stem cell population, Such as a human stem cell population, appreciated by those skilled in the art, the depiction of a single in an undifferentiated State, comprising contacting the stem Strand also defines the sequence of the complementary strand. cell population with an inhibitor of METTL3 or METTL4. In Thus, a nucleic acid also encompasses the complementary Some embodiments, the methods, compositions and kits as Strand of a depicted single strand. disclosed herein relate to methods to prevent a stem cell 0142. Without limitation, the nucleic acid agent can be population differentiating along an endoderm lineage. single-stranded or double-stranded. A single-stranded 0.137 Mettl3 inhibition in a stem cell population, e.g., a nucleic acid agent can have double-stranded regions, e.g., human stem cell population can be performed by one of where there is internal self-complementarity, and a double ordinary skill in the art, for example, inhibition of METTL3 Stranded nucleic acid agent can have single-stranded regions. can result in a decrease in METTL3 protein level, a decrease The nucleic acid can be of any desired length. In particular in METTL3 mRNA level, a decrease in METTL3 protein embodiments, nucleic acid can range from about 10 to 100 US 2016/0264934 A1 Sep. 15, 2016

nucleotides in length. In various related embodiments, or in addition, the single-stranded oligonucleotide can modu nucleic acid agents, single-stranded, double-stranded, and late the expression of a target sequence via RISC mediated triple-stranded, can range in length from about 10 to about 50 cleavage of the target sequence, i.e., the single-stranded oli nucleotides, from about 20 to about 50 nucleotides, from gonucleotide acts as a single-stranded RNAi agent. A 'single about 15 to about 30 nucleotides, from about 20 to about 30 stranded RNAi agent as used herein, is an RNAi agent which nucleotides in length. In some embodiments, a nucleic acid is made up of a single molecule. A single-stranded RNAi agent is from about 9 to about 39 nucleotides in length. In agent can include a duplexed region, formed by intra-strand Some other embodiments, a nucleic acid agent is at least 30 pairing, e.g., it can be, or include, a hairpin or pan-handle nucleotides in length. 0143. The nucleic acid agent can comprise modified Structure. nucleosides as known in the art. Modifications can alter, for 0.147. In some embodiments, the iRNA agent is a small example, the stability, solubility, or interaction of the nucleic hairpin RNA or short hairpin RNA (shRNA), a sequence of acid agent with cellular or extracellular components that RNA that makes a tight hairpin turn that can be used to silence modify activity. In certain instances, it can be desirable to target gene expression via RNA interference (RNAi). modify one or both strands of a double-stranded nucleic acid agent. In some cases, the two strands will include different 0148 Without wishing to be bound by theory, METTL3 modifications. In other instances, multiple different modifi (also known by aliases methyltransferase like 3.M6A, cations can be included on each of the strands. The various “mRNA (2'-O-methyladenosine-N(6)-)-methyltransferase'. modifications on a given strand can differ from each other, MT-A70, “N6-adenosine-methyltransferase 70 kDa sub and can also differ from the various modifications on other unit'. Spo8) is a member of methyltransferase like family. Strands. For example, one strand can have a modification, and The amino acid sequence of human METTL3 has Accession a different strand can have a different modification. In other number NP 062826.2 and the following sequence: cases, one strand can have two or more different modifica tions, and the another strand can include a modification that differs from the at least two modifications on the first strand. (SEQ ID NO: 2) 0144. In some embodiments, the antagonist of METTL3 is MSDTWSSIQAHKKQLDSLRERLORRRKODSGHLDLRNPEAALSPTFRSDS a single-stranded and double-stranded nucleic acid agent that is effective in inducing RNA interference, referred to as PWPTAPTSGGPKPSTASAWPELATDPELEKKLLHHLSDLALTLPTDAWSI siRNA, RNAi agent, or iRNA agent herein. iRNA agents suitable for inducing RNA interference in METTL3 are dis CLAISTPDAPATODGVESLLOKFAAOELIEVKRGLLODDAHPTLVTYADH closed, for example, in WO2013/019857, the contents of which are incorporated herein by reference in their entirety. SKLSAMMGAVAEKKGPGEVAGTWTGOKRRAEODSTTWAAFASSLVSGLNS RNAi Inhibitors of METTL3 SASEPAKEPAKKSRKHAASDVDLEIESLLNOOSTKEQOSKKVSOEILELL 0145. In one embodiment, the iRNA agent includes NTTTAKEOSIVEKFRSRGRAOVOEFCDYGTKEECMKASDADRPCRKLHFR double-stranded ribonucleic acid (dsRNA) molecules for inhibiting the expression of a gene encoding METTL3 or RIINKHTDESLGDCSFLNTCFHMDTCKYWHYEIDACMDSEAPGSKDHTPS METTL4 in a cell, e.g., a cell in a population of human stem cells and/or progenitor cells, where the dsRNA includes an OELALTOSVGGDSSADRLFPPOWICCDIRYLDVSILGKFAVVMADPPWDI antisense Strand having a region of complementarity which is complementary to at least a part of an mRNA formed in the HMELPYGTLTDDEMRRLNIPVLODDGFLFLWVTGRAMELGRECLNLWGYE expression of a gene encoding METTL3 or METTL4, and where the region of complementarity is 30 nucleotides or less RVDEIIWWKTNOLORIIRTGRTGHWLNHGKEHCLVGVKGNPOGFNOGLDC in length, generally 19-24 nucleotides in length, and where DVIVAEWRSTSHKPDEIYGMIERLSPGTRKIELFGRPHNWOPNWITLGNO the dsRNA, upon contact with or introduction to a cell expressing the gene METTL3 or METTL4 inhibits the LDGIHLLDPDVVARFK ORYPDGIISKPKNL expression of the gene by at least 10% as assayed by, for example, a PCR or branched DNA (bDNA)-based method, or 0149 Inhibition of the METTL3 gene can be by gene by a protein-based method. Such as by immunoassay or West silencing RNAi molecules according to methods commonly ern blot. Expression of METTL3 or METTL4 in cell culture known by a skilled artisan. For example, a gene silencing can be assayed by measuring METTL3 or METTL4 mRNA levels, respectively, such as by b)NA or TaqManassay, or by siRNA oligonucleotide duplexes targeted specifically to measuring protein levels, such as by immunofluorescence human METTL3 (GenBank No: NM 019852.4) can readily analysis, using, for example, Western Blotting or flow cyto be used to knockdown METTL3 expression. METTL3 metric techniques. mRNA can be successfully targeted using siRNAs; and other 0146 In some embodiments, the iRNA agent is an anti siRNA molecules may be readily prepared by those of skill in sense oligonucleotide. One of skill in the artis well aware that the art based on the known sequence of the target mRNA. To single-stranded oligonucleotides can hybridize to a comple avoid doubt, the sequence of a human METTL3 is provided mentary target sequence and prevent access of the translation at, for example, GenBank Accession Nos. NM 019852.4 machinery to the target RNA transcript, thereby preventing (SEQID NO: 1). Accordingly, in avoidance of any doubt, one protein synthesis. The single-stranded oligonucleotide can of ordinary skill in the art can design nucleic acid inhibitors, also hybridize to a complementary RNA and the RNA target such as RNAi (RNA silencing) agents to mRNA nucleic acid can be subsequently cleaved by an enzyme such as RNase H sequence of human METTL3 of NM 019852.4 (SEQID NO: and thus preventing translation of target RNA. Alternatively, 1) which is as follows:

US 2016/0264934 A1 Sep. 15, 2016

0150. Without wishing to be bound by theory, METTL4 - Continued (also known by aliases methyltransferase like 4, FLJ23017 and HsT661) is a member of methyltransferase like family. LPIPDHKLIVSVPCTLHSHKPPLAEWLKDYIKPDGEYLELFARNLOPGWT The amino acid sequence of human METTL4 has Accession SWGNEVLKFOHVDYFIAVESGS number NP 073751.3 and the following sequence: 0151. Similarly, inhibition of the METTL4 gene can be by gene silencing RNAi molecules according to methods com (SEO ID NO: 7) monly known by a skilled artisan. For example, a gene silenc MSV WHOLSAGWLLDHLSFINKINYOLHOHHEPCCRKKEFTTSVHFESLOM ing siRNA oligonucleotide duplexes targeted specifically to human METTL4 (GenBank No: NM_022840.4) can readily DSWSSSGWCAAFIASDSSTKPENDDGGNYEMFTRKEWFRPELFDWTKPYI be used to knockdown METTL4 expression. METTL4 TPAVHKECOOSNEKEDLMNGVKKEISISIIGKKRKRCVWFNOGELDAMEY mRNA can be successfully targeted using siRNAs; and other siRNA molecules may be readily prepared by those of skill in HTKIRELILDGSLQLIQEGLKSGFLYPLFEKODKGSKPITLPLDACSLSE the art based on the known sequence of the target mRNA. To LCEMAKHLPSLNEMEHOTLOLVEEDTSVTEODLFLRVVENNSSFTKVITL avoid doubt, the sequence of a human METTL4 is provided at, for example, GenBank Accession Nos. NM 022840.4 MGOKYLLPPKSSFLLSDISCMOPLLNYRKTFDVIVIDPPWONKSVKRSNR (SEQID NO:8). Accordingly, in avoidance of any doubt, one of ordinary skill in the art can design nucleic acid inhibitors, YSYLSPLOIOOIPIPKLAAPNCLLWTWVTNROKHLRFIKEELYPSWSVEV such as RNAi (RNA silencing) agents to mRNA nucleic acid WAEWHWWKITNSGEFVFPLDSPHKKPYEGLILGRVOEKTALPLRNADVNV sequence of human METTL4 of NM_022840.4 (SEQID NO: 8) which is as follows:

(SEQ ID NO: 8) atgcgaccgc ct cqtcgctg. galaggctg.cg totggtc.gc gcc.ca.gctgc git caccc.cag 61 galactggggit Ctgtgggcca gtgtggcc.gt ct ctacgaag actggcacga CCC ctaaagt 121 tagg toggaa gacctgtggg cagcttgagc gcc.gaggagt gcc.ctgaacg Ctcaacticgc 181 cctggaaacg tttitt.ccgta Cagcaa.catg gC9gcgc.cca tact Ctta gaaaaggaga 241 aagctttitt C totgtggact ggaaggggca tttitt catga t cactattta gatgggtgct 3O1 gttitt catga ggaga.gtctg ggaaggcggc gtc.cgcttitt Ctgacaaggg aagaggctac 361 tttgtc.cttt taaggattica atgact tcct gacttggagg atgtggacct agtggctaga 421 cc caaggacc aaa.gcaagaa gtcgtggggg gccCaggaag acaggaggat Cacattggga 481 titccagacat aagat caggt titta accc cc tittggccaaa ttittggctga aaatgttgaa 541 titat caactic tdaaattaaa aagaaagttt at attaaaac attgcaattt to cittagaat 6O1 ttctgtatat attaa.catca togaatgataa attct citt.ca atgtgcatgt caggtttittg 661 tacttgtata toaaatctat citgtgtgitat galagtg tatgtttattgaaa tacaagatat 721 ttaagaagct gatctggaaa gttggattitt cattctagtt cotaattic cc agaggcttitt 781 ttaaaggaag ggaatgtctg tdgtacacca gttgtcagct gggtggittac tigatcatct 841 ttcttittatic aacaagataa act atcaact tcaccagoat catgaacctt gttgcc.gtaa 901 aaaggagttc act acttctg tt cactittga gt ct cittcaa atggattctg tdtcct cotc 961 tdgagtctgt gctgcattta ttgcttctga ct ct tccact aagc.ca.gaga atgatgatgg 1021 aggaaattat galaatgttca cacgaaaatt totttitt.cga cctdaactgt ttgatgtcac 1081 caaacctitat ataact coag ctgttcataa agaatgc.cag caaagtaatgaaaaggaaga 1141 totgatgaat ggtgttaaaa aagaaatcto catttctatt attgggaaga agcgtaaaag 1201 atgtgttgtt ttcaat Caag gtgaattgga tigctatggaa taccatacaa agat caggga 1261 gotgattittg gatggat citt tacagttgat coaggaaggit citcaaaagtg gttittctitta 1321 to cactttitt gaaaaacagg acaagggtag taagcc catt actttaccac ttgacgc.ctg 1381 cagtttgtca gaattatgtgaaatggcaaa go atttgcct tctctgaatgaaatggaaca 1441 to agacatta caattggtgg aagaggatac atctgttaca gaac aggatt tatttittgcg US 2016/0264934 A1 Sep. 15, 2016 19

- Continued SO agttgttgaa aacaacticta gctttacaaa agtgattact ttaatgggac agaaatacct

56 gctaccaccg aaaag cagtt ttct t t t at c tgacatttct tgitatgcaac Cact totaaa.

62 Ctataggaaa acatttgatg taattgttgat agatccacca tggcagaaca aatcagttaa

68 aagaagtaat agg tacagtt atttgtcacc cc togcaaata cagdaaatac Citat CCCtaa

74 attggctgct c caaactgtc ttcttgttac ttgggtgacc aatagacaga agcaccitacg

ttittataaag gaagaactitt atc cct cittg gtctgttggag gtagttgctg agtggcactg

86 ggtaaaaata accaatt cag gagaatttgt gttcc catta gatt citccac acaaaaag.cc

92 Ctacgaaggt cittatactgg ggagggttca agaaaaaact gctic taccat tgaggaatgc

98 agatgtaaac gtgct cocca titccagacca Calaattaatt gt cagogtgc cc td tact ct

2O4. to act Cacat aagccaccgc ttgctgaggit tittaaaagac tacatcaagc Cagatgggga

210 at atttggag ttgtttgctic gaaatttaca gcCaggttgg actagttggg gcaatgaagt

216 tot Calaattit Cagcatgtgg attatttitat tgctgttggag totggaagct gactatgat c

222 ttgattaaag tagtggttt C ttcattgttt CCtcaccact tt to cottaa. ttctaagtica

228 titt titt tatt ttgttaccala cc cat attct tagaatataa acaggacttg ttttitt toag

234 taagggacca gaagtgact a gcc tt catgt aattittalaga tgaattittac ttgagttgca

24 O Ctalacattct atgttattot agactataca aattalagtgg taag cagtta taalagacggc

246 aagac catgc tattgaaaaa gttcagaaaa catacaccgt. ggaccagagg tottaat cott

252 atc tatggat gtgttttgttg tgacc catac agtgttgtaa aaaacaCtta gaac cattat

258 totaaaaaat ggggg tattt cacattaaag to cagatttic tgcttcttitt taaa catcag

264 aggct Ctggc tacacagagg cctttgttct tt cotgg cat cagt ctgcag gacCaagcgg

27 O tggtggctica Cttgggalaga gccttgttgct ct coactittg ccacagtacc actgccacca

276 tgctgct cac titatgtcatc cacttggcc c ttgtatgacc tgaatttgca acct ctdgta

282 tactgttatg ttctggagaa aat attcaaa. gatctgccaa at actgcatt agtatactga

288 gtttatacag catttttgta gggittittaaa ttgcattcaa ggtcactitt c caag cactitt

294 ctggttittgc ttgtttittct agaagaaaat gaaaagctat to Cttataat aaac atggca

3 OO gcaagtaaac agtgtgattg tgaaaaaaat attatttata gattittctac aaataaatat

ttgtc.tacca agtaaaatat tttgactgaa atgattctitt gaaatgcata ttgatttatt

312 atgtattgac tttittaaaaa. ttgaggtata attitt Cacala aatt CtcCaa ttitt cagtgt

3.18 caaatticagt gaattittgaa aaCatatata cagttgtctg totgccacag tgat catgat

324 acagaac act t tott tacco tgaaaacttic to attitt to c ttittgcagtic aatc.ccctgc

330 to citat cott ggc.ccctggc aaacactggit ttgctttcta to attagttc tgctgtttga

336 gaattt cata taaatggaat catgcaatgt gtaatctatt gtgcctggct tottt cacgt.

342 agcattttga gaaaa.gcatt tatact attt acagattgtt gacaaatatt tatic Cactaa

348 gtaaaatgtt agactgaaat gattctittga caagcttgcc aatt tactga ttttgttcaaa

3.54 gaaaaatatg ttatttittga agtttgttca to ctittgagt gtgtgagtat agtat cagag

360 gcttaattitt gtatt tatgg agctatt cta acttgttatt taaaaggaaa aagg tattaa

366 actitgaagca aactt. Ct. Cat gat ct caaaa aaaaaaaaaa. aaaa. US 2016/0264934 A1 Sep. 15, 2016 20

0152. In some embodiments, the shRNA for targeting deliver shRNAS to cells in vitro and in vivo as described in METTL3 has a nucleotide sequence of that is substantially Rubinson, D. A., et al. (2003) Nat. Genet. 33:401–406) and complementary to at least part of the target sequence GCTG Stewart, S. A., et al. (2003) RNA 9:493-501). The RNA CACTTCAGACGAATTAT (SEQID NO:3) or a fragment of interference agents, e.g., the siRNAs or shRNAs, can be intro at least 10, at least 15, at least 20, or at least 25 contiguous duced along with components that perform one or more of the nucleotides thereof. In some embodiments, the siRNA to following activities: enhance uptake of the RNA interfering METTL3 is GCUACCGUAUGGGACAUUA (SEQID NO: agents, e.g., siRNA, by the cell, inhibit annealing of single 4) or a fragment of at least 10, at least 15, at least 20, or at least Strands, stabilize single strands, or otherwise facilitate deliv 25 contiguous nucleotides thereof. ery to the target cell and increase inhibition of the target gene, 0153. In some embodiments, an antagonist of METTL3 is e.g., METTL3. The dose of the particular RNA interfering an antigomir to a miRNA (also referred to as “miR). miRs agent will be in an amount necessary to effect RNA interfer that have been shown to target METTL3 include, but are not ence, e.g., post translational gene silencing (PTGS), of the limited to; miR-423-3p and miR-1226-3p, miR-330-5p, miR particular target gene, thereby leading to inhibition of target 668-3p, miR-1224-5p, and miR-1981, as disclosed in Chenet gene expression or inhibition of activity or level of the protein al., (Cell Stem Cell, 2015; 16(3), 289-301; “méA RNA encoded by the target gene. Methylation Is Regulated by MicroRNAs and Promotes Reprogramming to Pluripotency'). In some embodiments, an Oligonucleotide Modifications inhibitor of METTL3 is an antigomir to miR-423-3p and/or to 0.155. In some embodiments, RNAi agents that inhibit miR-1226-3p, i.e., an anti-miR-423-3p and/or anti-miR METTL3 for use in the aspects of the invention as disclosed 1226-3p, which decreases the METTL3 interaction or bind herein can include oligonucleotide modifications. Unmodi ing on the mRNA. In some embodiments, an anti-miR-423 fied oligonucleotides can be less than optimal in Some appli 3p comprises ACUGAGGGGCCUCAGACCGAGCU (SEQ cations, e.g., unmodified oligonucleotides can be prone to ID NO: 5) or a fragment of at least 10, at least 15, at least 20, degradation by e.g., cellular nucleases. However, chemical or at least 24 contiguous nucleotides thereof. In some modifications to one or more of the subunits of oligonucle embodiments, an anti-miR-1226-3p comprises CUAGG otide can confer improved properties, e.g., can render oligo GAACACAGGGCUGGUGA (SEQID NO: 6) or a fragment nucleotides more stable to nucleases. Typical oligonucleotide of at least 10, at least 15, at least 20, or at least 24 contiguous modifications can include one or more of: (i) alteration, e.g., nucleotides thereof. replacement, of one or both of the non-linking phosphate 0154 In general, any method of delivering a nucleic acid oxygens and/or of one or more of the linking phosphate molecule can be adapted for use with the nucleic acid agents oxygens in the phosphodiester interSugar linkage; (ii) alter described herein. Methods of delivering RNA interference ation, e.g., replacement, of a constituent of the ribose Sugar, agents, e.g., an siRNA, or vectors containing an RNA inter e.g., of the 2' hydroxyl on the ribose Sugar; (iii) wholesale ference agent, to the target cells, e.g., stem cells and/or pro replacement of the phosphate moiety with “dephospho” link genitor cells, for uptake include injection of a composition ers; (iv) modification or replacement of a naturally occurring containing the RNA interference agent, e.g., an siRNA, or base with a non-natural base; (V) replacement or modification directly contacting the cell with a composition comprising an of the ribose-phosphate backbone, e.g. peptide nucleic acid RNA interference agent, e.g., an siRNA. In another embodi (PNA); (vi) modification of the 3' end or 5' end of the oligo ment, RNA interference agent, e.g., an siRNA may be nucleotide, e.g., removal, modification or replacement of a injected directly into any blood vessel. Such as vein, artery, terminal phosphate group or conjugation of a moiety, e.g., Venule or arteriole, via, e.g., hydrodynamic injection or cath conjugation of a ligand, to either the 3' or 5' end of oligonucle eterization. Administration may be by a single injection or by otide; and (vii) modification of the Sugar, e.g., six membered two or more injections. The RNA interference agent is deliv rings. ered in a pharmaceutically acceptable carrier. One or more 0156 The terms replacement, modification, alteration, RNA interference agents may be used simultaneously. In one and the like, as used in this context, do not imply any process embodiment, specific cells are targeted with RNA interfer limitation, e.g., modification does not mean that one must ence, limiting potential side effects. The method can use, for start with a reference or naturally occurring ribonucleic acid example, a complex or a fusion molecule comprising a cell and modify it to produce a modified ribonucleic acid bur targeting moiety and an RNA interference binding moiety rather modified simply indicates a difference from a naturally that is used to deliver RNA interference effectively into cells. occurring molecule. As described below, modifications, e.g., For example, an antibody-protamine fusion protein when those described herein, can be provided as asymmetrical mixed with siRNA, binds siRNA and selectively delivers the modifications. siRNA into cells expressing an antigen recognized by the 0157. A modification described herein can be the sole antibody, resulting in silencing of gene expression only in modification, or the sole type of modification included on those cells that express the antigen. The siRNA or RNA multiple nucleotides, or a modification can be combined with interference-inducing molecule binding moiety is a protein or one or more other modifications described herein. The modi a nucleic acid binding domain or fragment of a protein, and fications described herein can also be combined onto an oli the binding moiety is fused to a portion of the targeting gonucleotide, e.g. different nucleotides of an oligonucleotide moiety. The location of the targeting moiety can be either in have different modifications described herein. the carboxyl-terminal or amino-terminal end of the construct 0158. Described herein are iRNA agents that inhibit the or in the middle of the fusion protein. A viral-mediated deliv expression of METTL3. In one embodiment, the iRNA agent ery mechanism can also be employed to deliver siRNAs to includes double-stranded ribonucleic acid (dsRNA) mol cells in vitro and in vivo as described in Xia, H. et al. (2002) ecules for inhibiting the expression of METTL3 in a cell ex Nat Biotechnol 20(10): 1006). Plasmid- or viral-mediated vivo, e.g., in HSPCs ex vivo obtained from blood or UCB, delivery mechanisms of shRNA may also be employed to where the dsRNA includes an antisense strand having a US 2016/0264934 A1 Sep. 15, 2016

region of complementarity which is complementary to at art as further discussed below, e.g., by use of an automated least a part of an mRNA formed in the expression of DNA synthesizer, such as are commercially available from, METTL3, and where the region of complementarity is 30 for example, BioSearch, Applied Biosystems, Inc. In one nucleotides or less in length, generally 19-24 nucleotides in embodiment, a gene encoding METTL3 is a human gene. In length, and where the dsRNA, upon contact with or introduc another embodiment the gene encoding METTL3 is a mouse tion to a cell expressing the gene encoding METTL3, inhibits or rat gene. the expression of the gene by at least 10% as assayed by, for 0162. In one aspect, a dsRNA will include at least two example, a PCR or branched DNA (bDNA)-based method, or nucleotide sequences, a sense and an anti-sense sequence, by a protein-based method. Such as by immunoassay or West wherein the sense strand is SEQID NO: 1. In this aspect, one ern blot. Expression of METTL3 in cell culture, such as a of the two sequences is complementary to the other of the two stem cell population, can be assayed by measuring mRNA sequences, with one of the sequences being Substantially levels of METTL3, such as by blNA or TaqManassay, or by complementary to a sequence of the METTL3 mRNA. As measuring protein levels, such as by immunofluorescence described elsewhere herein and as known in the art, the analysis, using, for example, Western Blotting or flow cyto complementary sequences of a dsRNA can also be contained metric techniques. as self-complementary regions of a single nucleic acid mol 0159. A dsRNA includes two RNA strands that are ecule, as opposed to being on separate oligonucleotides. complementary to hybridize to form a duplex structure under 0163 The skilled person is well aware that dsRNAs hav conditions in which the dsRNA will be used. One strand of a ing a duplex structure of between 20 and 23, but specifically dsRNA (the antisense Strand) includes a region of comple 21, base pairs have been hailed as particularly effective in mentarity that is substantially complementary, and generally inducing RNA interference (Elbashir et al., EMBO 2001, fully complementary, to a target sequence. The target 20:6877-6888). However, others have found that shorter or sequence can be derived from the sequence of METTL3 longer RNA duplex structures can be effective as well. In the mRNA, e.g., SEQ ID NO: 1 as disclosed herein. The other embodiments, a dsRNAs described hereincan include at least Strand (the sense Strand) includes a region that is complemen one Strand of a length of minimally 21 nt. It can be reasonably tary to the antisense strand, such that the two strands hybrid expected that shorter duplexes having one of the sequences of ize and form a duplex structure when combined under Suit Tables 2-7 minus only a few nucleotides on one or both ends able conditions. Generally, the duplex structure is between 15 can be similarly effective as compared to the dsRNAs and 30 inclusive, more generally between 18 and 25 inclusive, described above. Hence, dsRNAs having apartial sequence of yet more generally between 19 and 24 inclusive, and most at least 15, 16, 17, 18, 19, 20, or more contiguous nucleotides generally between 19 and 21 base pairs in length, inclusive. from one of the sequences of SEQID NO:3 or 4, and differing Similarly, the region of complementarity to the target in their ability to inhibit the expression of a gene encoding sequence is between 15 and 30 inclusive, more generally METTL3 by not more than 5, 10, 15, 20, 25, or 30%inhibition between 18 and 25 inclusive, yet more generally between 19 from a dsRNA comprising the full sequence, are contem and 24 inclusive, and most generally between 19 and 21 plated according to the technology described herein. nucleotides in length, inclusive. In some embodiments, the 0164. While a target sequence is generally 15-30 nucle dsRNA is between 15 and 20 nucleotides in length, inclusive, otides in length, there is wide variation in the suitability of and in other embodiments, the dsRNA is between 25 and 30 particular sequences in this range for directing cleavage of nucleotides in length, inclusive. As the ordinarily skilled per any given target RNA. Various software packages and the son will recognize, the targeted region of an RNA targeted for guidelines set out herein provide guidance for the identifica cleavage will most often be part of a larger RNA molecule, tion of optimal target sequences for any given gene target, but often an mRNA molecule. Where relevant, a “part of an an empirical approach can also be taken in which a “window' mRNA target is a contiguous sequence of an mRNA target of or “mask of a given size (as a non-limiting example, 21 sufficient length to be a substrate for RNAi-directed cleavage nucleotides) is literally or figuratively (including, e.g., in (i.e., cleavage through a RISC pathway). dsRNAs having silico) placed on the target RNA sequence to identify duplexes as short as 9 base pairs can, under some circum sequences in the size range that can serve as target sequences. stances, mediate RNAi-directed RNA cleavage. Most often a By moving the sequence “window” progressively one nucle target will be at least 15 nucleotides in length, preferably otide upstream or downstream of an initial target sequence 15-30 nucleotides in length. location, the next potential target sequence can be identified, 0160 One of skill in the art will also recognize that the until the complete set of possible sequences is identified for duplex region is a primary functional portion of a dsRNA, any given target size selected. This process, coupled with e.g., a duplex region of 9 to 36, e.g., 15-30 base pairs. Thus, in systematic synthesis and testing of the identified sequences one embodiment, to the extent that it becomes processed to a (using assays as described herein or as known in the art) to functional duplex of e.g., 15-30 base pairs that targets a identify those sequences that perform optimally can identify desired RNA for cleavage, an RNA molecule or complex of those RNA sequences that, when targeted with an iRNA RNA molecules having a duplex region greater than 30 base agent, mediate the best inhibition of target gene expression. pairs is a dsRNA. Thus, an ordinarily skilled artisan will Thus, it is contemplated that further optimization of inhibi recognize that in one embodiment, then, an miRNA is a tion efficiency can be achieved by progressively “walking the dsRNA. In another embodiment, a dsRNA is not a naturally window' one nucleotide upstream or downstream of the occurring miRNA. In another embodiment, an iRNA agent given sequences to identify sequences with equal or better useful to target expression of METTL3 is not generated in the inhibition characteristics. target cell by cleavage of a larger dsRNA. 0.165. Further, it is contemplated that for any sequence 0161. A dsRNA as described herein can further include identified by a sequence identifier NO: 3 or 4, can be further one or more single-stranded nucleotide overhangs. The optimization could be achieved by systematically either add dsRNA can be synthesized by standard methods known in the ing or removing nucleotides to generate longer or shorter US 2016/0264934 A1 Sep. 15, 2016 22 sequences and testing those and sequences generated by not limited to RNAs containing modified backbones or no walking a window of the longer or shorter size up or down the natural internucleoside linkages. RNAS having modified target RNA from that point. Again, coupling this approach to backbones include, among others, those that do not have a generating new candidate targets with testing for effective phosphorus atom in the backbone. For the purposes of this ness of iRNAS based on those target sequences in an inhibi specification, and as sometimes referenced in the art, modi tion assay as known in the art or as described herein can lead fied RNAs that do not have a phosphorus atom in their inter to further improvements in the efficiency of inhibition. Fur nucleoside backbone can also be considered to be oligo ther still, Such optimized sequences can be adjusted by, e.g., nucleosides. In particular embodiments, the modified RNA the introduction of modified nucleotides as described herein will have a phosphorus atom in its internucleoside backbone. or as known in the art, addition or changes in overhang, or 0168 Modified RNA backbones include, for example, other modifications as known in the art and/or discussed phosphorothioates, chiral phosphorothioates, phospho herein to further optimize the molecule (e.g., increasing rodithioates, phosphotriesters, aminoalkylphosphotriesters, serum stability or circulating half-life, increasing thermal methyl and other alkyl phosphonates including 3'-alkylene stability, enhancing transmembrane delivery, targeting to a phosphonates and chiral phosphonates, phosphinates, phos particular location or cell type, increasing interaction with phoramidates including 3'-amino phosphoramidate and ami silencing pathway enzymes, increasing release from endo noalkylphosphoramidates, thionophosphoramidates, thion Somes, etc.) as an expression inhibitor. oalkylphosphonates, thionoalkylphosphotriesters, and 0166 An iRNA as described herein can contain one or boranophosphates having normal 3'-5' linkages. 2'-5' linked more mismatches to the target sequence. In one embodiment, analogs of these, and those) having inverted polarity wherein an iRNA as described herein contains no more than 3 mis the adjacent pairs of nucleoside units are linked 3'-5' to 5'-3' or matches. If the antisense strand of the iRNA contains mis 2'-5' to 5'-2'. Various salts, mixed salts and free acid forms are matches to a target sequence, it is preferable that the area of also included. mismatch not be located in the center of the region of comple 0169. Representative U.S. patents that teach the prepara mentarity. If the antisense strand of the iRNA contains mis tion of the above phosphorus-containing linkages include, but matches to the target sequence, it is preferable that the mis are not limited to, U.S. Pat. Nos. 3,687,808; 4,469,863; 4,476, match be restricted to be within the last 5 nucleotides from 301:5,023,243; 5,177, 195; 5,188,897; 5,264,423: 5,276,019; either the 5' or 3' end of the region of complementarity. For 5,278.302: 5,286,717; 5,321,131; 5,399,676; 5,405,939; example, for a 23 nucleotide iRNA agent RNA strand which 5,453,496; 5,455,233; 5,466,677; 5,476,925; 5,519,126; is complementary to a region of a gene encoding METTL3, 5,536,821: 5,541,316; 5,550,111: 5,563,253: 5,571,799; the RNA strand generally does not contain any mismatch 5,587,361; 5,625,050; 6,028, 188: 6,124,445; 6,160,109: within the central 13 nucleotides. The methods described 6,169,170; 6,172,209; 6,239,265; 6,277,603; 6,326, 199: herein or methods known in the art can be used to determine 6,346,614; 6,444,423: 6,531,590; 6,534,639; 6,608,035: whether an iRNA containing a mismatch to a target sequence 6,683,167; 6,858,715; 6,867,294; 6,878,805; 7,015,315; is effective in inhibiting the expression of METTL3. Consid 7,041,816; 7,273,933; 7,321,029; and U.S. Pat. RE39464, eration of the efficacy of iRNAs with mismatches in inhibit each of which is herein incorporated by reference ing expression of METTL3 is important, especially if the (0170 Modified RNA backbones that do not include a particular region of complementarity to the METTL3 gene is phosphorus atom therein have backbones that are formed by known to have polymorphic sequence variation within the short chain alkyl or cycloalkyl internucleoside linkages, population. mixed heteroatoms and alkyl or cycloalkyl internucleoside 0167. In one embodiment, at least one end of a dsRNA has linkages, or one or more short chain heteroatomic or hetero a single-stranded nucleotide overhang of 1 to 4, generally 1 or cyclic internucleoside linkages. These include those having 2 nucleotides. dsRNAs having at least one nucleotide over morpholino linkages (formed in part from the Sugar portion of hang have unexpectedly Superior inhibitory properties rela a nucleoside); siloxane backbones; Sulfide, Sulfoxide and Sul tive to their blunt-ended counterparts. In yet another embodi fone backbones; formacetyl and thioformacetyl backbones: ment, the RNA of an iRNA, e.g., a dsRNA, is chemically methylene formacetyl and thioformacetylbackbones; alkene modified to enhance stability or other beneficial characteris containing backbones; Sulfamate backbones; methylene tics. The nucleic acids featured in the technology described imino and methylenehydrazino backbones; Sulfonate and Sul herein can be synthesized and/or modified by methods well fonamide backbones; amide backbones; and others having established in the art, such as those described in “Current mixed N, O, S and CH. Sub.2 component parts. protocols in nucleic acid chemistry.” Beaucage, S. L. et al. 0171 Representative U.S. patents that teach the prepara (Edrs.), John Wiley & Sons, Inc., New York, N.Y., USA, tion of the above oligonucleosides include, but are not limited which is hereby incorporated herein by reference. Modifica to, U.S. Pat. Nos. 5,034,506; 5,166,315; 5,185,444; 5,214, tions include, for example, (a) end modifications, e.g., 5' end 134: 5,216,141, 5,235,033; 5,64,562: 5,264,564; 5,405,938; modifications (phosphorylation, conjugation, inverted link 5.434,257; 5,466,677; 5,470,967: 5489,677: 5,541,307; ages, etc.) 3' end modifications (conjugation, DNA nucle 5,561.225; 5,596,086; 5,602,240; 5,608,046; 5,610,289: otides, inverted linkages, etc.), (b) base modifications, e.g., 5,618,704; 5,623,070; 5,663,312; 5,633,360; 5,677,437; and, replacement with stabilizing bases, destabilizing bases, or 5,677,439, each of which is herein incorporated by reference. bases that base pair with an expanded repertoire of partners, 0.172. In other embodiments, suitable RNA mimetics suit removal of bases (abasic nucleotides), or conjugated bases, able are contemplated for use in iRNAs, in which both the (c) Sugar modifications (e.g., at the 2' position or 4' position) Sugar and the internucleoside linkage, i.e., the backbone, of or replacement of the Sugar, as well as (d) backbone modifi the nucleotide units are replaced with novel groups. The base cations, including modification or replacement of the phos units are maintained for hybridization with an appropriate phodiester linkages. Specific examples of RNA compounds nucleic acid target compound. One Such oligomeric com useful in the embodiments described herein include, but are pound, an RNA mimetic that has been shown to have excel US 2016/0264934 A1 Sep. 15, 2016 lent hybridization properties, is referred to as a peptide Optionally, the nucleic acid comprises a targeting sequence of nucleic acid (PNA). In PNA compounds, the sugar backbone miR-103, miR-105, miR-107 and miR-155. Such miRNA ofan RNA is replaced with an amide containing backbone, in binding nucleic acids are referred to as miRNA decoys or particular an aminoethylglycine backbone. The nucleobases miRNA sponges. For example, mRNAs with multiple copies are retained and are bound directly or indirectly to aza nitro of the miRNA target can be engineered into the 3'UTR of the gen atoms of the amide portion of the backbone. Represen mRNA creating an miRNA “sponge.” The miRNA inhibitors tative U.S. patents that teach the preparation of PNA com function by sequestering the cellular miRNAs away from the pounds include, but are not limited to, U.S. Pat. Nos. 5,539, mRNAs that normally would be targeted by them. Such 082; 5,714,331; and 5,719,262, each of which is herein nucleic acid decoys can be delivered, e.g., by viral vectors, incorporated by reference. Further teaching of PNA com and expressed to inhibit the activity of any of miR-103, miR pounds can be found, for example, in Nielsen et al., Science, 105, miR-107 and miR-155. 1991, 254, 1497-1500. 0176 Ribozymes are nucleic acid molecules that are 0173 Antisense molecules or antisense oligonucleotides capable of catalyzing a chemical reaction, either intramolecu (ASOs) are designed to interact with a target nucleic acid larly or intermolecularly. Typically, ribozymes cleave RNA molecule through either canonical or non-canonical base or DNA substrates. There are a number of different types of pairing. The interaction of the antisense molecule and the ribozymes that catalyze chemical reactions which are based target molecule is designed to promote the destruction of the on ribozymes found in natural systems, such as hammerhead target molecule through, for example, RNAseH mediated ribozymes, and hairpin ribozymes. There are also a number of RNA-DNA hybrid degradation. Alternatively the antisense ribozymes that are not found in natural systems, but which molecule is designed to interrupt a processing function that have been engineered to catalyze specific reactions. See, for normally would take place on the target molecule. Such as example, U.S. Pat. Nos. 5,807,718, and 5,910,408. Represen transcription or replication. Antisense molecules can be tative examples of how to make and use ribozymes to catalyze designed based on the sequence of the target molecule. a variety of different reactions can be found in, for example, Numerous methods for optimization of antisense efficiency U.S. Pat. Nos. 5,837,855, 5,877,022, 5,972,704, 5,989,906, by finding the most accessible regions of the target molecule and 6,017,756. exist. See for example, Vermeulen et al., RNA 13: 723-730 (0177 Small Molecule Inhibitors of METTL3 (2007) and in WO2007/095387 and WO 2008/036825; Yue, 0178. In some embodiments, the antagonist of METTL3 is et al., Curr. Genomics, 10(7):478-92 (2009) and Lennox a small molecule. As used herein, the term “small molecule' Gene Ther. 18(12): 1111-20 (2011), which are incorporated refers to a natural or synthetic molecule having a molecular by reference herein in their entireties. mass of less than about 5kD, organic or inorganic compounds 0.174 Thus, antisense molecules that inhibit METTL3 having a molecular mass of less than about 5 kD, less than and/or METTL4 can be designed and made using standard about 2 kD, or less than about 1 kD. nucleic acid synthesis techniques or obtained from a commer 0179. In some embodiments, the antagonist of METTL3 cial entity, e.g., Regulus Therapeutics (San Diego, Calif.). can have an IC50 of less than 50 uM, e.g., the antagonist of Optionally, the antisense molecule is single-stranded and METTL3 can have an IC50 of from about 50 uM to about 5 comprises RNA and/or DNA. Optionally, the backbone of the nM, or less than 5 nM. For example, in some embodiments, an molecule is modified by various chemical modifications to antagonist of METTL3 has an IC50 of from about 50 uM to improve the in vitro and in vivo stability and to improve the in about 25uM, from about 25uM to about 10 uM, from about vivo delivery of antisense molecules. Modifications of anti 10 uM to about 5 uM, from about 5uM to about 1 uM, from sense molecules include, but are not limited to. 2'-O-methyl about 1 uM to about 500 nM, from about 500 nM to about 400 modifications, 2'-O-methyl modified ribose sugars with ter nM, from about 400 nM to about 300 nM, from about 300 nM minal phosphorothioates and a cholesterol group at the 3' end, to about 250 nM, from about 250 nM to about 200 nM, from 2'-O-methoxyethyl (2'-MOE) modifications, 2'-fluoro modi about 200 nM to about 150 nM, from about 150 nM to about fications, and 2',4' methylene modifications (referred to as 100 nM, from about 100 nM to about 50 nM, from about 50 “locked nucleic acids” or LNAs). Thus, inhibitory nucleic nM to about 30 nM, from about 30 nM to about 25 nM, from acids include, for example, modified oligonucleotides (2'-O- about 25 nM to about 20 nM, from about 20 nM to about 15 methylated or 2'-O-methoxyethyl), locked nucleic acids nM, from about 15 nM to about 10 nM, from about 10 nM to (LNA; see, e.g. Valoczi et al., Nucleic Acids Res. 32(22): e175 about 5 nM, or less than about 5 nM. (2004)), morpholino oligonucleotides (see, e.g., Kloosterman 0180. In some embodiments, the antagonist of METTL3 et al., PLoS Biol 5(8):e203 (2007)), peptide nucleic acids can be an anti-METTL3 antibody molecule or an antigen (PNAS), PNA-peptide conjugates, and LNA/2'-O-methylated binding fragment thereof. Suitable antibodies include, but are oligonucleotide mixmers (see, e.g., Fabiani and Gait, RNA not limited to, polyclonal, monoclonal, chimeric, humanized, 14:336-46 (2008)). Optionally, the antisense molecule is an recombinant, single chain, F. F. F. R. and F2 frag antagomir. Antagomirs are oligonucleotides comprising ments. In some embodiments, neutralizing antibodies can be 2'-O-methyl modified ribose sugars with terminal phospho used as anti-METTL3 antibodies. Antibodies are readily rothioates and a cholesterol group at the 3' end. raised in animals such as rabbits or mice by immunization 0175 miRs comprising LNA (typically identified in capi with the antigen. Immunized mice are particularly useful for tals, DNA in lower case, complete phosphorothioate back providing sources of B cells for the manufacture of hybrido bone, where a capital C denotes LNA methylcytosine, are mas, which in turn are cultured to produce large quantities of described in Lanford et al., Science 327(5962: 198-201 monoclonal antibodies. In general, an antibody molecule (2010), which is incorporated by reference herein in its obtained from humans can be classified in one of the immu entirety. See also Elmen et al., Nature 452:896-9 (2008); and noglobulin classes IgG, IgM, IgA, IgE and Ig), which differ Elmenet al., Nucleic Acids Res. 36:1153-1162 (2008), which from one another by the nature of the heavy chain present in are incorporated by reference herein in their entireties. the molecule. Certain classes have subclasses as well. Such as US 2016/0264934 A1 Sep. 15, 2016 24

IgG, IgG, and others. Furthermore, in humans, the light 0186 Gene Editing chain may be a kappa chain or a lambda chain. Reference 0187 While it is preferred that METTL3 and/or METTL4 herein to antibodies includes a reference to all such classes, inhibition in a stem cell population is reversible or transient, Subclasses and types of human antibody species. thereby allowing the cell to differentiate along a lineage at a 0181 Antibodies provide high binding avidity and unique later timepoint, in some embodiments, the inhibition of specificity to a wide range of target antigens and haptens. METTL3 comprises contacting the population of stem cells Monoclonal antibodies useful in the practice of the methods and/or progenitor cells with a genome-editing agent for tar disclosed herein include whole antibody and fragments geted excision of the METTL3 and/or METTL4 gene from at thereof and are generated in accordance with conventional least one stem cell. As used herein, the term 'genome-editing techniques, such as hybridoma Synthesis, recombinant DNA agent” refers to a compound or a composition that can modify techniques and protein synthesis. a nucleotide sequence in the genome of an organism. In some 0182. The METTL3 polypeptide, or a portion or fragment embodiments, the genome-editing agent can excise a specific thereof, can serve as an antigen, and additionally can be used nucleotide sequence from the target genome. In some as an immunogen to generate antibodies that immunospecifi embodiments, the genome-editing agent can disrupt the func cally bind the antigen, using standard techniques for poly tion of a specific nucleotide sequence, for example, by break clonal and monoclonal antibody preparation. Preferably, the ing one or more bonds in the sequence. Genome editing can antigenic peptide comprises at least 10 amino acid residues, be achieved through processes Such as nuclease-mediated or at least 15 amino acid residues, or at least 20 amino acid mutagenesis, chemical mutagenesis, radiation mutagenesis, residues, or at least 30 amino acid residues. or meganuclease-mediated mutagenesis. 0183 Useful monoclonal antibodies and fragments can be 0188 In some embodiment, the genome-editing agent derived from any species (including humans) or can be comprises a DNA-binding member and a nuclease, wherein formed as chimeric proteins which employ sequences from the DNA-binding member localizes the nuclease to a target more than one species. Human monoclonal antibodies or site which is then cut by the nuclease. "humanized murine antibody can also be used in accordance 0189 In some embodiments, the genome-editing agent is with the present invention. For example, murine monoclonal a CRISPR/Cas system. In some embodiments, the CRISPR/ antibody can be “humanized by genetically recombining the Cas system is CRISPR/Cas9, which is disclosed in U.S. Pat. nucleotide sequence encoding the murine Fv region (i.e., No. 8,697.359 and US Application 2015/0291966, which is containing the antigen binding sites) or the complementarily corporated herein in its entirety by reference. In alternative determining regions thereof with the nucleotide sequence embodiments, the CRISPR/Cas system is CRISPR/Cpf1, as encoding a human constant domain region and an Fc region. disclosed in Zetsche et al., 2015; Cell 163(3): 759-777 “Cpf1 Humanized targeting moieties are recognized to decrease the Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR immunoreactivity of the antibody or polypeptide in the host Cas System”, which is incorporated herein in its entirety by recipient, permitting an increase in the half-life and a reduc reference. The CRISPR/Casis an engineered nuclease system tion in the possibility of adverse immune reactions in a man based on a bacterial system that can be used for genome ner similar to that disclosed in European Patent Application engineering. It is based on part of the adaptive immune No. 0.411,893 A2. The murine monoclonal antibodies should response of many bacteria and archea. When a virus or plas preferably be employed in humanized form. Antigenbinding mid invades a bacterium, segments of the invader's DNA are activity is determined by the sequences and conformation of converted into CRISPR RNAs (crRNA) by the immune the amino acids of the six complementarily determining response. This crRNA then associates, through a region of regions (CDRs) that are located (three each) on the light and partial complementarity, with another type of RNA called heavy chains of the variable portion (Fv) of the antibody. The tracrRNA to guide the Cas9 or Cpfl nuclease to a region 25-kDa single-chain Fv (sclv) molecule, composed of a vari homologous to the crRNA in the target DNA called a able region (VL) of the light chain and a variable region (VH) “protospacer”. Cas9 cleaves the DNA to generate blunt ends of the heavy chain joined via a short peptide spacer sequence, at the double-strand break (DSB) at sites specified by a is one option for minimizing the size of an antibody agent. 20-nucleotide guide sequence contained within the crRNA ScFVS provide additional options for preparing and Screening transcript. Cas9 requires both the crRNA and the tracrRNA a large number of different antibody fragments to identify for site specific DNA recognition and cleavage. This system those that specifically bind. Techniques have been developed has now been engineered such that the crRNA and tracrRNA to display ScFv molecules on the Surface offilamentous phage can be combined into one molecule (the “single guide RNA), that contain the gene for the scFv. Schv molecules with a and the crRNA equivalent portion of the single guide RNA broad range or antigenic-specificities can be present in a can be engineered to guide the Cas9 nuclease to target any single large pool of ScFV-phage library. desired sequence (see Jineketal (2012) Science 337, p. 816 0184 Chimeric antibodies are immunoglobin molecules 821, Jinek et al. (2013), eLife 2:e(00471, and David Segal, characterized by two or more segments or portions derived (2013) eLife 2:e()0563). In alternative embodiments, the from different animal species. Generally, the variable region CRISPR/Cpf1 system is used, where Cpfl requires only one of the chimeric antibody is derived from a non-human mam RNA template in the gene-editing complex and cleaves the malian antibody, Such as murine monoclonal antibody, and DNA resulting in a 5 nt staggered cut distal to the 5' T-rich the immunoglobin constant region is derived from a human PAM, resulting in sticky ends (rather than bluntends as when immunoglobin molecule. Preferably, both regions and the Cas9 is used). In some embodiments, a replacement gene can combination have low immunogenicity as routinely deter be used in the place of a METTL3 gene, e.g., a marker gene or mined. in some embodiments, ancell death gene which is operatively 0185. Anti-METTL3 antibodies are commercially avail linked to an inducible promoter, thereby allowing specific able through vendors such as Thermo Scientific, Sigma Ald inducable cell death of the modified (i.e., METTL3 gene rich, Atlas Antibodies, and R&D Systems. deleted) cells with a drug to turn on expression from the US 2016/0264934 A1 Sep. 15, 2016 inducible promoter, should it be necessary to eliminate Such 0.195 Additionally, Cas proteins have been developed modified cells after they are transplanted into a subject. which comprise mutations in their cleavage domains to ren Accordingly, the CRISPR/Cas (cas9 or cpf1) system can be der them incapable of inducing a DSB, and instead introduce engineered to create a double strand break (i.e., blunt ends a nick into the target DNA. In particular, the Cas nuclease (i.e., using cas9)) or sticky ends (i.e., using cpf1)) at a desired comprises two nuclease domains, the HNH and RuvC-like, target in a genome, and repair of the double Strand break can for cleaving the sense and the antisense Strands of the target be influenced by the use of repair inhibitors to cause an DNA, respectively. The Cas nuclease can thus be engineered increase in error prone repair. Such that only one of the nuclease domains is functional, thus (0190. There are at least three types of CRISPR/Cas sys creating a Cas nickase. tems which all incorporate RNAs and Cas proteins. Types I (0196. The Cas9 related CRISPR/Cas system comprises and III both have Cas endonucleases that process the pre two RNA non-coding components: tracrRNA and a pre-cr crRNAs, that, when fully processed into crRNAs, assemble a RNA array containing nuclease guide sequences (spacers) multi-Cas protein complex that is capable of cleaving nucleic interspaced by identical direct repeats (DRs). To use a acids that are complementary to the crRNA. The Type II CRISPR/Cas system to accomplish genome editing, both CRISPR (exemplified by Cas9) is one of the most well char functions of these RNAs must be present (see Cong et al. acterized systems. The Cas9 protein has at least two nuclease (2013) Sciencexpress 1/10.1126/science 1231143). In some domains: one nuclease domain is similar to a HNH endonu embodiments, the tracrRNA and pre-crRNAs are supplied via clease, while the other resembles a Ruv endonuclease separate expression constructs or as separate RNAS. In other domain. The HNH-type domain appears to be responsible for embodiments, a chimeric RNA is constructed where an engi cleaving the DNA strand that is complementary to the crRNA neered mature crRNA (conferring target specificity) is fused while the Ruv domain cleaves the non-complementary to a tracrRNA (supplying interaction with the Cas9) to create Strand. a chimeric cr-RNA-tracrRNA hybrid (also termed a single 0191 In some embodiments, Cas protein can be a “func guide RNA). tional derivative' of a naturally occurring Cas protein. As (0197) The Cpf1 system, is related to the CRISPR/Cas9 used herein, a “functional derivative' of a native sequence system, although the Cpfl protein is very different from Cas9. polypeptide is a compound having a qualitative biological but is present in some bacteria with CRISPR. Cpf1 and Cas9 property in common with a native sequence polypeptide. work differently, in that Cas9 requires two RNA molecules to “Functional derivatives” include, but are not limited to, frag cut DNA; Cpf1 needs only one. The proteins also cut DNA at ments of a native sequence and derivatives of a native different places, offering researchers more options when sequence polypeptide and its fragments, provided that they selecting a site to edit. Cpfl also cuts DNA in a different way. have a biological activity in common with a corresponding Cas9 cuts both strands in a DNA molecule at the same posi native sequence polypeptide. A biological activity contem tion, leaving behind blunt ends. In contrast, Cpfl leaves one plated herein is the ability of the functional derivative to Strand longer than the other, creating a sticky end, reducing hydrolyze a DNA substrate into fragments. The term "deriva chances of abnormal/random DNA being inserted at the tive' encompasses both amino acid sequence variants of cleavage site, and also allowing better control of DNA to be polypeptide, covalent modifications, and fusions thereof. inserted at the Cpf1 cleavage site. Cuts left by Cas9 tend to be repaired by Sticking the two ends back together, that can leave 0.192 As used herein, “Cas polypeptide' encompasses a errors. In contrast, Cpfl Sticky end cleavage allows more full-length Cas polypeptide, an enzymatically active frag accurate and frequent insertions. ment of a Cas polypeptide, and enzymatically active deriva 0.198. In some embodiments, the genome-editing agent is tives of a Cas polypeptide or fragment thereof. Suitable a ZFN. A ZFN generally comprises a zinc finger DNA bind derivatives of a Cas polypeptide or a fragment thereof ing protein and a DNA-cleavage domain. As used herein, a include, but are not limited to, mutants, fusions, covalent “zinc finger DNA binding protein’ or “zinc finger DNA bind modifications of Cas protein or a fragment thereof. ing domain is a protein, or a domain within a larger protein, 0193 Cas proteins and Cas polypeptides can be obtained that binds DNA in a sequence-specific manner through one or from a cell or synthesized chemically or by a combination of more Zinc fingers, which are regions of amino acid sequence these two procedures. The cell can be a cell that naturally within the binding domain whose structure is stabilized produces Cas protein, or a cell that naturally produces Cas through coordination of a zinc ion. The term zinc finger DNA protein and is genetically engineered to produce the endog binding protein is often abbreviated as Zinc finger protein enous Cas protein at a higher expression level or to produce a (ZFP). Zinc finger binding domains can be “engineered to Cas protein from an exogenously introduced nucleic acid, bind to a predetermined nucleotide sequence. Non-limiting which encodes a Cas that is same or different from the endog examples of methods for engineering Zinc finger proteins are enous Cas. The cell can be a cell that does not naturally design and selection. A designed Zinc finger protein is a produce Cas protein and is genetically engineered to produce protein not occurring in nature whose design/composition a Cas protein. results principally from rational criteria. Rational criteria for (0194 The CRISPR/Cas system can also be used to inhibit design include application of Substitution rules and comput gene expression. Lei et al. (2013) Cell 152(5):1173-1183) erized algorithms for processing information in a database have shown that a catalytically dead Cas9 lacking endonu storing information of existing ZFP designs and binding data. clease activity, when coexpressed with a guide RNA, gener 0199. In some embodiments, the genome-editing agent is ates a DNA recognition complex that can specifically inter a TALEN. As used herein, the term “transcription activator fere with transcriptional elongation, RNA polymerase like effector nuclease' or "TAL effector nuclease' or binding, or transcription factor binding. This system, called “TALEN” refers to a class of artificial restriction endonu CRISPR interference (CRISPRi), can efficiently repress cleases that are generated by fusing a TAL effector DNA expression of targeted genes. binding domainto a DNA cleavage domain. In some embodi US 2016/0264934 A1 Sep. 15, 2016 26 ments, the TALEN is a monomeric TALEN that can cleave (Promega) for expression in mammalian cell lines such as double stranded DNA without assistance from another CHO, COS, HEK-293, Jurkat, and MCF-7; replication TALEN. The term “TALEN” is also used to refer to one or incompetent adenoviral vector vectors p Adeno X, p.A.d5F35, both members of a pair of TALENs that are engineered to pLP-Adeno-X-CMV (Clontech(R), p Ad/CMV/V5-DEST, work together to cleave DNA at the same site. TALENs that pAd-DEST vector (InvitrogenTM Inc.) for adenovirus-medi work together can be referred to as a left-TALEN and a ated gene transfer and expression in mammalian cells; right-TALEN, which references the handedness of DNA. pLNCX2, pIXSN, and pLAPSN retrovirus vectors for use 0200. In some embodiments, a combination of genome with the Retro-XTM system from Clontech for retroviral-me editing agents can be used. diated gene transfer and expression in mammalian cells; 0201 In some embodiments, a CRISPR/Cas, TALEN, or pLentia/V5-DESTTM, plentiG/V5-DESTTM, and pLentiG.2/ ZFN molecule (e.g. a peptide and/or peptide/nucleic acid V5-GW/lacZ (INVITROGENTM Inc.) for lentivirus-medi complex) can be introduced into a cell, e.g. a cultured stem ated gene transfer and expression in mammalian cells; aden cell or progenitor cell, such that the presence of the CRISPR/ ovirus-associated virus expression vectors such as paAV Cas, TALEN, or ZFN molecule is transient and will not be MCS and pAAV-IRES-hrGFP for adeno-associated virus detectable in the progeny that cell. In some embodiments, a mediated gene transfer and expression in mammalian cells. nucleic acid encoding a CRISPR/Cas, TALEN, or ZFN mol 0205 The vector may or may not be incorporated into the ecule (e.g. a peptide and/or multiple nucleic acids encoding cell genome. The constructs may include viral sequences for the parts of a peptide/nucleic acid complex) can be introduced transfection, if desired. Alternatively, the construct may be into a cell, e.g. a cultured stem cell or progenitor cell. Such incorporated into vectors capable of episomal replication, that the nucleic acid is present in the cell transiently and the e.g., EPV and EBV vectors. nucleic acid encoding the CRISPR/Cas, TALEN, or ZFN 0206. When one or more ZFPs, TALENs, CRISPR/Cas molecule as well as the CRISPR/Cas, TALEN, or ZFN mol molecules are introduced into the cell, the ZFPs, TALENs, ecule itself will not be detectable in the progeny of that cell. In CRISPR/Cas molecules can be carried on the same vector or some embodiments, a nucleic acid encoding a CRISPR/Cas, on different vectors. When multiple vectors are used, each TALEN, or ZFN molecule (e.g. a peptide and/or multiple vector can comprise a sequence encoding one or multiple nucleic acids encoding the parts of a peptide/nucleic acid ZFPs, TALENs, CRISPR/Cas molecules. complex) can be introduced into a cell, e.g. a cultured stem 0207. Non-viral based delivery methods can also be used cellor progenitor cell. Such that the nucleic acid is maintained to introduce nucleic acids encoding engineered ZFPs, in the cell (e.g. incorporated into the genome) and the nucleic CRISPR/Cas molecules, and/or TALENs into cells (e.g., stem acid encoding the CRISPR/Cas, TALEN, or ZFN molecule cells and/or progenitor cells). Methods of non-viral delivery and/or the CRISPR/Cas, TALEN, or ZFN molecule will be of nucleic acids include electroporation, Sonoporation, lipo detectable in the progeny of that cell. fection, microinjection, biolistics, Virosomes, liposomes, 0202 The genome-editing agents can be delivered to a immunoliposomes, polycation or lipid-nucleic acid conju target cell by any suitable means. In some embodiments, the gates, naked DNA, mRNA, artificial virions, and agent-en genome-editing agent (e.g., CRISPR/Cas, TALEN, or ZFN) hanced uptake of DNA. is a protein and can be delivered by any suitable means for 0208. Additional exemplary nucleic acid delivery systems delivering a protein into a cell Such as electroporation, include those provided by AmaxaR Biosystems (Cologne, Sonoporation, microinjection, liposomal delivery, and nano Germany), Maxcyte, Inc. (Rockville, Md.). BTX Molecular material-based delivery. Delivery Systems (Holliston, Mass.) and Copernicus Thera 0203 The genome-editing agent can also be encoded by a peutics Inc. (see for example U.S. Pat. No. 6,008.336). Lipo nucleotide sequence. In some embodiments, the genome fection is described in e.g., U.S. Pat. No. 5,049.386, U.S. Pat. editing agent can be delivered using a vector known to those No. 4,946,787; and U.S. Pat. No. 4,897,355) and lipofection of ordinary skill in the art. Viral vector systems which can be reagents are sold commercially (e.g., TransfectamTM and utilized in the present invention include, but are not limited to, LipofectinTM). Cationic and neutral lipids that are suitable for (a) adenovirus vectors; (b) retrovirus vectors; (c) adeno-as efficient receptor-recognition lipofection of polynucleotides Sociated virus vectors; (d) herpes simplex virus vectors; (e) include those of Felgner, WO 91/17424, WO 91/16024. SV40 vectors; (f) polyomavirus vectors; (g) papilloma virus 0209 More details about genome-editing techniques can vectors: (h) picornavirus vectors: (i) pox virus vectors such as be found, for example, in “Targeted Genome Editing Using an orthopox, e.g., vaccinia virus vectors oravipox, e.g. canary Site-Specific Nucleases: ZFNs, TALENs, and the CRISPR/ pox or fowlpox: (i) a helper-dependent orgutless adenovirus; Cas9 System” by Takashi Yamamoto (Springer, 2015), the (k) a lentiviral vector; (1) adenovirus vectors; and (m) herp contents of which are incorporated herein by reference for the esvirus vectors. See, also, U.S. Pat. Nos. 6,534,261; 6,607, teaching on genome editing. 882; 6,824,978; 6,933,113; 6,979,539; 7,013,219; and 7,163, 0210 B. Activation of METTL3 and/or METTL4 824, each of which are incorporated by reference herein in 0211. Other aspects of the technology described herein their entireties. Replication-defective viruses can also be relates to methods, compositions and kits to promote a stem advantageous. cell population to differentiate along an endoderm lineage, 0204. In some embodiments, a plasmid expression vector for example, by activation of mA methyltransferases, such can be used. Plasmid expression vectors include, but are not as METTL3 and/or METTL4 or by increasing mA RNA limited to, pcDNA3.1, pET vectors (NovagenR), pGEX vec levels in the stem cell population. Methods to increase activ tors (GE Life Sciences), and pMAL vectors (New England ity of METTL3 and/METTL4 are well known in the art, and labs. Inc.) for protein expression in E. coli host cell Such as include, for example, increasing or overexpressing METTL3 BL21, BL21 (DE3) and AD494(DE3)pLysS. Rosetta (DE3), and/or METTL4 in a population of stem cells, e.g., human and Origami (DE3) ((NovagenR); the strong CMV promoter stem cells. In some embodiments, the human stem cells are based pcDNA3.1 (InvitrogenTM Inc.) and pClneo vectors pluripotent stem cells. In alternative embodiments, methods US 2016/0264934 A1 Sep. 15, 2016 27 to increase m6A levels of target genes in stem cell populations “plasmids” which refer to circular double stranded DNA include, but are not limited to inhibitors of fat-mass and loops which, in their vector form are not bound to the chro obesity associated protein (FTO) and ALKBH5 (which are mosome, and typically comprise entities for stable or tran both mA demethylases). Inhibition of FTO and/or ALKBH5 sient expression or the encoded DNA. Other expression vec by inhibition of gene expression or function would increase tors can be used in the methods as disclosed herein for mA levels in the target genes and thus increase differentia example, but are not limited to, plasmids, episomes, bacterial tion of the stem cell population). artificial , yeast artificial chromosomes, bacte 0212 Methods to inhibition FTO and/or ALKBH5 are riophages or viral vectors, and Such vectors can integrate into known by persons of ordinary skill in the art and encom the host’s genome or replicate autonomously in the particular passed for use in the methods to promote differentiation of a cell. A vector can be a DNA or RNA vector. Other forms of stem cell population as disclosed herein. In some embodi expression vectors known by those skilled in the art which ments, an inhibitor of FTO is rhein, which inhibits FTO with serve the equivalent functions can also be used, for example an IC50 value of 30 uM using méA-containing 15-mer ss self replicating extrachromosomal vectors or vectors which RNA as Substrate and a high-performance liquid chromatog integrates into a host genome. raphy (HPLC)-based assay (as disclosed in Scott L. et al. A 0219 Vectors include, but are not limited to, plasmids, Genome-Wide Association Study of Type 2 Diabetes in Finns cosmids, phagemids, viruses, other vehicles derived from Detects Multiple Susceptibility Variants. Science 2007, 316, viral or bacterial sources that have been manipulated by the 1341-1345). Additionally, in some embodiments, an inhibitor insertion or incorporation of the nucleic acid sequences for of FTO is meclofenamic acid (MA), which is a highly selec producing the microRNA, and free nucleic acid fragments tive inhibitor of FTO (IC50:8 uM) over ALKBH5 (no inhi which can be attached to these nucleic acid sequences. Viral bition) using HPLC-based assays (Huang Y., et al. Meclofe and retroviral vectors are a preferred type of vector and namic Acid Selectively Inhibits FTO Demethylation of méA include, but are not limited to, nucleic acid sequences from Over ALKBH5. Nucleic Acids Res, 2015; 43(1):373-84). the following viruses: retroviruses, such as: Moloney murine 0213. In some embodiments, the method relates to leukemia virus; Murine stem cell virus, Harvey murine sar increasing the levels of the human METTL3 protein corre coma virus; marine mammary tumor virus; Rous sarcoma sponding to SEQ ID NO:2, or a portion or functional frag virus; adenovirus; adeno-associated virus; SV40-type ment thereof which is capable of increasing mA on RNA viruses; polyoma viruses; Epstein-Barr viruses; papilloma species in human stem cell populations to a similar level. viruses; herpes viruses; vaccinia viruses; polio viruses; and (e.g., at least 80%) of the level of méA that occurs with the RNA viruses such as any retrovirus. One of skill in the art can wild-type human METTL3 protein of SEQID NO: 2. In some readily employ other vectors known in the art. embodiments, human METTL3 mRNA of SEQID NO: 1 is 0220 Viral vectors are generally based on non-cytopathic introduced into a human stem cell population. eukaryotic viruses in which non-essential genes have been 0214. In some embodiments, the method relates to replaced with the nucleic acid sequence of interest. Non increasing the levels of the human METTL4 protein corre cytopathic viruses include retroviruses, the life cycle of sponding to SEQ ID NO:7, or a portion or functional frag which involves reverse transcription of genomic viral RNA ment thereof which is capable of increasing mA on RNA into DNA with subsequent proviral integration into host cel species in human stem cell populations to a similar level. lular DNA. (e.g., at least 80%) of the level of méA that occurs with the 0221 Retroviruses have been approved for human gene wild-type human METTL4 protein of SEQID NO: 7. In some therapy trials. Genetically altered retroviral expression vec embodiments, human METTL4 mRNA of SEQID NO: 8 is tors have general utility for the high efficiency transduction of introduced into a human stem cell population. nucleic acids in Viva. Standard protocols for producing rep 0215. In some embodiments, methods to increase m6A in lication-deficient retroviruses (including the steps of incor cell populations comprises contacting the cell population poration of exogenous genetic material into a plasmid, trans with a miR, such as, miR-423-3p and miR-1226-3p, which fection of a packaging cell lined with plasmid, production of increases METTL3 interaction with mRNA transcripts. recombinant retroviruses by the packaging cell line, collec 0216 Delivery of Nucleic Acid Inhibitors of METTL3/ tion of viral particles from tissue culture media, and infection METTL4 or mRNAs Expressing METTL3/METTL4 to a of the target cells with viral particles) are provided in Krie StemCell Population. gler, M., “Gene Transfer and Expression, A Laboratory 0217. In some embodiments, a nucleic inhibitor to Manual.” W.H. Freeman Co., New York (1990) and Murry, E. METTL3 and/or METTL4, or a nucleic acid encoding J. Ed. “Methods in Molecular L. Biology. Vol. 7, Humana METTL3 and/or METTL4 protein or a functional fragment Press, Inc., Cliffton, N.J. (1991). thereof is delivered into a specific target cell, e.g., a stem cell 0222. In some embodiments the “in vivo expression ele population using a vector and gene expression systems which ments' are any regulatory nucleotide sequence, such as a are known by persons of ordinary skill in the art. promoter sequence or promoter-enhancer combination, 0218. The term “vectors’ refers to a nucleic acid molecule which facilitates the efficient expression of the nucleic acid to capable of transporting another nucleic acid to which it has produce the microRNA. The in vivo expression element may, been linked; a plasmid is a species of the genus encompassed for example, be a mammalian or viral promoter, Such as a by “vector'. The term “vector' typically refers to a nucleic constitutive or inducible promoter and/or a tissue specific acid sequence containing an origin of replication and other promoter. Examples of which are well known to one of ordi entities necessary for replication and/or maintenance in a host nary skill in the art. Constitutive mammalian promoters cell. Vectors capable of directing the expression of genes include, but are not limited to, polymerase promoters as well and/or nucleic acid sequence to which they are operatively as the promoters for the following genes: hypoxanthine phos linked are referred to herein as “expression vectors'. In gen phoribosyltransferase (HPTR), adenine deaminase, pyruvate eral, expression vectors of utility are often in the form of kinase, and beta.-actin. Exemplary viral promoters which US 2016/0264934 A1 Sep. 15, 2016 28 function constitutively in eukaryotic cells include, but are not enhancers, nuclear localization signals, endosmolytic pep limited to, promoters from the simian virus, papilloma virus, tides, etc. Preferably, these elements are derived from the adenovirus, human immunodeficiency virus (HIV), Roussar tissue of interest to aid specificity. In general, the in vivo comavirus, cytomegalovirus, the long terminal repeats (LTR) expression element shall include, as necessary, 5' non-tran of moloney leukemia virus and other retroviruses, and the scribing and 5' non-translating sequences involved with the thymidine kinase promoter of herpes simplex virus. Other initiation of transcription. They optionally include enhancer constitutive promoters are known to those of ordinary skill in sequences or upstream activator sequences. the art. Inducible promoters are expressed in the presence of 0225 Mammalian expression vectors can comprise an ori an inducing agent and include, but are not limited to, metal gin of replication, a Suitable promoter, site, inducible promoters and steroid-regulated promoters. For transcriptional termination sequences, and 5' flanking non example, the metallothionein promoter is induced to promote transcribed sequences. DNA sequences derived from the transcription in the presence of certain metal ions. Other SV40 viral genome, for example, SV40 origin, early pro inducible promoters are known to those of ordinary skill in moter, enhancer, splice, and polyadenylation sites may be the art. used to provide the required non-transcribed genetic ele 0223 Examples of tissue-specific promoters include, but mentS. are not limited to, the promoter for creatine kinase, which has 0226. Other described ways to deliver a nucleic inhibitor been used to direct expression in muscle and cardiac tissue to METTL3 and/or METTL4, or a nucleic acid encoding and immunoglobulin heavy or light chain promoters for METTL3 and/or METTL4 protein or a functional fragment expression in B cells. Other tissue specific promoters include thereof) as disclosed herein is via vectors, such as lentiviral the human Smooth muscle alpha-actin promoter. Exemplary constructs, and introducing molecules into cells using elec tissue-specific expression elements for the liver include but troporation. In some embodiments, FIV lentivirus vectors are not limited to HMG-COA reductase promoter, sterol which are based on the feline immunodeficiency virus (FIV) regulatory element 1, phosphoenolpyruvate carboxy kinase retrovirus and the HIV lentivirus vector system, which is (PEPCK) promoter, human C-reactive protein (CRP) pro based on the human immunodeficiency virus (HIV), are used. moter, human glucokinase promoter, cholesterol L 7-alpha Alternatively, electroporation is also useful in the present hydroylase (CYP-7) promoter, beta-galactosidase alpha-2.6 invention, although it is generally only used to deliver siR sialylkansferase promoter, insulin-like growth factor binding NAs into cells in vitro. protein (IGFBP-1) promoter, aldolase B promoter, human 0227. In one embodiment, a vector encoding an nucleic transferrin promoter, and collagen type I promoter. Exem inhibitor to METTL3 and/or METTL4, or a nucleic acid plary tissue-specific expression elements for the prostate encoding METTL3 and/or METTL4 protein or a functional include but are not limited to the prostatic acid phosphatase fragment thereof is delivered into a specific target cell, e.g., a (PAP) promoter, prostatic secretory protein of 94 (PSP 94) stem cell population. Nucleic acid sequences necessary for promoter, prostate specific antigen complex promoter, and expression in mammalian cells often utilize a combination of human glandular kallikrein gene promoter (hgt-1). Exem one or more promoters, enhancers, and termination and poly plary tissue-specific expression elements for gastric tissue adenylation signals. includebut are not limited to the human H+/K+-ATPase alpha 0228. One can also use localization sequences to deliver Subunit promoter. Exemplary tissue-specific expression ele an inhibitor to METTL3 and/or METTL4, or a nucleic acid ments for the pancreas include but are not limited to pancre encoding METTL3 and/or METTL4 protein or a functional atitis associated protein promoter (PAP), elastase 1 transcrip fragment thereofintracellularly to a cell compartment of inter tional enhancer, pancreas specific amylase and elastase est. Typically, the delivery system first binds to a specific enhancer promoter, and pancreatic cholesterol esterase gene receptor on the cell. Thereafter, the targeted cell internalizes promoter. Exemplary tissue-specific expression elements for the delivery system, which is bound to the cell. For example, the endometrium include, but are not limited to, the uteroglo membrane proteins on the cell Surface, including receptors bin promoter. Exemplary tissue-specific expression elements and antigens can be internalized by receptor mediated for adrenal cells include, but are not limited to, cholesterol endocytosis after interaction with the ligand to the receptor or side-chain cleavage (SCC) promoter. Exemplary tissue-spe antibodies. (Dautry-Varsat, A., et al., Sci. Am. 250:52-58 cific expression elements for the general nervous system (1984)). This endocytic process is exploited by the present include, but are not limited to, gamma-gamma enolase (neu delivery system. Because this process may damage inhibitor ron-specific enolase, NSE) promoter. Exemplary tissue-spe to METTL3 and/or METTL4, or a nucleic acid encoding cific expression elements for the brain include, but are not METTL3 and/or METTL4 protein or a functional fragment limited to, the neurofilament heavy chain (NF-H) promoter. thereof, for example a RNAi or siRNA agent, or anti-miR as Exemplary tissue-specific expression elements for lympho it is being internalized, it may be desirable to use a segment cytes include, but are not limited to, the human CGL-1/ containing multiple repeats of the RNA interference-induc granzyme B promoter, the terminal deoxy transferase (TdT), ing molecule of interest. One can also include sequences or lambda 5. VpreB, and lek (lymphocyte specific tyrosine pro moieties that disrupt endosomes and lysosomes. See, e.g., tein kinase p561ck) promoter, the humans CD2 promoter and Cristiano, R.J., et al., Proc. Natl. Acad. Sci. USA 90:11548 its 3' transcriptional enhancer, and the human NK and T cell 11552 (1993); Wagner, E., et al., Proc. Natl. Acad. Sci. USA specific activation (NKG5) promoter. Exemplary tissue-spe 89:6099-6103 (1992); Cotten, M., et al., Proc. Natl. Acad. cific expression elements for the colon include, but are not Sci. USA 89:6094-6098 (1992). limited to, pp60c-Src tyrosine kinase promoter, organ-specific 0229. In some embodiments, inhibitor to METTL3 and/or neoantigens (OSNS) promoter, and colon specific antigen-P METTL4, or a nucleic acid encoding METTL3 and/or promoter. METTL4 protein or a functional fragment thereof can be 0224. Other elements aiding specificity of expression in a complexed with desired targeting moieties by mixing a RNAi tissue of interest can include secretion leader sequences, molecules with a targeting moiety in the presence of com US 2016/0264934 A1 Sep. 15, 2016 29 plexing agents. Examples of such complexing agents include, for performing méA analysis of RNA from stem cell popu but are not limited to, poly-amino acids; polyimines; poly lations to characterize the cell state of the cell population, acrylates; polyalkylacrylates, polyoxethanes, polyalkylcy which can be used, for example, as a quality control for the anoacrylates; cationized gelatins, albumins, starches, acry stem cell population. In some embodiments, the stem cell lates, polyethyleneglycols (PEG) and starches: population is a human stem cell population, e.g., a hESC cell polyalkylcyanoacrylates; DEAE-derivatized polyimines, population or other human stem cell line. pollulans, celluloses and Starches. In some embodiments, the 0234. Accordingly, another aspect of the technology complexing agents include chitosan, N-trimethylchitosan, described herein relates to methods, compositions, assays, poly-L-lysine, polyhistidine, polyornithine, polyspermines, arrays and kits to characterize a stem cell population, Such as protamine, polyvinylpyridine, polythiodiethylaminomethyl a human stem cell population, comprising performing mA ethylene P(TDAE), polyaminostyrene (e.g. p-amino), poly analysis on the RNA obtained from the population of stem (methylcyanoacrylate), poly(ethylcyanoacrylate), poly(bu cells, and assessing the intensity of the mA levels of the tylcyanoacrylate), poly(isobutylcyanoacrylate), poly mRNA of at least 10 genes selected from any of those in Table (isohexylcynaoacrylate), DEAE-methacrylate, DEAE 1, or Table 2 as disclosed herein. hexylacrylate, DEAE-acrylamide, DE AE-albumin and 0235 Another aspect of the technology described herein DEAE-dextran, polymethylacrylate, polyhexylacrylate, poly relates to methods, compositions, assays, arrays and kits for (D.L-lactic acid), poly(DL-lactic-co-glycolic acid (PLGA), assessing mA levels in the RNA obtained from a population alginate, and polyethyleneglycol (PEG), and polyethylen of stem cells, e.g., human stem cells. In some embodiments, imine. the method comprises (i) measuring them. A levels of least 10 0230. In alternative embodiments, inhibitor to METTL3 mRNA transcripts selected from any of those listed in Table 1 and/or METTL4, or a nucleic acid encoding METTL3 and/or or Table 2, for example by contacting an array with RNA METTL4 protein or a functional fragment thereof is com isolated from a cell population, where the array comprises at plexed to a complexing agent, e.g., Such as a protamine or an least 10 or more oligonucleotides that hybridize to at least 10 RNA-binding domain, such as an siRNA-binding fragment or mRNA transcripts, or to at least 103'UTR or other untrans nucleic acid binding fragment of protamine. Protamine is a lated regions of at least 10 genes selected from any of those polycationic peptide with molecular weight about 4000-4500 listed in Table 1 or Table 2, and (ii) contacting the array with Da. Protamine is a small basic nucleic acid binding protein, at least one reagent which binds to móA in the RNA, such as which serves to condense the animal’s genomic DNA for an anti-mA antibody, or fragment thereof, such as an anti packaging into the restrictive Volume of a sperm head (War mA antibody which is fluorescently labeled or otherwise has rant, R. W., et al., Nature 271:130-135 (1978); Krawetz, S.A., a detectable label, therefore allowing the measurements of the et al., Genomics 5:639-645 (1989)). The positive charges of levels of méA in the at least selected 10 mRNA transcripts, or the protamine can strongly interact with negative charges of to at least 103'UTR or other untranslated regions of at least 10 the phosphate backbone of nucleic acid, such as RNA, result genes selected from any of those listed in Table 1 or Table 2. ing in a neutral and stable interference RNA-protamine com 0236 A further aspect of the technology described herein plex. relates to methods, compositions, assays, arrays and kits for use in a method for determining the cell state of a stem cell 0231. In one embodiment, the protamine fragment is population comprising performing the assay of claim 10, and encoded by a nucleic acid sequence disclosed in International comparing the levels of méA (i.e., peak intensities) of at least Patent Application: PCT/US05/0291 11, which is incorpo 10 genes selected from any of Table 1 in the RNA from the rated herein in its entirety by reference. The methods, stem cell population with the levels of méA (i.e., peak inten reagents and references that describe a preparation of a sities) in a reference stem cell population, and based on this nucleic acid-protamine complex in detail are disclosed in the comparison, determining the cell state of the stem cell popu U.S. Patent Application Publication Nos. US200210132990 lation. and US200410023902, and are herein incorporated by refer 0237 Another aspect of the present invention relates to a ence in their entirety. kit comprising: (i) an array composition for characterizing the II. Fingerprinting of méA Levels and Analysis of StemCell cell State of a population of stem cells, comprising at least 10 Populations oligonucleotides that hybridize to the RNA (i.e. mRNA tran scripts, 3'UTR or other untranslated RNAs) of at least 10 0232 Another aspect of the technology disclosed herein genes selected from any of those in Table 1 or Table 2 as relates to the use of the intensity of méA sites of methylation disclosed herein; and (ii) at least one regent to detect the m6A (i.e., méA peak intensity) as a quantitative metric or measure in RNA, such as, for example, an anti-mA antibody, or to distinguish cell states. Stated another way, the intensity of fragment thereof, for example an anti-méA antibody or frag m6A sites of methylation (i.e., méA peak intensity) of a set of ment thereof which is detectably labeled (e.g., with a flores specific target gene, e.g., at least 10 or more selected from cent label, colorimetric marker etc.). Table 1 or Table 2, can be used to “fingerprint a cell state, 0238. In some embodiments, the kit comprises a computer e.g., determine the cell state of the stem cell population, i.e., readable medium comprising instructions on a computer to to determine if the stem cell population is pluripotent (i.e., in compare the measured levels of méA (i.e., peak intensities) an undifferentiated pluripotent state) or if the human stem cell from a test stem cell population with reference levels of the population has differentiated along a cell lineage pathway. same RNA transcripts assessed. In some embodiments, the kit Importantly, using the intensity of méA sites of methylation comprises instructions to access to a Software program avail (i.e., méA peak intensity) of specific target genes is idepen able online (e.g., on a cloud) to compare the measured levels dent of gene expression levels, which is the current standard of the m6A (i.e., peak intensities) from the test stem cell of analysis of stem cell populations. population, e.g., human stem cell population, with reference 0233. Accordingly, another aspect of the technology levels of méA for the same RNAS assessed from a reference described herein relates to methods, assays, arrays and kits stem cell population, e.g., human stem cell population. US 2016/0264934 A1 Sep. 15, 2016 30

TABLE 1.

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOOO64703 ENSMUSGOOOOOO279OS DDX20 9 chr1 1123O8858 1123O8958 ENSGOOOOOO86015 ENSMUSGOOOOOOO381O MAST2 O chr1 46SOO659 46SOO760 ENSGOOOOO168O36 ENSMUSGOOOOOOO6932 CTNNB1 1 chr3 41240966 41241066 ENSGOOOOO168O36 ENSMUSGOOOOOOO6932 CTNNB1 2 chr3 41280873 41280978 ENSGOOOOO168O36 ENSMUSGOOOOOOO6932 CTNNB1 3 chr3 41281,311 41281,411 ENSGOOOOO1851.27 ENSMUSGOOOOOOSOO88 C6orf120 4 chr6 1701 02894. 1701 02994 ENSGOOOOO1091.18 ENSMUSGOOOOOO37791 PHF12 S chr17 27239936 2724OO45 ENSGOOOOO109113 ENSMUSGOOOOOOO2O59 RAB34 6 chr17 27041474 27041574 ENSGOOOOOO42O88 ENSMUSGOOOOOO21177 TDP1 7 chr14 90429848 90429948 ENSGOOOOO2O576S ENSMUSGOOOOOO41935 C5orf51 8 chrs 41917289 41917389 ENSGOOOOO182272 ENSMUSGOOOOOOSS629 B4GALNT4 9 chr11 377163 377 270 ENSGOOOOO184708 ENSMUSGOOOOOO2O4S4 EIF4ENIF1 2O chr22 3.1835,776 3.1835876 ENSGOOOOO141682 ENSMUSGOOOOOO24521 PMAIP1 21 chr18 57570000 57570 100 ENSGOOOOO14SO41 ENSMUSGOOOOOO40325 VPRBP 22 chr3 51457387 S1457488 ENSGOOOOO14SO41 ENSMUSGOOOOOO40325 VPRBP 23 chr3 51475542 51475642 ENSGOOOOO157978 ENSMUSGOOOOOO37295 LDLRAP1 24 chr 2589349S 25893595 ENSGOOOOO185728 ENSMUSGOOOOOO47213 YTHDF3 25 chr8 64099.129 64O99229 ENSGOOOOO154370 ENSMUSGOOOOOO2O4SS TRIM11 26 chr 228582552 228582652 ENSGOOOOO2O5268 ENSMUSGOOOOOO69094 PDE7A 27 chr8 66631S25 66631625 ENSGOOOOO213024 ENSMUSGOOOOOO43858 NUP62 28 chr19 SO411493 SO411593 ENSGOOOOO134247 ENSMUSGOOOOOO27864 PTGFRN 29 chr 117SO4014 117SO4114 ENSGOOOOO1342.47 ENSMUSGOOOOOO27864 PTGFRN 30 chir 11752.9590 117529690 ENSGOOOOO1434.42 ENSMUSGOOOOOO38902 POGZ 31 chir 151377307 1513774O7 ENSGOOOOO1434.42 ENSMUSGOOOOOO38902 POGZ 32 chir 151377594. 151377,694 ENSGOOOOO161204 ENSMUSGOOOOOOO3234 ABCF3 33 chr3 18391.1477 1839.11577 ENSGOOOOO247596 ENSMUSGOOOOOO23277 TWF2 34 chr3 S2262944 52263O44 ENSGOOOOOO48649 ENSMUSGOOOOOO3S623 RSF 35 chr11 77378.075 773.78175 ENSGOOOOOOS7757 ENSMUSGOOOOOO28669 PITHD1 36 chr 241 13930 241.14030 ENSGOOOOO13SO48 ENSMUSGOOOOOO24754 TMEM2 37 ch9 743OOO31 743 OO134 ENSGOOOOO13SO48 ENSMUSGOOOOOO24754 TMEM2 38 chr 7436O151 7436O2S1 ENSGOOOOO142798 ENSMUSGOOOOOO28763 HSPG2 39 chr 22149583 22149683 ENSGOOOOO135912 ENSMUSGOOOOOO332S7 TTLL4 40 chr2 2196.03558. 2196.03658 ENSGOOOOOO92148 ENSMUSGOOOOOO35247 HECTD1 41 chr14 31576238 31576338 ENSGOOOOO177732 ENSMUSGOOOOOOS1817 SOX12 42 chr2O 307350 307450 ENSGOOOOO166484 ENSMUSGOOOOOOO 1034 MAPK7 43 chr17 19284.120 19284224 ENSGOOOOO1OS281 ENSMUSGOOOOOOO1918 SLC1AS 44 chr19 47278621 472.78721 ENSGOOOOO172819 ENSMUSGOOOOOOO1288 RARG 45 chr12 S360S118 S360S 218 ENSGOOOOOO90097 ENSMUSGOOOOOO23495 PCBP4 46 chr3 S1991.534 S1991634 ENSGOOOOO121210 ENSMUSGOOOOOO33767 KIAAO922 47 chr4 154557652 1545.57760 ENSGOOOOOO999.54 ENSMUSGOOOOOO71226 CECR2 48 chr22 18027962 18028062 ENSGOOOOOO999.54 ENSMUSGOOOOOO71226 CECR2 49 chr22 18028899 18028999 ENSGOOOOOO7S413. ENSMUSGOOOOOOO7411 MARK3 SO chr14 1039694OS 103969SOS ENSGOOOOO16937S ENSMUSGOOOOOO42557 SIN3A S1 chr15 75664O67 7S 664167 ENSGOOOOO16937S ENSMUSGOOOOOO42557 SIN3A S2 chr15 75664369 75664475 ENSGOOOOO16937S ENSMUSGOOOOOO42557 SIN3A S3 chr15 75684619 75684719 ENSGOOOOO111802 ENSMUSGOOOOOO3S958 TDP2 54 chr6 24651OO3 24651.104 ENSGOOOOO1426SS ENSMUSGOOOOOO2897S PEX14 SS chir 10689962 10690063 ENSGOOOOO1341.86 ENSMUSGOOOOOO27881 PRPF38B 56 chr O924.2126 O9242233 ENSGOOOOO135900 ENSMUSGOOOOOO26248 MRPL44 57 chir 224824419 224824519 ENSGOOOOO166326 ENSMUSGOOOOOO271.89 TRIM44 58 chr11 35685.077 35685.177 ENSGOOOOOO898.76 ENSMUSGOOOOOO3O986 DHX32 59 chr10 27569423 27569523 ENSGOOOOO123066 ENSMUSGOOOOOO18076 MED13L 60 chr12 16428930 16429.030 ENSGOOOOO123066 ENSMUSGOOOOOO18076 MED13L 61 chr12 1642928O 16429380 ENSGOOOOO123066 ENSMUSGOOOOOO18076 MED13L 62 chr12 16429524 16429624 ENSGOOOOO1326.80 ENSMUSGOOOOOO28060 KIAAO907 63 chir SS883746 SS883846 ENSGOOOOOO68001 ENSMUSGOOOOOO10047 HYAL2 64 chr3 50357457 503575.57 ENSGOOOOO11527S ENSMUSGOOOOOO3OO3.6 MOGS 65 chr2 74688.345 74688445 ENSGOOOOOOS86OO ENSMUSGOOOOOO3O880 POLR3E 66 chr16 2234S114 2234.5224 ENSGOOOOO1656.71 ENSMUSGOOOOOO21488 NSD1 67 chrS 76562562 7656.2672 ENSGOOOOO1656.71 ENSMUSGOOOOOO21488 NSD1 68 chrs 766381.51 766382S1 ENSGOOOOO1656.71 ENSMUSGOOOOOO21488 NSD1 69 chrs 7663878O 7663888O ENSGOOOOO1656.71 ENSMUSGOOOOOO21488 NSD1 70 chrS 76721213 76721313

US 2016/0264934 A1 Sep. 15, 2016 32

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol D) chromosome Start Human end ENSGOOOOO170881 ENSMUSGOOOOOO3707S RNF139 36 chr8 12549978O 12549988O ENSGOOOOO148143 ENSMUSGOOOOOO60206 ZNF462 37 chr 1096886.75 109688775 ENSGOOOOO148143 ENSMUSGOOOOOO60206 ZNF462 38 chr 109773391 109773491 ENSGOOOOO104332 ENSMUSGOOOOOO31548 SFRP1 39 chr8 41166322 41 166422 ENSGOOOOO1782S2 ENSMUSGOOOOOO66357 WDR6 40 chr3 49052670 49052770 ENSGOOOOO120709 ENSMUSGOOOOOO343OO FAMS3C 41 chrS 137682590 137682690 ENSGOOOOO10O376 ENSMUSGOOOOOO22434 FAM118A 42 chr22 4573646S 45736.569 ENSGOOOOO126883 ENSMUSGOOOOOOO18SS NUP214 43 chir 134O73152 134O73259 ENSGOOOOO161638 ENSMUSGOOOOOOOOSSS ITGAS 44 chr12 S478982O S478.9920 ENSGOOOOOO786.18 ENSMUSGOOOOOOS3510 NRD1 45 chr1 S2344004 52344104 ENSGOOOOO101.412 ENSMUSGOOOOOO27490 E2F1 46 chr2O 32264513 32264613 ENSGOOOOO171603 ENSMUSGOOOOOO399.53 CLSTN1 47 chr1 97901.76 97.90276 ENSGOOOOO171604 ENSMUSGOOOOOO46668 CXXCS 48 chrS 139060543 13906.0648 ENSGOOOOOO22567 ENSMUSGOOOOOO7902O SLC45A4 49 chr8 142228759 142228865 ENSGOOOOO16963S ENSMUSGOOOOOOSO240 HIC2 SO chr22 218OO152 2180O2S2 ENSGOOOOO16963S ENSMUSGOOOOOOSO240 HIC2 51 chr22 2180OS85 218OO686 ENSGOOOOO136940 ENSMUSGOOOOOOO9030 PDCL 52 chr 1255823.47 125582456 ENSGOOOOO136940 ENSMUSGOOOOOOO9030 PDCL S3 chir 125582639 125582739 ENSGOOOOO114O19 ENSMUSGOOOOOO32S31 AMOTL2 54 chr3 134O76O23 134076123 ENSGOOOOO1O3S07 ENSMUSGOOOOOO3O802 BCKDK SS chr16 31123671 31123.771 ENSGOOOOO146067 ENSMUSGOOOOOO21495 FAM193B 56 chrS 176951.261 176951.361 ENSGOOOOO146067 ENSMUSGOOOOOO21495 FAM193B 57 chrS 176951 694 176951794 ENSGOOOOO1357.63 ENSMUSGOOOOOO31976 URB2 58 chr 229770975 2297,71075 ENSGOOOOO1357.63 ENSMUSGOOOOOO31976 URB2 59 chr 229773,778 229773878 ENSGOOOOO163481 ENSMUSGOOOOOO26171 RNF2S 60 chr2 219528782. 219528886 ENSGOOOOO14O262 ENSMUSGOOOOOO32228 TCF12 61 chr15 575784.04 57578514 ENSGOOOOO1456.04 ENSMUSGOOOOOOS411S SKP2 62 chrS 36152995 36153095 ENSGOOOOO1O1407 ENSMUSGOOOOOO276SO TTI1 63 chr2O 36641249 36641349 ENSGOOOOO1O1407 ENSMUSGOOOOOO276SO TTI1 64 chr2O 36641767 36641867 ENSGOOOOO1391.82 ENSMUSGOOOOOOO8153 CLSTN3 65 chr12 73.10712 73.10812 ENSGOOOOO11336O ENSMUSGOOOOOO22191 DROSHA 66 chrS 31526727 31526827 ENSGOOOOO175931 ENSMUSGOOOOOO2O802 UBE2O 67 chr17 74392.433 743.92533 ENSGOOOOOO82213 ENSMUSGOOOOOO22195 C5orf22 68 chrS 3.1538511 31538611 ENSGOOOOO112983 ENSMUSGOOOOOOO3778 BRD8 69 chrS 1375,00558 13.7500658 ENSGOOOOOO86062 ENSMUSGOOOOOO28413 B4GALT1 70 chir 331.13372 33 113472 ENSGOOOOO176915. ENSMUSGOOOOOO295 O1. ANKLE2 71 chr12 133306S14 133306621 ENSGOOOOO176915. ENSMUSGOOOOOO295 O1. ANKLE2 72 chr12 133331392 133331492 ENSGOOOOO168137 ENSMUSGOOOOOO34269 SETDS 73 chr3 95123S4 95124.54 ENSGOOOOO168137 ENSMUSGOOOOOO34269 SETDS 74 chr3 9517516 9517616 ENSGOOOOO168137 ENSMUSGOOOOOO34269 SETDS 7S chr3 951.7778 9517878 ENSGOOOOO163166 ENSMUSGOOOOOO24384 IWS1 76 chr2 128.262332 128262432 ENSGOOOOO16O710 ENSMUSGOOOOOO27951 ADAR 77 chr1 154557261 154557369 ENSGOOOOO146247 ENSMUSGOOOOOO32253 PHIP 78 chr6 796SO447 796.50547 ENSGOOOOO1563O4 ENSMUSGOOOOOO22983 SCAF4 79 chr21 33O43670 33043770 ENSGOOOOO143970 ENSMUSGOOOOOO37486 ASXL2 80 chr2 25964998 2596SO98 ENSG00000188021 ENSMUSG00000050148 UBQLN2 81 chrX 56591658 56591758 ENSGOOOOO182372 ENSMUSGOOOOOO26317 CLN8 82 chr8 1728.659 1728759 ENSGOOOOO126461 ENSMUSGOOOOOO384O6 SCAF1 83 chr19 SO156594 SO156694 ENSGOOOOO145632 ENSMUSGOOOOOO21701 PLK2 84 chrS 57750268 57750368 ENSGOOOOO1689.18 ENSMUSGOOOOOO26288 INPPSD 85 chr2 234.115576 234115684 ENSGOOOOO16471S ENSMUSGOOOOOO3897O LMTK2 86 chirf 9782.3595 97823695 ENSGOOOOOO3OS82. ENSMUSGOOOOOO34708 GRN 87 chr17 42430159 4243O259 ENSGOOOOO173786 ENSMUSGOOOOOOO6782 CNP 88 chr17 40120545 4012O645 ENSGOOOOO1781.88 ENSMUSGOOOOOO30733 SH2B1 89 chr16 28878O41 288781.45 ENSGOOOOO121057 ENSMUSGOOOOOO18428 AKAP1 90 chr17 SS183430 55183530 ENSGOOOOO1254.84 ENSMUSGOOOOOO35666 GTF3C4 91 chr 1355.53562 135553663 ENSGOOOOO1987OO ENSMUSGOOOOOO41879 IPO9 92 chr1 2O1845316 20184S422 ENSGOOOOO182963 ENSMUSGOOOOOO3452O GJC1 93 chr17 4288.1776 4288.1876 ENSGOOOOO182963 ENSMUSGOOOOOO3452O GJC1 94 chr17 428823.63 428,82463 ENSGOOOOO1972S6 ENSMUSGOOOOOO321.94 KANK2 95 chr19 11303870 11303970 ENSGOOOOO123SS2 ENSMUSGOOOOOO4O4SS USP45 96 chr6 998.93940 998.94O40 ENSGOOOOO171SS2 ENSMUSGOOOOOOO7659 BCL2L1 97 chrC) 30309594 30309694 ENSGOOOOO1 OO1 OS ENSMUSGOOOOOO2O453 PATZ1 98 chr22 31722789 31722895 ENSGOOOOO1 OO1 OS ENSMUSGOOOOOO2O453 PATZ1 99 chr22 3174O743 31740843 ENSGOOOOO1 OO1 OS ENSMUSGOOOOOO2O453 PATZ1 2OO chr22 3174O937 31741037 US 2016/0264934 A1 Sep. 15, 2016 33

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOO18SO33 ENSMUSGOOOOOO30539 SEMA4B 2O1 chr15 907721.85 90.772292 ENSGOOOOO1433.63 ENSMUSGOOOOOO15711 PRUNE 2O2 chir 151OO6508 151 OO6608 ENSGOOOOO102967 ENSMUSGOOOOOO31730 DHODH 2O3 chr16 72058172 72058272 ENSGOOOOOO626SO ENSMUSGOOOOOO41408 WAPAL 204 chr10 882S9886 88259986 ENSGOOOOO143013 ENSMUSGOOOOOO28266 LMO4 205 chir 87810689 87810789 ENSGOOOOOO88367 ENSMUSGOOOOOO27624 EPB41L1 2O6 chr2O 34817404 34817SO4 ENSGOOOOO181SSS ENSMUSGOOOOOO44791 SETD2 2O7 chr3 47098795 47098895 ENSGOOOOO1624O2 ENSMUSGOOOOOO285.14 USP24 208 chr 55532974 55533074 ENSGOOOOO1624O2 ENSMUSGOOOOOO285.14 USP24 209 chr 55534353 55.534453 ENSGOOOOO1085.78 ENSMUSGOOOOOO2O840 BLMH 210 chr17 28575929 28576O29 ENSGOOOOO1586.36 ENSMUSGOOOOOO3S4O1 C11orf O 211 chr11 76261122 76261222 ENSGOOOOO172795. ENSMUSGOOOOOO24472 DCP2 212 chrs 112349067 11234.9167 ENSGOOOOO159322 ENSMUSGOOOOOO2S236 ADPGK 213 chr15 73O44722 73O44822 ENSGOOOOO159322 ENSMUSGOOOOOO2S236 ADPGK 214 chr15 73O44897 73O44997 ENSGOOOOO166068 ENSMUSGOOOOOO27351 SPRED1 215 chr15 38643376 38643476 ENSGOOOOO103356 ENSMUSGOOOOOO3O871 EARS2 216 chr16 23S46495 23546595 ENSGOOOOO107651 ENSMUSGOOOOOOSS319 SEC23IP 217 chr10 121658063 1216581.63 ENSGOOOOO111530 ENSMUSGOOOOOO2O114 CAND1 218 chr12 67699642 676997.42 ENSGOOOOO143379 ENSMUSGOOOOOO15697 SETDB1 219 chr 1SO923233 150923.341 ENSGOOOOO143379 ENSMUSGOOOOOO15697 SETDB1 22O chir 1509333.27 150933427 ENSGOOOOO143379 ENSMUSGOOOOOO15697 SETDB1 221 chr 1SO936842 150936.942 ENSGOOOOO171492 ENSMUSGOOOOOO46O79 LRRC8D 222 chir 904OO967 904O1067 ENSGOOOOO171940 ENSMUSGOOOOOOS2O56 ZNF217 223 chr2O 5.2192477 52192577 ENSGOOOOOO838.57 ENSMUSGOOOOOO7OO47 FAT 224 chr4 1875.17937 187518043 ENSGOOOOOO838.57 ENSMUSGOOOOOO7OO47 FAT 225 chr4 187521195 1875.21295 ENSGOOOOOO68097 ENSMUSGOOOOOOOO976 HEATR6 226 chr17 58120927 S8121 O27 ENSGOOOOOO68097 ENSMUSGOOOOOOOO976 HEATR6 227 chr17 S81212O3 S8121303 ENSGOOOOOO99381 ENSMUSGOOOOOO423O8 SETD1A 228 chr16 3097718O 3097728O ENSGOOOOOO99381 ENSMUSGOOOOOO423O8 SETD1A 229 chr16 3.0990852 30990952 ENSGOOOOOO99381 ENSMUSGOOOOOO423O8 SETD1A 230 chr16 3.0991343 30991443 ENSGOOOOOOO9954 ENSMUSGOOOOOOO2748 BAZ1B 231 chirf 72856467 72856567 ENSGOOOOOOO9954 ENSMUSGOOOOOOO2748 BAZ1B 232 chirf 72891680 728.91780 ENSGOOOOOOO9954 ENSMUSGOOOOOOO2748 BAZ1B 233 chirf 7289.1974 72892O74 ENSGOOOOOOO9954 ENSMUSGOOOOOOO2748 BAZ1B 234 chirf 7289.2449 72892S49 ENSGOOOOO1445.24 ENSMUSGOOOOOO2624O COPS7B 235 chr2 2326733OS 2326734OS ENSGOOOOO132383 ENSMUSGOOOOOOOO751 RPA1 236 chr17 1800S92 18OO692 ENSGOOOOO129474 ENSMUSGOOOOOO22178 AUBA 237 chr14 234506.06 23450706 ENSGOOOOOO70366 ENSMUSGOOOOOO38290 SMG6 238 chr17 22O2S89 22O2689 ENSGOOOOO152952 ENSMUSGOOOOOO32374 PLOD2 239 chr3 4578848O 4578858O ENSGOOOOOO10322 ENSMUSGOOOOOO21910 NISCH 240 chr3 S2S21541 S2521641 ENSGOOOOOO10322 ENSMUSGOOOOOO21910 NISCH 241 chr3 525261.56 525262.56 ENSGOOOOO184863. ENSMUSGOOOOOO48.271 RBM33 242 chirf 55567886 55567996 ENSGOOOOO184867 ENSMUSGOOOOOO33436 ARMCX2 243 chrX O09 12393 OO912493 ENSGOOOOO108219 ENSMUSGOOOOOO37824 TSPAN14 244 chr10 82.277982 82.278O82 ENSGOOOOO182544 ENSMUSGOOOOOO45665 MFSDS 245 chr12 S3648059 S36481.59 ENSGOOOOOO72274 ENSMUSGOOOOOO22797 TFRC 246 chr3 95778804 95778.905 ENSGOOOOO1468.34 ENSMUSGOOOOOO29726 MEPCE 247 chirf OOO29121 OOO29221 ENSGOOOOO164040 ENSMUSGOOOOOO4994O PGRMC2 248 chr4 29.192367 29.192467 ENSGOOOOO239306 ENSMUSGOOOOOOO6456 RBM14 249 chr11 6639.1816 66.391916 ENSGOOOOO198728 ENSMUSGOOOOOO2S223 LDB1 250 chr10 O3867637 O3867737 ENSGOOOOO181026 ENSMUSGOOOOOO30609 AEN 251 chr15 891.69752 891 69852 ENSGOOOOO142949 ENSMUSGOOOOOO33295 PTPRF 252 chir 440S6683 44056.783 ENSGOOOOO142949 ENSMUSGOOOOOO33295 PTPRF 253 chir 44087837 44087937 ENSGOOOOOO418O2. ENSMUSGOOOOOO22S38 LSG1 254 chr3 94362699 943,62799 ENSGOOOOO15146S ENSMUSGOOOOOO39128 CDC123 2S5 cr10 122381.67 12238.267 ENSGOOOOO151461 ENSMUSGOOOOOO43241 UPF2 256 chr10 1196,2806 1196,2906 ENSGOOOOOOO3393 ENSMUSGOOOOOO26O24 ALS2 257 chir 2O2S6S6O2 2O2S65702 ENSGOOOOO143924 ENSMUSGOOOOOO32624 EML4 258 chr2 42557252 42557352 ENSGOOOOO1233.58 ENSMUSGOOOOOO23O34 NR4A1 259 chr12 S244.8563 5244.8663 ENSGOOOOO1631.13 ENSMUSGOOOOOO3849S OTUD7B 260 chir 149915755 149915855 ENSGOOOOO114948 ENSMUSGOOOOOO25964 ADAM23 261 chr2 2O7482464 20748.2564 ENSGOOOOO109572 ENSMUSGOOOOOOO4319 CLCN3 262 chr4 17064-1332 17O641432 ENSGOOOOO167862 ENSMUSGOOOOOO18858 ICT1 263 chr17 73017.077 73017177 ENSGOOOOO15861S ENSMUSGOOOOOO46062 PPP1R1SB 264 chr 20437 S200 2043753OO ENSGOOOOO15861S ENSMUSGOOOOOO46062 PPP1R1SB 265 chir 204379076. 2043791.76 US 2016/0264934 A1 Sep. 15, 2016 34

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOO101.337 ENSMUSGOOOOOO68040 TM9SF4 266 chr2O 30753.247 30753350 ENSGOOOOO101.337 ENSMUSGOOOOOO68040 TM9SF4 267 chr2O 30753978 3.0754O78 ENSGOOOOO13781S ENSMUSGOOOOOO273O4 RTF1 268 chr15 41772942 41773O42 ENSGOOOOO1654.94 ENSMUSGOOOOOO41328 PCF11 269 chr11 82879692 82879792 ENSGOOOOO1654.94 ENSMUSGOOOOOO41328 PCF11 27O chr11 82880587 82880687 ENSGOOOOO1161.91 ENSMUSGOOOOOO26594 RALGPS2 271 chir 1788.85673 1788.85773 ENSGOOOOO117139 ENSMUSGOOOOOO42207 KDMSB 272 chir 202698936 202699.036 ENSGOOOOO1598.73 ENSMUSGOOOOOO2O482 CCDC117 273 chr2 291.82.192 29.182292 ENSGOOOOO17OO37 ENSMUSGOOOOOO32782 CNTROB 274 chr17 78.36272 78.36372 ENSGOOOOO104853 ENSMUSGOOOOOOO2981 CLPTM1 275 cr19 45496.192 45496292 ENSGOOOOO1173.18 ENSMUSGOOOOOOO7872 ID3 276 chr 23884686 23884786 ENSGOOOOOO86758 ENSMUSGOOOOOO25261 HUWE1 277 chrX 53574997 53575097 ENSGOOOOOO83093 ENSMUSGOOOOOO44702 PALB2 278 chr16 23641466 23641566 ENSGOOOOO140598 ENSMUSGOOOOOO385.63 EFTUD1 279 chr15 8244.3896 82443996 ENSGOOOOO156471 ENSMUSGOOOOOO2.1518. PTDSS1 280 chr8 97345895 97345995 ENSGOOOOO1472.57 ENSMUSGOOOOOOSS6S3 GPC3 281 chrX 132887658 132887758 ENSGOOOOO1368.48 ENSMUSGOOOOOO26883 DAB2IP 282 chir 124522263 1245223.63 ENSGOOOOO16312S ENSMUSGOOOOOO28106 RPRD2 283 chir 1SO444.686 1SO444787 ENSGOOOOO163251 ENSMUSGOOOOOO4SOOS FZDS 284 chr2 2O863.1902. 208.632OO2 ENSGOOOOO163251 ENSMUSGOOOOOO4SOOS FZDS 285 chr2 2O8632239 208.632339 ENSGOOOOO21S251 ENSMUSGOOOOOO79043 FASTKDS 286 chr2O 3127689 31.27789 ENSGOOOOO135862 ENSMUSGOOOOOO26478 LAMC1 287 chr 183111784 18311 1884 ENSGOOOOO141568 ENSMUSGOOOOOO3927S FOXK2 288 chr17 8OSS9458 80559558 ENSGOOOOO141568 ENSMUSGOOOOOO3927S FOXK2 289 chr17 80S601 OO 80S602OO ENSGOOOOO16SOO6 ENSMUSGOOOOOO28437 UBAP1 290 chir 34241944 34242044 ENSGOOOOO164284 ENSMUSGOOOOOO2458O GRPEL2 291 chrs 14873.0644. 148.730744 ENSGOOOOO2OS213 ENSMUSGOOOOOOSO199 LGR4 292 chr11 27389623 2738.9723 ENSGOOOOO1241.77 ENSMUSGOOOOOOS7133 CHD6 293 chr2O 40O33200 40O333OO ENSGOOOOO1241.77 ENSMUSGOOOOOOS7133 CHD6 294 chr2O 40033590 40033691 ENSGOOOOOO72071 ENSMUSGOOOOOO13O33 LPHN1 295 chr19 1427,3705 14273805 ENSGOOOOOO72071 ENSMUSGOOOOOO13O33 LPHN1 296 chr19 14273986 14274086 ENSGOOOOO16O299 ENSMUSGOOOOOOO1151 PCNT 297 chr21 47783554 4778.3654 ENSGOOOOOO7S702 ENSMUSGOOOOOO3702O WDR62 298 chr19 365.94530 36594630 ENSGOOOOO102921 ENSMUSGOOOOOO31652 N4BP1 299 chr16 4859SO33 48595133 ENSGOOOOO102921 ENSMUSGOOOOOO31652 N4BP1 3OO chr16 48595778 48595878 ENSGOOOOO1471.30 ENSMUSGOOOOOO31310 ZMYM3 3O1 chrX 70472763 70472863 ENSGOOOOO107021 ENSMUSGOOOOOO39678 TBC1D13 3O2 chir 131570315 131570415 ENSGOOOOO1321.53 ENSMUSGOOOOOO32480 DHX30 3O3 chr3 47888.285 47888385 ENSGOOOOO1381.62 ENSMUSGOOOOOO3O852 TACC2 3O4 chr10 12397O682 1239.70782 ENSGOOOOO1126SS ENSMUSGOOOOOO23972 PTK7 305 chr6 43.128845 43128945 ENSGOOOOO137522. ENSMUSGOOOOOO70426 RNF121 306 chr11 717074.04 71707504 ENSGOOOOO145982 ENSMUSGOOOOOO21420 FARS2 307 chr6 5369244 S369344 ENSGOOOOO197081 ENSMUSGOOOOOO23830 IGF2R 3O8 chr6 1605261.45 1605262.SS ENSGOOOOO121083. ENSMUSGOOOOOO2O483 DYNLL2 309 chr17 56166648 56166752 ENSGOOOOOO14919 ENSMUSGOOOOOO4OO18 COX15 31O chr10 101474207 101474307 ENSGOOOOOO824.58 ENSMUSGOOOOOOOO881 DLG3 311 chrX 69722124 69722224 ENSGOOOOO107341 ENSMUSGOOOOOO36241 UBE2R2 312 chir 33917308 33917408 ENSGOOOOOO376.37 ENSMUSGOOOOOO2892O FBXO42 313 chir 16577669 16577769 ENSGOOOOO124789 ENSMUSGOOOOOO21374 NUP153 314 chr6 17637651 17637751 ENSGOOOOO1696.41 ENSMUSGOOOOOOO1089 LUZP1 315 chir 23414329 23414429 ENSGOOOOO1696.41 ENSMUSGOOOOOOO1089 LUZP1 316 chr 23415379 234.15479 ENSGOOOOO1696.41 ENSMUSGOOOOOOO1089 LUZP1 317 chr 234.17824 234.17924 ENSGOOOOO2444.62 ENSMUSGOOOOOO89824 RBM12 318 chr2O 3424.1710 34241810 ENSGOOOOO2444.62 ENSMUSGOOOOOO89824 RBM12 319 chr2O 34242917 3424301.7 ENSGOOOOOO48028 ENSMUSGOOOOOO32267 USP28 32O chr11 113669811 1136 69911 ENSGOOOOO1321.28 ENSMUSGOOOOOO28.703 LRRC41 321 chir 467S1441 467S1541 ENSGOOOOO108528 ENSMUSGOOOOOO14606 SLC25A11 322 chr17 484.0589 484O689 ENSGOOOOOO15532 ENSMUSGOOOOOO2O868 XYLT2 323 chr17 48437SO4 48437604 ENSGOOOOO165934 ENSMUSGOOOOOO41781 CPSF2 324 chr14 92628.045 92628145 ENSGOOOOO172273 ENSMUSGOOOOOO32119 HINFP 325 chr11 1190OSO2O 1190OS120 ENSGOOOOO1326O4 ENSMUSGOOOOOO31921 TERF2 326 chr16 69400773 69400873 ENSGOOOOOOS1382 ENSMUSGOOOOOO32462 PIK3CB 327 chr3 138374167 138374.267 ENSGOOOOO15339S ENSMUSGOOOOOO21608 LPCAT1 328 chrS 1463681 1463,781 ENSGOOOOO128228 ENSMUSGOOOOOO22769 SDF2L1 329 chr22 21998.470 21998.570 ENSGOOOOO104081 ENSMUSGOOOOOO40093 BMF 330 chr15 40383240 40.383340 US 2016/0264934 A1 Sep. 15, 2016 35

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOO10O3.64 ENSMUSGOOOOOO36O46 KIAAO930 331 chr22 45592.538 45592.638 ENSGOOOOO166902 ENSMUSGOOOOOO24683 MRPL16 332 chr11 59573855 59573957 ENSGOOOOO1241.51 ENSMUSGOOOOOO27678 NCOA3 333 chr2O 46275903 46276OO3 ENSGOOOOO104885 ENSMUSGOOOOOO61589 DOT1L, 334 chr19 2222387 2222487 ENSGOOOOO104885 ENSMUSGOOOOOO61589 DOT1L, 335 cr19 2226,731 22268.31 ENSGOOOOO177613 ENSMUSGOOOOOOS3536 CSTF2T 336 chr10 S3458231 S34583.31 ENSGOOOOO1521.37 ENSMUSGOOOOOO41548 HSPB8 337 cr12 1196172O2 1196.173O2 ENSGOOOOO166908 ENSMUSGOOOOOO2S417 PIP4K2C 338 chr12 57995951 57996051 ENSGOOOOO 105722. ENSMUSGOOOOOO40857 ERF 339 chr19 42752707 42.752807 ENSGOOOOO 105722. ENSMUSGOOOOOO40857 ERF 340 chr19 42752961 427S3061 ENSGOOOOO1396.51 ENSMUSGOOOOOO46897 ZNF740 341 chr12 53581634 53581743 ENSGOOOOO172046 ENSMUSGOOOOOOO6676 USP19 342 chr3 49145673 491.45778 ENSGOOOOO187764 ENSMUSGOOOOOO21451 SEMA4D 343 chir 91993600 919.93707 ENSGOOOOO185619 ENSMUSGOOOOOO33623 PCGF3 344 chr4 7598.96 75.9996 ENSGOOOOO169925 ENSMUSGOOOOOO26918 BRD3 345 chir 1368.98.626 1368.98726 ENSGOOOOO126O12 ENSMUSGOOOOOO2S332 KDMSC 346 chrX 53222270 53222370 ENSGOOOOO126O12 ENSMUSGOOOOOO2S332 KDMSC 347 chrX S322349S 5322.3595 ENSGOOOOO122042 ENSMUSGOOOOOOO1687 UBL3 348 chr13 3O341200 3O341300 ENSGOOOOO1191.39 ENSMUSGOOOOOO24812 TP2 349 chr 71869441 71869S41 ENSGOOOOO108262 ENSMUSGOOOOOO11877 GIT1 3SO chr17 279.01620 27901720 ENSGOOOOO101773 ENSMUSGOOOOOO41238 RBBP8 351 chr18 2057,3357 205.73457 ENSGOOOOO137SO4. ENSMUSGOOOOOOS1451 CREBZF 352 chr11 85375536 85375636 ENSGOOOOO138231 ENSMUSGOOOOOO32469 DBR1 353 chr3 137880791 137880891 ENSGOOOOO186834 ENSMUSGOOOOOO48878 HEXIM1 354 chr17 43227588 43227688 ENSGOOOOO126947 ENSMUSGOOOOOO3346O ARMCX1 355 chrX 100808056 100808156 ENSGOOOOO1135.04 ENSMUSGOOOOOO17756 SLC12A7 356 chrS 1051,508 1051608 ENSGOOOOOO85377 ENSMUSGOOOOOO19849 PREP 357 chr6 105725839 105725939 ENSGOOOOO121274 ENSMUSGOOOOOO36779 PAPDS 358 chr16 SO263276 50263376 ENSGOOOOOO871.57 ENSMUSGOOOOOO1771S PGS1 359 chr17 763999.10 764OOO10 ENSGOOOOOO82781 ENSMUSGOOOOOO22817 ITGBS 360 chr3 12448237S 124482475 ENSGOOOOOO6O237 ENSMUSGOOOOOO45962 WNK1 361 chr12 99.4900 99SOOO ENSGOOOOO174953 ENSMUSGOOOOOO2777O DEHX36 362 chr3 153993949 153994O49 ENSGOOOOO156381 ENSMUSGOOOOOO37904 ANKRD9 363 chr14 102973218 102973318 ENSGOOOOO1984.08 ENSMUSGOOOOOO2S220 MGEAS 364 chr10 1O3S46098 103546,198 ENSGOOOOO1984.08 ENSMUSGOOOOOO2S220 MGEAS 365 chr10 103558723 103558823 ENSGOOOOO1983.31 ENSMUSGOOOOOOSOSSS HYLS1 366 chr11 125769870 125769.970 ENSGOOOOO118523. ENSMUSGOOOOOO19997 CTGF 367 chr6 132270307 1322704O7 ENSGOOOOO13327S ENSMUSGOOOOOOO3345 CSNK1 G2 368 chr19 196978O 196988O ENSGOOOOOO63978 ENSMUSGOOOOOO2911 O RNF4 369 chr4 2515571 2515671 ENSGOOOOO162923 ENSMUSGOOOOOO38733 WDR26 370 chir 224.577350 224.577450 ENSGOOOOO1971.22 ENSMUSGOOOOOO27646 SRC 371 chrC) 3603.1958 36032058 ENSGOOOOO173653 ENSMUSGOOOOOO24889 RCE 372 chr11 66613552 666.13652 ENSGOOOOO1338.95 ENSMUSGOOOOOO24947 MEN1 373 cr11 64571737 645.71837 ENSGOOOOO1338.95 ENSMUSGOOOOOO24947 MEN1 374 chr11 64572O37 64572138 ENSGOOOOO101.126 ENSMUSGOOOOOOS1149 ADNP 375 chrC) 49SO858O 49SO868O ENSGOOOOO17O604 ENSMUSGOOOOOO44030 IRF2BP1 376 chr19 46388328 46.388428 ENSGOOOOO17O606 ENSMUSGOOOOOO2O361 HSPA4 377 chrS 13244OO93 13244O193 ENSGOOOOO13683O ENSMUSGOOOOOO26796 FAM129B 378 chir 130269093 1302691.93 ENSGOOOOOO82641 ENSMUSGOOOOOO3861S NFE2L1 379 chr17 46128178 46128278 ENSGOOOOOO82641 ENSMUSGOOOOOO3861S NFE2L1 380 chr17 461360S6 461361.59 ENSGOOOOO1696.92 ENSMUSGOOOOOO26922 AGPAT2 381 chir 1395.68071. 139568171 ENSGOOOOO167258 ENSMUSGOOOOOOO3119 CDK12 382 chr17 37618686 37618789 ENSGOOOOO1232OO ENSMUSGOOOOOO22OOO ZC3H13 383 chir13 46541803 46541903 ENSGOOOOO119596 ENSMUSGOOOOOO21244 YLPM1 384 chr14 75248.186 752.48286 ENSGOOOOO119596 ENSMUSGOOOOOO21244 YLPM1 385 chr14 75264807 75264907 ENSGOOOOO119596 ENSMUSGOOOOOO21244 YLPM1 386 chr14 75266.182 7S266282 ENSGOOOOO148840 ENSMUSGOOOOOOSS491 PPRC1 387 chr10 10390686S 103906.96S ENSGOOOOO148843. ENSMUSGOOOOOO2SO47 PDCD11 388 chr10 105205321 105205421 ENSGOOOOO148842 ENSMUSGOOOOOO641OS CNNM2 389 chr10 10483.6827 104836927 ENSGOOOOOOO8O83 ENSMUSGOOOOOO38518 JARID2 390 chr6 15496499 15496599 ENSGOOOOOOO8O83 ENSMUSGOOOOOO38518 JARID2 391 chr6 1SSO1212 1SSO1312 ENSGOOOOO121236 ENSMUSGOOOOOO72244 TRIM6 392 chr11 S632143 S632243 ENSGOOOOO1548O3. ENSMUSGOOOOOO32633 FLCN 393 chr17 17116619 17116719 ENSGOOOOOO99899 ENSMUSGOOOOOO22721 TRMT2A 394 chr22 2O103732 20103832 ENSGOOOOO165526 ENSMUSGOOOOOO32044 RPUSD4. 395 chr11 126O72955 126O73055 US 2016/0264934 A1 Sep. 15, 2016 36

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOO101.138 ENSMUSGOOOOOO27498 CSTF1 396 chr2O 54978.760 54978.860 ENSGOOOOO17O633 ENSMUSGOOOOOO29474 RNF34 397 cr12 121855S06 121855606 ENSGOOOOO1745.79 ENSMUSGOOOOOO6641S MSL2 398 chr3 13587O121 1358.70221 ENSGOOOOO1745.79 ENSMUSGOOOOOO6641S MSL2 399 chr3 135870946 135871046 ENSGOOOOO1745.79 ENSMUSGOOOOOO6641S MSL2 400 chr3 135914097 135914197 ENSGOOOOO2O6SS7 ENSMUSGOOOOOO792.59 TRIM71 401 chr3 3293.2375 32932476 ENSGOOOOO1OOO84 ENSMUSGOOOOOO22702 HIRA 402 chr22 19318425 1931852S ENSGOOOOO15S287 ENSMUSGOOOOOO40414 SLC25A28 403 chr1O 101370706 101370806 ENSGOOOOO1986.46 ENSMUSGOOOOOO38369 NCOA6 404 chr2O 3333.7630 33337730 ENSGOOOOO1986.42 ENSMUSGOOOOOO70923 KLHL9 405 chr) 213335.60 21333669 ENSGOOOOO10O888 ENSMUSGOOOOOOS3754 CHD8 406 chr14 21853640 21853740 ENSGOOOOO10O888 ENSMUSGOOOOOOS3754 CHD8 407 chr14 21862O23 21862123 ENSGOOOOO1234.73 ENSMUSGOOOOOO28718 STIL 408 chr1 47716853 47716953 ENSGOOOOO155868 ENSMUSGOOOOOO2O397 MED7 409 chrS 156565759 156565859 ENSGOOOOO16OSS1 ENSMUSGOOOOOO17291 TAOK1 410 chr17 27869955 27870.055 ENSGOOOOO156983 ENSMUSGOOOOOOO1632 BRPF1 411 chr3 978O8O1 9780901 ENSGOOOOOO12232 ENSMUSGOOOOOO21978 EXTL3 412 chr8 28575287 28575387 ENSGOOOOO163946 ENSMUSGOOOOOO4O651 FAM208A 413 chr3 56657659 56657760 ENSGOOOOO163946 ENSMUSGOOOOOO4O651 FAM208A 414 chr3 56675500 56675600 ENSGOOOOO185624 ENSMUSGOOOOOO2S130 P4HB 415 chr17 798O1835 798O1936 ENSGOOOOOO77684 ENSMUSGOOOOOO2S764 PHF17 416 chr4 12978.3348 129783448 ENSGOOOOOO77684 ENSMUSGOOOOOO2S764 PHF17 417 chr4 129792890 129792990 ENSGOOOOOO77684 ENSMUSGOOOOOO2S764 PHF17 418 chr4 129793234 129793334 ENSGOOOOOOOS810 ENSMUSGOOOOOO33004 MYCBP2 419 chr13 77619357 7761.9457 ENSGOOOOO153827 ENSMUSGOOOOOO262.19 TRIP12 42O chr2 230723562 230723.662 ENSGOOOOO153827 ENSMUSGOOOOOO262.19 TRIP12 421 chr2 230724093 230724193 ENSGOOOOOO99889 ENSMUSGOOOOOOOO32S ARVCF 422 chr22 1995.7471 1995.7571 ENSGOOOOOO99889 ENSMUSGOOOOOOOO32S ARVCF 423 chr22 19957765 19957865 ENSGOOOOO196367 ENSMUSGOOOOOO4S482 TRRAP 424 chirf 98.609930 98.610030 ENSGOOOOO1276.03 ENSMUSGOOOOOO28649 MACF1 425 chr1 39851434 39851534 ENSGOOOOO1276.03 ENSMUSGOOOOOO28649 MACF1 426 chr1 39853O8O 398531.83 ENSGOOOOO132964 ENSMUSGOOOOOO2963S CDK8 427 chr13 268287SO 26828.860 ENSGOOOOO132964 ENSMUSGOOOOOO2963S CDK8 428 chr13 269782S6 26978.356 ENSGOOOOO161547 ENSMUSGOOOOOO3412O SRSF2 429 chr17 7473.3297 7473.3397 ENSGOOOOO2O6560 ENSMUSGOOOOOO14496 ANKRD28 430 chr3 1571 1481 15711581 ENSGOOOOO14SSSS ENSMUSGOOOOOO22272 MYO10 431 chrS 16701.316 16701416 ENSGOOOOOO72364 ENSMUSGOOOOOO49470 AFF4 432 chrS 132216686 132216786 ENSGOOOOOO72364 ENSMUSGOOOOOO49470 AFF4 433 chrS 132232254 1322323S4 ENSGOOOOO115306 ENSMUSGOOOOOO2O31S SPTBN1 434 chr2 S4858532 S4858632 ENSGOOOOO115306 ENSMUSGOOOOOO2O31S SPTBN1 435 chr2 54876826 54876926 ENSGOOOOO180901 ENSMUSGOOOOOO1694O KCTD2 436 chr17 73059.955 73060055 ENSGOOOOO134452 ENSMUSGOOOOOOS8594 FBXO18 437 chr1O S948372 S948472 ENSGOOOOO124486 ENSMUSGOOOOOO31010 USP9X 438 chrX 41075382 41075482 ENSGOOOOO124486 ENSMUSGOOOOOO31010 USP9X 439 chrX 41075663 41075763 ENSGOOOOO111737 ENSMUSGOOOOOO29518 RAB3S 440 chr12 12O534739 12O5348.39 ENSGOOOOO111737 ENSMUSGOOOOOO29518 RAB3S 441 chr12 12OS34962. 120535062 ENSGOOOOOO61938 ENSMUSGOOOOOO22791 TNK2 442 chr3 195590509 195590609 ENSGOOOOOO61938 ENSMUSGOOOOOO22791 TNK2 443 chr3 195594699 195594799 ENSGOOOOO132466 ENSMUSGOOOOOOSS2O4 ANKRD17 444 chr4 7395.7524 73957633 ENSGOOOOO131669 ENSMUSGOOOOOO37966 NINJ1 445 chr) 95884141 95884241 ENSGOOOOO1437.40 ENSMUSGOOOOOOO9894 SNAP47 446 chr1 227935784. 2279.35892 ENSGOOOOO1181.93 ENSMUSGOOOOOO41498 KIF14 447 chr1 2OOS227O2 200522807 ENSGOOOOO1158.16 ENSMUSGOOOOOO24081 CEBPZ 448 chr2 37454836 37454.936 ENSGOOOOO1158.16 ENSMUSGOOOOOO24081 CEBPZ 449 chr2 374.55261 37455361 ENSGOOOOOO91409 ENSMUSGOOOOOO271.11 ITGA6 450 chr2 1733691.02 1733692O2 ENSGOOOOOO908.63 ENSMUSGOOOOOOO3316 GLG1 451 chr16 74487002 744.871 O2 ENSGOOOOO1380.18 ENSMUSGOOOOOO7S703 EPT1 452 chr2 26612047 26612156 ENSGOOOOO128731 ENSMUSGOOOOOO3O4S1 HERC2 453 chr15 28356705 2835.6805 ENSGOOOOO141664 ENSMUSGOOOOOO38866 ZCCHC2 454 chr18 6O241930 60242O39 ENSGOOOOO1861.87 ENSMUSGOOOOOO33S4S ZNRF1 455 chr16 75141622 75141722 ENSGOOOOO116731 ENSMUSGOOOOOOS7637 PRDM2 456 chr1 14113255 14113359 ENSGOOOOOO884.48 ENSMUSGOOOOOO31508 ANKRD10 457 chr13 111532O21 111 S321.21 ENSGOOOOO1756O2. ENSMUSGOOOOOO95098 CCDC8SB 458 chr11 65658,550 656586SO ENSGOOOOO131016 ENSMUSGOOOOOO38587 AKAP12 459 chr6 151673318 151673418 ENSGOOOOO107929 ENSMUSGOOOOOO33499 LARP4B 460 chr1O 858.932 859.032

US 2016/0264934 A1 Sep. 15, 2016 38

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOOO74181 ENSMUSGOOOOOO38146 NOTCH3 S25 cr19 15272O88 15272188. ENSGOOOOO1876.78 ENSMUSGOOOOOO24427 SPRY4. 526 chrS 14169347S 141693575 ENSGOOOOO137OSS ENSMUSGOOOOOO28577 PLAA S27 chir 26905765 2690586S ENSGOOOOO1661.45 ENSMUSGOOOOOO2731S SPINT1 528 chr15 41137O2S 41 13712S ENSGOOOOO1661.45 ENSMUSGOOOOOO2731S SPINT1 529 cr15 41149216 41 149316 ENSGOOOOO164366 ENSMUSGOOOOOO21578 CCDC127 530 chrS 205397 2O5497 ENSGOOOOO164366 ENSMUSGOOOOOO21578 CCDC127 531 chrS 2OS809 2O5909 ENSGOOOOO172409 ENSMUSGOOOOOO27079 CLP1 532 chr11 S74284.32 574.28536 ENSGOOOOO19673O ENSMUSGOOOOOO21559 DAPK1 S33 ch9 903.21265 90.321365 ENSGOOOOO1989.52 ENSMUSGOOOOOOO1415 SMGS 534 chir 1562.35860 1562,35960 ENSGOOOOO160392 ENSMUSGOOOOOO49643 C19Crfaf S35 cr19 40827762 40827862 ENSGOOOOO146063 ENSMUSGOOOOOO40365 TRIM41 536 chrS 180651400 180651 SOO ENSGOOOOO143393 ENSMUSGOOOOOO38861 PI4KB 537 chir 15126SOO3 151265103 ENSGOOOOO1791.51 ENSMUSGOOOOOO38957 EDC3 S38 chr15 74924812 74924912 ENSGOOOOO168061 ENSMUSGOOOOOO24790 SAC3D1 539 chr11 6481.1958 64812058 ENSGOOOOOO683O8 ENSMUSGOOOOOO31154 OTUDS 540 chrX 48780O66 4878016.6 ENSGOOOOO168246 ENSMUSGOOOOOO44949 UBTD2 541 chrs 171638789 171638893 ENSGOOOOO168246 ENSMUSGOOOOOO44949 UBTD2 542 chrs 171639024 171639124 ENSGOOOOO166398 ENSMUSGOOOOOO66571 KIAAO3SS 543 chr19 348327OS 348328OS ENSGOOOOO166398 ENSMUSGOOOOOO66571 KIAAO3SS 544 chr19 348331.68 34833268 ENSGOOOOO1772OO ENSMUSGOOOOOOS6608 CHD9 545 chr16 5335832O S335842O ENSGOOOOO16343S ENSMUSGOOOOOOO3051 ELF3 546 chr 2O1984436 201984S43 ENSGOOOOO16343S ENSMUSGOOOOOOO3051 ELF3 547 chr 2O1984774 201984874 ENSGOOOOO1731.20 ENSMUSGOOOOOOS4611 KDM2A 548 chr11 67022S24 67022624 ENSGOOOOOO70961 ENSMUSGOOOOOO19943 ATP2B1 549 chr12 9004951O 90O4961O ENSGOOOOO116212 ENSMUSGOOOOOO28617 LRRC42 SSO chir 54417770 S4417870 ENSGOOOOO144674 ENSMUSGOOOOOO38708 GOLGA4 SS1 chr3 37365215 37365315 ENSGOOOOO103966 ENSMUSGOOOOOO27293 EHD4 SS2 chr15 42192710 421928.10 ENSGOOOOO110046 ENSMUSGOOOOOO24773 ATG-2A SS3 cr11 64662114 64662214 ENSGOOOOO1972.99 ENSMUSGOOOOOO3OS28 BLM SS4 chr15 91293060 912931 60 ENSGOOOOO129315 ENSMUSGOOOOOO11960 CCNT1 SSS cr12 49086996 49087103 ENSGOOOOO131711 ENSMUSGOOOOOOS2727 MAP1B SS6 chrS 71SO1060 715O1170 ENSGOOOOO1982.18 ENSMUSGOOOOOOO6673 QRICH1 SS7 chr3 49094397 49094497 ENSGOOOOO124571 ENSMUSGOOOOOO671SO XPOS SS8 chr6 43491SSO 434916SO ENSGOOOOO136068 ENSMUSGOOOOOO2S278 FLNB 559 chr3 S8109.194 S8109294 ENSGOOOOO114302 ENSMUSGOOOOOO326O1 PRKAR2A 560 chr3 48788924 48789.024 ENSGOOOOO142453 ENSMUSGOOOOOO32185 CARM1 561 chr19 11032S10 11032610 ENSGOOOOO16796S ENSMUSGOOOOOO24142 MLST8 562 chr16 2258975 2259075 ENSGOOOOO18O357 ENSMUSGOOOOOO40524 ZNF609 S63 chr15 64967232 64967332 ENSGOOOOO17995O ENSMUSGOOOOOOO2S24 PUF60 564 chr8 144898598. 144898,698 ENSGOOOOO116062 ENSMUSGOOOOOOOS370 MSH6 S6S chir 48027847 48027947 ENSGOOOOO112039 ENSMUSGOOOOOOO7S70 FANCE 566 chr6 3S423806 3S423906 ENSGOOOOO125834 ENSMUSGOOOOOO3788S STK35 567 chrC) 2097855 2097955 ENSGOOOOO132952 ENSMUSGOOOOOO41264 USPL1 568 chr13 31.232291 31232391 ENSGOOOOOO651.83 ENSMUSGOOOOOO33285 WDR3 569 chr 1185O2O58 118502158 ENSGOOOOO1367.09 ENSMUSGOOOOOO244OO WDR33 570 chir 128463860 128463960 ENSGOOOOO1367.09 ENSMUSGOOOOOO244OO WDR33 571 chir 128477586 128477688 ENSGOOOOOO60749 ENSMUSGOOOOOO74994 QSER1 572 chr11 32975575 32975677 ENSGOOOOO110074 ENSMUSGOOOOOO39048 FOXRED1 573 cr11 126147790 1261.47890 ENSGOOOOO197912 ENSMUSGOOOOOOOO738 SPG-7 574 chr16 89623369 89623469 ENSGOOOOO156273 ENSMUSGOOOOOO2S612 BACH1 575 cr1 3O714781 3O714881 ENSGOOOOO156273 ENSMUSGOOOOOO2S612 BACH1 576 chr1 30715069 30715170 ENSGOOOOO140829 ENSMUSGOOOOOO37993 DHX38 577 chr16 72146588 72146688 ENSGOOOOO10O3.30 ENSMUSGOOOOOO343S4 MTMR3 578 chr2 30421779 3O421879 ENSGOOOOO111300 ENSMUSGOOOOOO42719 NAA2S 579 cr12 11246731S 112467415 ENSGOOOOOO68.323 ENSMUSGOOOOOOOO134 TFE3 580 chrX 48887731 4.8887841 ENSGOOOOO111785 ENSMUSGOOOOOO3S62O RIC8B 581 chr12 1072O8767 1072O8867 ENSGOOOOO1831SS ENSMUSGOOOOOO42229 RABIF 582 chir 2028SOO72 20285O172 ENSGOOOOO11431S ENSMUSGOOOOOO22528 HES1 583 chr3 193856OS6 1938561S6 ENSGOOOOO136280 ENSMUSGOOOOOOOO378 CCM2 S84 chirf 45115631 45115731 ENSGOOOOO133704 ENSMUSGOOOOOO40029 IPO8 585 cr12 30783478 30783578 ENSGOOOOO167978 ENSMUSGOOOOOO3921.8 SRRM2 586 chr16 2817292 2817392 ENSGOOOOO130939 ENSMUSGOOOOOO28960 UBE4B S87 chir 1OO93638 10093.738 ENSGOOOOO130713 ENSMUSGOOOOOO39356 EXOSC2 588 chr 133579180 13357928O ENSGOOOOO1082.56 ENSMUSGOOOOOO37857 NUFIP2 S89 chr17 27613945 276.14045 US 2016/0264934 A1 Sep. 15, 2016 39

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID D Symbol ID) chromosome Start Human end ENSGOOOOO1082.56 ENSMUSGOOOOOO37857 NUFIP2 S90 chr17 276.14.146 27614246 ENSGOOOOO1082.56 ENSMUSGOOOOOO37857 NUFIP2 591 chr17 276.14471 276.14571 ENSGOOOOO257315 ENSMUSGOOOOOO944.10 ZBED6 592 chir 2O37681 04 2037682O4 ENSGOOOOOO758.56 ENSMUSGOOOOOO18974 SART3 593 cr12 10892OOSO 10892O1SO ENSGOOOOO159023. ENSMUSGOOOOOO28906 EPB41 594 chir 29314204 29314304 ENSGOOOOO107758 ENSMUSGOOOOOO21816 PPP3CB 595 chr10 75197869 75197969 ENSGOOOOO156599 ENSMUSGOOOOOO34O7S ZDHHCS 596 chr11 S74399.06 S744OOO6 ENSGOOOOO176986 ENSMUSGOOOOOO393.67 SEC24C 597 chr10 7553O82O 75530920 ENSGOOOOO1690.18 ENSMUSGOOOOOO3224.4 FEM1B 598 chr15 6858.2094 685821.94 ENSGOOOOO1690.18 ENSMUSGOOOOOO3224.4 FEM1B 599 chr15 68582606 685827O6 ENSGOOOOOOS8804 ENSMUSGOOOOOO28614 TMEM48 600 chir S4233519 S4233619 ENSGOOOOO1671.82 ENSMUSGOOOOOO18678 SP2 6O1 chr17 46OOS389 46.005489 ENSGOOOOO162714 ENSMUSGOOOOOO2O472 ZNF496 602 chr 247463981 247464081 ENSGOOOOO1465.76 ENSMUSGOOOOOO392.44 C7orf26 603 chirf 6639684 6639.787 ENSGOOOOOO2O2S6 ENSMUSGOOOOOO27SS1 ZFP64 604 chr2O 50768775 50768875 ENSGOOOOO1654.58 ENSMUSGOOOOOO32737 INPPL1 605 chr11 71948.631 71948.731 ENSGOOOOO19651O ENSMUSGOOOOOO29466 ANAPC7 606 chr12 110811954. 110812O63 ENSGOOOOO130764 ENSMUSGOOOOOO29028 LRRC47 6O7 chr 3697677 3697777 ENSGOOOOO12251S ENSMUSGOOOOOO41164 ZMIZ2 608 chrf 44807321 44807421 ENSGOOOOO1381.82 ENSMUSGOOOOOO24795 KIF2OB 609 chr1O 91497229 91497329 ENSGOOOOO142627 ENSMUSGOOOOOOO6445 EPHA2 610 chir 16451467 16451567 ENSGOOOOO1973.29 ENSMUSGOOOOOO2O134 PELI1 611 chr2 64321768 64321868 ENSGOOOOO1973.29 ENSMUSGOOOOOO2O134 PELI1 612 chr2 64.322106 64.322206 ENSGOOOOOOO6007 ENSMUSGOOOOOO33917 GDE 613 chr16 19514682 19514782 ENSGOOOOO156925. ENSMUSGOOOOOO67860 ZIC3 614 chrX 1366494-62. 136 6495.62 ENSGOOOOOO16864 ENSMUSGOOOOOO21916 GLT8D1 615 chr3 S2728812 527 28912 ENSGOOOOO160606 ENSMUSGOOOOOO19437 TLCD1 616 chr17 27051425 27051525 ENSGOOOOO182866 ENSMUSGOOOOOOOO409 LCK 617 chr 32751383 32751483 ENSGOOOOOO6S243. ENSMUSGOOOOOOO4591 PKN2 618 chr 89299.082 892991.82 ENSGOOOOOO789.02 ENSMUSGOOOOOO2S139 TOLLIP 619 chr11 1298132 1298.232 ENSGOOOOO168286 ENSMUSGOOOOOO36442 THAP11 62O chr16 67877695 67877795 ENSGOOOOO 105663 ENSMUSGOOOOOOO6307 MLL4.1 621 chr19 36223396 36223496 ENSGOOOOO149782 ENSMUSGOOOOOO2496O PLCB3 622 chr11 6403498O 64O3S08O ENSGOOOOO14892S ENSMUSGOOOOOO381.87 BTBD10 623 chr11 13410457 13410557 ENSGOOOOO176248 ENSMUSGOOOOOO2696S ANAPC2 624 chr) 140O8218O 140O82282 ENSGOOOOO162702 ENSMUSGOOOOOO41483 ZNF281 625 chir 200376434 200376534 ENSGOOOOO162702 ENSMUSGOOOOOO41483 ZNF281 626 chr 200377,690 2003,77790 ENSGOOOOO166135 ENSMUSGOOOOOO36450 HIFIAN 627 chr1O 102307974 1023O8.074 ENSGOOOOO1661.33 ENSMUSGOOOOOO27324 RPUSD2 628 chr15 40866435 40866535 ENSGOOOOO11S2O7 ENSMUSGOOOOOO29144 GTF3C2 629 chr2 275494OO 2754.9500 ENSGOOOOO11S2O7 ENSMUSGOOOOOO29144 GTF3C2 630 chr2 27549672 27549772 ENSGOOOOO119787 ENSMUSGOOOOOOS9811 ATL2 631 chr2 38523O44 38.523144 ENSGOOOOO13837S ENSMUSGOOOOOO39354 SMARCAL1 632 chr2 21728OO43 21728O143 ENSGOOOOO130772 ENSMUSGOOOOOO66O42 MED18 633 chr1 28661407 28661SOf ENSGOOOOO149503. ENSMUSGOOOOOO2466O INCENP 634 chr11 61897738 61897838 ENSGOOOOO182871 ENSMUSGOOOOOOO143S COL18A1 635 chr21 46932S4O 46932640 ENSGOOOOO154945 ENSMUSGOOOOOO2O864 ANKRD40 636 chr17 48773290 48.773390 ENSGOOOOO198783 ENSMUSGOOOOOO46O1 O ZNF830 637 chr17 3.3289068 3.3289168 ENSGOOOOO198783 ENSMUSGOOOOOO46O1 O ZNF830 638 chr17 3.328961O 3.328971 O ENSGOOOOO19878O ENSMUSGOOOOOO41817 FAM169A 639 chrS 74O77389 74O77498 ENSGOOOOO1434.57 ENSMUSGOOOOOO46519 GOLPH3L 640 chr1 15062O824 150620924 ENSGOOOOO181449 SOX2 641 ENSGOOOOO111704 NANOG 642 ENSGOOOOO175387 SMAD2 643 ENSGOOOOO166949 SMAD3 644 ENSGOOOOO136997 MYC 645 ENSGOOOOO137815 RTF1 646 ENSGOOOOO103479 RBL2 647 ENSGOOOOOOO8O83 JARID2 648 ENSGOOOOO131914 LIN28 649 ENSGOOOOO168036 CTNNB1 6SO ENSGOOOOO12S686 MED1 651 ENSGOOOOOO74266 EED 652 ENSGOOOOO245532 NEAT1 653 ENSGOOOOO2S8609 LINC-ROR 654 US 2016/0264934 A1 Sep. 15, 2016 40

TABLE 1-continued

hESC and mESC Common Peaks Table 1: List of genes for measuring mA levels in stem cell populations. Table 1 is related to FIG. 6 and provides the Ensemble Gene ID of human and mouse and chromosome coordinates of common mA peaks. SEQID NO: (for Human human Human Ensembl Mouse Ensembl Ggene Gene Gene Human Human Ggene ID ID Symbol ID) chromosome Start Human end

ENSGOOOOO2.79897 MEGAMIND. 655 TUNA (BIRC6 antisense RNA 2) ENSGOOOOO163508 EOMES 656 ENSGOOOOO125798 FOXA2 657

TABLE 2 DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOOO64703 DDX20 N Y ENSGOOOOO166O2S AMOTL1 N Y ENSGOOOOOO8601 S MAST2 N Y ENSGOOOOO166O2S AMOTL1 Y Y ENSGOOOOO160O87 UBE22 N Y ENSGOOOOO175216 CKAPS N Y ENSGOOOOO160688 FLAD1 N Y ENSGOOOOO196323 ZBTB44 N Y ENSGOOOOO143476 DTL Y Y ENSGOOOOO137SO4 CREBZF N Y ENSGOOOOO142599 RERE N Y ENSGOOOOO1827.04 TSKU N Y ENSGOOOOO142599 RERE Y Y ENSGOOOOO1827.04 TSKU N Y ENSGOOOOO179403 WWA1 N Y ENSGOOOOO18663S ARAP1 Y Y ENSGOOOOO2O3668 CHML Y Y ENSGOOOOOO7OO47 PHRF1 N Y ENSGOOOOO117523 PRRC2C N Y ENSGOOOOO173621 LRFN4 N Y ENSGOOOOO162377 SELRC1 N Y ENSGOOOOO133789 SWAP70 N Y ENSGOOOOO162377 SELRC1 Y Y ENSGOOOOO166261 ZNF2O2 N Y ENSGOOOOO2O4138 PHACTR4 N Y ENSGOOOOO166261 ZNF2O2 N Y ENSGOOOOO2O4138 PHACTR4 N Y ENSGOOOOO1421 O2 ATHL1 N Y ENSGOOOOO158769 F11R N Y ENSGOOOOO149428 HYOU1 Y Y ENSGOOOOO143337 TOR1AIP1 N Y ENSGOOOOO149428 HYOU1 N Y ENSGOOOOO143337 TOR1AIP1 Y Y ENSGOOOOO1621.94 C11orf248 N Y ENSGOOOOO143624 INTS3 N Y ENSGOOOOO134824 FADS2 N Y ENSGOOOOO168159 RNF187 N Y ENSGOOOOO168040 FADD N Y ENSGOOOOOO90273 NUDC N Y ENSGOOOOO188486 H2AFX N Y ENSGOOOOOO90273 NUDC N Y ENSGOOOOO188486 H2AFX N Y ENSGOOOOO117724 CENPF Y Y ENSGOOOOO149091 DGKZ Y Y ENSGOOOOO117724 CENPF N Y ENSGOOOOO175827 APOO1266.1 N Y ENSGOOOOO143294 PRCC N Y ENSGOOOOO11004.8 OSBP N Y ENSGOOOOO163374 YY1-AP1 N Y ENSGOOOOO121653 MAPK8IP1 N Y ENSGOOOOO158796 DEDD N Y ENSGOOOOO11006O PUS3 Y Y ENSGOOOOO136636 KCTD3 N Y ENSGOOOOO165458 INPPL1 Y Y ENSGOOOOO164011 ZNF691 N Y ENSGOOOOOO78902 TOLLIP N Y ENSGOOOOO16O710 ADAR N Y ENSGOOOOO160613 PCSK7 Y Y ENSGOOOOO2S846S RP11- N Y ENSGOOOOOO72S18 MARK2 N Y S74F21.3.1 ENSGOOOOO116667 C1 orf21 N Y ENSGOOOOO149016 TUT N Y ENSGOOOOO142949 PTPRF Y Y ENSGOOOOO184281 TSSC4 N Y ENSGOOOOO142949 PTPRF N Y ENSGOOOOOO89597 GANAB N Y ENSGOOOOO142949 PTPRF N Y ENSGOOOOO198561 CTNND1 N Y ENSGOOOOOO83444 PLOD1 N Y ENSGOOOOO165434 PGM2L1 N Y ENSGOOOOOO83444 PLOD1 N Y ENSGOOOOO196914 ARHGEF12 N Y ENSGOOOOO116863 ADPRHL2 N Y ENSGOOOOO110711 AIP N Y ENSGOOOOO160803 UBQLN4 N Y ENSGOOOOO137497 NUMA1 N Y US 2016/0264934 A1 Sep. 15, 2016 41

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change -2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO171492 LRRC8D ENSGOOOOO137497 NUMA1 ENSGOOOOO158195 WASF2 ENSGOOOOO137497 NUMA1 ENSGOOOOO18O198 RCC1 ENSGOOOOO2O5213 LGR4 ENSGOOOOO122482 ZNF644 ENSGOOOOO149532 CPSF7 ENSGOOOOOO2O129 NCDN ENSGOOOOO1498.23 C11orf2 ENSGOOOOO1531.87 HNRNPU ENSGOOOOO137S13 NARS2 ENSGOOOOO1571.84 CPT2 ENSGOOOOO166902 MRPL16 ENSGOOOOO1571.84 CPT2 ENSGOOOOOO76053 RBM7 ENSGOOOOO1074.04 DVL1 ENSGOOOOO174669 SLC29A2 ENSGOOOOO215717 TMEM167B ENSGOOOOO168569 TMEM223 ENSGOOOOO171603 CLSTN1 ENSGOOOOO234.857 RP11 831H9.16.1 ENSGOOOOO143079 CTTNBP2NL ENSGOOOOO120451 SNX19 ENSGOOOOO135823 STX6 ENSGOOOOO120451 SNX19 ENSGOOOOO135823 STX6 ENSGOOOOO135372 NAT10 ENSGOOOOO134690 CDCA8 ENSGOOOOO162236 STX5 ENSGOOOOOO66135 KDM4A ENSGOOOOO173898 SPTBN2 ENSGOOOOOO66135 KDM4A ENSGOOOOOO95139 ARCN1 ENSGOOOOO1856.30 PBX1 ENSGOOOOOO60749 QSER1 ENSGOOOOO130695 CEP85 ENSGOOOOO110074 FOXRED1 ENSGOOOOO116754 SRSF11 ENSGOOOOO167985 SDHAF2 ENSGOOOOO162783 IERS ENSGOOOOOOS9804 SLC2A3 ENSGOOOOO116128 BCL9 ENSGOOOOOOS7294 PKP2 ENSGOOOOO116128 BCL9 ENSGOOOOO196498 NCOR2 ENSGOOOOO168264 IRF2BP2 ENSGOOOOO189079 ARID2 ENSGOOOOO188157 AGRN ENSGOOOOO1116O2 TIMELESS ENSGOOOOO188157 AGRN ENSGOOOOO1116O2 TIMELESS ENSGOOOOO157870 FAM213B ENSGOOOOOO88986 DYNLL1 ENSGOOOOO213516 RBMXL1 ENSGOOOOOOO3056 M6PR ENSGOOOOO1606.79 CHTOP ENSGOOOOO177084 POLE ENSGOOOOO198492 YTHDF2 ENSGOOOOO181852 RNF41 ENSGOOOOO198492 YTHDF2 ENSGOOOOO182SOO ENSGOOOOO198492 YTHDF2 ENSGOOOOO151952 ENSGOOOOO143384 MCL1 ENSGOOOOO123094 ENSGOOOOO169641 LUZP1 ENSGOOOOO171792 ENSGOOOOO169641 LUZP1 ENSGOOOOO161813 ENSGOOOOO116698 SMG7 ENSGOOOOO161813 ENSGOOOOO116691 MIIP ENSGOOOOO1396.13 s AR C C 2 ENSGOOOOO143S45 RAB13 ENSGOOOOO173064 2orf51 ENSGOOOOO2S3368 TRNP1 ENSGOOOOO1752.15 DSP2 ENSGOOOOO143153 ATP1B1 ENSGOOOOO1752.15 DSP2 ENSGOOOOO197622 CDC42SE1 ENSGOOOOO183495 400 ENSGOOOOO185483 ROR1 ENSGOOOOO126746 F384 ENSGOOOOOOS4118 THRAP3 ENSGOOOOO170633 F34 ENSGOOOOOO82S12 TRAFS ENSGOOOOO170633 F34 ENSGOOOOO143390 RFX5 ENSGOOOOO1393.18 SP6 ENSGOOOOO154358 OBSCN ENSGOOOOO170855 AP1 ENSGOOOOO130764 LRRC47 ENSGOOOOO2S3719 TXN7L3B ENSGOOOOO130764 LRRC47 ENSGOOOOO166225 FRS2 ENSGOOOOOO8SSS2 GSF9 ENSGOOOOO1391.54 EBP2 ENSGOOOOO162702 ZNF281 ENSGOOOOO167548 s L L2 ENSGOOOOO162702 ZNF281 ENSGOOOOOO76108 ENSGOOOOO162702 ZNF281 ENSGOOOOO134287 ENSGOOOOO158710 TAGLN2 ENSGOOOOO1741 06 ENSGOOOOO204160 ZDHHC18 ENSGOOOOO171681 ENSGOOOOO204160 ZDHHC18 ENSGOOOOOO89094 ENSGOOOOO116560 SFPQ ENSGOOOOOO89094 ENSGOOOOOO23902 PLEKHO1 ENSGOOOOO247.077 ENSGOOOOO134247 PTGFRN ENSGOOOOO136O26 ENSGOOOOOO786.18 NRD1 ENSGOOOOO123066 ENSGOOOOO116584 ARHGEF2 ENSGOOOOO166860 ENSGOOOOO1426SS PEX14 ENSGOOOOO161638 ENSGOOOOO132688 NES ENSGOOOOO111266 ENSGOOOOO132688 NES ENSGOOOOOO874.48 US 2016/0264934 A1 Sep. 15, 2016 42

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO158966 CACHD1 ENSGOOOOOO81760 AACS ENSGOOOOO158966 CACHD1 ENSGOOOOO110871 COQ5 ENSGOOOOOOS8673 ZC3H11A ENSGOOOOO184O47 DIABLO ENSGOOOOO186283 TOR3A ENSGOOOOO111412 C12OrfA9 ENSGOOOOO19796S MPZL.1 ENSGOOOOO133639 BTG1 ENSGOOOOOOS3372 MRTO4 ENSGOOOOO111752 PHC1 ENSGOOOOO157933 SKI ENSGOOOOO150990 DHX37 ENSGOOOOO164008 C1 orf50 ENSGOOOOO166598 HSP90B1 ENSGOOOOOO85491 SLC25A24 ENSGOOOOO185591 SP1 ENSGOOOOO116871 MAP7D1 ENSGOOOOOO6O237 WNK1 ENSGOOOOO1987OO IPO9 ENSGOOOOO120800 UTP2O ENSGOOOOO162419 GMEB1 ENSGOOOOOO13573 DDX11 ENSGOOOOO160818 GPATCH4 ENSGOOOOO174718 C12Orf3S ENSGOOOOO1434.86 EIF2D ENSGOOOOOO828OS ERC1 ENSGOOOOO132716 DCAF8 ENSGOOOOO136014 USP44 ENSGOOOOO132716 DCAF8 ENSGOOOOO136014 USP44 ENSGOOOOO116990 MYCL1 ENSGOOOOO167272 POP5 ENSGOOOOO188976 NC2L ENSGOOOOOOSO4OS LIMA1 ENSGOOOOO1182OO CAMSAP2 ENSGOOOOOO891S4 GCN1L1 ENSGOOOOOOS4116 TRAPPC3 ENSGOOOOO110931 CAMKK2 ENSGOOOOO155380 SLC16A1 ENSGOOOOO110931 CAMKK2 ENSGOOOOO143061 IGSF3 ENSGOOOOO150977 RILPL2 ENSGOOOOO162923 WDR26 ENSGOOOOO12O647 CCDC77 ENSGOOOOO186603 ENSGOOOOO178498 DTX3 ENSGOOOOOO65526 ENSGOOOOO174437 ATP2A2 ENSGOOOOOO65526 ENSGOOOOO175727 MLXIP ENSGOOOOO182827 ENSGOOOOO102804 TSC22D1 ENSGOOOOOO78808 ENSGOOOOO102804 TSC22D1 ENSGOOOOO158109 ENSGOOOOOO433SS ZIC2 ENSGOOOOO116473 ENSGOOOOO187498 COL4A1 ENSGOOOOO160685 ZBTB7B ENSGOOOOO187498 COL4A1 ENSGOOOOO224870 RP4 ENSGOOOOO125249 RAP2A 758.18.2.1 ENSGOOOOO224870 RP4 ENSGOOOOO1361.22 BORA 758.18.2.1 ENSGOOOOO162512 SDC3 ENSGOOOOO150907 FOXO1 ENSGOOOOO215908 CROCCP2 ENSGOOOOO133104 SPG20 ENSGOOOOO242590 RP11 ENSGOOOOO234787 LINCOO458 S4O7.14.1 ENSGOOOOO1543OS MIA3 ENSGOOOOO134899 ERCC5 ENSGOOOOO127603 MACF1 ENSGOOOOO122042 UBL3 ENSGOOOOO198837 DENND4B ENSGOOOOO139514 SLC7A1 ENSGOOOOO213190 MLLT11 ENSGOOOOO139514 SLC7A1 ENSGOOOOO1989.52 SMGS ENSGOOOOO169062 UPF3A ENSGOOOOO143375 CGN ENSGOOOOO150510 FAM124A ENSGOOOOOO31698 SARS ENSGOOOOO1232OO ZC3H13 ENSGOOOOOO60656 PTPRU ENSGOOOOO1232OO C3H13 ENSGOOOOOO36549 ZZZ3 ENSGOOOOO1988.94 AA1737 ENSGOOOOO1961.82 STK40 ENSGOOOOO165898 CA2 ENSGOOOOO116237 ICMT ENSGOOOOO10O852 RHGAPS ENSGOOOOO116237 ICMT ENSGOOOOO2O5476 CDC85C ENSGOOOOO117713 ARID1A ENSGOOOOOO92148 ECTD1 ENSGOOOOO117713 ARID1A ENSGOOOOO100813 CIN1 ENSGOOOOO117713 ARID1A ENSGOOOOO100813 CIN1 ENSGOOOOO162714 ZNF496 ENSGOOOOO1971 O2 YNC1RH1 ENSGOOOOO1434.57 GOLPH3L ENSGOOOOOO89737 DX24 ENSGOOOOO18O398 MCFD2 ENSGOOOOO1006SO RSFS ENSGOOOOO135916 ITM2C ENSGOOOOO119596 LPM1 ENSGOOOOO247626 MARS2 ENSGOOOOO100461 BM23 ENSGOOOOO176946 THAP4 ENSGOOOOO100461 BM23 ENSGOOOOO115694 STK2S ENSGOOOOOOO6432 AP3K9 ENSGOOOOOO82258 CCNT2 ENSGOOOOO100441 : HNYN ENSGOOOOOO82258 CCNT2 ENSGOOOOOO15133 CCDC88C ENSGOOOOO163811 WDR43 ENSGOOOOO1OO938 GMPR2 US 2016/0264934 A1 Sep. 15, 2016 43

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO 98142 ANKRD57 ENSGOOOOO2SS242 C14orf169 ENSGOOOOO 43970 ASXL2 ENSGOOOOO2SS242 C14orf169 ENSGOOOOO 43970 ASXL2 ENSGOOOOO 39998 RAB15 ENSGOOOOO 35912 TTLL4 ENSGOOOOO 39998 RAB15 ENSGOOOOO 35912 TTLL4 ENSGOOOOO 19669 IRF2BPL ENSGOOOOO 5170 ACVR1 ENSGOOOOO OO823 APEX1 ENSGOOOOO213160 KLHL23 ENSGOOOOO 65617 DACT1 ENSGOOOOO213160 KLHL23 ENSGOOOOOO72042 RDH11 ENSGOOOOOO828.98 XPO1 ENSGOOOOO 97119 SLC25A29 ENSGOOOOO 4948 ADAM23 ENSGOOOOO 97119 SLC25A29 ENSGOOOOO 632S1 FZD5 ENSGOOOOO 57227 MMP14 ENSGOOOOO 97.329 PELI1 ENSGOOOOO 57227 MMP14 ENSGOOOOO S2284 TCF7L1 ENSGOOOOO OO796 SMEK1 ENSGOOOOO 5464 USP34 ENSGOOOOOO 66735 KIF26A ENSGOOOOO 36699 SMPD4 ENSGOOOOOO 89916 C14orf118 ENSGOOOOOO71051 NCK2 ENSGOOOOO 19707 ENSGOOOOO 98.12 FAM98A ENSGOOOOO 19707 ENSGOOOOO 34323 MYCN ENSGOOOOO SS463 ENSGOOOOO 323.13 MRPL3S ENSGOOOOO OO888 ENSGOOOOO S816 CEBPZ. ENSGOOOOO OO603 ENSGOOOOO 38018 EPT1 ENSGOOOOO OO836 ENSGOOOOOO74054 CLASP1 ENSGOOOOO 79933 ENSGOOOOO 6062 MSH6 ENSGOOOOO 65819 ENSGOOOOO 36.720 HS6ST1 ENSGOOOOO 83576 ENSGOOOOO 36.720 HS6ST1 ENSGOOOOO 26803 ENSGOOOOO 70745 KCNS3 ENSGOOOOO 26803 ENSGOOOOO 98522 GPN1 ENSGOOOOO OO941 ENSGOOOOOOO3S09 C2Orfs 6 ENSGOOOOO 6SS88 ENSGOOOOO 72845 SP3 ENSGOOOOO 6SS88 ENSGOOOOO240857 RDH14 ENSGOOOOO 33997 ENSGOOOOO 52518 ZFP36L2 ENSGOOOOO2SO366

ENSGOOOOOO63660 GPC1 ENSGOOOOO 40443 ENSGOOOOO 63166 WS1 ENSGOOOOO 40443 ENSGOOOOO 24OO6 OBSL1 ENSGOOOOO 693.75 ENSGOOOOO 24OO6 OBSL1 ENSGOOOOO 693.75 ENSGOOOOO 44524 COPS7B ENSGOOOOO 28944 ENSGOOOOO S32O1 RANBP2 ENSGOOOOO 821.75 ENSGOOOOO 301.47 SH3BP4 ENSGOOOOO 40521 ENSGOOOOO 15129 TP53I3 ENSGOOOOO 66855 ENSGOOOOO S2291 TGOLN2 ENSGOOOOO 66716 ENSGOOOOO2O4634 TBC1D8 ENSGOOOOO 8SO33 ENSGOOOOO 2S630 POLR1B ENSGOOOOO 73548 ENSGOOOOO 52147 GEMIN6 ENSGOOOOO 36383 ENSGOOOOO 68.758 SEMA4C ENSGOOOOO 36383 ENSGOOOOO 68.758 SEMA4C ENSGOOOOO 79361 ENSGOOOOO 18242 MREG ENSGOOOOO O4081 ENSGOOOOOO914.09 ITGA6 ENSGOOOOO O3994 ENSGOOOOO 15825 PRKD3 ENSGOOOOO 40263 ENSGOOOOO 63795 ZNF513 ENSGOOOOOO21776 ENSGOOOOO 24383 MPHOSPH10 ENSGOOOOO 40464 PML ENSGOOOOO 19862 LGALSL ENSGOOOOO 28.96S CHAC1 ENSGOOOOOO68654 POLR1A ENSGOOOOO 3.1873 CHSY1 ENSGOOOOOO68654 POLR1A ENSGOOOOO 69371 SNUPN ENSGOOOOO 15942 ORC2 ENSGOOOOO 662OO COPS2 ENSGOOOOO 70340 B3GNT2 ENSGOOOOO 82768 NGRN ENSGOOOOO 70340 B3GNT2 ENSGOOOOOO33800 PIAS1 ENSGOOOOO 70340 B3GNT2 ENSGOOOOO225151 AC103965.1.1 ENSGOOOOO 152O7 GTF3C2 ENSGOOOOO 97.299 BLM ENSGOOOOO 44233 AMMECR1L, ENSGOOOOO 69.018 ENSGOOOOO 63812 ZDHHC3 ENSGOOOOO 69.018 ENSGOOOOO 44746 ARL6IPS ENSGOOOOO 69.018 ENSGOOOOO 78.252 WDR6 ENSGOOOOO 69.018 ENSGOOOOO 78.252 WDR6 ENSGOOOOO 671.96 US 2016/0264934 A1 Sep. 15, 2016 44

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) change change Gene Gene Symbol (Yes/No) (Yes/No) Gene >2) >1.5) ENSGOOOOO170266 GLB1 ENSGOOOOO104142 ENSGOOOOO114019 AMOTL2 ENSGOOOOO183060 ENSGOOOOO114019 AMOTL2 ENSGOOOOO104O67 ENSGOOOOO168137 SETD5 ENSGOOOOO104O67 ENSGOOOOO175928 LRRN1 ENSGOOOOOO34053 ENSGOOOOO181SSS SETD2 ENSGOOOOO159322 ENSGOOOOO181SSS SETD2 ENSGOOOOO174498 ENSGOOOOO1739SO XXYLT1 ENSGOOOOO169926 ENSGOOOOOO10322 NISCH ENSGOOOOO140259 ENSGOOOOO154767 XPC ENSGOOOOO1826.36 D N ENSGOOOOO175093 SPSB4 ENSGOOOOO140474 ENSGOOOOO114631 PODXL2 ENSGOOOOO157483 ENSGOOOOO172046 USP19 ENSGOOOOO166233 ENSGOOOOO164091 WDR82 ENSGOOOOO18O357 Z. F60 9 ENSGOOOOO134086 VHL ENSGOOOOO170776 ENSGOOOOO164O45 CDC25A ENSGOOOOO140320 sA. H D ENSGOOOOO163684 RPP14 ENSGOOOOOO90238 ENSGOOOOO163681 SLMAP ENSGOOOOO1684.11 ENSGOOOOO174579 MSL2 ENSGOOOOO1684.11 W D3 ENSGOOOOO2O6557 TRIM71 ENSGOOOOOO66654 ENSGOOOOO114867 EIF4G1 ENSGOOOOO182831 i ENSGOOOOO213672 NCKIPSD ENSGOOOOO140854 ENSGOOOOO170876 TMEM43 ENSGOOOOO149930 ENSGOOOOO11412O SLC25A36 ENSGOOOOO197562 ENSGOOOOO154783 FGD5 ENSGOOOOO180O3S ENSGOOOOO176095 P6K1 ENSGOOOOO 103356 ENSGOOOOO187091 PLCD1 ENSGOOOOO 103356 ENSGOOOOO170837 GPR27 ENSGOOOOOO99381 ENSGOOOOO170837 GPR27 ENSGOOOOO 103549 RNF40 ENSGOOOOO1636O2 RYBP ENSGOOOOO141084 RANBP10 ENSGOOOOO163608 C3orf17 ENSGOOOOO198736 SEPX1 ENSGOOOOO163832 C3orf75 ENSGOOOOO131149 KIAAO182 ENSGOOOOOO73849 ST6GAL1 ENSGOOOOO131149 KIAAO182 ENSGOOOOOO73849 ST6GAL1 ENSGOOOOO162O73 PAQR4 ENSGOOOOOO73849 ST6GAL1 ENSGOOOOO162O73 PAQR4 ENSGOOOOOO73849 ST6GAL1 ENSGOOOOO162O73 PAQR4 ENSGOOOOO134077 THUMPD3 ENSGOOOOO102921 N4BP1 ENSGOOOOO14SO41 VPRBP ENSGOOOOOOSO820 BCAR1 ENSGOOOOO163660 CCNL1 ENSGOOOOOOSO820 BCAR1 ENSGOOOOO144749 LRIG1 ENSGOOOOO131165 CHMP1A ENSGOOOOO144730 L17RD ENSGOOOOO1188.98 C PL ENSGOOOOOO82781 TGB5 ENSGOOOOO1188.98 C PL ENSGOOOOO225733 FGDS-AS1 ENSGOOOOOO77238 L4R ENSGOOOOO225733 FGDS-AS1 ENSGOOOOO 103335 PIEZO1 ENSGOOOOO144711 QSEC1 ENSGOOOOOO83093 PALB2 ENSGOOOOO198585 NUDT16 ENSGOOOOO179889 DXDC1 ENSGOOOOO175455 CCDC14 ENSGOOOOO157350 T3GAL2 ENSGOOOOO1321 SS RAF1 ENSGOOOOO159579 SPRY1 ENSGOOOOOOS1382 PIK3CB ENSGOOOOO189091 F3B3 ENSGOOOOO174738 NR1D2 ENSGOOOOO166454 TMIN ENSGOOOOO174953 DHX36 ENSGOOOOO157106 iMG1 ENSGOOOOOOO4534 RBM6 ENSGOOOOO 103257 LC7AS ENSGOOOOO15589.3 ACPL2 ENSGOOOOO 103257 LC7AS ENSGOOOOO1734O2 DAG1 ENSGOOOOO153815 sMIP ENSGOOOOOO73792 GF2BP2 ENSGOOOOO1407SO RHGAP17 ENSGOOOOOO8O819 CPOX ENSGOOOOO132603 IP7 ENSGOOOOO151276 MAGI1 ENSGOOOOOOO7392 UC7L ENSGOOOOOO16864 GLT8D1 ENSGOOOOO1688O2 HTF8 ENSGOOOOO136603 SKIL ENSGOOOOO1982.11 UBB3 ENSGOOOOO163872 YEATS2 ENSGOOOOO167978 RRM2 ENSGOOOOO162290 DCP1A ENSGOOOOO104731 LHDC4 ENSGOOOOO161217 ENSGOOOOO168286 HAP11 ENSGOOOOO169744 ENSGOOOOO168286 HAP11 ENSGOOOOO138759 ENSGOOOOO167191 PRCSB US 2016/0264934 A1 Sep. 15, 2016 45

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) change change Gene Gene Symbol (Yes/No) (Yes/No) Gene >2) >1.5) ENSGOOOOO1095O1 WFS1 ENSGOOOOO 103326 ENSGOOOOO1095O1 WFS1 ENSGOOOOO140632 ENSGOOOOO145220 LYAR ENSGOOOOO176387 ENSGOOOOO10926S KIAA1211 ENSGOOOOO167526 ENSGOOOOO10926S KIAA1211 ENSGOOOOO 103429 ENSGOOOOO128052 KDR ENSGOOOOO 103423 ENSGOOOOO128052 KDR ENSGOOOOO1406SO ENSGOOOOO128052 KDR ENSGOOOOO 103449 ENSGOOOOO128052 KDR ENSGOOOOO169217 CD2BP2 ENSGOOOOO152990 GPR12S ENSGOOOOO169217 CD2BP2 ENSGOOOOO152990 GPR12S ENSGOOOOO166847 DCTNS ENSGOOOOO168936 TMEM129 ENSGOOOOO122386 ZNF205 ENSGOOOOO118579 MED28 ENSGOOOOOO90905 TNRC6A ENSGOOOOOO83857 ENSGOOOOO102977 ACD ENSGOOOOOO83857 ENSGOOOOO102974 CTCF ENSGOOOOOO83857 ENSGOOOOO1821.49 IST1 ENSGOOOOO121892 ENSGOOOOO168488 ATXN2L ENSGOOOOO185619 ENSGOOOOO122257 RBBP6 ENSGOOOOO168556 ENSGOOOOO162O62 C16orf59 ENSGOOOOOO77684 ENSGOOOOO1O3SSO C16orf&8 ENSGOOOOOO77684 ENSGOOOOOO80603 SRCAP ENSGOOOOO186222 ENSGOOOOO153406 NMRAL.1 ENSGOOOOO152208 ENSGOOOOO1846O2 SNN ENSGOOOOO109814 ENSGOOOOO1846O2 SNN ENSGOOOOO163629 PTPN13 ENSGOOOOO187555 USP7 ENSGOOOOO168924 ETM1 ENSGOOOOOO90857 PDPR ENSGOOOOO163694 RBM47 ENSGOOOOOOO6327 NFRSF12A ENSGOOOOO164040 GRMC2 ENSGOOOOO 103160 SDL1 ENSGOOOOO198589 BA ENSGOOOOOO62O38 DH3 ENSGOOOOO1574.04 KIT ENSGOOOOO179918 :EPHS2 ENSGOOOOO218336 ODZ3 ENSGOOOOO179918 EPHS2 ENSGOOOOO1841.60 ADRA2C ENSGOOOOO12992S MEM8A ENSGOOOOO118762 PKD2 ENSGOOOOO1411.01 B1 ENSGOOOOO132466 ANKRD17 ENSGOOOOOO87258 NAO1 ENSGOOOOOO3S928 RFC1 ENSGOOOOO168872 DX19A ENSGOOOOO1324OS TBC1D14 ENSGOOOOO168872 DX19A ENSGOOOOO179059 ZFP42 ENSGOOOOOO99364 BXL19 ENSGOOOOO179010 MRFAP1 ENSGOOOOO125166 OT2 ENSGOOOOO138771 SHROOM3 ENSGOOOOO197912 PG7 ENSGOOOOO161021 MAML.1 ENSGOOOOO157349 DX19B ENSGOOOOO161021 MAML.1 ENSGOOOOOO95906 NUBP2 ENSGOOOOO174136 RGMB ENSGOOOOO167513 CDT1 ENSGOOOOO113141 IK ENSGOOOOO167513 CDT1 ENSGOOOOO197226 TBC1D9B ENSGOOOOOO90S6S RAB11FIP3 ENSGOOOOO1131 61 HMGCR ENSGOOOOO167693 N ENSGOOOOO12O705 ETF1 ENSGOOOOO167693 N ENSGOOOOO113504 SLC12A7 ENSGOOOOO186566 PATCH8 ENSGOOOOOO48140 TSPAN17 ENSGOOOOO171298 AA ENSGOOOOO1456O4 SKP2 ENSGOOOOO1794.09 EMIN4 ENSGOOOOO153922 CHD1 ENSGOOOOO1794.09 EMIN4 ENSGOOOOO164574 GALNT10 ENSGOOOOO167861 7orf28 ENSGOOOOO113645 WWC1 ENSGOOOOO1671 OS MEM92 ENSGOOOOO176788 BASP1 ENSGOOOOO109062 SLC9A3R1 ENSGOOOOO1222O3 KIAA1191 ENSGOOOOO141736 ERBB2 ENSGOOOOO153395 LPCAT1 ENSGOOOOO121057 AKAP1 ENSGOOOOO153395 LPCAT1 ENSGOOOOO121058 COIL ENSGOOOOO188725 CSOrfa ENSGOOOOO159842 ABR ENSGOOOOOO37474 NSUN2 ENSGOOOOOO2972S RABEP1 ENSGOOOOOO37474 NSUN2 ENSGOOOOO170O37 CNTROB ENSGOOOOO169223 LMAN2 ENSGOOOOO170004 CHD3 ENSGOOOOO14SSSS MYO10 ENSGOOOOO188554 NBR1 ENSGOOOOO131504 DIAPH1 ENSGOOOOO17306S C17orf53 ENSGOOOOO1641.51 KIAAO947 ENSGOOOOO132142 ACACA ENSGOOOOO15SS08 CNT8 ENSGOOOOO136448 NMT1 US 2016/0264934 A1 Sep. 15, 2016 46

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change -2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO2SO337 RP11 Y Y ENSGOOOOO136448 NMT1 N 46C2O.1.1 ENSGOOOOO135083 CCNTL ENSGOOOOO136448 NMT1 ENSGOOOOO164190 NIPBL ENSGOOOOO136444 RSAD1 ENSGOOOOO145882 PCYOX1L ENSGOOOOO213977 TAX1BP3 ENSGOOOOOO82516 GEMINS ENSGOOOOO177370 TIMM22 ENSGOOOOOO67248 DHX29 ENSGOOOOO108.256 NUFIP2 ENSGOOOOO19878O FAM169A ENSGOOOOO108.256 NUFIP2 ENSGOOOOO150712 MTMR12 ENSGOOOOO10827O AATF ENSGOOOOO178913 TAF7 ENSGOOOOO16OSS1 TAOK1 ENSGOOOOO165671 NSD ENSGOOOOO132475 H3F3B ENSGOOOOO165671 NSD ENSGOOOOO161542 PRPSAP1 ENSGOOOOO165671 NSD ENSGOOOOO108840 HDACS ENSGOOOOO165671 NSD ENSGOOOOO108848 LUCTL3 ENSGOOOOO165671 NSD ENSGOOOOO1861.85 KIF18B ENSGOOOOO165671 NSD ENSGOOOOOO72310 SREBF1 ENSGOOOOOO38382 TRIO ENSGOOOOO1974.17 SHPK ENSGOOOOO168246 UBTD2 ENSGOOOOO1974.17 SHPK ENSGOOOOOO70814 TCOF1 ENSGOOOOO175832 ETV4 ENSGOOOOO152684 PELO ENSGOOOOO108312 UBTF ENSGOOOOOO92421 SEMA6A ENSGOOOOO185359 HGS ENSGOOOOOO92421 SEMA6A ENSGOOOOO174282 ZBTB4 ENSGOOOOO112984 KIF2OA ENSGOOOOO141456 ACO91153.1 ENSGOOOOO113583 CSOrf15 ENSGOOOOO141456 ACO91153.1 ENSGOOOOO171604 CXXC5 ENSGOOOOOO72134 EPN2 ENSGOOOOO113657 DPYSL3 ENSGOOOOO133026 MYH10 ENSGOOOOO174705 SH3PXD2B ENSGOOOOO133026 MYH10 ENSGOOOOO164294 GPX8 ENSGOOOOO133026 MYH10 ENSGOOOOO113194 EAF2 ENSGOOOOO133026 MYH10 ENSGOOOOO113739 STC2 ENSGOOOOO1084.24 KPNB1 ENSGOOOOOO7O614 NDST1 ENSGOOOOO18O340 FZD2 ENSGOOOOO171720 HDAC3 ENSGOOOOO178307 TMEM11 ENSGOOOOOO72364 AFF4 ENSGOOOOO1989.09 MAP3K3 ENSGOOOOOO72364 AFF4 ENSGOOOOO12S686 MED1 ENSGOOOOO113758 D BN1 ENSGOOOOO12S686 MED1 ENSGOOOOO145919 B OD1 ENSGOOOOO185298 CCDC137 ENSGOOOOO145911 N4BP3 ENSGOOOOO167193 CRK ENSGOOOOO2S1273 RP11 ENSGOOOOOO67596 DHX8 549K2O.1.1 ENSGOOOOO187678 SPRY4. ENSGOOOOO182473 EXOC7 ENSGOOOOO187678 SPRY4. ENSGOOOOO167699 GLOD4 ENSGOOOOO131711 MAP1B ENSGOOOOO1091.18 PHF12 ENSGOOOOO164615 CAMLG ENSGOOOOO109111 SUPT6H ENSGOOOOO113048 MRPS27 ENSGOOOOO185722 ANKFY1 ENSGOOOOOO38427 WCAN ENSGOOOOO131748 STARD3 ENSGOOOOOO38427 WCAN ENSGOOOOO183O48 MRPL12 ENSGOOOOOO38427 WCAN ENSGOOOOOO91542 ALKBHS ENSGOOOOOO38427 WCAN ENSGOOOOO173821 RNF213 ENSGOOOOO164244 PRRC1 ENSGOOOOO173821 RNF213 ENSGOOOOO119900 OGFRL1 ENSGOOOOO141580 WDR4SL ENSGOOOOO119900 OGFRL1 ENSGOOOOO141720 PIP4K2B ENSGOOOOO247909 ENSGOOOOO141720 PIP4K2B ENSGOOOOO153O46 CDYL ENSGOOOOO133028 SCO1 ENSGOOOOO112739 PRPF4B ENSGOOOOOO4O633 PHF23 ENSGOOOOO213079 SCAF8 ENSGOOOOOO91640 SPAG7 ENSGOOOOO137166 FOXP4 ENSGOOOOOOO6744 ELAC2 ENSGOOOOO180992 MRPL14 ENSGOOOOOOO6744 ELAC2 ENSGOOOOO189241 TSPYL1 ENSGOOOOO187531 SIRT7 ENSGOOOOOO44O90 CULT ENSGOOOOO171634 BPTF ENSGOOOOO151914 DST ENSGOOOOO1793.14 WSCD1 ENSGOOOOO112658 SRF ENSGOOOOOO34152 MAP2K3 ENSGOOOOO236,673 RP11 ENSGOOOOO121067 SPOP 698.2.1 US 2016/0264934 A1 Sep. 15, 2016 47

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) change change Gene Gene Symbol (Yes/No) (Yes/No) Gene >2) >1.5) ENSGOOOOO 24782 RREB1 ENSGOOOOO141564 ENSGOOOOO 24688 MAD2L1BP ENSGOOOOO141569 ENSGOOOOO 81472 ZBTB2 ENSGOOOOO141568 ENSGOOOOO 88.112 C6orf132 ENSGOOOOOO82641 ENSGOOOOO 11817 DSE ENSGOOOOOO82641 ENSGOOOOO 11817 DSE ENSGOOOOO121083 ENSGOOOOO 96586 MYO6 ENSGOOOOO108528 SLC25A11 ENSGOOOOO 97081 IGF2R ENSGOOOOO141504 SAT2 ENSGOOOOO 18482 PHF3 ENSGOOOOO172057 ORMDL3 ENSGOOOOO 18482 PHF3 ENSGOOOOOOO2919 SNX11 ENSGOOOOOO85511 MAP3K4 ENSGOOOOO108262 GIT1 ENSGOOOOO 12033 PPARD ENSGOOOOOO871S2 ATXN7L3 ENSGOOOOO 12033 PPARD ENSGOOOOOO871S2 ATXN7L3 ENSGOOOOO 52661 GA1 ENSGOOOOO188522 FAM83G ENSGOOOOO 52661 GA1 ENSGOOOOO167258 CDK12 ENSGOOOOO 52661 GA1 ENSGOOOOO186834 HEXIM1 ENSGOOOOO 884.28 MUTED ENSGOOOOOO68489 PRR11 ENSGOOOOO 46426 TLAM2 ENSGOOOOOOO72O2 KIAAO1OO ENSGOOOOOO49618 ARID1B ENSGOOOOO177469 PTRF ENSGOOOOO 46O72 TNFRSF21 ENSGOOOOO177469 PTRF ENSGOOOOO 56639 ZFAND3 ENSGOOOOO141295 SCRN2 ENSGOOOOO 3O396 MLLT4 ENSGOOOOO125445 MRPS7 ENSGOOOOO 3O396 MLLT4 ENSGOOOOO141378 PTRH2 ENSGOOOOO 644.42 CITED2 ENSGOOOOO173894 CBX2 ENSGOOOOOO85377 PREP ENSGOOOOO173894 CBX2 ENSGOOOOO 96821 C6orf106 ENSGOOOOO1088.19 PPP1R9B ENSGOOOOO 96821 C6orf106 ENSGOOOOO176658 MYO1D ENSGOOOOOOO8O83 JARID2 ENSGOOOOO 41219 C17orf&O ENSGOOOOO111961 SASH1 ENSGOOOOOOO4142 POLDIP2 ENSGOOOOOO96070 BRPF3 ENSGOOOOO 33O3O MPRIP ENSGOOOOOO966.96 DSP ENSGOOOOO 20063 ENSGOOOOO135316 SYNCRIP ENSGOOOOO 69727 ENSGOOOOOOS7663 ATGS ENSGOOOOOO60069 ENSGOOOOO146457 WTAP ENSGOOOOO S4845 ENSGOOOOO146457 WTAP ENSGOOOOO 70677 ENSGOOOOO146457 WTAP ENSGOOOOO 70677 ENSGOOOOO112029 FBXO5 ENSGOOOOOO81913 ENSGOOOOO112249 ASCC3 ENSGOOOOO2S6463 ENSGOOOOO182952 HMGN4 ENSGOOOOO 76014 ENSGOOOOO106443 PHF14 ENSGOOOOO 8461 ENSGOOOOO136231 IGF2BP3 ENSGOOOOO 644 ENSGOOOOO106636 YKT6 ENSGOOOOO 424 ENSGOOOOOO65883 CDK13 ENSGOOOOO 544 ENSGOOOOO106263 EIF3B ENSGOOOOO 703 ENSGOOOOO166526 ZNF3 ENSGOOOOO 4193 ENSGOOOOO16453S DAGLB ENSGOOOOO2 O849 ENSGOOOOOOO6453 BAIAP2L1.1 ENSGOOOOO 407 ENSGOOOOO160963 EMID2 ENSGOOOOO 407 ENSGOOOOO160963 EMID2 ENSGOOOOO 407 TTI1 ENSGOOOOO243335 KCTD7 ENSGOOOOO 447 FAM83D ENSGOOOOO158321 AUTS2 ENSGOOOOO 552 BCL2L1 ENSGOOOOO158321 AUTS2 ENSGOOOOO 940 ZNF217 ENSGOOOOO158321 AUTS2 ENSGOOOOO 940 ZNF217 ENSGOOOOO1291.03 SUMF2 ENSGOOOOO 337 TM9SF4 ENSGOOOOO185274 WBSCR17 ENSGOOOOO O1337 TM9SF4 ENSGOOOOO185274 WBSCR17 ENSGOOOOO 26003 PLAGL2 ENSGOOOOO1881.91 PRKAR1B ENSGOOOOO 328.23 C20orf111 ENSGOOOOO154978 WOPP1 ENSGOOOOO 49658 YTHDF1 ENSGOOOOO154978 WOPP1 ENSGOOOOO 97122 SRC ENSGOOOOO154978 WOPP1 ENSGOOOOOOS3438 NNAT ENSGOOOOO154978 WOPP1 ENSGOOOOO O1189 C20orf2O ENSGOOOOOO75624 ACTB ENSGOOOOO 32640 BTBD3 ENSGOOOOOOO2822 MAD1L1 ENSGOOOOO 32640 BTBD3 ENSGOOOOO146776 ATXN7L1 ENSGOOOOO 25844 RRBP1 ENSGOOOOO106624 AEBP1 ENSGOOOOO O1040 ZMYND8 US 2016/0264934 A1 Sep. 15, 2016 48

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) change change Gene Gene Symbol (Yes/No) (Yes/No) Gene >2) >1.5) ENSGOOOOO128567 PODXL ENSGOOOOO124222 ENSGOOOOO128567 PODXL ENSGOOOOOO8832S ENSGOOOOO106459 NRF1 ENSGOOOOO177732 ENSGOOOOOO7S213 EMA3A ENSGOOOOO196227 ENSGOOOOO198742 MURF1 ENSGOOOOO101158 ENSGOOOOO1286O2 MO ENSGOOOOO1011SO ENSGOOOOO10666S LIP2 ENSGOOOOO1011SO ENSGOOOOO10666S LIP2 ENSGOOOOO158470 ENSGOOOOO10666S LIP2 ENSGOOOOO124181 ENSGOOOOO158457 SPAN33 ENSGOOOOO1328.19 ENSGOOOOO16488O INTS1 ENSGOOOOO124164 ENSGOOOOO1468.30 GYF1 ENSGOOOOO2444.62 ENSGOOOOO1468.30 GYF1 ENSGOOOOO2444.62 ENSGOOOOO146834 EPCE ENSGOOOOO2444.62 ENSGOOOOO157224 LDN12 ENSGOOOOOO2S293 ENSGOOOOOO91732 C3HC1 ENSGOOOOO101115 4 ENSGOOOOO18O233 NRF2 ENSGOOOOO1241.45 sc f ENSGOOOOO165215 LDN3 ENSGOOOOOO92758 L9A3 ENSGOOOOO164889 LC4A2 ENSGOOOOOO92758 L9A3 ENSGOOOOO146535 s NA 12 ENSGOOOOO118707 : F2 ENSGOOOOO242265 O ENSGOOOOO149600 COMMD7 ENSGOOOOO242265 O ENSGOOOOO1O1246 ARFRP1 ENSGOOOOO174469 NAP2 ENSGOOOOO1O1412 E2F1 ENSGOOOOO128595 U ENSGOOOOO1011.93 C20orf11 ENSGOOOOO14715S ENSGOOOOO1967OO ZNFS12B ENSGOOOOO186462 ENSGOOOOO101019 UQCC ENSGOOOOO147OSO ENSGOOOOOO8919S TRMT6 ENSGOOOOO147OSO ENSGOOOOO165246 NLGN4Y ENSGOOOOO169084 ENSGOOOOO11.4374 USP9Y ENSGOOOOO188021 ENSGOOOOO 105127 AKAP8 ENSGOOOOO2O39SO ENSGOOOOO142449 FBN3 ENSGOOOOO123S 62 MS ENSGOOOOOOOSOO7 UPF1 ENSGOOOOO102081 ENSGOOOOO160888 IER2 ENSGOOOOO147274 S. X ENSGOOOOO142252 GEMINT ENSGOOOOO172S34 ENSGOOOOO167470 MIDN ENSGOOOOO172S34 ENSGOOOOO108.107 RPL28 ENSGOOOOOO67.445 RO ENSGOOOOO119559 C19Crf25 ENSGOOOOOO67.445 ENSGOOOOO 105429 MEGF8 ENSGOOOOO196368 ENSGOOOOO 105186 ANKRD27 ENSGOOOOO182195 ENSGOOOOO 1054O1 CDC37 ENSGOOOOO184481 ENSGOOOOO117877 CD3EAP ENSGOOOOO12S352 ENSGOOOOO187867 PALM3 ENSGOOOOO196998 WDR45 ENSGOOOOO213753 ACO16629.2.1 ENSGOOOOO197021 CXorf24OB ENSGOOOOO1676OO CYP2S1 ENSGOOOOO147162 OGT ENSGOOOOO1676OO CYP2S1 ENSGOOOOO1876O1 MAGEH1 ENSGOOOOOO11243 AKAP8L, ENSGOOOOO131263 RLIM ENSGOOOOOO72071 LPHN1 ENSGOOOOO126O12 KDMSC ENSGOOOOO1275.27 EPS15L1 ENSGOOOOOO71859 FAMSOA ENSGOOOOO130382 MLLT1 ENSGOOOOO169093 ASMTL ENSGOOOOOO646O7 SUGP2 ENSGOOOOO182378 PLCXD1 ENSGOOOOOO646O7 SUGP2 ENSGOOOOO101849 TBL1X ENSGOOOOO10488O ARHGEF18 ENSGOOOOOO71889 FAM3A ENSGOOOOO104.885 DOT1L, ENSGOOOOO214717 ZBED1 ENSGOOOOO 105270 CLIP3 ENSGOOOOO146938 NLGN4X ENSGOOOOO153879 CEBPG ENSGOOOOO124486 USP9X ENSGOOOOO133275 CSNK1 G2 ENSGOOOOO186871 ERCC6L ENSGOOOOO133275 CSNK1 G2 ENSGOOOOO183943 PRKX ENSGOOOOO 105732 ZNF574 ENSGOOOOO169188 APEX2 ENSGOOOOOO7S702 WDR62 ENSGOOOOO134590 FAM127A ENSGOOOOO2S4858 MPV17L2 ENSGOOOOO180964 TCEAL8 ENSGOOOOO181896 ZNF101 ENSGOOOOOO112O1 KAL1 ENSGOOOOO184635 ZNF93 ENSGOOOOOOS 6998 GYG2 ENSGOOOOO 105085 MED26 ENSGOOOOO155959 VBP1 ENSGOOOOO129951 LPPR3.1 US 2016/0264934 A1 Sep. 15, 2016 49

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change -2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO173273 TNKS ENSGOOOOO141867 BRD4. ENSGOOOOO158669 AGPAT6 ENSGOOOOO129932 DOHH ENSGOOOOO168575 LC20A2 ENSGOOOOO 105323 HNRNPUL1 ENSGOOOOO1838O8 BM12B ENSGOOOOO 105323 HNRNPUL1 ENSGOOOOO179041 s RS1 ENSGOOOOO 10532S FZR1 ENSGOOOOO153317 SAP1 ENSGOOOOOO71S64 TCF3 ENSGOOOOO171316 HD7 ENSGOOOOO127663 KDM4B ENSGOOOOO171316 HD7 ENSGOOOOOOO7047 MARK4 ENSGOOOOO136986 ERL1 ENSGOOOOO141994 DUS3L ENSGOOOOO185728 THDF3 ENSGOOOOO131116 ZNF428 ENSGOOOOO185728 THDF3 ENSGOOOOO213024 NUP62 ENSGOOOOO2O5268 DE7A ENSGOOOOO213024 NUP62 ENSGOOOOO173281 PP1R3B ENSGOOOOO 105281 SLC1AS ENSGOOOOO170619 COMMDS ENSGOOOOO 1051.31 EPHX3 ENSGOOOOO104331 MPAD1 ENSGOOOOO246181 ENSGOOOOO104312 RIPK2 ENSGOOOOO12SSOS MBOAT7 ENSGOOOOO182319 PRAGMIN.1 ENSGOOOOO167658 EEF2 ENSGOOOOO178764 ZHX2 ENSGOOOOO 105173 CCNE1 ENSGOOOOO133874 RNF122 ENSGOOOOO115255 REEP6 ENSGOOOOO147596 PRDM14 ENSGOOOOO167460 TPM4 ENSGOOOOO160957 RECQL4 ENSGOOOOO130312 MRPL34 ENSGOOOOO180900 SCRIB ENSGOOOOO167674 ACO11498.1 ENSGOOOOO180900 SCRIB ENSGOOOOO130311 DDA1 ENSGOOOOO1571 10 RBPMS ENSGOOOOO160570 DEDD2 ENSGOOOOOO12232 EXTL3 ENSGOOOOO 105197 TIMMSO ENSGOOOOO180921 FAM83H ENSGOOOOO187266 EPOR ENSGOOOOO182372 LN8 ENSGOOOOO182087 C190rf6 ENSGOOOOO147457 HMP7 ENSGOOOOO130669 PAK4 ENSGOOOOO147454 LC25A37 ENSGOOOOO125755 SYMPK ENSGOOOOO183309 NF623 ENSGOOOOO16763S ZNF146 ENSGOOOOO12O885 LU ENSGOOOOO125912 NCLN ENSGOOOOO136997 MYC ENSGOOOOOO31823 RANBP3 ENSGOOOOO181090 EHMT1 ENSGOOOOO227SOO SCAMP4 ENSGOOOOO130S60 UBAC1 ENSGOOOOO198683 ACO12615.1 ENSGOOOOO165661 QSOX2 ENSGOOOOO 105245 NUMBL ENSGOOOOO1598.84 CCDC107 ENSGOOOOO 105245 NUMBL ENSGOOOOO148143 ZNF462 ENSGOOOOO198093 ZNF649 ENSGOOOOO107130 NCS1 ENSGOOOOO198093 ZNF649 ENSGOOOOO137124 ALDH1B1 ENSGOOOOOO79999 KEAP1 ENSGOOOOO147869 CER1 ENSGOOOOO17911S FARSA ENSGOOOOO238,227 C9Crf69 ENSGOOOOO12S651 F2F1 ENSGOOOOO238,227 C9Crf69 ENSGOOOOO12S651 F2F1 ENSGOOOOOO78725 BC1 ENSGOOOOO16OOO7 RHGAP35 ENSGOOOOO12719.1 RAF2 ENSGOOOOO142549 LONS ENSGOOOOO107341 BE2R2 ENSGOOOOOO85872 ENSGOOOOO107341 BE2R2 ENSGOOOOO129347 ENSGOOOOO169925 RD3 ENSGOOOOO129347 ENSGOOOOO1483OO EXO4 ENSGOOOOO134815 ENSGOOOOO233137 P1 ENSGOOOOOO74181 H 3 2OI1.1.1 TCI ENSGOOOOO11933S s ENSGOOOOO131941 H N2 ENSGOOOOO155827 RNF2O ENSGOOOOO218891 5. ENSGOOOOO137OSS PLAA ENSGOOOOOO6SOOO ENSGOOOOO196730 DAPK1 ENSGOOOOOO6SOOO ENSGOOOOO130723 PRRC2B ENSGOOOOO132O24 C 2 D1 A. ENSGOOOOO14.8296 SURF6 ENSGOOOOO13O881 ENSGOOOOO148297 MED22 ENSGOOOOOO99942 ENSGOOOOO221829 EANCG ENSGOOOOOO99942 KL ENSGOOOOO137038 C9Crf123 ENSGOOOOOO99942 ENSGOOOOO1369.08 DPM2 ENSGOOOOO183864 ENSGOOOOO197579 TOPORS ENSGOOOOO100116 C AT ENSGOOOOO197579 TOPORS ENSGOOOOOO40608 ENSGOOOOOO97OO7 ABL ENSGOOOOO183579 ENSGOOOOOO97OO7 ABL ENSGOOOOO182541 US 2016/0264934 A1 Sep. 15, 2016 50

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change -2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO168795 ZBTBS ENSGOOOOO182541 LIMK2 ENSGOOOOOO44574 HSPAS ENSGOOOOO185651 UBE2IL3 ENSGOOOOO197724 PHF2 ENSGOOOOO185651 UBE2IL3 ENSGOOOOO107362 FAM108B1 ENSGOOOOO10O379 KCTD17 ENSGOOOOO136943 CTSL2 ENSGOOOOO10O393 EP300 ENSGOOOOO107104 KANK1 ENSGOOOOO100401 NGAP1 ENSGOOOOO167106 FAM102A ENSGOOOOO100403 C3H7B ENSGOOOOOO99810 MTAP ENSGOOOOO100403 C3H7B ENSGOOOOO176248 ANAPC2 ENSGOOOOO170638 2 RABD ENSGOOOOO147874 HAUS6 ENSGOOOOO196588 KL1 ENSGOOOOO198722 UNC13B ENSGOOOOO1 OO139 ICALL1 ENSGOOOOO1483.58 GPR107 ENSGOOOOO138867 22orf13 ENSGOOOOO107290 SETX ENSGOOOOO1OOOS8 RYBB2P1 ENSGOOOOO13883S RGS3 ENSGOOOOO1OOO14 PECC1L, ENSGOOOOO167110 GOLGA2 ENSGOOOOO185721 RG1 ENSGOOOOO1986.42 KLHL9 ENSGOOOOO10O226 PBP1 ENSGOOOOO187713 TMEM2O3 ENSGOOOOOO999S4 CR2 ENSGOOOOO1861.93 C9orf140 ENSGOOOOOO999S4 CR2 ENSGOOOOO155876 RRAGA ENSGOOOOOO999S4 CR2 ENSGOOOOO12S484 GTF3C4 ENSGOOOOOO99991 ABIN1 ENSGOOOOO12S484 GTF3C4 ENSGOOOOO128294 ENSGOOOOOO66697 C9Crf30 ENSGOOOOO10O32S ENSGOOOOO157657 ZNF618 ENSGOOOOO159873 ENSGOOOOO241978 AKAP2 ENSGOOOOO10O345 ENSGOOOOO241978 AKAP2 ENSGOOOOO10O345 ENSGOOOOO241978 AKAP2 ENSGOOOOO10O345 ENSGOOOOO165138 ANKS6 ENSGOOOOO133424 ENSGOOOOO148248 SURF4 ENSGOOOOO133424 ENSGOOOOO188986 COBRA1 ENSGOOOOOO93OOO ENSGOOOOO198917 C9orf114 ENSGOOOOOO93OOO ENSGOOOOO13OSS8 OLFM1 ENSGOOOOOO93OOO ENSGOOOOO13OSS8 OLFM1 ENSGOOOOO10O297 ENSGOOOOO130559 CAMSAP1 ENSGOOOOO1 OO1 OS ENSGOOOOO148468 FAM171A1 ENSGOOOOO1 OO1 OS ENSGOOOOO107719 KIAA1274 ENSGOOOOO1 OO1 OS ENSGOOOOO156374 PCGF6 ENSGOOOOO12824.5 ENSGOOOOO107816 LZTS2 ENSGOOOOO2S3352 ENSGOOOOO107815 C10orf2 ENSGOOOOOO99904 ENSGOOOOOO95637 SORBS1 ENSGOOOOOO999.68 BCL2L13 ENSGOOOOO1486OO CDHR1 ENSGOOOOOO999.68 BCL2L13 ENSGOOOOO156521 TYSND1 ENSGOOOOOO999.68 BCL2L13 ENSGOOOOO151893 C10orf246 ENSGOOOOOO999.68 BCL2L13 ENSGOOOOO107651 SEC23IP ENSGOOOOO159140 SON ENSGOOOOOO65809 FAM107B ENSGOOOOO159140 SON ENSGOOOOOO992O4 ABLIM1 ENSGOOOOO159140 SON ENSGOOOOO14868O HTR7 ENSGOOOOO1591.28 FNGR2 ENSGOOOOO107949 BCCIP ENSGOOOOO184787 UBE2G2 ENSGOOOOO107949 BCCIP ENSGOOOOO233393 APOOO688.29.1 ENSGOOOOO148840 PPRC1 ENSGOOOOO1832SS PTTG1IP ENSGOOOOO1552S6 FYVE27 ENSGOOOOO160298 C21orf58 ENSGOOOOO138166 USPS ENSGOOOOO16O299 PCNT ENSGOOOOO168209 DIT4 ENSGOOOOO16O299 PCNT ENSGOOOOOO3S403 ENSGOOOOO182871 COL18A1 ENSGOOOOO151208 LGS ENSGOOOOO107872 FBXL15 ENSGOOOOO197444 GDHL ENSGOOOOOO95739 BAMBI ENSGOOOOO1989S4 AA1279 ENSGOOOOO176986 SEC24C ENSGOOOOOO95787 WAC ENSGOOOOOO77147 TM9SF3 ENSGOOOOO148429 USP6NL ENSGOOOOO107779 BMPR1A ENSGOOOOO148429 USP6NL ENSGOOOOO110514 MADD ENSGOOOOO17SO29 CTBP2 ENSGOOOOO166833 NAV2 ENSGOOOOO165886 UBTD1 ENSGOOOOOO14216 CAPN1 ENSGOOOOOOS2749 RRP12 ENSGOOOOO162337 LRP5 ENSGOOOOO1712O6 TRIM8 ENSGOOOOOO48649 RSF1 US 2016/0264934 A1 Sep. 15, 2016 51

TABLE 2-continued DPMI between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48) Table 2: List of genes (mRNA transcripts) with Differential Peak Intensities (DPMI) between undifferentiated (resting)human H1-ESC (TO) and 48 hours after Activin A induction towards endoderm (mesoendoderm) lineage (T48). Coordinates of méA peaks in human genome (mm9), type of transcript, and gene symbols are shown. Each row indicates if DPMI is over 1.5- or 2-fold. Related to FIG. 5).

DMPI DMPI DMPI DMPI (fold (fold (fold (fold change 2) change -1.5) Gene change change Gene Gene Symbol (Yes/No) (Yes/No) Gene Symbol >2) >1.5) ENSGOOOOO107957 SH3PXD2A N Y ENSGOOOOO2S6591 RP11- N Y 286N228.1 ENSGOOOOO134463 ECHDC3 Y Y ENSGOOOOO171067 C11orf24 N Y ENSGOOOOO107937 GTPBP4 Y Y ENSGOOOOO171067 C11orf24 N Y ENSGOOOOO122378 FAM213A N Y ENSGOOOOO14926O CAPNS N Y ENSGOOOOO1821.80 MRPS16 N Y ENSGOOOOO17SS7S PAAF1 Y Y ENSGOOOOO148773 MKI67 N Y ENSGOOOOO132749 MTLS N Y ENSGOOOOOO626SO WAPAL Y Y ENSGOOOOO149503 INCENP N Y ENSGOOOOOO626SO WAPAL Y Y ENSGOOOOO149503 INCENP N Y ENSGOOOOOO626SO WAPAL Y Y ENSGOOOOO118058 MLL N Y ENSGOOOOO171307 ZDHHC16 Y Y ENSGOOOOO137710 RDX N Y

0239. In some embodiments, the assays, arrays and kits for transcripts, or to at least 10 3'UTR or other untranslated assessing mA levels in the RNA obtained from a population regions of at least 10 genes selected from any of those listed of stem cells, e.g., human stem cell can comprises measuring in Table 1 or Table 2, or any from Tables S1-S3 or S5, and (ii) the mA levels 10 or more mRNA transcripts selected from contacting the array with at least one reagent which binds to any of those listed in Tables S1, S2, S3, S4, S5 and S6, m6A in the RNA, such as an anti-mA antibody, or fragment disclosed in Batista et al., Cell Stem Cell, 2014, 15(6), 707 thereof, such as an anti-mA antibody which is fluorescently 719, entitled “méA RNA Modification Controls Cell Fate labeled or otherwise has a detectable label, therefore allowing Transition in Mammalian Embryonic StemCells’, (available the measurements of the levels of méA in the at least selected online at the world-wide web address: “//dx.doi.org/10.1016/ 10 mRNA transcripts, or to at least 103'UTR or other untrans j. stem.2014.09.019), which is incorporated herein in its lated regions of at least 10 genes selected from any of those entirety by reference. listed in Table 1 or Table 2 or any from Tables S1-S3 or S5. 0240 More specifically, Table S1 in Batista et al., dis 0242 A further aspect of the technology described herein closes all Mouse High-Confidence Peaks (and relates to FIG. relates to methods, compositions, assays, arrays and kits for 1 and FIG. 4 herein) and shows the coordinates of m6A peaks use in a method for determining the cell state of a stem cell in mouse genome (mm.9), position of the m6A peak in the population comprising performing the assay of claim 10, and transcript, type of transcript, and gene symbol are displayed. comparing the levels of méA (i.e., peak intensities) of at least For the Difference in Mettl3, the ratio between the IP and the 10 genes selected from any of Table 1 or Table 2, or any from Input is represented. Table S2 in Batista et al., discloses Tables S1-S3 or S5, in the RNA from the stem cell population nanostring Counts after méA-IP and is related to FIG. 1 with the levels of méA (i.e., peak intensities) in a reference disclosed herein. Gene symbols with counts for Input, méA stem cell population, and based on this comparison, deter IP, and IgG are shown. The ratios of the Input and Fold mining the cell state of the stem cell population. enrichment over the gene body of Actb are represented. Table 0243 Another aspect of the present invention relates to a S3 in Batista et al., discloses all Human High-Confidence kit comprising: (i) an array composition for characterizing the Peaks and is related to FIG. 5 herein. Coordinates of méA cell State of a population of stem cells, comprising at least 10 peaks in human genome (mm9), type of transcript, and gene oligonucleotides that hybridize to the RNA (i.e. mRNA tran symbols are shown. Table S4 in Batista et al., is reproduced as scripts, 3'UTR or other untranslated RNAs) of at least 10 Table 2 herein and shows DPMI between T0 (undifferenti genes selected from any of those in Table 1 or Table 2 or any ated) and T48 (endoderm differentiated) human stem cell from Tables S1-S3 or S5, as disclosed herein; and (ii) at least populations. Table 2 is related to FIG. 5 herein and shows one regent to detect the m6A in RNA, such as, for example, an coordinates of méA peaks in human genome (mm.9), type of anti-mA antibody, or fragment thereof, for example an anti transcript, and gene symbols are shown. Each row indicates if m6A antibody or fragment thereof which is detectably DPMI is over 1.5- or 2-fold. Table S5 in Batista et al., dis closes human and Mouse Methylated Gene Comparison, and labeled (e.g., with a florescent label, colorimetric marker is related to FIG. 6 herein and lists the Gene ID in human and etc.). mouse and type of homology are shown. Table S6 in Batista 0244 A. Methods of méA Analysis et al., is reproduced as Table 1 herein, and lists 632 gene 0245 B. Arrays transcripts that have common peaks between hESC and 0246 Methods of measure m6A are known by one of mESCs, and lists the Gene ID in human and mouse and ordinary skill in the art. For example, as disclosed herein, one chromosome coordinates of common peaks. can use anti-méA antibodies. Commercial móA RNA methy 0241. In some embodiments, the array comprises 10 or lation quantification kits are commercially available and more oligonucleotides that hybridize to at least 10 mRNA encompassed for use in the methods, kits and assays as dis US 2016/0264934 A1 Sep. 15, 2016 52 closed herein, e.g., such as those from AbCam (Cat No: GADD45A, PUM1 YWHAZ, UBC, TFRC, TBP, RPLPO, ab185912) or Epigentek (Cat No:P-9005-96). PPIA, POLR2A, PGK1, IPO8, HMBS, GUSB, B2M, HPRT1 0247 Accordingly, an array as disclosed herein encom or 18S passes an array of oligonucleotides which hybridize to the 0253) In some embodiments, the array comprises no more target RNA species (e.g., 10 or more genes selected from any than 100, or no more than 90, or no more than 50 nucleic acid listed in Table 1, Table 2, Table S1-S3 or Table S5), and sequences, e.g., oligonucleotides or primers. In some contacting the array with RNA obtained from the stem cell embodiments, the nucleic acid sequences present on the array population (e.g., human stem cell population) and allowing are sets of primers. In some embodiments, the nucleic acid the RNA to hybridize to the oligonucleotides, washing the sequences, e.g., oligonucleotides or primers are immobilized array to remove any unbound (non-hybridized) RNA, then on, or within a solid Support. Nucleic acid sequences can be adding an anti-méA antibody. After removal of the unbound immobilized on the solid support by the 5' end of said oligo anti-méA antibody, the bound anti-méA antibody can be nucleotides. In some embodiments, the Solid Support is detected by methods commonly known in the art, e.g., where selected from a group of materials comprising silicon, metal, the anti-méA antibody is fluorescently labeled, using flurse and glass. In some embodiments, the Solid Support comprises cent detection, or using a different colormetic method known oligonucleotides at assigned positions defined by X and y in the art. coordinates. 0248. In some embodiments, the oligonucleotides on the 0254 Accordingly, the present invention contemplates a array are at least 90% identical to, or specifically hybridize to method of generating an array, comprising providing a solid the RNA or mRNA of the genes selected from any listed in Support comprising a plurality of positions for oligonucle Table 1, Table 2, Table S1-S3 or Table S5). In some embodi otides, the positions defined by X and y coordinates; a plural ments, the array comprises oligonucleotides (e.g., probes or ity of different oligonucleotides (or primer pairs), each com primers) which specifically hybridize to the mRNA expressed prising a sequence which is complementary to at least a by the genes selected from any listed in Table 1, Table 2, Table portion of the sequence of an gene being measured, where S1-S3 or Table S5). each oligonucleotide (or primer pair) is placed in a known 0249. In some embodiments, the array comprises at least position on the Solid Support to create an ordered array. 10, or at least about 20, or at least about 30, or 30-60, or 60-90 0255. In one embodiment of the present invention, oligo or more than 90 nucleic acid sequences (e.g. oligonucle nucleotides that are immobilized by the 5' end on a solid otides), or at least 10, or at least about 20, or at least about 30, Surface by a chemical linkage are contemplated. In some or 30-60, or 60-90 or more than 90 pairs of nucleic acid embodiments, the oligonucleotides are primers, and can be sequences (e.g., primers), that can be used to measure m6A approximately 17 bases in length, although other lengths are levels of a combination of 10 or more genes selected from any also contemplated. listed in Table 1, Table 2, Table S1-S3 or Table S5). 0256 In another embodiment of the present invention, a 0250 In some embodiments, any of the genes listed in method of hybridizing target nucleic acid fragments is con Table 1, Table 2, Table S1-S3 or Table S5 can be substituted templated which comprises providing an ordered array of for alternative genes. For example, in some embodiments, in immobilized oligonucleotides representing sequences in addition to comprising probes (e.g., oligonucleotides and/or selected from any listed in Table 1, Table 2, Table S1-S3 or primers) which specifically hybridize to the mRNA of at least Table S5 and providing a plurality of fragments of a target 10, or at least 20 genes selected from any listed in Table 1, nucleic acid; and bringing the fragments of the target nucleic Table 2, Table S1-S3 or Table S5), the array can comprise acid into contact with the array under conditions such that at additional reagents (e.g., probes, e.g., oligonucleotides and/ least one of the fragments hybridizes to one of the immobi or primers) which specifically hybridize to the mRNA of lized oligonucleotides on the array. other genes for measuring the m6A levels of genes not listed in Table 1, Table 2, Table S1-S3 or Table S5). Such genes are 0257. In some embodiments, when RNA from the stem known by persons of ordinary skill in the art and are envi cell population hybridizes to an oligonucleotide attached on Sioned for use in the assays, kits, methods, systems as dis the Surface of the array, it is detected with an antibody, e.g., closed herein. anti-méA antibody that is detectably labeled or has a detect able moiety, which may be fluorescent, luminescent, radio 0251. In some embodiments, the array further comprises active, enzymatically active, etc., particularly a molecule spe nucleic acid sequences (e.g., oligonucleotides and/or prim cific for binding to the parameter with high affinity. ers) which specifically hybridize to the mRNA of at least 1, or Fluorescent moieties are readily available for labeling virtu at least 2, or at least 3, or at least 4 or least 5 control genes. ally any biomolecule, structure, or cell type. Immunofluores Control genes include those listed in Table 3, but are not cent moieties can be directed to bind not only to specific limited to ACTB, JARID2, CTCF, SMAD1, B-actin, GAPDH proteins but also specific conformations, cleavage products, and the like. In some embodiments, nucleic acid sequences or site modifications like phosphorylation. Individual pep that amplify a control gene can be present at multiple loca tides and proteins can be engineered to autofluoresce, e.g. by tions in the same array. expressing them as green fluorescent protein chimeras inside 0252. In some embodiments, the array comprises nucleic cells (for a review see Jones et al. (1999) Trends Biotechnol. acid sequences, e.g., oligonucleotides or primers, that 17(12):477-81). Thus, antibodies can be genetically modified amplify the mRNA of at least sequences corresponding to to provide a fluorescent dye as part of their structure. Depend 1-10 control genes, such as, but not limited to the control ing upon the label chosen, parameters may be measured using genes selected from the group consisting of ACTB, JARID2, other than fluorescent labels, using Such immunoassay tech CTCF, SMAD1, GAPDH, B-actin, EIF2B, RPL37A, niques as radioimmunoassay (RIA) or enzyme linked immu CDKN1B, ABL1, ELF1, POP4, PSMC4, RPL30, CASC3, nosorbance assay (ELISA), homogeneous enzyme immu PES1, RPS17, RPSL17L, CDKN1A, MRPL19, MT-ATP6, noassays, and related non-enzymatic techniques. US 2016/0264934 A1 Sep. 15, 2016

0258 Hybridization to arrays may be performed, where referred to in this disclosure are available from commercial the arrays can be produced according to any suitable methods Vendors such as BioRad, Stratagene, Invitrogen, Sigma-Ald known in the art. For example, methods of producing large rich, and ClonTech. arrays of oligonucleotides are described in U.S. Pat. No. 0259. In some embodiments, the detection agent, e.g., 5,134,854, and U.S. Pat. No. 5,445,934 using light-directed anti-méA antibody is further labeled with a detectable synthesis techniques. Using a computer controlled system, a marker, for example a fluorescent marker. Such detectable heterogeneous array of monomers is converted, through labels include, but are not limited to, for example but not simultaneous coupling at a number of reaction sites, into a limited to metallic beads and streptavidin. heterogeneous array of polymers. Alternatively, microarrays 0260 RNA can be isolated from eukaryotic cells by pro are generated by deposition of pre-synthesized oligonucle cedures that involve lysis of the cells and denaturation of the otides onto a solid substrate, for example as described in PCT proteins contained therein. Stem cells of interest include published application no. WO95/35505. Methods for collec pluripotent stem cells, including but not limited to ES cells, adult stem cells and iPSC cells, from mammals including tion of data from hybridization of samples with an array are human species. Additional steps can be employed to remove also well known in the art. For example, the polynucleotides DNA. Cell lysis can be accomplished with a nonionic deter of the cell samples can be generated using a detectable fluo gent, followed by microcentrifugation to remove the nuclei rescent label, and hybridization of the polynucleotides in the and hence the bulk of the cellular DNA. In one embodiment, samples detected by Scanning the microarrays for the pres RNA is extracted from cells of the various types of interest ence of the detectable label. Methods and devices for detect using guanidinium thiocyanate lysis followed by CsCl cen ing fluorescently marked targets on devices are known in the trifugation to separate the RNA from DNA (Chirgwin et al., art. Generally, Such detection devices include a microscope Biochemistry 18:5294-5299 (1979)). Poly(A)+ RNA is iso and light source for directing light at a Substrate. A photon lated by selection with oligo-dT cellulose (see Sambrook et counter detects fluorescence from the substrate, while an X-y al, MOLECULAR CLONING A LABORATORY translation stage varies the location of the Substrate. A con MANUAL (2ND ED.), Vols. 1-3, Cold Spring Harbor Labo focal detection device that can be used in the subject methods ratory, Cold Spring Harbor, N.Y. (1989). Alternatively, sepa is described in U.S. Pat. No. 5,631,734. A scanning laser ration of RNA from DNA can be accomplished by organic microscope is described in Shalon et al., Genome Res. (1996) extraction, for example, with hot phenol or phenol/chloro 6:639. A scan, using the appropriate excitation line, is per form/isoamyl alcohol. If desired, RNase inhibitors can be formed for each fluorophore used. The digital images gener added to the lysis buffer. Likewise, for certain cell types, it can ated from the scan are then combined for Subsequent analysis. be desirable to add a protein denaturation/digestion step to the For any particular array element, the ratio of the fluorescent protocol. signal from one sample is compared to the fluorescent signal 0261 Nucleic acid and ribonucleic acid (RNA) molecules from another sample, and the relative signal intensity deter can be isolated from a particular biological sample using any mined. Methods for analyzing the data collected from hybrid of a number of procedures, which are well-known in the art, ization to arrays are well known in the art. For example, where the particular isolation procedure chosen being appropriate detection of hybridization involves a fluorescent label, data for the particular biological sample. For example, freeze analysis can include the steps of determining fluorescent thaw and alkaline lysis procedures can be useful for obtaining nucleic acid molecules from Solid materials; heat and alkaline intensity as a function of Substrate position from the data lysis procedures can be useful for obtaining nucleic acid collected, removing outliers, i.e. data deviating from a prede molecules from urine; and proteinase K extraction can be termined statistical distribution, and calculating the relative used to obtain nucleic acid from blood (Roiff, A et al. PCR: binding affinity of the targets from the remaining data. The Clinical Diagnostics and Research, Springer (1994)). resulting data can be displayed as an image with the intensity 0262 For many applications, it is desirable to preferen in each region varying according to the binding affinity tially enrich mRNA with respect to other cellular RNAs, such between targets and probes. Pattern matching can be per as transfer RNA (tRNA) and ribosomal RNA (rRNA). Most formed manually, or can be performed using a computer mRNAs contain a poly(A)tail at their 3' end. This allows them program. Methods for preparation of substrate matrices (e.g., to be enriched by affinity chromatography, for example, using arrays), design of oligonucleotides for use with Such matri oligo(dT) or poly(U) coupled to a solid Support, such as ces, labeling of probes, hybridization conditions, Scanning of cellulose or Sephadex. (see Ausubel et al., CURRENT PRO hybridized matrices, and analysis of patterns generated, TOCOLS IN MOLECULARBIOLOGY, vol. 2, Current Pro including comparison analysis, are described in, for example, tocols Publishing, New York (1994). Once bound, poly(A)-- U.S. Pat. No. 5,800,992. General methods in molecular and mRNA is eluted from the affinity column using 2 mM EDTA/ cellular biochemistry can also be found in such standard O.1% SDS. textbooks as Molecular Cloning: A Laboratory Manual, 3rd 0263. The sample of RNA can comprise a plurality of Ed. (Sambrook et al., Harbor Laboratory Press 2001); Short different mRNA molecules, each different mRNA molecule Protocols in Molecular Biology, 4th Ed. (Ausubel et al. eds., having a different nucleotide sequence. In a specific embodi John Wiley & Sons 1999); Protein Methods (Bollaget al., ment, the mRNA molecules in the RNA sample comprise at John Wiley & Sons 1996): Nonviral Vectors for Gene Therapy least 100 different nucleotide sequences. In another specific (Wagner et al. eds. Academic Press 1999); Viral Vectors embodiment, the RNA sample is a mammalian RNA sample. (Kaplift & Loewy eds. Academic Press 1995); Immunology 0264. In a specific embodiment, total RNA or mRNA from Methods Manual (I. Lefkovits ed., Academic Press 1997); the pluripotent stem cell population is used in the assays and and Cell and Tissue Culture: Laboratory Procedures in Bio methods as disclosed herein. The source of the RNA can be technology (Doyle & Griffiths, John Wiley & Sons 1998). pluripotent cells or stem cells of an animal, human, mammal, Reagents, cloning vectors, and kits for genetic manipulation primate, non-human animal, dog, cat, mouse, rat, bird, etc. In US 2016/0264934 A1 Sep. 15, 2016 54 specific embodiments, the methods of the invention are used rodent animal model or a human Subject. Such as for regen with a sample containing mRNA or total RNA from 1x10 erative medicine and cell replacement/enhancement therapy. cells or less. In another embodiment, proteins can be isolated In some embodiments, a Subject Suffers from or is diagnosed from the foregoing Sources, by methods known in the art, for with a disease or condition selected from the group consisting use in expression analysis at the protein level. of cancer, diabetes, cardiac failure, muscle damage, Celiac 0265 Probes to the homologs of the target gene sequences Disease, neurological disorder, neurodegenerative disorder, disclosed herein in Tables 1, 2 or S1-S3 or S5 can be lysosomal storage disease, and any combinations thereof. In employed preferably wherein non-human nucleic acid is Some embodiments, the pluripotent stem cell is administered being assayed. locally, or alternatively, administration is transplantation of the pluripotent stem cell into the subject. Assays to Determine the Differentiation Potential of 0272. In some embodiments, the stem cell populations for Pluripotent StemCells use in the methods, assays, arrays and kits as disclosed herein 0266. In some embodiments, the present invention pro can be a pluripotent human stem cell population, e.g., a stem vides a method for selecting a stem cell line, e.g., a pluripotent cell population that has the ability to differentiate along a stem cell line, comprising measuring the m6A RNA modifi lineage selected from the group consisting of mesoderm, cation (or méA peak intensities) of target genes (e.g., selected endoderm, ectoderm, neuronal, hematopoietic lineages, and from any listed in Table 1, Table 2, Table S1-S3 or Table S5) any combinations thereof, or differentiated into an insulin in a stem cell line; and comparing the m6A peak intensity producing cell (pancreatic cell, beta-cell, etc.), neuronal cell, with a reference level of the same genes. muscle cell, skin cell, cardiac muscle cell, hepatocyte, blood 0267 In some embodiments, a stem cell line, e.g., a pluri cell, adaptive immunity cell, innate immunity cell and the potent stem cell line is a mammalian pluripotent stem cell like. line. Such as a human pluripotent stem cell line. 0273. In some embodiments, the methods, assays, arrays 0268. In some embodiments, the assay is a high-through and systems as disclosed herein can be performed by a service put assay for assaying a plurality of different stem cell lines, provider, for example, where an investigator can have one or for example, but not limited to permitting one to assess a more samples (e.g., an array of samples) each sample com plurality of different induced pluripotent stem cells derived prising a stem cell line, or a different population of stem cells, from reprogramming a Somatic cell obtained from the same for assessment using the methods, differentiation assays, kits or a different Subject, e.g., a mammalian Subject or a human and systems as disclosed herein in a diagnostic laboratory subject. In some embodiments, the assay is a 96-well format, operated by the service provider. In such an embodiment, and in some embodiments, the assay is in a 384-well format, after performing the assays of the invention as disclosed, the permitting multiple pluripotent stem cell lines to be assayed service provider performs the analysis and provide the inves at the same time. In some embodiments, the assay is an tigator a report, e.g., levels of méA of the target genes, or list automated format, enabling high-throughput analysis of 96 of méA peak intensities of each stem cell line analyzed. In and/or 384-well plates. alternative embodiments, the service provider can provide the 0269. In additional aspects, the stem cell line, e.g., pluri investigator with the raw data of the assays and leave the potent stem cells are cultured under different conditions and analysis to be performed by the investigator. In some embodi in different culture media and analyzed for méA peak inten ments, the report is communicated or sent to the investigator sities in target genes, e.g. genes selected from any listed in via electronic means, e.g., uploaded on a secure web-site, or Table 1, Table 2, Table S1-S3 or Table S5. This allows for sent via e-mail or other electronic communication means. In differences in analysis of stem cells in different maintenance Some embodiments, the investigator can send the samples to culture conditions, such as the cultivation to high density the service provider via any means, e.g., via mail, express which can influence stem cells transitioning from an undif mail, etc., or alternatively, the service provider can provide a ferentiated to differentiated phenotype. service to collect the samples from the investigator and trans 0270. In some embodiments, the differentiation assay can port them to the diagnostic laboratories of the service pro be configured to be automated e.g., to be run by a robot. In vider. In some embodiments, the investigator can deposit the some embodiments, a robot can also perform RNA extraction samples to be analyzed at the location of the service provider of an entire multiwell plate, and pipettes the RNA from each diagnostic laboratories. In alternative embodiments, the Ser well into separate assay plates (e.g., when using 96-well vice provider provides a stop-by service, where the service qPCR plates) or into /4 of a plate (e.g., when using 384-well provider send personnel to the laboratories of the investigator qPCR plates). For example, where one stem cell line is to be and also provides the kits, apparatus, and reagents for per analyzed, the RNA from the stem cell line can be pipetted into forming the assays on the investigators stem cell lines in the each well of a 96-well plate, and each well of the 96-well plate investigators laboratories, and analyze the results and pro used to measure the m6A levels of different genes and/or vides a report to the investigator of the characteristics of each control. In some embodiments, were multiple stem cell lines stem cell line analyzed, or plurality of stem cell lines ana are to be analyzed, the RNA from each stem cell line can be lyzed. plated into 4 of the individual wells of a 384-well plate, where a 384-well plate can be used for the analysis of 4 stem Kits cell lines at the same time. 0274. Another aspect of the present invention relates to 0271 Another aspect of the present invention relates to the kits for characterizing the cell State of a population of stem use of a stem cell line, e.g., a pluripotent stem cell line, which cells, e.g., human stem cells, comprising an array as disclosed has been validated and characterized using the methods and herein. In some embodiments, a kit comprises an array as arrays and assays disclosed herein, for treatment of a subject disclosed herein and reagents for measuring the levels of by administering to a subject a stem cell population, for m6A RNA modification, including méA peak intensities of a example a treatment of a mammalian Subject, e.g., a mouse or set of genes selected from any listed in Table 1 or Table 2, or US 2016/0264934 A1 Sep. 15, 2016

any listed in Tables S1-S3 or S5 in Batista et al., which is of target genes selected from any of those listed in Table 1 or incorporated herein in its entirety by reference. The kit can Table 2, or any from Tables S1-S3 or S5. In some embodi further comprise instructions for use. ments, the kit can be configured to be automated e.g., to be run 0275. In some embodiments, the kit for carrying out the by a robot. For example, samples can be added to the array of methods as disclosed herein comprises probes (e.g., oligo the kit using a robot etc., and the robot can perform the nucleotides and/or primers) which specifically hybridize to hybridization method, wash the array to remove non-hybrid the mRNA of at least about 20, or at least about 30, or at least ized RNA, add the detection reagent (e.g., an anti-móA anti about 40, or at least about 50, or at least about 60, or at least body, such as a detectably labeled anti-méA antibody), wash about 70, or at least about 80, or at least about 90 or more than the array to remove non-bound detection agent, and detection 90 genes selected from any of those listed in Table 1 or Table of méA levels using an anti-méA antibody (e.g., a detectably 2, or any from Tables S1-S3 or S5. In some embodiments, the labeled anti-méA antibody) and readout of the levels of méA kit comprises probes (e.g., oligonucleotides and/or primers) levels of the measured target genes. In some embodiments, which specifically hybridize to the mRNA of at least about 3 the robot can perform computer or comparative analysis of or more genes selected from Table 1 or Table 2. the detected méA levels to provide peak intensities of the 0276 Another aspect of the present invention relates to a m6A levels for each target gene assessed. kit for carrying out a methods and assays as disclosed herein, 0280. In some embodiments, a kit as disclosed herein also where the kit comprises: reagents for measuring the m6A comprises at least one reagent for selecting a desired stem cell levels of a set of genes selected from any of at least 20 or at line, e.g., a stem cell line among many cell lines, e.g., reagents least 30 from the genes listed in Table 1 or Table 2, or any from to select one or more appropriate stem cell lines for the Tables S1-S3 or S5. In some embodiments, the reagents are intended use of the stem cell line. Such agents are well known antibodies to móA RNA, or antibody fragments or epitope in the art, and include without limitation, labeled antibodies binding portions thereof. In some embodiments, the reagents, to select for cell-specific lineage markers and the like. In some e.g., antibodies or fragments thereofare detectably labeled. In embodiments, the labeled antibodies are fluorescently Some embodiments, the probes, e.g., oligonucleotides can be labeled, or labeled with magnetic beads and the like. In some immobilized on a Solid Support. In some embodiments, in embodiments, a kit as disclosed herein can further comprise addition to comprising oligonucleotides that hybridize to at at least one or more reagents for profiling and annotating an least 20 genes selected from Table 1 or Table 2, or any from existing ES cell and/or iPS cell bank in high throughput, Tables S1-S3 or S5., the kit can comprise additional reagents according to the methods as disclosed herein. for measuring the m6A levels of different genes not listed in (0281. In one aspect the invention provides a kit compris Table 1. In some embodiments, the kit comprises an array ing one or more control stem cell populations, e.g., a control which also comprises oligos for at least 1, or at least 2, or at undifferentiated human stem cell population, and/or a control least 3, or at least 4 or least 5 control genes. Control genes differentiated human cell cell population, which can be used include, but are not limited to any of combination of ACTB, for comparative analysis with a test human stem cell popula JARID2, CTCF, SMAD1, B-actin, GAPDH, EIF2B, tion being assessed using the methods, arrays and assays as RPL37A, CDKN1B, ABL1, ELF1, POP4, PSMC4, RPL30, disclosed herein. In addition to the above mentioned compo CASC3, PES1, RPS17, RPSL17L, CDKN1A, MRPL19, nent(s), the kit can also include informational material. The MT-ATP6, GADD45A, PUM1 YWHAZ, UBC, TFRC, TBP, informational material can be descriptive, instructional, mar RPLPO, PPIA, POLR2A, PGK1, IPO8, HMBS, GUSB, keting or other material that relates to the methods described B2M, HPRT1 or 18S and the like. In some embodiments, a herein and/or the use of the components for the assays, meth probe for a control gene can be present multiple times in the ods and systems described herein. For example, the informa same assay or kit. tional material can describe methods for selecting a stem cell 0277. In some embodiments, the kit further comprises population, for measuring méA levels, etc. instructions for use. In some embodiments, the kit comprises a computer readable medium comprising instructions Uses encoded thereupon for running a software program on a com 0282. In some embodiments, the methods, arrays, assays puter to compare the levels of méA modification on the RNA and kits as disclosed herein can be used in a variety of ways of a set of gene targets in a test stem cell population with clinically and in research applications. For instance, methods, reference m6A levels of the same genes. In some embodi arrays, assays and kits as disclosed herein are useful for ments, the kit comprises instructions to access a Software identifying the cell State of a stem cell population (e.g., a program available online (e.g., on a cloud) to compare the human stem cell population), e.g., if it is in an undifferenti measured méA levels of the genes from the test stem cell ated (i.e., resting) pluripotent state, or if it has started or population (e.g., human stem cell population) with reference undergone lineage differentiation. In some embodiments, the m6A levels from a control stem cell population. fingerprinting of méA levels or peak intensities as disclosed 0278. In some embodiments, the array include probes e.g., herein is useful for assessing the phenotype or differentiation hybridization probes that specifically hybridize to a set of of a stem cell population in response to a drug, and therefore target genes selected from a Subset of at least 20 genes from can be used for drug screening purposes. Additionally, the any listed in Table 1 or Table 2, or any from Tables S1-S3 or methods, arrays and assays as disclosed herein are useful to S5. In some embodiments, the probes, e.g., oligos can be ensure stem cell populations used in a drug screening assay immobilized on a solid Support. In some embodiments, the kit are consistant and are in the same cell state, and do not differ and/or assay as disclosed herein comprises probes (e.g., oli from each other, thus enabling the drug screening to identify gos) for at least about 10, or at least about 20, or at least about potential hits/drugs are the effect of the drug rather than due 30, or more than 30 genes listed in Table 1 or 2. to variations in the different stem cell lines. 0279. In some embodiments, the kit is in a 96-well or 0283. In some embodiments, the methods, arrays, assays 384-well format and comprises probes to hybridize with a set and kits as disclosed herein are useful for identifying and US 2016/0264934 A1 Sep. 15, 2016 56 selecting a stem cell line, e.g., a pluripotent stem cell line use. In some embodiments, the use of such a database can be which would be suitable for therapeutic use, e.g., stem cell easily extended Such that a user can upload the data from the therapy or other regenerative medicine. In some embodi array or assays as disclosed herein (e.g., méA levels, and/or ments, the methods, arrays, assays and kits as disclosed m6A peak intensities for selected target genes) for aparticular herein can be used in clinics to determine clinical safety and stem cell population of interest. In a simple analogy, the utility of a particular pluripotent stem cell line. database could function similar to Google's “search for simi 0284. In some embodiments, the methods, arrays, assays lar sites’, whereby the database could be used as an efficient and kits as disclosed herein can be used as a quality control to way to select useful cell lines for novel and/or mixed tissue monitor the characteristics of a stem cell population, e.g., a types, or to identify stem cell lines in a cell bank that can have human stem cell line, over multiple passages and/or before are in the undifferentiated (i.e. resting) cell state or are dif and after cryopreservation procedures, for example, to ensure ferentiated along a specific lineage. that the cell remains in an undifferentiated (e.g., resting) state 0288. In some embodiments, the methods, arrays, assays and no significant epigenetic or functional genomic changes and kits as disclosed herein can be used for identification and have occurred over time (e.g., over passages and after cryo selection of a desired stem cell line, e.g., a pluripotent stem preservation). For example, the methods, arrays, assays and cell line for mass production. For example, methods to inhibit kits as disclosed herein can be used to characterize stem cell MEETTL3 and/or METTL4 can be used to maintain the cells populations before, and during storage, e.g., in a stem cell in an undifferentiated State of culturing and expanding a stem bank, to catalogue each stem cell line (e.g., human stem cell cell population efficiently in large quantities, e.g., large batch line) which is placed in the bank, and to ensure that the stem cultures or in bioreactors, and the fingerprinting methods, and cells have the same properties after thawing as they did prior uses of the assays and arrays as disclosed herein can be used to cryopreservation. In some embodiments, a stem cell popu as a quality control to ensure the expanded stem cell popula lation can be contacted with a METTL3 and/or METTL4 tion remained in an undifferentiated cell state during expan inhibitor as disclosed herein, before, after or during crypo sion in a bulk culture. preservation, e.g., a METTL3 and/or METTL4 inhibitor can 0289. In another embodiment, the methods, arrays, assays be present in a cryopreservation media. and kits as disclosed herein can be used for assessing drug 0285. In some embodiments, the raw data of méA levels responsiveness of a stem cell population, for example, a stem and/or méA peak intensities for target genes for each stem cell line can be assessed using the methods, arrays, assays and cell line can be stored in a centralized database, where the data kits as disclosed herein prior to, during, and after contacting can be used to select a pluripotent stem cell line for a particu with a drug or other agent or stimulus (e.g., electric stimuli for lar use or utility, e.g., for selection of a stem cell line in a stem cardiac pluripotent progenitors) to generate m6A signature of cell bank. the stem cell line in the presence or absence of the drug. 0286. In some embodiments, the methods, arrays, assays 0290. In another embodiment, the methods, arrays, assays and kits as disclosed hereincan be used in research to monitor and kits as disclosed herein can be used for selection of a stem functional genomic changes as a stem cell line, e.g., a pluri cell line, e.g., a pluripotent stem cell line, based on its safety potent stem cell line, differentiates along different lineages. profile. For example, a stem cell population can be selected In some embodiments, aspects as disclosed herein can be that has a móA signature indicating it is in an undifferentiated used to monitor and determine the characteristics of stem cell State etc. lines from Subjects with particular diseases, e.g., one can 0291. In another embodiment, the methods, arrays, assays monitor stem cell lines, e.g., a stem cell line from Subjects and kits as disclosed herein can be used for selection and/or with genetic defects or particular genetic polymorphisms, quality control, and/or validation of a stem cell population in and/or having a particular disease. For example, one can different or new states of pluripotency or multipotency, for monitor and determine the m6A levels between an iPSC cell example to provide information regarding which stem cell derived from a subject with a neurodegenerative disease, Such lines are in an undifferentiated State (i.e., pluripotent state) but as ALS, as compared to a normal iPSC cell from a healthy do not fall under the usual definition of human ES cell lines Subject (or a non-ALS Subject). Such as a healthy sibling. (e.g., human ground-state ES cell and partially repro Similarly, one can determine ifiPS cells has comparable m6A grammed cell lines, e.g., partially induced pluripotent stem levels (or peak intensities) of selected target genes as com (piPS) cells, which are capable of being reprogrammed fur pared to human ES cells or other pluripotent stem cells. ther to a pluripotent stem cell). Additionally, the aspects as disclosed herein can fully char 0292. It has been shown that continued in vitro culture and acterize the cell State of a stem cell population, e.g. human passaging improves the quality of iPS cell lines (see Polo et stem cell population without the need for teratoma assays al., Nat Biotechnol. 2010 August; 28(8):848-55, and Nat Rev and/or generation of chimera mice, therefore significantly Mol Cell Biol. 2010 September; 11 (9):601, and Nat Rev increasing the high-throughputability of characterizing pluri Genet. 2010 September; 11 (9): 593). On the other hand, potent stem cell lines. continued passaging is expensive. Accordingly, in some 0287. In some embodiments, the methods, arrays, assays embodiments, the methods, arrays, assays and kits as dis and kits as disclosed herein can be used in creating a database, closed herein can be used for measuring how much passaging where such a database would be useful in organizing and is Sufficient for improving the quality of the stem cell line, cataloging a human stem cell repository, e.g., a central reposi e.g., the pluripotent stem cell line. tory (e.g., a tissue and/or cell bank) containing a large number 0293. In further embodiments, the methods, arrays, assays of quality-controlled and utility-predicted pluripotent cell and kits as disclosed herein can be used in a variety of differ lines, such that one can use a database comprising the m6A ent research and clinical uses to characterize, monitor and levels (or méA peak intensities) of specific target genes for assess if a stem cell line is in an undifferentiated state. For each stem cell line in the bank to specifically select a particu example, typical application includes in areas such as, but not lar pluripotent stem cell line for the investigators’ intended limited to, (i) labs and/or companies interested in disease US 2016/0264934 A1 Sep. 15, 2016 57 mechanisms (e.g., using the kits or services as disclosed potent stem cell is a human stem cell line known in the art. In herein to reduce the complexity of generating iPS cell lines, Some embodiments, the pluripotent stem cell is an induced as well as differentiated cells for disease modeling and small pluripotent stem (iPS) cell, or a stably reprogrammed cell scale drug screening, (ii) labs and/or companies trying to which is an intermediate pluripotent stem cell and can be identify Small molecules and/or biologicals for a given dis further reprogrammed into an iPS cell, e.g., partial induced ease target (e.g., using the kits and/or services as disclose pluripotent stem cells (also referred to as “piPS cells”). In herein to enable the production of large numbers of highly some embodiments, the pluripotent stem cell, iPSC or piPSC standardized cells for drug screening), (iii) clinical and pre is a genetically modified pluripotent stem cell. clinical research groups for quality control and validating 0298. In some embodiments, the pluripotent state of a stem cell lines where they are interested in producing cells for pluripotent stem cell used in the present invention can be implantation into humans or animals (e.g., using a kit and/or confirmed by various methods. For example, the pluripotent service as disclosed hereinto permits quality control at a level stem cells can be tested for the presence or absence of char of accuracy that will be sufficient for regulatory approval, acteristic ES cell markers. In the case of human ES cells, e.g., FDA approval), (iv) tissue banks that desire to give their examples of such markers include SSEA-4, SSEA-3, TRA customers information, including advice and data about the 1-60, TRA-1-81 and OCT4, and are known in the art. undifferentiated State of the stem cell population, and quality 0299 While the methods of the present invention allow the and utility of the stem cell lines, e.g., pluripotent stem cell pluripotency (or lack thereof) to be assessed by measuring lines on offer (e.g., using a kit and/or service as disclosed m6A levels (or peak intensities) of a Subset of genes listed in herein to provide unbiased assessment of the quality and/or Table 1 and/or 2, the pluripotency of a stem cell line can also utility of a large number of pluripotent cell lines, in an inex be confirmed by injecting the cells into a Suitable animal, e.g., pensive high throughput manner, —it is contemplated that the a SCID mouse, and observing the production of differentiated assays can ultimately be performed on 1,000-100,000s of cells and tissues. Still another method of confirming pluripo pluripotent stem cell lines to cover the whole population of tency is using the Subject pluripotent cells to generate chi cell lines stored in the cell bank), (v) private consumers who meric animals and observing the contribution of the intro desire to generate, and optionally, bank at least one or more duced cells to different cell types. Methods for producing stem cell lines, e.g., pluripotent stem cell lines, e.g., iPS cell chimeric animals are well known in the art and are described lines (or piPS cell lines) generated from their somatic differ in U.S. Pat. No. 6,642.433, which is incorporated by refer entiated cells, either for themselves and/or their children or ence herein. other offspring, for example, as a type of health insurance 0300 Yet another method of confirming pluripotency is to policy for future regenerative medicine purposes. observe ES cell differentiation into embryoid bodies and other differentiated cell types when cultured under conditions StemCell Populations for Analysis of méA Levels (or méA that favor differentiation (e.g., removal of fibroblast feeder Peak Intensities) layers). This method has been utilized and it has been con 0294 As disclosed herein, méA levels (e.g., méA peak firmed that the subject pluripotent cells give rise to embryoid intensities) of target genes can be used to assess if the cell bodies and different differentiated cell types in tissue culture. state of any stem cell line or population, from any species, e.g. 0301 In this regard, it is known that some mouse embry a mammalian species, such as a human. In some embodi onic stem (ES) cells have a propensity of differentiating into ments, the present invention specifically contemplates using Some cell types at a greater efficiency as compared to other the methods, arrays, assays and kits as disclosed herein to cell types. Similarly, human pluripotent (ES) cells can pos determine if a stem cell is pluripotent. Any type of stem cell sess selective differentiation capacity. Accordingly, the can be assessed. For simplicity, when referring to analysis of present invention can be used to identify and select a pluri a pluripotent stem cell herein, this encompasses analysis of potent stem cell with desired characteristics and differentia both pluripotent and non-pluripotent stem cells. tion propensity for the desired use of the pluripotent stem cell. 0295. In some embodiments, the stem cell is a pluripotent For example, where the pluripotent cell line has been stem cell. Generally, a pluripotent stem cell to be analyzed screened according to the methods of the invention, a pluri according to the methods described herein can be obtained or potent stem cell can be selected due to its increased efficiency derived from any available source. Accordingly, a pluripotent of differentiating along a particular cell line, and can be cell can be obtained or derived from a vertebrate or inverte induced to differentiate to obtain the desired cell types brate. In some embodiments, the pluripotent stem cell is according to known methods. For example, a human pluripo mammalian pluripotent stem cell. In all aspects as disclosed tent stem cell, e.g., a ES cell or iPS cell can be induced to herein, pluripotent stem cells for use in the methods, arrays, differentiate into hematopoietic stem cells, muscle cells, car assays and kits as disclosed herein can be any pluripotent diac muscle cells, liver cells, islet cells, retinal cells, cartilage stem cell. cells, epithelial cells, urinary tract cells, etc., by culturing 0296. In some embodiments, the pluripotent stem cell is a Such cells in differentiation medium and under conditions primate or rodent pluripotent stem cell. In some embodi which provide for cell differentiation, according to methods ments, the pluripotent stem cell is selected from the group known to persons of ordinary skill in the art. Medium and consisting of chimpanzee, cynomologous monkey, Spider methods which result in the differentiation of ES cells are monkey, macaques (e.g. Rhesus monkey), mouse, rat, wood known in the art as are suitable culturing conditions. chuck, ferret, rabbit, hamster, cow, horse, pig, deer, bison, 0302) In some embodiments, the stem cell population is a buffalo, feline (e.g., domestic cat), canine (e.g. dog, foX and iPS cell, e.g., a hiPSC. One can use any method for repro wolf), avian (e.g. chicken, emu, and ostrich), and fish (e.g., gramming a somatic cell to an iPS cell or an piPS cell, for trout, catfish and salmon) pluripotent stem cell. example, as disclosed in International patent applications; 0297. In some embodiments, the pluripotent stem cell is a WO2007/069666; WO2008/118820; WO2008/124133; human pluripotent stem cell. In some embodiments, the pluri WO2008/151058; WO2009/006997; and U.S. Patent Appli US 2016/0264934 A1 Sep. 15, 2016

cations US2010/0062533; US2009/0227032; US2009/ the NIH Human Embryonic Stem Cell Registry, e.g. hES 0068742; US2009/0047263; US2010/0015705; US2009/ BGN-01, hESBGN-02, hESBGN-03, hESBGN-04 0081784; US2008/0233610; U.S. Pat. No. 7,615,374; U.S. (BresaGen, Inc.); HES-1, HES-2, HES-3, HES-4, HES-5, patent application Ser. No. 12/595,041, EP2145000, HES-6 (ES Cell International); Miz-hES1 (MizMedi Hospi CA2683056, AU8236629, Ser. No. 12/602,184, EP2164951, tal-Seoul National University); HSF-1, HSF-6 (University of CA2688539, US2010/0105100; US2009/0324559, US2009/ California at San Francisco); and H1, H7, H9, H13, H14 0304646, US2009/0299763, US2009/0191159, the contents (Wisconsin Alumni Research Foundation (WiCell Research of which are incorporated herein in their entirety by refer Institute)). In some embodiments, an embryo has not been ence. In some embodiments, an iPS cell foruse in the methods destroyed in obtaining a pluripotent stem cell for use in the as described herein can be produced by any method known in methods, assays, systems as disclosed herein. the art for reprogramming a cell, for example virally-induced 0306 In another embodiment, the stem cells, e.g., adult or or chemically induced generation of reprogrammed cells, as embryonic stem cells can be isolated from tissue including disclosed in EP 1970446, US2009/0047263, US2009/ solid tissues (the exception to solid tissue is whole blood, 0068742, and 2009/0227032, which are incorporated herein including blood, plasma and bone marrow) which were pre in their entirety by reference. In some embodiments, iPS cells viously unidentified in the literature as sources of stem cells. can be reprogrammed using modified RNA (mod-RNA) as In some embodiments, the tissue is heart or cardiac tissue. In disclosed in US2012/0046346, which is incorporated herein other embodiments, the tissue is for example but not limited in its entirety by reference. to, umbilical cord blood, placenta, bone marrow, or chondral 0303. In some embodiments, an iPS cell for use in the villi. methods, arrays, assays and kits as disclosed herein can be 0307 Stem cells of interest for use in the methods, arrays, produced from the incomplete reprogramming of a Somatic assays and kits as disclosed herein also include embryonic cell by chemical reprogramming. Such as by the methods as cells of various types, exemplified by human embryonic stem disclosed in WO2010/033906, the content of which is incor (hES) cells, described by Thomson et al. (1998) Science porated herein in its entirety by reference. In alternative 282:1145; embryonic stem cells from other primates, such as embodiments, the stable reprogrammed cells disclosed Rhesus stem cells (Thomson et al. (1995) Proc. Natl. Acad. herein can be produced from the incomplete reprogramming Sci USA 92:7844); marmoset stem cells (Thomson et al. of a Somatic cell by non-viral means, such as by the methods (1996) Biol. Reprod. 55:254); and human embryonic germ as disclosed in WO2010/048567 the contents of which is (hEG) cells (Shambloft et al., Proc. Natl. Acad. Sci. USA incorporated herein in its entirety by reference. 95:13726, 1998). Also of interest are lineage committed stem 0304. Other stem cells for use in the methods as disclosed cells. Such as mesodermal stem cells and other early cardio herein can be any stem cell known to persons of ordinary skill genic cells (see Reyes et al. (2001) Blood 98:2615-2625; in the art. Exemplary stem cells include embryonic stem cells, Eisenberg & Bader (1996) Circ Res. 78(2):205-16; etc.). adult stem cells, pluripotent stem cells, neural stem cells, liver stem cells, muscle stem cells, muscle precursor Stem cells, Drug Screening and Other Uses endothelial progenitor cells, bone marrow stem cells, chon 0308 Existing assays for drug screening/testing and toxi drogenic stem cells, lymphoid stem cells, mesenchymal stem cology studies have several shortcomings because they can cells, hematopoietic stem cells, central nervous system stem include pluripotent stem cells which are poorly characterized cells, peripheral nervous system stem cells, and the like. and/or pluripotent stem cell lines which are abnormal or Descriptions of stem cells, including methods for isolating deviate from a typical pluripotent stem cell line in terms of its and culturing them, can be found in, among other places, differentiation capacity and potential. Accordingly, by mea Embryonic StemCells, Methods and Protocols, Turksen, ed., Suring méA levels of a set of target genes as disclosed herein, Humana Press, 2002; Weisman et al., Annu. Rev. Cell. Dev. one can identify and choose a stem cell line which is in an Biol. 17:387 403; Pittinger et al., Science, 284:14347, 1999; undifferentiated State which Suitable for use in drug screening Animal Cell Culture, Masters, ed., Oxford University Press, assay. Such identified stem cells then can be chosen for use in 2000; Jackson et al., PNAS 96(25): 1448286, 1999: Zuket al., screening assays to Screen a test compound and or in disease Tissue Engineering, 7:211228, 2001 (“Zuk et al.'); particu modeling assays. larly Chapters 33 41; and U.S. Pat. Nos. 5,559,022, 5,672,346 and 5,827,735. Descriptions of stromal cells, including meth 0309 Furthermore, the methods, arrays, assays and kits as ods for isolating them, can be found in, among other places, disclosed herein are useful to determine the cell state of Prockop, Science, 276:71 74, 1997: Theise et al., Hepatology, specific cell types from all developmental stages and even 31:23540, 2000; Current Protocols in Cell Biology, Bonifa from blastocysts etc. cino et al., eds. John Wiley & Sons, 2000 (including updates through March, 2002); and U.S. Pat. No. 4,963,489. Uses to Optimize StemCell Maintenance Media 0305 Additional pluripotent stem cells for use in the 0310. In some embodiments, the methods, arrays, assays methods, arrays, assays and kits as disclosed herein can be and kits as disclosed herein can be used to optimize culture any cells derived from any kind of tissue (for example embry media for maintaince and/or passage of stem cell populations onic tissue such as fetal or pre-fetal tissue, or adult tissue), in an undifferentiated State. For example, one can measure which stem cells have the characteristic of being capable m6A levels (or peak intensities) of selected target genes under appropriate conditions of producing progeny of differ selected from any listed in Table 1 and/or Table 2 in a stem cell ent cell types that are derivatives of all of the 3 germinal layers population in the presence of different culture media and/or (endoderm, mesoderm, and ectoderm). These cell types can culture conditions, and using the m6A levels measured to be provided in the form of an established cell line, or they can assist in selecting the culture media and/or culture conditions be obtained directly from primary embryonic tissue and used which maintains the stem cell population in an undifferenti immediately for differentiation. Included are cells listed in ated State. US 2016/0264934 A1 Sep. 15, 2016 59

0311. Accordingly, aspects of the present invention relate 0318 mESC Cell Culture and Differentiation to culture media, e.g., culture media comprising a METTL3 0319 J-1 murine embryonic stem cells were grown under and/or METTL4 inhibitor as disclosed herein for maintaining typical feeder free ES cell culture conditions. Cells were a stem cell population in an undifferentiated State. In some grown in gelatinized (0.2% Gelatin) tissue culture plates in embodiments, the culture media is a cryopreservation culture mESC media (KnockCut DMEM (Gibco, Life Technologies: media. By way of an example only, in some embodiments, the 10829-018) supplemented with 1000 U/ml leukemia inhibi methods, arrays, assays and kits as disclosed herein can be tory factor (Millipore; ESG1107), lx non-essential amino used to confirm that a stem cell media, e.g., a pluripotent stem acids (Gibco, Life Technologies: 11140-050), lx Glutamax cell media maintains a stem cell in a pluripotent state and does (Gibco, Life Technologies; 35050-061), 10% Pen Strep not result in méA modification which indicates that the stem (Gibco, Life Technologies; 151140-122) and 15% Fetal cell lines is in an undifferentiated state. Bovine Serum (HyClone, SH30071.03)). 0312 Another aspect of the present invention relates to a 0320 For cardiomyocite differentiation, mESCs were container comprising a stem cell population, e.g. a human plated at a density of 2x10 cells/mL in ultra-low attachment stem cell population in the presence of culture media com plates in cardiomyocyte differentiation media (CMD) prising a METTL3 and/or METTL4 inhibitor as disclosed (DMEMIGIBCO, 15% FBSHyclonel, 1% penicillin/strep herein. tomycin, 1% GlutaMax and 1 mM Ascorbic Acid Sigma) to induce EB formation. Media was changed on day 3 and on day 6, EBs were re-suspended in fresh CMD media and EXAMPLES replated on 0.2% gelatin coated dishes. Media was changed 0313. Throughout this application, various publications on day 9 and on day 12 the number of contracting patches of are referenced. The disclosures of all of the publications and cells was quantified in triplicate for each cell line. those references cited within those publications in their 0321 For Neuron differentiation, Mouse embryonic stem entireties are hereby incorporated by reference into this appli cells were grown in mESC medium (DMEM (Invitrogen), cation in order to more fully describe the state of the art to 12% knockout replacement serum (Invitrogen), 3% cosmic which this invention pertains. The following examples are not calf serum (Thermo Scientific) supplemented with non-es intended to limit the scope of the claims to the invention, but sential amino acids (Invitrogen), penicillin-streptomycin (In are rather intended to be exemplary of certain embodiments. vitrogen), sodium pyruvate (Invitrogen), 2-mercaptoethanol Any variations in the exemplified methods which occur to the (Invitrogen) and LIF). Cells were dissociated in 2.5% trypsin skilled artisan are intended to fall within the scope of the for 5 minutes, pelleted, and resuspended on a gelatinized present invention. plate in MEF medium (DMEM, 10% cosmic calf serum, 0314. The developmental potential of human pluripotent non-essential amino acids, penicillin-streptomycin, sodium stem cells suggests that they can produce disease-relevant cell pyruvate, 2-mercaptoethanol) for 30 minutes to remove feed types for biomedical research as well as cells for transplan ers. 5x106 mESCs were then replated onto 10 cm bacterial tation to address a disease. However, Substantial variation has plates in MEF medium and cultured for 4 days. On day 4, cells been reported among pluripotent cell lines, which could were replated under adherent culture conditions. Medium affect their utility and clinical safety. Disclosed herein are was replaced with ITSFn medium (DMEM:F12 (Invitrogen), methods to maintain a stem cell line, e.g., human stem cell insulin 5ug/ml, apotransferrin 50 ug/ml. Sodium selenate population in an undifferentiated State, and assays and arrays 30 nM), fibronectin 250 ng/ml) the following day and to assess the cell State of a stem cell population, e.g., if it is an replaced every other day. Cells were cultured for 10 days in undifferentiated State, and/or progressed along a lineage dif ITSFn before fixation. ferentiation pathway. 0322 For the cell proliferation assay (MTT) 5 thousand cells where cultured in 24 well dish and the assay performed 0315. In summary, the inventors have developed methods according to the manufacturer's protocol (Roche; for maintaining human stem cell in an undifferentiated State, 11465007001). For the single colony assays and Nanog stain and assays and arrays to assess the cell state of a stem cell ing, 1 thousands cells where cultured, per well, on a six well population in a rapid, cost effective, high-throughput method dish. that is independent of gene expression levels. 0323 For Alkaline Phosphatase Staining, at day 6 cells were fixed (50% Methanol, 50% Acetone) and stained for Methods and Materials Alkaline Phosphatase with Vector Blue Alkaline Phosphatase Substracte Kit (Vector; 5300), according to manufacturers 0316 Mouse Cell Culture and Differentiation protocol. 0317 J-1 murine embryonic stem cells were grown under 0324 For Nanog and Oct4 staining cells where fixed with typical feeder free ES cell culture conditions. For cardiomyo 4% paraformaldehyde (PFA) (Thermo Scientific, 28909). cyte formation, mESCs were differentiated in cardiomyocyte Cardiomyocites were cultured in chamber slides and fixed on differentiation media and scored on day 12. For neuron for day 12 with 4% PFA and N cells where fixed for 20 minutes mation, mESCs were differentiated in MEF and ITSFn in 4% PFA. Cells where washed 3 times with PBS and medium and scored after 10 days in ITSFn medium. For the blocked in PBS with 0.1% Triton and 5% FBS (for N cells, cell proliferation assay 5000 cells where cultured in 24 well CCS was used instead of FBS) for 20 minutes. Cells where plates and the assay performed according to the manufactur then incubated with primary antibody Rabbit anti-Nanog er's protocol (MTT assay, Roche). For the single colony Antibody, Bethyl: mouse anti-Oct-3/4, Santa cruz, mMF20, assays and Nanog staining, 1000 cells where cultured per Developmental studies Hybridoma bank; anti-Tuj1, Covance well, on a six well plate. For alkaline phosphatase staining, (1:1000), rabbit anti-Nanog, ReproCell (1:200) for 30 min cells were stained according to the manufacturer's protocol utes in blocking medium. After 3 PBS washes, cells where (Vector Blue Alkaline Phosphatase Substrate Kit). incubated with secondary antibody (Alexa 488 Goat anti US 2016/0264934 A1 Sep. 15, 2016 60 mouse, Alexa Goat anti-Rabbit, donkey Alexa-555 anti (Promega) and 2.5 volume of 100% ethanol at -20°C. either mouse, donkey Alexa-488 anti-Rabbit (1:1000; Invitrogen)) for 2 hours or overnight. The precipitated RNA was then in blocking medium. Cells where washed 3 times and Nuclei centrifuged using a refrigerated table-top at maximum speed were counterstained with DAPI. Images where collected on a (>13,000 g) at 4°C. for 20 minutes. The precipitated RNA Zeiss Observer.Z1 using AxioVision software. was then washed with 70° C. ethanol and centrifuged at 0325 hESCs Cell Culture, Transfection and Differentia maximum speed for an additional 10 minutes. The final pellet tion was then re-suspended in ultra pure H.O. PolyA RNA selec 0326 H1 (WAO1) cells were cultured in feeder-free con tion was performed twice using Dynabeads mRNA Purifica ditions as described (Sigova et al., 2013). Stable hESC lines tion Kit (Invitrogen Cat. #610.06) according to the manufac were created that expressed shMETTL3 RNA or scrambled turer's protocol. The second polyA RNA selection was shRNA by transfecting hESCs with plasmids encoding performed using the eluate of the first polyaA RNA selection shMETTL3 or scrambled shRNA and a puromycin resistance as starting material according to the manufactures instruc gene. Cells were treated with puromycin for six days begin tion. For all RNA samples, the concentration, purity and ning two days after transfection. For each shRNA, two inde integrity of the RNA were verified using a NanoDrop and pendent puromycin-resistant colonies were picked and Bioanalyzer. expanded. Endodermal differentiation was then induced by 0331 Immunofluorescence Staining Activin A, as described (Sigova et al., 2013). Day 2 and Day 0332 Cells were fixed with 4% paraformaldehyde 4 of differentiation were measured from the time that Activin (Thermo Scientific). Washes were performed with PBS. After was added. Puromycin was removed from the media one day blocking, cells were incubated with primary antibody in prior to endodermal differentiation. Neuronal induction was blocking medium. Cells were washed and incubated with induced through treated with potent and specific inhibitors of secondary antibody in blocking medium. Nuclei were coun SMAD signaling. terstained with DAPI. 0327 H1 (WAO1) cells were cultured in feeder-free con 0333 RNA mA IP dition using mTESR1 media (Stem Cell Technologies Cat. 0334. The detailed anti-mA RIP and library preparation #05850) on 6-well plates coated with matrigel (BD Bio protocols are described in detail in the Extended Experimen sciences, Cat.#354603), as described (Sigova et al., 2013). tal Procedures. RNA was extracted with TRIZol (Ambion) Transfection of shMETTL3 RNA (DF/HCC DNA Resource according to manufacturer's protocol. After polyA RNA Core Cat.#HsSH00253093) and scrambled shRNA (DF/ selection, RNA was fragmented in fragmentation buffer (10 HCC DNA Resource Core, pIKO-scramble, Cat. nM ZnCl2, 10 mM Tris HCl, pH7.0). Fragmented RNA was #EvNO00438085) was performed using Lipofectamine LTX incubated with anti-mA polyclonal antibody (Synaptic Sys (Life Technologies Cat.#25338100). Two days after transfec tems) and after extensive washing, bound RNA eluted. Input tion, cells were treated with 0.5 microgram per milliliter of and anti-mA polyclonal antibody enriched RNA were used puromycin (Life Technologies Cat.fi A113802) for 6 days. to construct RNA libraries. For each shRNA, two independent puromycin-resistant colo 0335 Mouse ESC Protocol 1– nies were picked from independent wells and expanded and 0336 Poly A+ RNA was purified with one round of selec Maintained under puromycin for analysis. Before Endoder tion with MicroPoly(A)Purist Kit (Ambion; AM1919). The mal differentiation puromycin was withdrawn. Endodermal PolyA+ RNA was fragmented to ~100 nucleotide fragments differentiation was then induced by resting cells in RPMI by incubation with Zinc Chloride buffer (10 mM ZnCl2, 10 (Life Technologies Cat.#11875-093) with B27 supplement mM Tris-HCl, pH 7.0). After the RNA was incubated at 94° C. (Life Technologies Cat.# 17504-044) for 24 hours followed for 30 seconds, Zinc Chloride buffer, previously warmed to by addition of Activin (R&D Systems), as described (Sigova 94°C., was added and incubated for 2 minutes. The reaction et al., 2013). Day 2 and Day 4 of differentiation were mea was stopped with 0.2M EDTA, and the RNA precipitated with sured from the time that Activin was added. standard ethanol precipitation. 15ug of anti-móA polyclonal 0328 RNA Extraction, DNASE I Treatment and Poly a antibody (Synaptic Systems) were pretreated with agarose Selection beads coated with ssDNA to reduced background (PMID: 0329 mESC total RNA was isolated from cells according 21472695). Antibody was conjugated to Dynabeads Protein to manufacturers instructions using TRIZol reagent (Am G (Life Technologies: 10003D) overnight at 4°C. 200 ug of bion). The RNA was re-suspended in ultrapure HO, treated fragmented RNA were incubated with the antibody in with DNAse I (Ambion) for 30 minat37° C. and subjected to 1xDamIP buffer (10 mM sodium phosphate buffer, pH 7.0, RNA clean up reaction with RNeasy Midi Kit (Qiagen), 0.3 M NaCl, 0.05% (w/v) Triton X-100) supplemented with according to manufacture’s protocol. RNA was eluted in 1% SuperRNAse Inhibitor (Ambion), for 3 hours at 4° C. ultrapure H.O. PolyA RNA selection was performed using After incubation, the antibody was washed 5 times with MicroPoly(A) Purist (Life Technologies) according to the DamIP buffer and the RNA eluted with 0.5 mg ml-1 N6-me manufacturer's protocol. The second polyA RNA selection thyladenosine (Sigma-Aldrich) in DamIP buffer (Xiao and was performed using the eluate of the first polyA RNA selec Moore, 2011). 1 volume of Ethanol was added to the eluted tion as starting material according to the manufacture's RNA, and the RNA recovered an RNeasy mini column. instruction. 0337 Library Construction: 0330 hESC total RNA was isolated from cells according 0338. The imunoprecipitated RNA, and an equivalent to manufacturers instructions using TRIZol LS reagent (Am amount of input RNA where used for library generation with bion). Total RNA was treated using DNAse I (Promega) for the dUTP protocol, as described (Levin et al., 2010) except 20 minutes at 37°C. The treated RNA was then acid phenol/ libraries were size selected by gel purification after ligation chloroform extracted and chloroform extracted. The RNA and after PCR amplification. Libraries where sequenced was precipitated using 300 mM final concentration of NaCl, using an Illumina HiSeq at the Stanford Center for Genomics spiked with 1 ul of 50 mg/ml of Ultra Pure Glycogen and Personalized Medicine. US 2016/0264934 A1 Sep. 15, 2016

0339 Mouse ESC Protocol 2 GGA AGA GCG TCG TGA (SEQ ID NO: 664) T/iSp18/ 0340 Second set of libraries was generated as described in GGATCC/iSp18/TACTGAACCGC (SEQID NO: 665). (Schwartz et al., 2013). Total RNA was subjected to two (0345 Human ESC Protocol: rounds of selection with MicroPoly(A)Purist Kit (Ambion; (0346) Of note for each biological replicate for mA-seq. AM1919).5ug of RNA were fragmented as described above. we started with 400 g of total RNA yielding approximately After fragmentation RNA was incubated with 30 units of 10 ug of double polyA selected RNA which was re-suspended Polynucleotide Kinase in 50 mM Tris-HCl pH 7.6, 8 mM in a final volume of 50 ul using UltraPure HO (Life Tech EDTA and 2 mM DTT. RNA was purified on a quiagen nologies). 250 ul of digestion/fragmentation buffer (10 nM RNeasy column, and 10% was saved to be used as input. RNA ZnCl2, 10 mM Tris HCl, pH7.0) was added to the 50 ul of 2x was denatured and incubated with 25ul of protein G beads polyA RNA. The 300 ul of PolyA RNA/fragmentation buffer (previously bound to 3 ug of anti-méA polyclonal antibody was heated at 94° C. for exactly 5 minutes. 50 ul of 0.5M (Synaptic Systems) in 1xIPP buffer (150 mM. NaCl, 10 mM EDTA was added to stop the fragmentation reaction and TRIS-HCL and 0.1% NP-40). After 3 hours, beads where immediately put on ice. washed 2 times with IPP buffer, 2 times with low salt buffer, 0347 The 2x polyA fragmented RNA was then heated at 2 times with high saltbuffer and 1 time with IPP buffer. RNA 65° C. for 5 minutes and immediately put on ice. 50 ul of was eluted from the beads with 30 ul of RLT buffer, for 5 mA-DynaBeads (The mA antibody-Synaptic Systems was minutes. The RNA eluate was added to 20 ul of myone Silane coupled to Dynabeads using the Life Technologies coupling beads re-suspended in 30 ul of RLT. 60 ul of Ethanol where kit catil 1431 1D) were equilibrated by washing twice for 5 added to the beads and incubated for 2 minutes. The beads minutes in 500 ul of mA-Binding Buffer (50 mM Tris-HCl, where then washed 2 times with 70% Ethanol and the RNA 150 mM NaCl2, 1% NP-40, 0.05% EDTA). The RNA was eluted in 160 ul of IPP buffer. The eluted RNA was added to then added to the equilibrated mA-DynaBeads. The RNA 25ul of Protein Abeads previously bound to 3 ug of anti-méA was allowed to bind to the mA-Dynabeads (in 500 ul volume polyclonal antibody (Synaptic Systems). After 3 hour incu of mA-Dynabeads/m"A-Binding Buffer at room tempera bation beads where washed and RNA eluted as described ture while rotating (tail-over-head) at 7 rotations per minutes above. RNA was eluted in 100 ul of RNAse free water. for 1 hour. The tubes containing the samples were placed on 0341 Library Construction: a magnet allowing the beads complexes to cluster for one 0342. After isolating fragmented méA enriched RNA we minute or until the solution become clear. The liquid phase constructed deep sequencing libraries as Rouskin et al. with was carefully collected and placed on ice as this 500 ul frac the following modifications. RNA was first ligated to 25 pmol tion represents the “Supernatant” of the mA IP. Following of pre-adenylated L3 (IDT) adaptor overnight at 16°C. The the collection of the Supernatant fraction, series of washes ligated samples were subjected to 8% PAGE separation, were performed using various buffers (see as follow). For all stained and imaged with SybrGold (Life Technologies) and wash steps to the exception of the elution step, the beads were ligated material was excised. The resulting gel slices were washed 3 minutes then place on a magnet and the wash crushed and the RNA was eluted in 400 uL of Crush Soak buffers were discarded. Following the supernatant collection. Buffer (500 mM. NaCl and 1 mM EDTA) and 5 uL of Wash step 1: The reminding fractions bound to the beads were SUPERasen (Life Technologies) overnight at 4°C. Eluted washed twice in 500 ul of mA-Binding Buffer (Tris-HCl 50 RNA was purified with SpinX columns (Corning), precipi mM, NaCl, 150 mM, NP-40 1%, EDTA 0.05%). Wash Step 2: tated, and reverse transcribed (RT) with RT oligos modified The RNA/beads complexes were washed once in 500 ul of from the iCLIP method ((Konig et al., 2010), sequences Low Salt Buffer (SSPE 0.25x, EDTA 0.001M, Tween-20 below). cDNAs size selected on a 6% PAGE and eluted in 400 0.05%, NaCl 37.5 mM). Wash Step 3: The RNA/beads com uL of Crush Soak Buffer at 50° C. overnight. Eluted cDNA plexes were washed once in 500 ul of High Salt Buffer (SSPE was purified with Spinx columns, precipitated, and circular 0.25x, EDTA, 0.001M, Tween-20 0.05%, NaCl 137.5 mM). ized using CircIligasell (Epicentre) for 2 hours at 60° C. in a Wash Step 4: The RNA/beads complexes were washed twice 20ul reaction. Circular cDNAs were purified with MiniElute in 500 ul of in TET (T.E.+0.05% Tween-20). Elution Step: columns and Buffer PNI (Qiagen) and eluted in 20 uL of EB The mA-RNA was eluted from the beads by repeating four Buffer. PCR amplification was performed in 50 uL reactions times the following: 125 ul of Elution Buffer (DTT 0.02M, with 25uL 2x Phusion High Fidelity Master Mix, 2.5uL of 10 NaCl 0.150M, Tris-HC1 pH7.5 0.05M, EDTA 0.001M, SDS uM P3/P5 PCR primers (Ule, NSMB 2009/2010), and 22.5 0.10%) was added to the beads and incubated at 42°C. for 5 uL of circularized cDNA. Samples required between 15-25 minutes. At the end of the 5 minutes the beads were gently cycles of PCR. PCR reactions were purified using AMPure Vortexed and placed on the magnet. The liquid phase was XP beads (Beckman) and final library DNA was eluted in 20 collected and transferred to a fresh tube as this will represent uL of water. Quantification was performed by BioAnalyzer the eluate fraction containing the mA "enriched RNA'. An analysis of the DNA, which was then sent for deep sequenc additional 125ulofelution buffer was then added to the beads ing on an Illumina HiSeq2500 machine (Elim Biopharm, and the processed was repeated. The liquid phase obtained at Hayward, Calif.). each step was added to the “fresh tube' containing the 125ul of eluate from the previous step so the total final eluate vol 0343 Oligo and Adapater Sequences: ume was 500 ul. (0344 preA L3/SrApp/AGA TCG GAA GAG CGGTTC (0348 All RNA fractions were extracted as follow. 500 ul AG (SEQID NO: 661)/3ddC/; P5AATGATACGGCG ACC of acid phenol-chloroform (acid-phenol:chloroform, pH 4.5 ACC GAG ATC TAC ACT CTTTCC CTA CAC GAC GCT (with IAA, 125:24:1) Ambion) were added to the 500 ul CTTCCG ATCT (SEQID NO: 662); P3 CAA GCA GAA sample. The sample was centrifuged at 4°C. at 10,000 g for GAC GGCATA CGAGAT CGG TCT CGG CATTCCTGC 7.5 minutes. The upper phase was carefully collected making TGAACC GCTCTTCCG ATCT (SEQID NO: 663); RTo Sure not to touch the inter-phase and transfer to a clean 1.5 ml ligol (Barcode) /5phos/NNN NNA ACC NNN NAG ATC tube. 500 ml of chloroform was added to the fresh tube US 2016/0264934 A1 Sep. 15, 2016 62 vortexed briefy and centrifuged at 4°C. at 10,000 g for 7.5 culated using the formula: Amount of target=2^^ (Livak minutes. The upper phase was transferred to a fresh 1.5 ml and Schmittgen, 2001). The qPCR using Taqman reagents tube and NaCl ethanol precipitated overnight at -20°C. in was done in a 10 ul volume made of 5 ul of Universal PCR presence 1 ul of (20 mg/ml) Ultra Pure Glycogen. The fol Master Mix (Applied Bosystems Cat.#4304437), 0.5 ul of lowing day the sample was centrifuged at 4°C. for 20 minutes TaqMan probe mix (each), 2 ul of cDNA template at 50 ng/ul at 16,000 g. The pellet was then washed in 70% ethanol and 2.5ul of HO. The PCR was carried on using a standard centrifuged and additional 10 minutes at 4°C. at 16,000 g. protocol with melting curve. The amount of target were cal The pellet was then let to dry at room temperature for 10 culated as above. The TaqMan probes were purchased from minutes prior to be re-suspended in the desired volume of Applied Biosystems: 18s (AB Hs99999901 s1), FOXA2 Ultra-Pure HO (Invitrogen Catil 10977-015). (AB Hs00232764 m1), SOX17 (AB Hs 00751752 s1), (0349 Library Construction: NANOG (AB Hs 02387400 g1), and SOX2 (AB 0350 100 ng (100 ng of input and 100 ng of post mA-IP 010533049 s1). positive fraction) were used for library construction and 0355 RNA Stability Assay RNAsequsing TrueSeq Stranded mRNA Sample Preparation 0356) Wildtype and Mettl3 KO cells were treated with 0.8 Guide, entering the protocol by adding the Fragment, Prime, uMFlavopiridol for 3 hours. RNA extraction and qRT PCR Finish Mix, skipping the elution step and proceeding imme as described above. diately to the synthesis of the First Strand cDNA. From that 0357 shRNAs Targeting shRNAs point on, the exact steps of the Illumina TruSeq Stranded 0358 Short Hairpin RNAs targeting the mouse Mettl3 mRNA sample Preparation Guide were followed to the end. sequences GCACACTGATGAATCTTTA (SEQ ID NO: RNA Sequencing. Each individual library fragment size was 658) and GCACTTCCTTACAAAGCT (SEQ ID NO: 659) verified on Agilent Bioanalyzer 2100 with High Sensitivity were generated in the pSicoR plasmid backbone (Addgene chip. Final quantification was done by qPCR on PerkinElmer 12084, (Ventura et al., 2004)). The plasmid pSicoR shluc 2500Fast with Kapa library quantification kit (#KK4824). (Addgene 14782, (Konig et al., 2010) was used as a negative Libraries were pooled at equimolar concentrations according control. The plasmids were co-transfected into 293T cells to the manufacturer guidelines (TruSeq Stranded mRNA with pMd2G and psPAX2 with Fugene HD (Promega, Sample Preparation Guide September 2012). After cluster E2311) according to manufacturers instructions. Virus ing on Illumina cBot, samples were run on Illumina HiSeq where collected after 48 hours. The collected media was 2OOO. filtered through a 0.45 um membrane and the virus concen 0351. For méAIP-RT-qPCR, and méAIP-Nanostring, trated with Lenti-X concentrator (Clontech; 631231). J-1 experiment were performed as described above (protocol 1), mESC cells were infected in the presence of 2 ug per ml except2 ug of fragmented RNA, and 1 lug of antibody were polybrene. After 24 hours, cells where selected with puromy used. Rabbit IgG was used as a non-specific antibody control cin. After selection, cells where replated at low density and for immunoprecipitation in parallel to the anti-méA poly single clones where collected. Real time PCR was used to clonal antibody (Synaptic Systems). choose determine efficiency of the Knock Down. 0352 Real Time PCR 0359. The shRNA hairpins targeting human Mettl3 were 0353 For the mouse experiments, RNA was analyzed on a purchased from DF/HCC DNA Resource Core. Multiple sh LightCycler 480 by RT-qPCR with One-Step RT-PCR Master clones were purchased against METTL3 (HsSH00253093, Mix SYBR Green (Stratagene). For gene expression experi HsSH00253439, HsSHO0253446, HsSH00253487, ments, each PCR reaction was performed in 12 Jul with 45 ng HsSH00253494). After testing of their individual knockdown of total RNA, 0.8 ul of RT block/enzyme mixture, 1.2 ul efficiency both by qRT-PCR and anti-METTL3 western blot primers at 1.25 uMeach and 6 ul of MasterMix (final volume in 293T, we identified number HssH00253093 (insert 12 ul). The PCR was carried on using a standard protocol with Sequence: CCGGGCTGC ACTTCA GAC GAATTATCT melting curve. The amount of target were calculated using the CGAGAT AATTCG TCT GAA GTG CAG CTTTTT (SEQ formula: Amount of target=2^' (Livak and Schmittgen, ID NO: 660); Target Sequence: GCTGCACTTCAGAC 2001). Two tailed T test for unequal, unpaired data sets with GAATTAT: SEQID NO:3) as giving optimal knockdown and heteroscedastic variation was used to compare samples. this was used to generate H1-ESCs knockdown cell lines. The Primer sequences available upon request. scrambled shRNA control plKO-Scramble (Catil 0354 For human experiments, a first mixed made of 10 pg Ev000438085) was also obtained from the DF/HCC DNA to 5 lug of RNA in 5 ul volume, 411 of random hexamers Resource Core. (Roche), 1 ul of dNTPmix (10 mM each) and 5ul of ultrapure 0360 CRISPR-Mediated Mettl3 Knockout HO was first generated, heated at 65° C. for 5 minutes and 0361 gRNA sequences where chosen and designed a immediately put on ice. 4 ul of 5x First Strand Buffer was CRISPR design tool (Hsu et al., 2013). Plasmids for guide added along with 1 ul of 0.1M DTT, 1 ul RNAse inhibitor and RNA were co-nucleofected (Lonza; VPH-1001), with a 1 ul of Superscript III reverse transcriptase (Invitrogen). The human codon optimized Cas9 expression plasmid and a plas 20 ul reverse transcription reaction was then incubated 5 mid with a puromycine resistance cassette. Cells were plated minutes at room temperature, then 60 minutes at 50° C. then at low density for single colony isolation and selected single 15 minutes at 70° C. The freshly synthesized cDNA was colonies tested by western blot for loss of protein. More treated with 1 ul of RNAse H at 37° C. for 20 minutes. For specifically, RNA sequences where chosen and designed Sybergreen quantitative real time PCR assays, each PCR from CRISPR design tool (Hsu et al., 2013). DNA blocks reaction was done in a 20 Jul volume made of 10 ul of master containing all of the components necessary for gRNA expres mix (SYBR GreenER qPCR SuperMix for iCycler-Invitro sion (Mali et al., 2013) were synthesized by IDT and cloned gen), 5ul of primer mix at 1.2 LM (each) and 5ul of cDNA in Topo-Blunt plasmid (Invitrogen). Plasmids for guide RNA template at 20 ng/ul. The PCR was carried on using a standard were co-nucleofected (Lonza; VPH-1001), according to protocol with melting curve. The amount of target were cal manufacturers instructions, with a human codon optimized US 2016/0264934 A1 Sep. 15, 2016

Cas9 expression plasmid and a plasmid with a puromicine were exposed on a phosphor Screen and Scanned on a GE resistance cassette. Cells were plated at low density for single typhoon TRIO at the Stanford Functional Genomics Facility. colony isolation. The remaining cells were cultured for Sur 0368 mA Level Dot-Blots veyor assay. After 24 hours, cells were selected with puromi 0369 Amersham Hybond-XL (Cat.# RPN303s) mem cine for 48 hours. DNA extraction and surveyor assay as brane was rehydrated in HO for 3 minutes. The membrane described in (Cong et al., 2013). Single colonies where was then “sandwiched in Bio-Dot Microfiltration Apparatus selected and tested by western blot for loss of Protein. DNA (BioRad, cat. #170-6545). Each well was then filled with HO sequencing of the targeted locus was used to confirm presence and flushed by gentle Suction vacuum until it appeared dry. 5 of mutations that abrogate protein production. ul of HO alone was then applied to the membrane in each 0362 Annexin V Analysis well followed by addition of indicated amount of RNA and 0363 Cells were labeled with Live/Dead Fixable Aqua this was allowed to bind to the membrane by gravity. The (Life Technologies) and fluorochrome conjugated Annexin V. apparatus was disassembled and the membrane was cross Samples were analyzed on a special order FACS Aria II (BD linked in a UV STRATALINKER 1800 using the automatic Biosciences). More specifically, one million cells were col function and then the membrane was placed back into the lected and washed twice with PBS. The cells were incubated apparatus. The membrane was then blocked 10 minutes using with 1 ul of Live/Dead Fixable Aqua (Life Technologies) for sterile RNAse DNase free TBST+5% milk. The mA primary 30 minutes, protected from light. The cells were then washed antibody (Anti-mA, Synaptic Systems, Cat. #202 003) was twice with FACS buffer and re-suspended in 1x Binding then added at a concentration of 1:500 at room temperature buffer followed by an incubation with 5 ul of fluorochrome for 1 hour in TBST+5% milk. The membrane was then conjugated Annexin V for 15 min. The cells were washed washed four times in PBST. The membrane was then incu once with FACS buffer and resuspended in 500 ul of Binding bated with the secondary anti rabbit antibody (1:5000 dilu buffer. Samples were analyzed on a special order FACS Aria tion) for 30 minutes in TBST+5% milk. The membrane was II (BD Biosciences). washed 4 times 5 minutes in TBST and expose on an auto 0364 Western Blot radiographic film using Pierce ECL Western Blotting Sub 0365 Cell extracts where resolved on a NuPAGE 4-12% Strate. Bis-Tris Mini Gel and transferred to Immobilon-FL mem 0370 Mass Spectrometric Quantification of méA brane. Images were collected on a Licor Odyssey imaging 0371. Enzymatic hydrolysis of RNA to ribonucleosides system. More specifically, cells were collected and lysed in was carried out as described previously, (Taghizadeh et al., RIPA buffer (400 mM. NaCl, 1% gepal, 0.5% Sodium 2008) with modifications. Following addition of 100 nM Deoxycholate, 0.1% SDS and 10 mM Tris-Cl pH 8.0) for 30 'N-ethenocytidine and 10 uMN-guanosine as internal min on ice. The lysate was centrifuged for 10 minute and the standards for mA and adenosine respectively (due to similar supernatant collected. Protein was quantified with BCA Pro masses and retention times), RNA (200 ng) was digested with tein Assay Kit (Pierce). Proteins where resolved on a 2 Unuclease P1 (Sigma Aldrich, St. Louis, Mo.) at 37°C. for NuPAGE 4-12% Bis-Tris Midi Gel and transferred to Immo 3 h in 55 ul in buffer containing 16 mM sodium acetate (pH bilon-FL membrane. Primary antibodies used are: (Rabbit 6.8), 1.8 mM zinc chloride, 9 g/mL coformycin, 45 lug/mL anti-METTL3/MT-A70, Bethyl A301-568; Mouse anti-beta tetrahydrouridine, 2.3 mM desferroxamine, 0.45 mMbuty actin, mabcam 8224 and Rabbit anti-PARP, Cell Signaling, lated hydroxytoluene, followed by addition of 45ul of 27 mM 9542). Secondary antibodies used: IRDye 680RD Goat anti of sodium acetate (pH 7.8), 17 U calf thymus alkaline phos Mouse IgG (H+L) (Licor) and IRDye 800CW Goat anti phatase (New England Biolabs, Ipswich, Mass.) and 0.1 U Rabbit IgG (H+L) (Licor). Images where collected on a Licor Snake Venom phosphodiesterase (Sigma Aldrich) with incu Odyssey imaging system. bation overnight at 37° C. The digestion mixture was later 0366) Determination of mA Levels deproteinized by centrifugal filtration (Nanosep 10K; Pall 0367 2D-TLC was performed as described by (Jia et al., Corporation, Port Washington, N.Y.), and 10ul of the mixture 2011). For dot-blots, the indicated amounts of RNA were was analyzed by a liquid chromatography-coupled triple qua applied to the membrane and cross-linked by UV. The mA drupole mass spectrometry (LC-QQQ). HPLC was per primary antibody was then added to the blocked membrane at formed on an Agilent series 1200 instrument (Agilent Tech a concentration of 1:500. The membrane was incubated with nologies, Santa Clara, Calif.) consisting of a binary pump, a the secondary antibody and exposed to an auto-radiographic Solvent degasser, a thermostatted column compartment and film. mA RNA mass-spectrometry was performed as an autosampler. The nucleosides were resolved on a Dionex described in the Extended Experimental Procedures. More Acclaim Polar Advantage C16 column (3 um particles, 120 A specifically, 2D-TLC was performed as described by (Jia et pores, 2.1 x 150 mm; 30°C.) at 300 uL/min using a solvent al., 2011). 100 to 200 ng of polyA+ RNA, selected for two system consisting of 0.1% acetic acid in HO (A) and 0.1% rounds, was digested with 2000 units of RNAse T1 (Ambion) acetic acid in acetonitrile (B), with the elution performed in a final volume of 25ul, with 1xPNK buffer and incubated isocratically at 0% B for 29 min, followed by a column at 37° C. for 1 hour. The RNA was labeled with 10 units of washing at 70% B and column equilibration. Mass spectrom PNK (NEB) and 1 ul T-32PATP (6000 Ci/mmol; Perkin etry detection was achieved using an Agilent 6410 QQQ mass Elmer). The reaction was cleaned with a G25 column and spectrometer in positive electrospray ionization mode with precipitated with Standard Ethanol precipitation. The RNA the following parameters: ESI capillary voltage, 3000 V: gas was re-suspended in 10 ul of 50 mM sodium acetate (pH 5.5) temperature, 340°C.; drying gas flow, 10 L/min: nebulizer and digested with 1 Unit of nuclease P1 (USBiological; pressure, 20 psi; fragmentor voltage, 150 V. The nucleosides N7000). 1 ul was loaded on a Cellulose TLC glass plate were quantified using the nucleoside->base ion mass transi (EMD chemicals; 571 6-7). The first dimension was resolved tions of 282.1->150.1 (mA), and 268.1->136.1 (A). Abso in isobutyric acid:0.5 M NH4OH (5:3, v/v) and the second lute quantities of mA and A were determined from calibra dimension resolved in isopropanol:HCl: water. The plates tion curves prepared daily. US 2016/0264934 A1 Sep. 15, 2016 64

0372 Microarray Data Acquisition and Data Analysis. 0379 m6A Peak Calling and Intensity Calling and Analy sis 0373) RNA was extracted as described above and submit 0380 Search for enriched peaks was performed by scan ted for Hybridization on GeneChip Mouse Exon 1.0ST Array ning each gene using 100-nucleotide sliding windows, and at the Protein and Nucleic Acid Facility of the Stanford calculate an enrichment score for each sliding window (Do School of Medicine. For gene expression analysis, arrays minissini et al., 2012). Windows with RPKM-5 in the eluate, were RMA normalized using justRMA package in R. After enrichment score-2 in genes with RPKM in the input normalization, probes with average expression of all arrays sample> 1 were defined as enriched in mGA pull down. less than 100 were filtered out as not expressed probes. For Enriched windows with score greater than neighboring win each expressed probe, its expressions were log2ed, and the dows where selected as méA peaks. To determine “high confidence', we first intersected the peaks in biological rep gene expression was defined as the average expression of all licates, requiring at least 0.5 overlap using the BedTools the expressed probes that attached to this gene. Student T-test package (Quinlan and Hall, 2010). Peaks that did not intersect comparing wide-type Versus knockout signals in the arrays where merged, and peaks that merged end to end where also were used to calculate the significance of the expression kept for downstream analysis. The peaks where re-defined as changes, and false discovery rate (FDR) was estimated using 100 nt windows centered at the middle of the intersected/ p.adjust package in R. Differential expression was defined merged peaks. For Human méA peak detection, eluate win using the following filters: significance analysis of microar dow RPKM-10 instead of 5 were used. Common peaks were rays 3.0 (Tusher et al., 2001) with a false discovery rate less determined in the same way as described in mouse. For each than 5%, an average fold changes2 in any group, and an time point, the common peaks of the two replicates were average raw expression intensity's 100 in any group. referred to as “high-confidence' peaks. 0381 To study the peak distributions on transcripts, the 0374 mA Methylation IP RNA-Sequencing Analysis inventors assigned each "high-confidence peak (using 0375 Libraries generated with iCLIP adaptors where middle point) to the collapsed transcript (mouse) or to the separated by barcode, and perfectly matching reads were longest isoform of each Ensembl gene. 100 bins of equal collapsed. Sequencing reads were mapped using Top Hat length were made for 5' UTR, CDS and 3'UTR respectively and the average number of peaks for each bin was calculated. (Trapnell et al., 2009). A non-redundant mm 9 transcriptome The peak intensity was calculated as the ratio of window was assembled from UCSC RefSeq genes, UCSC genes, and RPKM between eluate and input for each peak. To compare predictions from (Ulitsky et al., 2011) and (Guttman et al., the peak intensities between two samples, we used sample 2011). For human datasets, the Ensembl genes (release 64) specific peaks as well as common peaks and required input was used. Search for enriched peaks was performed by scan window RPKM-20 to obtain reliable peak intensity values. ning each gene using 100-nucleotide sliding windows, and 0382 More specifically, the inventors searched for mGA calculating an enrichment score for each sliding window peaks by scanning each gene using 100-nucleotide sliding (Dominissini et al., 2012). HOMER software package (Heinz windows, and calculate an enrichment score for each sliding et al., 2010) was used for de novo discovery of the methyla window (Dominissini et al., 2012). Windows with RPKM-5 tion motif. More specifically, libraries generated with iCLIP and RPKM-10 for mouse and human respectively were used. adaptors (mouse, protocol 2) where separated by barcode, A enrichment score-2 in genes with RPKM in the input and perfectly matching reads were collapsed and barcodes sample> 1 were defined as enriched in mGA pull down. removed. For all libraries, single-end RNA-Seq reads were Enriched windows with score greater than neighboring win mapped to the mouse (mm.9 assembly) of human genome dows where selected as méA peaks. To determine “high con (hg19 assembly) using TopHat (version 1.1.3) (Trapnell et al., fidence', we first intersected the peaks in biological repli 2009). Only uniquely mapped reads were subjected to down cates, requiring at least 0.5 overlap using the BedTools stream analyses. package (Quinlan and Hall, 2010). Peaks that did not intersect 0376. The mouse RNA-seq reads, recorded in BAM/SAM where merged, and peaks that merged end to end where also format were transformed to bedCraph format, indicating the kept for downstream analysis. The peaks where re-defined as number of reads on each genomic position. A non-redundant 100 nt windows centered at the middle of the intersected/ mm 9 transcriptome was assembled from UCSC RefSeq, merged peaks. For each time point, the common peaks of the genes, UCSC genes, and predictions from (Ulitsky et al., two replicates were referred to as “high-confidence' peaks. 2011) and (Guttman et al., 2011). Gene expression in the form The peak intensity was calculated as the ratio of window of RPKM was calculated using a self-developed script. RPKM between eluate and input for each peak. To compare the peak intensities between two samples, the inventors used 0377 For human RNA-seq reads, FPKMs of Ensembl sample specific peaks as well as common peaks and required genes (release 64) were calculated using Cufflinks (version input window RPKM-20 to obtain reliable peak intensity 2.0.2) (Trapnell et al., 2010) and differentially expressed values. genes between input RNAs of TO and T48 were determined 0383 Comparing Mouse and Human Peaks. by Cuffdiff (version v2.0.2) (Trapnell et al., 2013). 0384 The inventors common peaks of 3 mESC samples 0378. To make UCSC read coverage tracks, the read cov and common peaks of 2 hESC samples for mouse and human erage at each single nucleotide was normalized to library size ESC méA comparison. To compare the methylated genes for input and eluate (mA RIP) respectively. For human between mESC and hESC at gene level, only Ensembl genes samples, we normalized the read densities by adjusting the with the annotated one to one ortholog between human and library sizes (total uniquely mapped reads) to be the same mouse were considered in the comparison, and the genes (average total uniquely mapped reads of initial sequencing must have gene expression value (RPKM or FPKM) greater runs of 4 samples) for input and eluate (mA RIP) respec than 1 in all samples of both hESC and mESC. To compare the tively. The average normalized read densities of replicates A m6A peak intensities between human and mouse ESCs, the and B were shown in the Figures. inventors aligned all the mESC peaks to human genome US 2016/0264934 A1 Sep. 15, 2016

based on the UCSC pairwise genome alignment (http://hg score was calculated as the maximum window scores of all download.Soe.ucsc.edu/), the orthologous mouse-human windows of each gene including unmethylated genes, the regions of merged peaks (at least 1 bp overlap) and species windows with input window RPKM-1 were removed from specific peaks were used for the comparison. For merged the calculation. peaks, the inventors took the center 100 bp regions and only 0392 Gene Set Enrichment Analysis used those had window. 0393 Genes were ranked by their enrichment score, and 0385) A gene's enrichment score was defined as the maxi equally divided into 10 groups. For each group, a multi mum enriched window in this gene. HOMER software pack dimensional gene set enrichment analysis over DAVID Gene age (Heinz et al., 2010) was used for de novo discovery of the Ontology terms and stem cell gene sets methylation motif, using the high confidence peaks. Random 0394 (Wong et al., 2008) was performed using Genomica windows for control where obtained using the BedTools (Segal et al., 2005: Segal et al., 2004; Segal et al., 2003). A package (Quinlan and Hall, 2010). P-value of <0.01 from hyper geometric test between a gene 0386 GO () analyses for methylated genes group and gene set was defined as significant. were conducted using DAVID (Huangda et al., 2009) with 0395. Determination of Differentially Methylated Peaks genes with RPKM-1 (mouse) or FPKM-1 (human) as back 0396 To determine effects of Mettl3 loss of function on ground. m6A peaks, we calculated the peak intensity for the high 0387 Fingerprinting méA During Endoderm Differentia confidence peaks identified in wild type cells. Peaks with tion (Similar Strategy for any Comparison in Same Organism significant changes in peak intensity (p.values 0.05) where would Apply) considered for further analysis. To determine the effect of 0388 To determine the amount of dynamic regulation or differentiation in hESC, the union of mA peaks of T0 and extent of differential móA peaks during differentiation in T48 (initial sequencing run, with comparable sequencing hESC, the m6A peaks of undifferentiated ESCs (TO) and after depth for both time points) were analyzed to determine the 48 hours of differentiation (T48) that that meet the following differentially methylated peaks between TO and T48 that criteria between TO and T48 were identified: 1) Input gene meet the following criteria: 1) Input gene FPKM-1 in all 4 FPKM-1 in all 4 samples; 2) Input window RPKM10 in all samples; 2) Input window RPKM-10 in all 4 samples; 3) At 4 samples; 3) At least 1.5 fold (or 2 fold) change of peak least 1.5 fold (or 2 fold) change of peak intensities in both intensities in both replicates in the same direction; 4) The replicates in the same direction; 4) The maximum peak inten maximum peak intensity of all sampless2; 5) In each repli sity of all sampless2; 5) In each replicate, the sample with cate, the sample with higher peak intensity must be called as higher peak intensity must be called as having peak. To deter having peak. To determine the union of méA peaks of TO and mine the union of méA peaks of TO and T48, we pooled all the T48, the inventors pooled all the peaks of the samples and peaks of 4 samples and merged the same peaks and peaks with merged the same peaks and peaks with 50 bp overlapped, the 50 bp overlapped, the unmerged peaks were then merged if unmerged peaks were then merged if they were end-to-end they were end-to-end peaks spanning 200 bp. We took the peaks spanning 200 bp. The inventors took the center 100 bp center 100 bp of merged peaks as union peaks if they meet the of merged peaks as union peaks if they meet the following following criteria in either TO or T48: 1) both replicates had criteria in either TO or T48: 1) both replicates had the peaks; the peaks; 2) The center 100 bp had window score-2 in both 2) The center 100 bp had window score>2 in both replicates. replicates. Subsequently a heatmap and clustering analysis was per 0397 Heatmap and Clustering Analysis formed. The heatmaps of all samples were made based on Z 0398. Heatmaps of all 4 samples were made based on Z score scaled log 2 values for peak intensities. For peak inten score scaled log 2 values for peak intensities or gene expres sity analysis, the peaks and samples were clustered using sion levels (FPKMs) respectively. For analysis of the differ 1-Pearson correlation coefficient of log2(peak intensity) as entially expressed genes, the genes and samples were clus the distance metric. tered by average linkage hierarchical clustering using 0389 Dataset Comparison 1-Pearson correlation coefficient of log2(FPKM) as the dis 0390 Mouse Pol II occupancy data, mRNA half life and tance metric. For peak intensity analysis, the peaks and Protein translation efficiency were obtained from (Ingolia et samples were clustered in the same way using 1-Pearson al., 2011; Rahl et al., 2010; Sharova et al., 2009) Plotting and correlation coefficient of log2(peak intensity) as the distance statistical tests were performed in R. Multi-dimensional gene metric. set enrichment analysis over DAVID Gene Ontology terms 0399 Analysis of mGA Sites in Non-Coding RNAs and stem cell gene sets (Wong et al., 2008) were performed 0400. The longest isoforms of Ensembl genes were used to using Genomica (Segal et al., 2005; Segal et al., 2004; Segal study the distribution of méA peaks on coding and noncoding et al., 2003). A P-value of <0.01 from a hyper geometric test transcripts. Noncoding transcripts overlapping with any iso between a gene group and gene set was defined as significant. forms of coding genes were removed, and transcripts with 0391 More specifically, Pol II occupancy, obtained from less than 3 exons were also removed. The analysis used the (Rahl et al., 2010), at transcriptional start sites was deter peaks found wild type mESC cells or the union of H1 TO (all mined using an in-house developed script based on annota data), H1 T48, 293T, HepG2 (including stimulated samples) tions downloaded from the UCSC table browser. Mouse and human brain (Dominissini et al., 2012; Meyer et al., mRNA half life and Protein translation efficiency was 2012). To study the m6A peak distributions on transcripts, in extracted from (Ingolia et al., 2011; Sharova et al., 2009) for each transcript we made 10 bins of equal length for the first genes with RPKMD=1 in the input. Plotting and statistical test exon, internal exons and the last exon respectively, and the performed in R. For genes with multiple Half life values percentage of peaks in each bin was calculated for coding and reported, the average value was used. We obtained human noncoding transcripts. Additionally, the peak coverage mRNA half-life of induced pluripotent stem (IPS) cells from around the last exon-exon splice junction was also analyzed published thesis (Neff et al., 2012). The m6A enrichment for coding and noncoding transcripts. The peaks used in this US 2016/0264934 A1 Sep. 15, 2016 66 analysis included the wild type mESC or H1 TO (all data), H1 each gene by dividing the density of the promoter proximal T48, 293T, HepG2 (including stimulated samples) and region by the density of the gene body region. human brain (Dominissini et al., 2012; Meyer et al., 2012). (0409 Analysis of the Relationship Between mA and The peak coverage (number of peaks covering the site) nor RNA Polymerase II Travelling Ratio malized by the total number of overlapped peaks was calcu 0410. To compare the mA peak intensity and RNA poly lated for the 750 bp regions flanking the last splice junction. merase II travelling ratio, the m6A enrichment score was Therefore, the transcripts with less than 750 bp on either side calculated as the maximum window scores of all windows of were also removed from the analysis. each gene including unmethylated genes, the windows with 04.01 Exon Length Analysis input window RPKM-1 were removed from the calculation. 0402 Middle points of all high-confidence peaks in the 0411 Teratoma Generation and Histopathology two time points were assigned to exons of the longest iso 0412 Mettl3 wild type and mutant cells (2.5x106) were forms of Ensembl coding genes. Only internal exons were subcutaneously injected into 8-week-old female SCID/Beige used in the Subsequent analysis. Exon length and number of mice (Charles River). In the fourth week after injection, the m6A motifs were used to normalize the number of peaks in mice were euthanized and the tumors were harvested, each exon. Error bar indicates variations estimated via 1000 weighed, measured and processed for histological analysis. times of bootstrapping for each bin of exon length. All animal studies were approved by Stanford University Single Exon Gene Analysis IACUC guidelines. For histological analysis, slides were 0403 stained with hematoxylin and eosin (H&E); or stained by 04.04 Ensembl genes without any multi-exon isoforms immunohistochemistry (IHC) with VECTASTAIN ABC Kit were considered as single exon genes. The peak distribution (PK-4000, Vector laboratories) and DAB Peroxidase Sub of the longest isoform of single exon protein-coding genes strate Kit (SK-4 100, Vector laboratories) following the manu was analyzed in the same way as for multi-exon protein facturers instructions. Analyses were performed by a coding genes, except that 10 bins were made for each 5' UTR, boarded veterinarypathologist (DMB). CDS and 3'UTR. 0413 Mettl3 wild type and mutant cells were trypsinized 04.05 Comparison of méA Peaks Between Mouse and and 2.5x106 cells were subcutaneously injected into Human ESCs 8-week-old female SCID/Beige mice (Charles River). Ter 0406 We used common peaks of 3 mESC and common atoma progression was monitored by Volume measurement peaks of 2 hESC for mouse and human ESC méA compari every other day after a visible tumor mass formed. In the son. To compare the methylated genes between mESC and fourth week after injection, the mice were euthanized and the hESC at gene level, only Ensembl genes with the annotated tumors were harvested, weighed, measured and then were one to one ortholog between human and mouse were consid processed for histological analysis. All the animal studies ered in the comparison, and the genes must have gene expres were approved by Stanford University IACUC guidelines. sion value (RPKM or FPKM) greater than 1 in all samples of 0414 For histological analysis, teratomas were fixed with both hESC and mESC. To compare the m6A peak intensities 4% paraformaldehyde, processed for routine histopathology, between human and mouse ESCs, we aligned all the mESC embedded in paraffin and 4 micron sections were stained with peaks to human genome based on the UCSC pairwise genome hematoxylin and eosin (H&E); or stained by immunohis alignment (http://hgdownload.Soe.ucsc.edu/), the ortholo tochemistry (IHC) with VECTASTAIN ABC Kit (PK-4000, gous mouse-human regions of merged peaks (at least 1 bp Vector laboratories) and DAB Peroxidase Substrate Kit (SK overlap) and species specific peaks were used for the com 4100, Vector laboratories) following the manufacturers parison. For merged peaks, we took the center 100 bp regions instructions. Antibodies used for IHC were: anti-Nanog and only used those had window scorese2 in all samples of (1:500; A300-397A, Bethyl) and anti-Ki67 (1:100: both species. Only Ensembl genes with the annotated one to RM-9106, Thermo). Tumors were evaluated and images one orthologs between human and mouse were considered. where captured using a Zeiss Axioskop 2 microscope with a To obtain reliable peak intensity values, we required gene DS-Ri1 camera and NIS-Elements D image software. RPKM or FPKM-1 and input window RPKM-5 in all 0415) Antibodies Used in this Study. samples of both species. 0416 Rabbit polyclonal anti-mA (Synaptic Systems, 202 04.07 GRO-Seq Analyses and RNA Polymerase II Trav 003); Rabbit polyclonal anti-METTL3 (Proteintech, 15073 eling Ratio Calculation 1-AP); Rabbit polyclonal anti-METTL3 (Bethyl, A301-568); 0408 GRO-seq data for hESCs (replicate 1-3) and GRO Rabbit pre-immune serum (Sigma, R9133); Mouse mono seq data for 48 hours of endodermal differentiation (replicate clonal anti-beta actin (mAbcam, 8224); Rabbit polyclonal 1) (Sigova et al., 2013) (GSE 41009) were analyzed. FASTQ anti-PARP (Cell Signaling, 9542); Rabbit polyclonal anti files were mapped to hg19 using Bowtie2 with the parameters Nanog (Bethyl, A300-397A); Rabbit polyclonal anti-Nanog -k2-L24-N1—local. Calculation of the traveling ratio was (ReproCell); Mouse monoclonal anti-Oct-3/4 (Santa cruz, adapted from (Rahl et al., 2010). Briefly, each gene was sc-5279); Mouse monoclonal anti-Tuj1 (MMS-435P); divided into the proximal promoter and gene body. The proxi mMF20 (Developmental studies Hybridoma bank); Rabbit mal promoter was defined as the region from 30 bp upstream monoclonal anti-Ki67 (Thermo, RM-9106); Donkey anti to 300 bp downstream of the transcription start site. The gene Rabbit antibody (Amersham, NA934); Goat anti-Mouse IgG body was defined as 300 bp downstream of the TSS to the end (H+L) IRDye 680RD (Licor); Goat anti-Rabbit IgG (H+L) of the annotated gene. The number of GRO-seq reads that IRDye 800CW (Licor); Goat anti-mouse Alexa-488; Goat mapped to the promoter proximal region and gene body was anti-Rabbit Alexa-555; Donkey anti-mouse Alexa-555; Don determined for each gene in each experimental condition. The key anti-rabbit Alexa-488. total number of reads mapped to each region was divided by 0417 mA Antibody Titration the length of the region to determine the read density. The 0418 We generated an mA antibody titration curve to RNA polymerase II traveling ratio (TR) was calculated for identify the point of saturation of the anti-mA antibody in the US 2016/0264934 A1 Sep. 15, 2016 67 context of performing mA RIPs (FIG. S1). To do so, we 0423 mA in mRNAs of mESC Core Pluripotency Fac utilized an in vitrogenerated transcript from a plasmid con tOrs taining full length GAPDH transcript. The plasmid was first 0424 The inventors herein discovered that mRNAs linearized by restriction digest using SalI just downstream of encoding the core pluripotency regulators in mESCs are the GAPDH clNA cloning site. The linearized plasmid was modified with mA. Nanog, Klf4, and Myc mRNAs all gel purified and in vitro T7 mediated transcription was per showed regions of mA enrichment, whereas Pou5f1 (also formed using the Ambion MEGAScript Kit (AM1334) as known as Oct4) lacked mA modification (FIG. 1A). Further described in the user manual. The incorporation of mA to the more, the mA-seq results were confirmed with independent mA transcripts was done by adding TriLink N-Methylad mA IP-qRT-PCR. (FIG.9A). A medium throughput valida enosine-5'-Triphosphate (cathi N1013) at the indicated con tion assay was deployed using mA-IP followed by Nanos centration to unmodified ATP of the kit (ex a 2% mA tran tring nCounter analysis (mA-string), which again validated script was made by mixing 98% ATP with 2% mA mA enrichment of Nanog, Sox2, Myc mRNAs and select nucleotide) according to the manufacturer instructions. The mESC IncRNAs over the gene body of beta-actin (Table S2. anti-mA RIP was performed as described in the mA-seq as disclosed in Batista et al., Cell Stem Cell, 2014, 15(6), section, with the exception that intact full length GAPDH 707-719). These validation results suggest that the mA-seq transcript was utilized as input for the RIP step. data are accurate and robust. Extending downstream of the ESC master regulators, it was discovered that mA marks the Example 1 mRNAs of 9 of 14 second-tier regulators important for ESC 0419 N6-methyl-adenosine (m6A) is the most abundant self-renewal and repression of lineage-specific transcription covalent modification on messenger RNAS in Somatic cells (Young, 2011), including Myc, Lin28, Med1, Jarid2, and Eed and is linked to human diseases, but its functions in mamma (FIG. 1B). The mRNAs of eight out of twelve key regulatory lian development are poorly understood. Here, the inventors proteins recently reported to account for a majority of ESC demonstrate an evolutionary conservation and function of cell fate decisions are m6A modified (Dunn et al., 2014). m6A by mapping the m6A methylome in mouse and human Dividing the modified genes into five groups based on the embryonic stem cells (ESCs). Thousands of messenger and degree of modification revealed that the top group (corre long noncoding RNAS show conserved móA modification, sponding to the top 20% modified genes) was enriched for including transcripts encoding core pluripotency transcrip several functional groups, including: chordate embryonic tion factors Nanog and Sox2. méA was discovered to be development, embryonic development, gastrulation and cell enriched over 3' untranslated regions at defined sequence cycle (FIG. 1C). Thus, mA extensively marks mRNAs motifs, and marks unstable transcripts, including transcripts encoding the ESC core pluripotency network, many of which that need to be turned over upon differentiation. Genetic are dynamically controlled at the level of transcription during inactivation or depletion of mouse and human Mettl3, one of differentiation. the known moA methylases, led to móA erasure on select 0425 mA. Location and Motif in mESCs Suggest a Com target genes, prolonged Nanog expression upon differentia mon Mechanism Shared with Somatic Cells tion, and impaired ESC's exit from self-renewal towards dif 0426) Denovo motif analysis of mESC mA sites revealed ferentiation into several lineages in vitro and in vivo. Thus, a motif that recapitulates the previously described mA the inventors have discovered that mGA is a mark of transcrip sequence motif (FIG. 1D) (Canaani et al., 1979; Csepany et tome flexibility required for stem cells to differentiate to al., 1990; Dominissini et al., 2012: Harper et al., 1990; specific lineages. Horowitz et al., 1984; Meyer et al., 2012: Rana and Tuck, 0420) Thousands of mESC Transcripts Bear mA 1990; Rottman et al., 1994: Wei and Moss, 1977). The fre 0421) To understand the role of the mA RNA modifica quency of motif occurrence peaks near the center of experi tion in early development, the inventors mapped the locations mentally mapped mA sites. Control motif analysis on a of mA modification across the transcriptome of mouse random group of windows of the same size, extracted from (mESC) and human (hESC) embryonic stem cells. Polyade genes with comparable level of expression, failed to identify nylated RNA was subjected to fragmentation, and mA-bear the methylation motif, demonstrating specificity (FIG. 9B). ing fragments were enriched by immunoprecipitation with an mA sites in mESC are significantly enriched near the stop mA-specific antibody, followed by high throughput codon and beginning of the 3' UTR of protein coding genes sequencing (Methods). For each experiment, libraries were (FIGS. 1E and 1F), as previously described for somatic cells. built for multiple biological replicates and concordant peaks Although the largest fraction of mA sites was within the for each experiment were used for subsequent bioinformatic coding sequence (CDS, 35%), the stop codon neighborhood analyses. showed the strongest enrichment, as a 400 nt window around 0422. In mESCs, mA-seq revealed a total of 9754 peaks stop codons contained 33% of mA sites in the mESC tran in 5578 transcripts (-2 peaks per transcript) with RPKMD1. scriptome but represented just 12% of the motif occurrence. The majority of mA peaks are found in protein coding genes, In genes with only one modification site, the bias for modifi with 9588 mA peaks found in 5461 protein coding tran cation at the neighborhood of the STOP codon is even more Scripts (out of 9923 protein coding transcripts). Considering pronounced (FIG.1F). Comparison of transcript read cover the lower expression levels of linckNA as a class, it is likely age between input and wild type revealed no bias for read that the fraction of modified noncoding transcripts is under accumulation around the STOP codon in the input sample estimated. 166 mA peaks are found in 117 noncoding tran (FIG.9C). scripts (out of 485 long noncoding RNA transcripts) (Table 0427 Next, the relationship between exon length of the S1, as disclosed in Batista et al., Cell StemCell, 2014, 15(6), coding sequence (CDS) and mA modification of mRNAs 707-719). Thus, thousands of mESC transcripts, including was analysed, purposefully excluding the last exon, fre mRNAs andlincPNAS, are mA-modified (Dominissini et al., quently the longest exon in a coding gene, and often including 2012; Meyer et al., 2012). part of the CDS along with the stop codon and 3'-UTR. The US 2016/0264934 A1 Sep. 15, 2016 inventors discovered that methylated internal exons were sig fied and mA-marked RNAs is similar (FIG.9J). In contrast, nificantly longer than non-methylated control internal exons mA-marked transcripts had significantly shorter RNA half (median exon length of 737 bp vs 124 bp: P-2.2x10': life 2.5 hours shorter on average (p=<2.2, FIG. 14 and two-sided Wilcoxon test). The strong bias for mA modifica increased rate of mRNA decay (average decay rate of 9 min vs tion occurring in long internal exons remained even when the 5.4 min for mA vs. unmodified, p=<2.2). mA modified number of peaks per exon was normalized by exon length transcripts have slightly lower translational efficiency than (FIGS. 9D and 9E). Alternatively, this enrichment in long unmodified transcripts (1.32 vs. 1.51, respectively) (Ingolia et internal exons of mRNAs could be the result of higher prob ability of finding RRACU motif in longer sequence space. al., 2011) (FIG.9K). These results demonstrated that mA is Analysis of number of peaks per exon after normalizing by a chemical mark associated with transcript turnover. the number of motifs in Such exons revealed a strong enrich 0431 Mettl3 Knockout Decreases mA and Promotes ment of mA modification(s) in long exons, independent of ESC Self-Renewal the number of potential motifs (FIG.9F). These results dem onstrate the possibility that processing of long exons is 0432) To understand the role of mA methylation in ESC coupled mechanistically to mA targeting through as yet biology, the inventors inactivated Mettl3, which is one of the unclear systems and/or that mA modification itself may play components of the mA methylase complex. No genetic study a role in controlling long exon processing. The topological of Mettl3 has been performed in human stem cell populations enrichment of mA peaks surrounding stop codons in to rigorously define its requirement for mA modification, as mRNAs is a poorly understood aspect of themA methylation all previously reported studies have relied on knock down. system. Therefore, to understand if there was a topological Herein, the inventors targeted Mettl3 by CRISPR-mediated enrichment or constraint on mA modification in non-coding gene editing (see Methods section), and generated several RNAs (ncRNAs), which by definition have no stop codons, homozygous Mettl3 KO ESC lines. DNA sequencing con the inventors parsed both incRNAs and protein coding RNAs firmed homozygous stop codons that terminate translation with three or more exons into three normalized bins includ within the first 75 amino acids, and immunoblot analysis ing: the 1st exon, all internal exons and last exon. The inven confirmed the seabsence of Mettl3 protein (FIG. 2A, FIG. tors determined that there was an enrichment of mA near the 10A). Two dimensional thin layer chromatography (2D last exon-exon splice junction for both coding and incRNAS TLC) of single nucleotides digested from purified poly(A) (FIG. 1G), demonstrating that the enrichment of mA peaks RNA showed a significant (-60%) but incomplete reduction around the STOP codon is independent of the Stop codon of mA in Mettl3 KO ESC (FIG. 2B and FIG. 10B). Interest itself. Furthermore, the inventors also discovered mA ingly and contrary to a recent publication (Wang et al., enrichment in mRNAs and non-coding RNAs as the last splice junction is crossed (FIG.9G). Interestingly, the inven 2014b), the inventors suprizingly discovered that Mettl3 KO tors also identified increasing frequency of mA approaching reduced but did not prevent the stable accumulation of the 3' end of single-exon genes (FIG. 9H), consistent with Mettl14 (FIG. 10C). Thus, these experiments demonstrated high mA at the 3'end/last codon-3'UTR of multi-exonic that Mettl3 is a major, but not the sole, mA methylase in genes. mESC. 0428 Together, the location and sequence features identi 0433. Furthermore, in contrast to prior reports, the inven fied in mESCs demonstrate a mechanism for mA deposition tors demonstrated herein that Mettl3 KO ESCs are viable and that is similar if not identical in somatic cells. Thus, the Surprisingly demonstrated improved self-renewal. In fact, inventors have discovered that that the m6A methylome is Mettl3 KO in mESCs were unexpectedly viable and could be hardwired into transcripts based on their primary sequence, maintained indefinitely over months, and Mettl3 KO ESCs and is present in pluripotent cells that are a model of early exhibited low levels of apoptosis, similar to wildtype mESCs, embryonic life. as judged by PARP cleavage and Annexin V flow cytometry Example 2 (FIG. 2A, FIG. 10D). The inventors next assessed whether Mettl3 KO affected the ability of stem cells to remain pluri mA is a Mark for RNA Turnover potent. Mettl3 KOESC colonies were consistently larger than WT ESCs, and still retained the round and compact ESC 0429 Next, the inventors assessed if transcript levels are colony morphology with intense alkaline phosphatase stain correlated with the presence of mA modification. Compari ing comparable to wild type colonies as well as uniform son of mA enrichment level versus the absolute abundance expression of Nanog and Octá (FIG. 2C, 2D, 2E, FIG. 10E of RNAs revealed no correlation between level of enrichment and data not shown). Quantitative cell proliferation assay and gene expression (FIG. 1H). A separate, quartile based confirmed the increased proliferation rate of KO over WT analysis found a higher percentage of mA-modified tran ESCs (FIG.2F). These observations demonstrate that Mettl3 scripts in the middle quartiles of transcript abundance (FIG. KO enables enhanced ESC self-renewal. To rule out potential S1I). Thus, the methylome analysis demonstrates that mA off-target effects from CRISPR-mediated gene targeting, an modification is not simply a random modification that occurs orthogonal approach to knockdown Mettl3 in ESCs was used. on abundant cellular transcripts; rather, mA preferentially In particular, the inventors used two independent short hair marks transcripts expressed at a medium level. pin RNAs (shRNAs) knocked down Mettl3 to ~20% (FIG. 0430. To further define potential mechanisms of mA 10F). 2D-TLC showed a ~40% loss of mA in poly(A) RNAs function, the inventors assessed whether mA-marked tran (FIG. 10G), and apoptosis assays confirmed lack of cell death scripts differ from unmodified transcripts at the level of tran induction. Importantly, Mettl3 depletion also increased ESC Scription, RNA decay, or translation by leveraging published proliferation compared to control shRNA for one hairpin genome-wide datasets in mESCs (Methods). RNA poly (FIG. 10H). Thus, two independent approaches confirm that merase II occupancy at the promoter region of both unmodi Mettl3 inactivation enhanced self-renewal of ESCs. US 2016/0264934 A1 Sep. 15, 2016 69

0434 Mettl3 KO Blocks Directed Differentiation. In Vitro tified in wild type (FIG. 4A). The inventors detected changes and Teratoma Differentiation. In Vivo in 3739 sites (in 3122 genes), including modification sites in 0435 These findings, coupled with the discovery that Nanog mRNA. Thus, this unbiased analysis suggested a set of modified genes tend to have a shorter half-life, demonstrate targets that rely more exclusively on Mettl3, including Nanog that Mettl3, and by extension mA, is needed to fine-tune and and other pluripotency mRNAs (FIGS. 4B and 4C) (Table S1, limit the level of many ESC genes, including pluripotency as disclosed in Batista et al., Cell Stem Cell, 2014, 15(6), regulators. Since Mettl3 KO cells are capable of self-renewal, 707-719). Gene Set Enrichment Analysis confirmed that their capacity for directed differentiation in vitro toward two Mettl3-target genes significantly overlap functional gene sets lineages: cardiomyocytes (CM) or the neural lineage was important for pluripotency, including targets of Ctnnb1 (8.8x assessed. While the wild type control cells were able to gen 10'), targets of Smad3 or Smad3 (1.6x10°), targets of erate beating CM (-50% of colonies), only -3% of Mettl3 KO Myc (2.7x10'), targets of Sox2 (6.5x10'), and targets of colonies of two independent clones produced beating CMs. Nanog (8.5x10') (FIG. 4C). Five of eleven core ESC regu Furthermore, differentiated colonies of Mettl3 KO cells lators lost méA modification in Mettl3 KO, including Nanog, retained high levels of Nanog expression but lacked expres Rlf1, Jarid2, and Lin28 (FIG. 4D). Independent validation by sion of the CM structural protein Myhó, reflecting a larger mA RIP followed by Nanostring detection confirmed loss of number of cells that failed to exit the mESC program in the mA in Nanog, and other mRNAs in KO vs. wild type ESCs mutant cells. (FIG. 3A and data not shown). Similarly, upon (FIG. 4E). Following transcription arrest by flavopiridol directed differentiation to the neural lineage, a marked differ treatment, Nanog mRNA showed delayed turnover in Mettl3 ence between the ability of the two cells types to differentiate KO cells compared to wild type, consistent with a require was detected. To assay for neural differentiation, the cells ment formA in Nanog mRNA turnover (FIG.4F). However, were stained for Tuj1, a beta-3 tubulin which is expressed in RNA-seq analysis of Mettl3 KO cells revealed modest per mature and immature neurons. While -53% of wild type turbations in mRNA steady state levels with only ~300 genes colonies had Tuj1+ projections, less than 6% of Mettl3 KO demonstrating significant changes over 1.5 fold. Collectively, colonies had Tuj1+ projections in both knock-out clones these results suggest that Mettl3 plays a selective role in (FIG. 3B). Additionally, differentiated Mettl3 KO cells regulating the dynamics of ESC gene expression. showed an impaired ability to repress Nanog and activate 0437. Wide Spread mA Modification of Human ESCs Tuj1 mRNA (FIG.3B). To confirm the role of Mettl3 in ESC 0438. The identification of thousands of mA sites raises differentiation in vivo, Mettl3 KO or wild type cells were the challenge of defining the functional importance of each injected subcutaneously into the right or left flank respec and every one of the sites. To this end, the inventors mapped tively, of SCID/Beige mice (n=5). Both wild type and mA sites in hESCs and during endoderm differentiation to Mettl3KO cells formed tumors consistent in morphology elucidate the patterns and potential conservation of mA with teratomas. Mutant tumors tended to be larger, in accor methylome (FIG. 5A). In basal (undifferentiated or resting) dance with mutant cell growth curves observed in vitro (FIG. state hESCs (T=0), mA-seq identified 16,943 peaks in 7,871 3C). Histological analysis of H&E stained tumor sections genes representing 7530 coding and 341 non-coding RNAS. revealed consistent differences between the two populations: Upon differentiation towards endoderm (T=48, “endoderm While both groups of cells formed teratomas that contained differentiation” thereafter), mA-seq identified 15,613 mA differentiation to Some degree, into all three germ layers, the peaks in 7,195 genes representing 6909 coding and 286 non teratomas derived from KO cells were predominantly com coding RNAs (Table S3, as disclosed in Batista et al., Cell posed of poorly differentiated cells with very high mitotic Stem Cell, 2014, 15(6), 707-719). As shown in FIG. 5B, indices and numerous apoptotic bodies, whereas wild type 11322 peaks (6004 genes) were common between the undif cells differentiated predominantly into neuroectoderm (FIG. ferentiated (T=0) and differentiated hESCs (T=48), while 3D). Analysis of adjacent sections revealed that the mutant 5348 (3979 genes) vs. 4087 peaks (3024 genes) were unique teratomas have markedly higher staining of proliferation respectively. marker Ki67 and ESC protein Nanog, which highlight the 0439. Many Master Regulators of hESC Maintenance and poorly differentiated cells (FIG. 3D and FIG. 11A). RNA Differentiation are Modified with móA analysis confirmed that Mettl3 KO tumors had higher levels 0440 Interestingly, similar to mESC, transcripts encoding of Nanog, Oct4 and Ki67 and lower levels of Tuj1, Myhé and many hESC master regulators, including human NANOG, Sox 17 (FIG. 11B). Thus, the inventors discovered that inhi SOX2, and NR5A2, were mA modified. Like mESC, the bition of Mettl3 leads to insufficient mA, which in turn leads transcripts for OCT4 (POUF51) in hESC did not harbor an to a block in ESC differentiation and persistence of a stem mA modification (FIG.5D). These results show a high level like, highly proliferative state (i.e., mettl3 inhibition leads to of specificity and conservation of mA targets among core self-renewal and proliferation of ESCs). pluripotency/maintenance factors in mouse and human Example 3 ESCs. The inventors also identified human specific lincRNAs with known roles in hESC maintenance such as LINC-ROR Mettl3 Target Genes in mESCs and MEGAMIND/TUNA to contain mA modification(s) (FIG.5D: FIG. 13A) (Lin et al., 2014; Loewer et al., 2010). 0436 The incomplete loss of bulk mA in Mettl3 KO may Upon induction of differentiation, the inventors identified result either because Mettl3 is soley responsible for the transcripts encoded by several key regulators of endodermal methylation of a subset of genes or sites and/or Mettl3 func differentiation also to have m'A modifications including tions in a redundant fashion with another methylase on all EOMES and FOXA2 (FIG.5D). Gene ontology (GO) analy mA-modified genes. To distinguish these possibilities, the ses of methylated genes in undifferentiated hESC (T=0) were mA methylome was mapped in Mettl3 KO cells. Compari significantly enriched in biological functions such as regula son of the methylomes of wild type vs. Mett3 KO ESCs tion of transcription (FDR=1.2x10'), chordate embryonic revealed a global loss of methylation across méA sites iden development (FDR=1.1x10'), and regulation of cell mor US 2016/0264934 A1 Sep. 15, 2016 70 phogenesis (FDR=0.01). The same analysis after endodermal 2, also disclosed as Table S4 in Batista et al., Cell StemCell, differentiation retained enrichment in the similar GO terms. 2014, 15(6), 707-719). In terms of commonly methylated Upon differentiation toward endoderm, 1356 peaks in 1137 genes, regulators of ESC pluripotency demonstrate mA genes showed quantitative differences of at least 1.5 fold in modification sites at nearly equivalent locations such as m6A intensity, after normalization for input transcript abun SOX2 (FIG.6F), but not identical sites based on our analyses. dance (FIGS. 5E and 5F. Table 2, as disclosed as Table S6 in While other genes, such as GLI1 had methylation at identical Batista et al., Cell Stem Cell, 2014, 15(6), 707-719). The site(s). Yet, other genes such as CHD6 were found to have a majority of these differential mA sites represented quantita conserved mA site, along with a mouse or human-specific tive differences at existing sites (i.e. 59.1% of the peaks were mA peaks at different exons (FIG. 6F). Thus, while the called in both time points), rather then state-specific de novo inventors data reveals a Substantial overlap at the gene level. appearance or erasure of modification (FIG. 5G) (see meth demonstrating broad functional significance of mA modifi ods). This is consistent with the discovery that 74.9% of sites cation in ESCs in both species, the inventors also discovered overlapped observed in 293T data (Meyer et al., 2012) and the numerous species-specific mA patterns that may contribute little change seen in méA sites in a recent Survey of cell types to specific aspects of human ESC biology (Schnerch et al., (Schwartz et al., 2014), demonstrating that transcripts exhibit 2010). dynamic differential peak méA methylation intensity largely at “hard wired sites’ during differentiation under the condi Example 4 tions examined and when compared to other tissue types. 0441 Conserved Features of mA Modifications Span METTL3 is Required for hESC Differentiation ning Different Species 0445. To address the function of mA in hESCs, hESC 0442. The inventors determined that three salient features colonies were generated with stable knockdown of METTL3, of the mA methylome are conserved in hESCs. First, mA shRNA control, or wild-type cells (FIG. 7A). Knockdown of sites in hESCs are also dominated by the identical RRACU METTL3 inhESCs resulted in reduction in METTL3 mRNA motif seen in mESC and somatic cells (Dominissini et al., levels and reduction in mA level based on serial dilution 2012; Meyer et al., 2012) (FIG.5C). There was also a strong preference of targeting long-internal exons at the RRACU analysis of polyA+RNA (FIGS. 7B and 7C and FIGS. 13B motif even after normalizing for exon length and number of and 13C). METTL3-depleted hESCs could be stably main mA motifs (FIG.5H). Second, there was a significant enrich tained, demonstrating the dispensability of METTL3 for ment in mA peaks at 3' end of transcripts, near the stop hESC self-renewal. Furthermore there was no difference in codons of coding genes or the last exon in non-coding RNAS viability between control and knockdown hESCs (data not (FIG.5I, FIG. 13B, 13C). Furthermore, the topology of mA shown). Strikingly, differentiation of METTL3-depleted modification is preserved upon endodermal differentiation hESCs into neural stem cells (NSCs) by dual inhibition of (FIG.5I). As in mESCs, moderate to lowly expressed genes SMAD signaling, using Dorsomorphin and SB-431542 have higher probability of becoming methylated (FIG. 13E). revealed a block in neuronal differentiation (Methods). While Lastly, hESC mA is not correlated with transcription rate as 44% (+3.5% S.d.) of the control cells were Sox1 positive, only judged by GRO-seq (Sigova et al., 2013), but is strongly 10% (+3.1% s.d) of the METTL3-depleted were Sox1 posi anti-correlated with measured mRNA half-life in human tive (FIG. 13A). 0446. Similarly, knockdown of METTL3, in three inde pluripotent cells (Neffet al., 2012), strongly suggesting that pendently generated ES colony clones selected for METTL3 mA modification also marks RNA turnover in hESCs, as knockdown, led to a profound block in endodermal differen observed for mESCs (FIG.5J, FIGS. 13F and 13G). tiation at day 2 and day 4 based on failure to express the 0443) Evolutionary Conservation and Divergence of the endoderm markers EOMES and FOXA2 compared to either mA Epi-Transcriptomes of Human and Mouse ESCs two shRNA control colony clones (FIG. 7D) or wildtype 0444 Previous studies report conservation of mA modi hESCs (FIG. 13D). Consistently, METTL3-depleted ESCs fied genes between mouse and human in Somatic cell types retain high levels of expression of the master regulators (-51%-45%), but the comparisons are limited by non NANOG and SOX2 throughout the differentiation time matched tissue types and transformed VS. untransformed cell course in contrast to their diminishing expression in wildtype types (Dominissini et al., 2012; Meyer et al., 2012). Herein, cells (FIG. 7E and FIG. 13E). These results indicate that the inventors assessed the evolutionary conservation of METTL3 and mA control differentiation of hESCs. human and mouse ESC mA methylomes. At the gene level, 69.4% (3609 of 5204) of hESC genes are also mA modified in the orthologus mouse gene (p-value-8.3x10'7: Fisher Example 5 exact test) (FIG. 6A: Table S5, as disclosed in Batista et al., 0447. In previous reports of mA sites in transformed Cell Stem Cell, 2014, 15(6), 707-719). Furthermore, the HepG2 cells under a variety of conditions showed the major inventors identified 632 conserved mA peak sites (46.1%) ity of mA sites were invariant, a subset of dynamically between hESCs and mESCs (Table 1, which is a modified regulated mA sites was also reported (Dominissini et al., version of Table S6 disclosed in Batista et al., Cell StemCell, 2012). However, the Dominissini and colleagues study lacked 2014, 15(6), 707-719). Notably, conserved sites tended to Sufficient replicates of stimulated samples to allow for accu have higher mA peak intensities compared to mA peak sites rate assessment of méA sites. Chen et al., (Chen, Cell Stem that are not conserved (FIGS. 6B and 6C, p-values=1.3x10' Cell Mar. 5 2015; FIG. 1D) also report that among 3,880 and 8.7x10° for hESC or mESC, respectively; Wilcoxon commonly expressed transcripts in four different mouse cell/ test). The species specificity of gene methylation in mouse tissue types, 89% of 3,880 genes had variable or dynamically and human showed multiple patterns as shown through the regulated mA peaks in at least two cell types, however, as indicated examples, starting with genes found exclusively there were was insufficient replicates, the results cannot be methylated in one species or another (FIGS. 6D and 6E. Table accurately assessed, in addition, Chen and colleges fail to US 2016/0264934 A1 Sep. 15, 2016 specify the criteria for identifying differential peaks. Herein, the complexity of 5-methyl-cytosine in DNA and histone the inventors rely upon replicates is critical for the concor lysine methylations that undergo extensive reprogramming dance of peak calling. In contrast, previous published studies with distinct rules in pluripotent vs. Somatic cells. hover at ~70-80%, making it a challenge to call differential 0451 Importantly, the inventors discovered a general and mA peaks in single replicates, due to the inherent noise in conserved topological enrichment of mA sites at the 3' end of mA-seq. In addition, it was unclear from the previous reports genes among single-exon and multi-exon mRNAS as well as whether differential peaks truly represent novel and unique incRNAs. Thus, neither the stop codon nor the last exon-exon sites vs "latent sites that can be found in other cells/condi splice junction can alone explain the observed mA topology tions or tissue/cell types. Lastly, before the present invention, in RNA. However, all species examined to date including it was not clear how human méA peak intensity compared to Saccharomyces cerevisae and Arabidopsis thalania exhibit a mouse tissues. strong 3' bias in mA localization, suggest an evolutionary 0448. In contrast to the previous reports, herein, the inven constraint that may target the mA modification to the 3' ends tors analyzed the degree of dynamic modulation of mA of genes regardless of gene structure or coding potential peaks across at least two replicates during human ESC endo (Bodi et al., 2012; Schwartz et al., 2013). This bias may be derm differentiation. Only genes that showed an FPKM of achieved by preferential mA methylases recruitment to 3' >=1 in their input at both time points were analysed and used sites or preferential action of demethylases in upstream to calculate the intensity of mA peaks identified by Pirhana. regions of the transcript. Although the role of de-methylases Peaks were then identified as exhibiting differential mA cannot be excluded in the patterning of the mA methylome, peaks intensities (DMPIs) between t=0 and t=48. The inven the observation of 3' end mA bias in S. cerevisiae, which tors detected 5.3% (n=194/3674; 156 genes) and 18.8% lacks known mA demethylases argues against the latter (n=691/3674; 481 genes) of mA sites exhibited DMPIs over mechanism (Jia et al., 2011; Schwartz et al., 2013; Zheng et a threshold of 2 fold or 1.5 fold, respectively (Table S3, as al., 2013). The functional importance of mA location vs. its disclosed in Batista et al., Cell Stem Cell, 2014, 15(6), 707 specific molecular outcome need to be addressed in future 719). studies. 0449. Of these 691 DMPIs using 1.5 fold threshold, 77.1% 0452 Mettl3 Selectively Targets mRNAs Including Pluri occurred in genes that showed no differential gene expression potency Regulators (FIG. 4A). Furthermore, 44.4% of these DMPIs represent 0453 While previous reports had approached Mettl3 mA peaks called in both time points (T-0 vs T-48). function by RNAi knock down (Dominissini et al., 2012: Examples of genes showing DMPI during differentiation Fustinet al., 2013; Liu et al., 2014; Wanget al., 2014b), herein include LRRC47 and C-MYC, which show an increase in the inventors used genetic ablation of Mettl3 KO (using mA peak intensities following differentiation. By contrast, CRISPR) to examine the true loss-of-function phenotypes. genes such as RBMX show a decrease in mA peak intensities The importance of using definitive genetic models is high following differentiation. In addition, genes such as RAN lighted by recent studies in the DNA methylation field where GAP1, which have two methylation sites, only exhibit shRNA experiments led to mis-assigned functions of Tet dynamic regulation of one site (FIG. 4B). A gene ontology proteins that were later recognized in genetic knockouts (GO) analyses did not yield a significant recognizable pattern. (Dawlaty et al., 2013; Dawlaty et al., 2011). We found that As shown in FIG.4C supervised hierarchical clustering of the both Mettl3 KO and depletion led to incomplete reduction of DMPI set was able to distinguish the hESC samples. Accord the global levels mA in both mESCs and hESCs, demon ingly, the present technology demonstrates the utility and the strating redundancy in mA methylases. However, mA pro power of using mA methylation status to distinguish hESC filing in Mettl3 KO cells revealed a subset of targets, approxi in their basal (undifferentiated or resting) state (t=0) from the mately 33% of mA peaks, that are preferentially dependent differentiated cells (t=48). To perform an unbiased assess on Mettl3, and these included Nanog, Sox2, and additional ment, the inventors carried out unsupervised clustering of the pluripotency genes. A second mA methylase, Mettl14, could log(2) peak intensities for high confidence peaks in genes also regulate mA on some of the identified target genes. with FPKMD-1 and large coefficient of variation in peak inten 0454 RNAi knockdown of Mettl3 in somatic cancer cells sities across all samples. Importantly, this unsupervised clus led to apoptosis (Dominissini et al., 2012), and Wang and tering analysis was able to distinguish differentiated from colleagues reported ectopic differentiation of mESC with undifferentiated cells (FIG. 12). Importantly, the inventors Mettl3 depletion (Wang et al., 2014b). In contrast, herein the demonstrate herein the potential of mA site peak intensity as inventors suprizingly discovered that Mettl3 KO does not novel cellular classifiers. Biologically, this analysis eluci affect ESC cell viability or self-renewal, and in fact mESC dates a restricted but dynamic mA modification program renewed at an improved rate. triggered by hESC endoderm differentiation. 0455 Conservation of méA Methylome in Mammalian Example 6 ESCS 0456. The conserved methylation patterns of many ESC master regulators and the shared phenotype observed upon mA Methylome in ES Cells inactivation of METTL3 suggest that METTL3 operates to 0450. The inventors demonstrate herein that the ESC mA control stem cell differentiation. It is known that human and methylome in mouse and human cells reveals extensive m'A mouse ESCs are not equivalent (Schnerch et al., 2010), and modification of ESC genes, including most key regulators of are cultured in different conditions. By focusing in on ESC pluripotency and lineage control. The pattern and orthologous genes, the inventors were able to catalog both sequence motifassociated with ESC mA are similar to those shared and species-specific methylation sites. The observa previously reported in Somatic cells, indicating a single tion that certain methylation sites are modified whenever a mechanism that deposits mA modification in early embry target transcript is expressed in both species, despite cell State onic life. This conserved mechanism for mA contrasts with or culture differences, demonstrates that these modification US 2016/0264934 A1 Sep. 15, 2016 72 events have been preserved under strong purifying selection 0462. Agarwala, S. D., Blitzblau, H. G., Hochwagen, A., during evolution. Herein, the inventors genomic analyses also and Fink, G. R. (2012). RNA methylation by the MIS pave the way to further understand potential biological dif complex regulates a cell fate decision in yeast. PLoS Genet ferences between mouse and human ESCs at the level of méA 8, e1002732. epitranscriptome, given the unique patterns of some methy 0463 Bodi, Z., Zhong, S., Mehra, S., Song, J., Graham, lation sites between the species. N., Li, H., May, S., and Fray, R. G. (2012). Adenosine 0457 RNA “Anti-Epigenetics': mA as a Mark of Tran Methylation in Arabidopsis mRNA is Associated with the scriptome Flexibility 3' End and Reduced Levels Cause Developmental Defects. 0458 Stem cell gene expression programs need to balance Front Plant Sci 3, 48. fidelity and flexibility. On one hand, stem cell genes need 0464 Bokar, J. A., Shambaugh, M. E., Polayes, D., Mat Sufficient stability to maintain self-renewal and pluripotency era, A. G., and Rottman, F. M. (1997). Purification and over multiple cell generations, but on the other hand, gene cDNA cloning of the AdoMet-binding subunit of the expression needs to change dynamically and rapidly in human mRNA (N6-adenosine)-methyltransferase. Rina 3, response to differentiation cues. It has been proposed that 1233-1247. ESC gene expression programs are in constant flux between 0465 Canaani. D., Kahana, C., Lavi, S., and Groner, Y. competing fates, and pluripotency is a statistical average (Loh (1979). Identification and mapping of N6-methyladenos and Lim, 2011; Montserrat et al., 2013; Shu et al., 2013). ine containing sequences in simian virus 40 RNA. Nucleic Herein, the inventors have demonstrated that mRNAs with Acids Res 6,2879-2899. mA tend to have a shorter half-life, and Nanog and Sox2 0466 Clancy, M.J., Shambaugh, M. E., Timpte, C.S., and mRNAs could not be properly down-regulated on differen Bokar, J. A. (2002). Induction of sporulation in Saccharo tiation in Mettl3-deficient mESC and hESC. However, Mettl3 myces cerevisiae leads to the formation of N6-methylad deficiency has only modest effects on steady state gene enosine in mRNA: a potential mechanism for the activity expression, which could arise from the non-stoichiometric of the IME4 gene. Nucleic Acids Res 30, 4509-4518. nature of the mA modification. The application of methods 0467 Csepany, T., Lin, A., Baldick, C.J., Jr., and Beemon, and assays disclosed herein are useful to determine level of K. (1990). Sequence specificity of mRNA N6-adenosine modification of each RNA species are useful for determining methyltransferase. J Biol Chem 265, 20117-20122. the state of the stem cell population (Harcourtet al., 2013: Liu 0468 Dawlaty, M. M., Breiling, A., Le, T., Raddatz, G., et al., 2013). Herein and in contrast to prior reports, the Barrasa, M.I., Cheng, A.W., Gao, Q. Powell, B. E., Li, Z. inventors demonstrate that Mettl3 KO ESCs suprizingly Xu, M., et al. (2013). Combined deficiency of Tea and Tet2 results in enhanced self-renewal but hindered differentiation, causes epigenetic abnormalities but is compatible with concomitant with decreased ability to down regulate ESC postnatal development. Dev Cell 24, 310-323. mRNAs. WTAP, a conserved Mettl3 interacting partner from 0469 Dawlaty, M. M., Ganz, K., Powell, B. E., Hu, Y.C., yeast to human cells (Horiuchi et al., 2013; Schwartz et al., Markoulaki, S., Cheng, A. W., Gao, Q. Kim, J., Choi, S. 2014), is also required for endodermal and mesodermal dif W., Page, D.C., et al. (2011). Tea is dispensable for main ferentiation (Fukusumi et al., 2008). The observed pheno taining pluripotency and its loss is compatible with embry types in ESC and teratomas are all the more notable because onic and postnatal development. Cell Stem Cell 9, 166 we have significantly reduced but not eliminated mA. 175. 0459. Accordingly, the inventors have demonstrated a 0470 Dominissini, D., Moshitch-Moshkovitz, S., model where mA serves as the necessary flexibility factor to Schwartz, S., Salmon-Divon, M., Ungar, L., Osenberg, S., counter balance epigenetic fidelity—a RNA “anti-epigenet Cesarkas, K., Jacob-Hirsch, J., Amariglio, N. Kupiec, M., ics” (FIG. 7F). mA marks ESC fate determinants to limit et al. (2012). Topology of the human and mouse m6A RNA their level of expression, and also ensures their continual methylomes revealed by méA-seq Nature 485, 201-206. degradation so that ESC can rapidly exit the pluripotent state 0471 Dunn, S.J., Martello, G., Yordanov, B., Emmott, S., upon differentiation. The inability of stem cell populations, and Smith, A. G. (2014). Defining an essential transcrip e.g., human stem cells to exit the stem cell state (i.e., undif tion factor program for naive pluripotency. Science 344, ferentiated State) and continue proliferation upon insufficient 1156-1160. mA correlates with the association of FTO with human 0472. Fu, Y., and He, C. (2012). Nucleic acid modifica cancers (Loos and Yeo, 2013). METTL3 depletion also leads tions with epigenetic significance. Curr Opin Chem Biol to elongation of the circadian clock (Fustin et al., 2013), also 16, 516-524. suggesting a role for mA in resetting the transcriptome. In 0473. Fukusumi, Y., Naruse, C., and Asano, M. (2008). yeast, mA is active during meiosis (Clancy et al., 2002; Wtap is required for differentiation of endodermand meso Schwartz et al., 2013), where diploid gene expression pro derm in the mouse embryo. Dev Dyn 237, 618-629. grams are reset to generate haploid offspring. 0474 Fustin, J. M., Doi, M., Yamaguchi, Y., Hida, H., 0460 Herein, the inventors have demonstrated that mA is Nishimura, S., Yoshida, M., Isagawa, T., Morioka, M. S., important for the transition between cell states, by facilitating Kakeya, H., Manabe, I., et al. (2013). RNA-Methylation a reset mechanism between stages in both mouse and human Dependent RNA Processing Controls the Speed of the cells. In contrast to epigenetic mechanisms that provide cel Circadian Clock. Cell 155, 793-806. lular memory of gene expression states, mA enforces the 0475 Gulati, P. Cheung, M. K., Antrobus, R., Church, C. transience of genetic formation helping cells to forget the D., Harding, H. P., Tung, Y.C., Rimmington, D., Ma, M., past and thereby embrace the future. Ron, D., Lehner, P. J., et al. (2013). Role for the obesity related FTO gene in the cellular sensing of amino acids. REFERENCES Proc Natl Acad Sci USA 110, 2557-2562. 0461 The references are incorporated herein in their 0476 Guttman, M., Donaghey, J., Carey, B. W., Garber, entirety by reference. M. Grenier, J. K. Munson, G., Young, G., Lucas, A. B., US 2016/0264934 A1 Sep. 15, 2016

Ach, R., Bruhn, L., et al. (2011). linckNAS act in the 0490 Liu, N., Parisien, M., Dai, Q., Zheng, G., He, C., and circuitry controlling pluripotency and differentiation. Pan, T. (2013). Probing N6-methyladenosine RNA modi Nature 477, 295-300. fication status at single nucleotide resolution in mRNA and 0477 Harcourt, E. M., Ehrenschwender, T., Batista, P. J., long noncoding RNA. Rina. Chang, H. Y., and Kool, E. T. (2013). Identification of a 0491 Loewer, S., Cabili, M. N., Guttman, M., Loh, Y. H., selective polymerase enables detection of N(6)-methylad Thomas, K., Park, I. H., Garber, M., Curran, M., Onder, T., enosine in RNA. JAm ChemSoc. 135, 19079-19082. Agarwal, S., et al. (2010). Large intergenic non-coding 0478 Harper, J. E., Miceli, S. M., Roberts, R. J., and RNA-RoR modulates reprogramming of human induced Manley, J. L. (1990). Sequence specificity of the human pluripotent stem cells. Nat Genet 42, 1113-1117. mRNA N6-adenosine methylase in vitro. Nucleic Acids 0492 Loh, K. M., and Lim, B. (2011). A precarious bal Res 18,5735-5741. ance: pluripotency factors as lineage specifiers. Cell Stem 0479 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Cell 8, 363-369. Y. C., Laslo, P., Cheng, J. X., Murre, C., Singh, H., and 0493 Loos, R.J., andYeo, G. S. (2013). The bigger picture Glass, C. K. (2010). Simple combinations of lineage-de of FTO the first GWAS-identified obesity gene. Nat Rev termining transcription factors prime cis-regulatory ele Endocrinol. ments required for macrophage and B cell identities. Mol 0494. Meyer, K. D., Saletore.Y., Zumbo, P., Elemento, O., Cell 38,576-589. Mason, C. E., and Jaffrey, S. R. (2012). Comprehensive 0480 Hess, M.E., Hess, S., Meyer, K. D., Verhagen, L.A., analysis of mRNA methylation reveals enrichment in 3' Koch, L., Bronneke, H. S., Dietrich, M. O., Jordan, S. D., UTRs and near stop codons. Cell 149, 1635-1646. Saletore. Y., Elemento, O., et al. (2013). The fat mass and 0495 Montserrat, N., Nivet, E., Sancho-Martinez, I., obesity associated gene (Fto) regulates activity of the Hishida, T., Kumar, S., Miguel, L., Cortina, C. Hishida.Y., dopaminergic midbrain circuitry. Nat Neurosci 16, 1042 Xia, Y., Esteban, C. R. et al. (2013). Reprogramming of 1048. human fibroblasts to pluripotency with lineage specifiers. 0481 Hongay, C. F., and Orr-Weaver, T. L. (2011). Droso Cell Stem Cell 13,341-350. phila Inducer of MEiosis 4 (IME4) is required for Notch 0496 Neff, A. T., Lee, J. Y., Wilusz, J., Tian, B., and signaling during oogenesis. Proc Natl AcadSci USA 108, Wilusz. C.J. (2012). Global analysis reveals multiple path 14855-14860. ways for unique regulation of mRNA decay in induced 0482 Horiuchi, K. Kawamura, T., Iwanari, H., Ohashi, pluripotent stem cells. Genome Res 22, 1457-1467. R., Naito, M., Kodama, T., and Hamakubo, T. (2013). 0497 Niu, Y., Zhao, X. Wu, Y. S. Li, M. M. Wang, X.J., Identification of Wilms tumor 1-associating protein com and Yang, Y. G. (2013). N6-methyl-adenosine (m6A) in plex and its role in alternative splicing and the cell cycle. J RNA: an old modification with a novel epigenetic function. Biol Chem. Genomics Proteomics Bioinformatics 11, 8-17. 0483 Horowitz, S., Horowitz, A., Nilsen, T.W., Munns, T. 0498 Rahl, P. B., Lin, C. Y., Seila, A. C., Flynn, R. A., W., and Rottman, F. M. (1984). Mapping of N6-methylad McCuine, S., Burge, C. B., Sharp, P. A., and Young, R. A. enosine residues in bovine prolactin mRNA. Proc Natl (2010). c-Myc regulates transcriptional pause release. Cell Acad Sci USA 81, 5667-5671. 141, 432-445. 0499 Rana, A. P., and Tuck, M.T. (1990). Analysis and in 0484 Hsu, P. D., Scott, D. A., Weinstein, J. A., Ran, F.A., vitro localization of internal methylated adenine residues Konermann, S., Agarwala, V., Li, Y., Fine, E. J. Wu. X., in dihydrofolate reductase mRNA. Nucleic Acids Res 18, Shalem, O., et al. (2013). DNA targeting specificity of 48O3-4808. RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827-832. 0500 Rottman, F. M., Bokar, J. A., Narayan, P., Sham 0485 Ingolia, N. T., Lareau, L. F., and Weissman, J. S. baugh, M. E., and Ludwiczak, R. (1994). N6-adenosine (2011). Ribosome profiling of mouse embryonic stem cells methylation in mRNA: substrate specificity and enzyme reveals the complexity and dynamics of mammalian pro complexity. Biochimie 76, 1109-1114. teomes. Cell 147,789-802. (0501 Schnerch, A., Cerdan, C., and Bhatia, M. (2010). 0486 Jia, G., Fu, Y., Zhao, X., Dai, Q., Zheng, G., Yang.Y., Distinguishing between mouse and human pluripotent Yi, C., Lindahl, T., Pan, T., Yang, Y. G., et al. (2011). stem cell regulation: the best laid plans of mice and men. N6-methyladenosine in nuclear RNA is a major substrate Stem Cells 28, 419-430. of the obesity-associated FTO. Nat Chem Biol 7, 885-887. 0502 Schwartz, S. Agarwala, S. D., Mumbach, M. R., 0487 Kang, H. J., Jeong, S.J., Kim, K. N. Baek, I.J., Jovanovic, M., Mertins, P., Shishkin, A., Tabach, Y., Chang, M., Kang, C.M., Park, Y. S., and Yun, C.W. (2014). Mikkelsen, T. S., Satija, R., Ruvkun, G., et al. (2013). A novel protein, Pho92, has a conserved YTH domain and High-resolution mapping reveals a conserved, widespread, regulates phosphate metabolism by decreasing the mRNA dynamic mRNA methylation program in yeast meiosis. stability of PHO4 in Saccharomyces cerevisiae. Biochem J Cell 155, 1409-1421. 457, 391-400. 0503 Schwartz, S., Mumbach, M. R., Jovanovic, M., 0488 Lin, N., Chang, K.Y., Li, Z. Gates, K. Rana, Z.A., Wang, T., Maciag, K., Bushkin, G. G., Mertins, P., Ter Dang, J., Zhang, D., Han, T., Yang, C. S., Cunningham, T. Ovanesyan, D., Habib, N., Cacchiarelli, D., et al. (2014). J., et al. (2014). An Evolutionarily Conserved Long Non Perturbation of méA Writers RevealsTwo Distinct Classes coding RNA TUNA Controls Pluripotency and Neural Lin of mRNA Methylation at Internal and 5' Sites. Cell Rep 8, eage Commitment. Mol Cell 53, 1005-1019. 284-296. 0489 Liu, J., Yue.Y., Han, D., Wang, X. Fu, Y., Zhang, L., 0504 Segal, E., Friedman, N., Kaminski, N., Regev, A., Jia, G., Yu, M., Lu, Z., Deng. X., et al. (2014). AMETTL3 and Koller, D. (2005). From signatures to models: under METTL14 complex mediates mammalian nuclear RNA standing cancer using microarrayS. Nat Genet 37 Suppl. N6-adenosine methylation. Nat Chem Biol 10, 93-95. S38-45. US 2016/0264934 A1 Sep. 15, 2016 74

0505 Segal, E., Friedman, N. Koller, D., and Regev, A. et al. (2012). Topology of the human and mouse m6A RNA (2004). A module map showing conditional activity of methylomes revealed by méA-seq Nature 485, 201-206. expression modules in cancer. Nat Genet 36, 1090-1098. 0522 Guttman, M., Donaghey, J., Carey, B. W., Garber, 0506 Segal, E., Shapira, M., Regev, A., Pe'er, D., Bot M. Grenier, J. K. Munson, G., Young, G., Lucas, A. B., stein, D., Koller, D., and Friedman, N. (2003). Module Ach, R., Bruhn, L., et al. (2011). lincRNAs act in the networks: identifying regulatory modules and their condi circuitry controlling pluripotency and differentiation. tion-specific regulators from gene expression data. Nat Nature 477,295-300. Genet 34, 166-176. 0523 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, 0507 Shah, J. C., and Clancy, M.J. (1992). IME4, a gene Y. C., Laslo, P. Cheng, J. X., Murre, C. Singh, H., and that mediates MAT and nutritional control of meiosis in Glass, C. K. (2010). Simple combinations of lineage-de Saccharomyces cerevisiae. Mol Cell Biol 12, 1078-1086. termining transcription factors prime cis-regulatory ele 0508 Sharova, L. V., Sharov, A. A., Nedorezov, T., Piao, ments required for macrophage and B cell identities. Mol Y., Shaik, N., and Ko, M. S. (2009). Database for mRNA Cell 38,576-589. half-life of 19977 genes obtained by DNA microarray analysis of pluripotent and differentiating mouse embry 0524 Hsu, P. D., Scott, D.A., Weinstein, J. A., Ran, F.A., onic stem cells. DNA Res 16, 45-58. Konermann, S., Agarwala, V., Li, Y., Fine, E. J. Wu. X., 0509 Shu, J., Wu, C., Wu, Y., Li, Z., Shao, S., Zhao, W., Shalem, O., et al. (2013). DNA targeting specificity of Tang, X., Yang, H. Shen, L., Zuo, X., et al. (2013). Induc RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827-832. tion of pluripotency in mouse Somatic cells with lineage 0525 Huangda, W., Sherman, B.T., Zheng, X., Yang, J., specifiers. Cell 153,963-975. Imamichi, T., Stephens, R., and Lempicki, R. A. (2009). 0510 Sibbritt, T., Patel, H. R., and Preiss, T. (2013). Map Extracting biological meaning from large gene lists with ping and significance of the mRNA methylome. Wiley DAVID. Curr Protoc Bioinformatics Chapter 13, Unit 13 Interdiscip Rev RNA 4, 397-422. 11. 0511 Trapnell, C., Pachter, L., and Salzberg, S. L. (2009). 0526 Ingolia, N. T., Lareau, L. F., and Weissman, J. S. Tophat: discovering splice junctions with RNA-Seq. Bio (2011). Ribosome profiling of mouse embryonic stem cells informatics 25, 1105-1111. reveals the complexity and dynamics of mammalian pro 0512 Ulitsky, I., Shkumatava, A., Jan, C. H., Sive, H., and teomes. Cell 147,789-802. Bartel, D. P. (2011). Conserved function of lincRNAs in 0527 Jia, G., Fu, Y., Zhao, X., Dai, Q., Zheng, G., Yang.Y., vertebrate embryonic development despite rapid sequence Yi, C., Lindahl, T., Pan, T., Yang, Y. G. et al. (2011). evolution. Cell 147, 1537-1550. N6-methyladenosine in nuclear RNA is a major substrate 0513 Wang, X., Lu, Z. Gomez, A., Hon. G. C. Yue. Y., of the obesity-associated FTO. Nat Chem Biol 7, 885-887. Han, D., Fu, Y. Parisien, M., Dai, Q., Jia, G., et al. (2014a). 0528 Konig, J., Zarnack, K., Rot, G., Curk, T., Kayikci, N6-methyladenosine-dependent regulation of messenger M. Zupan, B., Turner, D.J., Luscombe, N. M., and Ule, J. RNA stability. Nature 505, 117-120. (2010). iCLIP reveals the function of hnRNP particles in 0514 Wang, Y., Li Y., Toth, J. I. Petroski, M.D., Zhang, splicing at individual nucleotide resolution. Nat Struct Mol Z., and Zhao, J. C. (2014b). N6-methyladenosine modifi Biol 17,909-915. cation destabilizes developmental regulators in embryonic 0529) Levin, J. Z., Yassour, M., Adiconis. X., Nusbaum, stem cells. Nat Cell Biol 16, 191-198. C., Thompson, D.A., Friedman, N., Gnirke, A., and Regev, 0515 Wei, C. M., and Moss, B. (1977). Nucleotide A. (2010). Comprehensive comparative analysis of strand sequences at the N6-methyladenosine sites of HeLa cell specific RNA sequencing methods. Nat Methods 7, 709 messenger ribonucleic acid. Biochemistry 16, 1672-1676. 715. 0516 Wong, D. J., Liu, H., Ridky, T. W., Cassarino, D., 0530 Livak, K.J., and Schmittgen, T.D. (2001). Analysis Segal, E., and Chang, H. Y. (2008). Module map of stem of relative gene expression data using real-time quantita cell genes guides creation of epithelial cancer stem cells. tive PCR and the 20-Delta Delta C(T)) Method. Methods Cell Stem Cell 2, 333-344. 25, 402-408. 0517. Young, R. A. (2011). Control of the embryonic stem 0531 Mali, P., Yang, L., Esvelt, K. M., Aach, J., Guell, M., cell state. Cell 144, 940-954. DiCarlo, J. E., Norville, J. E., and Church, G. M. (2013). 0518. Zheng, G., Dahl, J. A., Niu, Y., Fedorcsak, P., RNA-guided human genome engineering via Cas9. Sci Huang, C.M., Li, C.J., Vagbo, C. B., Shi, Y. Wang, W. L., ence 339, 823-826. Song, S. H., et al. (2013). ALKBH5 is a mammalian RNA demethylase that impacts RNA metabolism and mouse 0532 Neff, A. T., Lee, J. Y., Wilusz, J., Tian, B., and fertility. Mol Cell 49, 18-29. Wilusz. C.J. (2012). Global analysis reveals multiple path 0519 Zhong, S. Li, H., Bodi, Z. Button, J., Vespa, L., ways for unique regulation of mRNA decay in induced Herzog, M., and Fray, R. G. (2008). MTA is an Arabidopsis pluripotent stem cells. Genome research 22, 1457-1467. messenger RNA adenosine methylase and interacts with a 0533. Quinlan, A. R., and Hall, I. M. (2010). BEDTools: a homolog of a sex-specific splicing factor. Plant Cell 20, flexible Suite of utilities for comparing genomic features. 1278-1288. Bioinformatics 26, 841-842. 0520 Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., 0534) Rahl, P. B., Lin, C. Y., Seila, A. C., Flynn, R. A., Habib, N., Hsu, P. D. Wu. X., Jiang, W., Marraffini, L.A., McCuine, S., Burge, C. B., Sharp, P. A., and Young, R. A. et al. (2013). Multiplex genome engineering using (2010). c-Myc regulates transcriptional pause release. Cell CRISPR/Cas systems. Science 339, 819-823 141, 432-445. 0521. Dominissini, D., Moshitch-Moshkovitz, S., 0535 Schwartz, S., Agarwala, S. D., Mumbach, M. R., Schwartz, S., Salmon-Divon, M., Ungar, L., Osenberg, S., Jovanovic, M., Mertins, P., Shishkin, A., Tabach, Y., Cesarkas, K., Jacob-Hirsch, J., Amariglio, N. Kupiec, M., Mikkelsen, T. S., Satija, R., Ruvkun, G., et al. (2013). US 2016/0264934 A1 Sep. 15, 2016

High-resolution mapping reveals a conserved, widespread, (0549 Xiao, R., and Moore, D. D. (2011). DamIP: using dynamic mRNA methylation program in yeast meiosis. mutant DNA adenine methyltransferase to study DNA Cell 155, 1409-1421. protein interactions in vivo. Curr Protoc Mol Biol Chapter 0536 Segal, E., Friedman, N., Kaminski, N., Regev, A., 21, Unit 21 21. and Koller, D. (2005). From signatures to models: under 1. A method for maintaining a stem cell population in an standing cancer using microarrays. Nat Genet 37 Suppl. undifferentiated State, comprising contacting the stem cell S38-45. population with an inhibitor of METTL3 or METTL4. 0537 Segal, E., Friedman, N. Koller, D., and Regev, A. 2. The method of claim 1, wherein the stem cell population (2004). A module map showing conditional activity of is a human stem cell population. expression modules in cancer. Nat Genet 36, 1090-1098. 3. The method of claim 1, wherein the human stem cell 0538 Segal, E., Shapira, M., Regev, A., Pe'er, D., Bot population is a population of hESCs. stein, D., Koller, D., and Friedman, N. (2003). Module 4. The method of claim 1, wherein the stem cell population networks: identifying regulatory modules and their condi is prevented from differentiating along an endoderm lineage. tion-specific regulators from gene expression data. Nat 5. The method of claim 1, wherein the inhibitor of Genet 34, 166-176. METTL3 or METTL4 is a RNAi inhibitor or miRNA. 0539 Sharova, L. V., Sharov, A. A., Nedorezov, T., Piao, 6. A method of promoting a stem cell population to differ Y., Shaik, N., and Ko, M. S. (2009). Database for mRNA entiate along an endoderm lineage comprising contacting the half-life of 19977 genes obtained by DNA microarray stem cell population with an agent which increases méA of analysis of pluripotent and differentiating mouse embry mRNA in the stem cell population. onic stem cells. DNA Res 16, 45-58. 7. The method of claim 6, wherein the agent is a méA 0540 Sigova, A. A., Mullen, A. C., Molinie, B., Gupta, S., methyltransferase. Orlando, D. A., Guenther, M. G., Almada, A. E., Lin, C., 8. The method of claim 7, wherein the m6A methyltrans Sharp, P. A., Giallourakis, C. C., et al. (2013). Divergent ferase is METTL3 or METTL4. The method of claim 6, transcription of long noncoding RNA/mRNA gene pairs in wherein the stem cell population is a human stem cell popu embryonic stem cells. Proc Natl AcadSci USA 110,2876 lation. 2881. 9. A method to characterize a stem cell population, com 0541 Taghizadeh, K., McFaline, J. L., Pang, B., Sullivan, prising performing méA sequencing on the population of M. Dong, M., Plummer, E., and Dedon, P. C. (2008). stem cells, and assessing the intensity of the m6A levels of the Quantification of DNA damage products resulting from mRNA of at least 10 genes selected from any of those in Table deamination, oxidation and reaction with products of lipid 1 or Table 2. peroxidation by liquid chromatography isotope dilution 10. An assay for assessing méA levels in the RNA of at tandem mass spectrometry. Nat Protoc 3, 1287-1298. least 10 genes selected from any of those listed in Table 1, 0542 Trapnell, C., Hendrickson, D. G., Sauvageau, M., comprising contacting an array comprising at oligonucle Goff, L., Rinn, J. L., and Pachter, L. (2013). Differential otides that hybridize to at least 10 genes selected from any of analysis of gene regulation at transcript resolution with Table 1 or Table 2 with RNA isolated from a cell population, RNA-seq Nat Biotechnol 31, 46-53. and contacting the array with at least one reagent which binds (0543. Trapnell, C., Pachter, L., and Salzberg, S. L. (2009). to m6A in the RNA. Tophat: discovering splice junctions with RNA-Seq. Bio 11. The assay of claim 10, wherein the reagent which binds informatics 25, 1105-1111. to móA is an anti-méA antibody, or fragment thereof. 0544 Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, 12. The assay of claim 11, wherein the anti-méA antibody A., Kwan, G., van Baren, M.J., Salzberg, S. L., Wold, B.J., or fragment thereof is detectably labeled. and Pachter, L. (2010). Transcript assembly and quantifi 13. A method for determining the cell state of a stem cell cation by RNA-Seq reveals unannotated transcripts and population comprising performing the assay of claim 10, and isoform switching during cell differentiation. Nat Biotech comparing the levels of méA of at least 10 genes selected nol 28, 511-515. from any of Table 1 in the RNA from the stem cell population (0545 Tusher, V. G., Tibshirani, R., and Chu, G. (2001). with the levels of méA in a reference stem cell population, Significance analysis of microarrays applied to the ioniz and based on this comparison, determining the cell state of the ing radiation response. Proc Natl AcadSci USA 98, 5116 stem cell population. 5121. 14. The method of claim 13, wherein the levels of méA are 0546) Ulitsky, I., Shkumatava, A., Jan, C. H., Sive, H., and peak intensity levels. Bartel, D. P. (2011). Conserved function of lincRNAs in 15. A kit comprising: Vertebrate embryonic development despite rapid sequence a. an array composition for characterizing the cell State of evolution. Cell 147, 1537-1550. a population of stem cells, comprising at least 10 oligo (0547 Ventura, A., Meissner, A., Dillon, C. P. McManus, nucleotides that hybridize to the RNA of at least 10 M., Sharp, P.A., Van Paris, L., Jaenisch, R., and Jacks, T. genes selected from any of those in Table 1; and (2004). Cre-lox-regulated conditional RNA interference b. at least one regent to detect the m6A in RNA. from transgenes. Proc Natl Acad Sci USA 101, 10380 16. The kit of claim 15, wherein the regent is an anti-méA 10385. antibody, or fragment thereof. 0548 Wong, D. J., Liu, H., Ridky, T. W., Cassarino, D., 17. The kit of claim 16, wherein the anti-méA antibody or Segal, E., and Chang, H. Y. (2008). Module map of stem fragment thereof is detectably labeled. cell genes guides creation of epithelial cancer stem cells. 18. A culture media comprising an inhibitor of METTL3 or Cell Stem Cell 2, 333-344. METTL4. US 2016/0264934 A1 Sep. 15, 2016 76

19. The culture media of claim 18, wherein the culture media is a cryopreservation media. 20. The culture media of claim 18, further comprising a population of human stem cells. k k k k k