2012/037456 Al
Total Page:16
File Type:pdf, Size:1020Kb
(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date - 22 March 2012 (22.03.2012) 2012/037456 Al (51) International Patent Classification: (74) Agents: RESNICK, David, S. et al; Nixon Peabody CI2Q 1/68 (2006.01) LLP, 100 Summer Street, Boston, MA 021 10 (US). (21) International Application Number: (81) Designated States (unless otherwise indicated, for every PCT/US201 1/05 193 1 kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, (22) International Filing Date: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, 16 September 201 1 (16.09.201 1) DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, (25) Filing Language: English HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, (26) Publication Language: English ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, (30) Priority Data: NO, NZ, OM, PE, PG, PH, PL, PT, QA, RO, RS, RU, 61/384,030 17 September 2010 (17.09.2010) US RW, SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, 61/429,965 5 January 201 1 (05.01 .201 1) US TM, TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (71) Applicant (for all designated States except US): PRESI¬ DENT AND FELLOWS OF HARVARD COLLEGE (84) Designated States (unless otherwise indicated, for every [US/US]; 17 Quincy Street, Cambridge, MA 02138 (US). kind of regional protection available): ARIPO (BW, GH, GM, KE, LR, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, (72) Inventors; and ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ, (75) Inventors/ Applicants (for US only): EGGAN, Kevin, C. TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, [US/US]; 181 Essex Street #E402, Boston, MA 021 11 EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, ΓΓ, LT, LU, (US). MEISSNER, Alexander [DE/US]; 6 Fourth Street LV, MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, Place, Apt. 3, Cambridge, MA 02141 (US). BOCK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, Christoph [DE/US]; 5 15 Green Street, Cambridge, MA GW, ML, MR, NE, SN, TD, TG). 02139 (US). KISKINIS, Evangelos [GR/US]; 504 Bea Published: con Street #56, Boston, MA 021 15 (US). VERSTAP- PEN, Griet, Annie, Frans [BE/BE]; 16/1 Ridder van — with international search report (Art. 21(3)) Ranstlei, B-2640 Mortsel (BE). (54) Title: FUNCTIONAL GENOMICS ASSAY FOR CHARACTERIZING PLURIPOTENT STEM CELL UTILITY AND SAFETY CD C FIG. 22B (57) Abstract: The present invention generally relates set of reference data or "scorecard" for a pluripotent stem cell, and meth © ods, systems and kits to generate a scorecard for predicting the functionality and suitability of a pluripotent stem cell line for a de sired use. In some aspects, a method for generating a scorecard comprises using at least 2 stem cell assays selected from: epigenet- o ic profiling, differentiation assay and gene expression assay to predict the functionality and suitability of a pluripotent stem cell line for a desired use. In some embodiments, the scorecard reference data can be compared with the pluripotent stem cells data to effectively and accurately predict the utility of the pluripotent stem cell for a given application, as well as any to identify specific o characteristics of the pluripotent stem cell line to determine their suitability for downstream applications, such as for example, their suitability for therapeutic use, drug screening and toxicity assays, differentiation into a desired cell lineage, and the like. FUNCTIONAL GENOMICS ASSAY FOR CHARACTERIZING PLURIPOTENT STEM CELL UTILITY AND SAFETY CROSS REFERENCE TO RELATED APPLICATIONS [001] This application claims priority under 35 U.S.C. 119(e) of U.S. Provisional Patent Application Serial No: 61/384,030 filed on September 17, 2010, and provisional application 61/429,965 filed on January 5, 201 1, the contents of which are incorporated herein by reference in their entirety. FIELD OF THE INVENTION [002] The present invention relates to method for characterizing, such as characterizing by high throughput methods, stem cells, and for methods and compositions for standardizing and optimizing the selection of pluripotent cell lines for disease modeling, studying stem cell population and their use for therapeutic treatment of diseases. GOVERNMENT SUPPORT [003] This invention was made in part, with government support under NIH Roadmap Initiative on Epigenomics, Grant Number U01ES017155 awarded by National Institutes of Health. The Government of the U.S. has certain rights in the invention. REFERENCES TO TABLES [004] This application includes as part of the originally filed subject matter three compact discs, labeled "Copy 1" and "Copy 2," and "Copy 3" each disc containing eleven (11) text files. Each of the compact discs ("Copy 1", "Copy 2" and "Copy 3") includes eleven (11) text files for ten separate lengthy tables, which are named "002806-06774 1-P2_TABLE 3.txt" (9,919 KB, created 1/7/2011), "002806- 06774 1-P2_TABLE 4.txt" (19,381 KB, created 1/7/2011), "002806-06774 1-P2_TABLE 5.txt" (10,006 KB, created 1/7/2011), "002806-06774 1-P2_TABLE 8.txt" (98 KB, created 1/7/2011), "002806-067741- P2_TABLE 10.txt" (180 KB, created 1/7/2011), "002806-067741 -P2_TABLE 12A.txt" (160 KB, created 1/7/2011); "002806-067741 -P2_TABLE 12B.txt" (160 KB, created 1/7/2011); "002806-067741- P2_TABLE 12C.txt" (31 KB, created 1/7/2011), 002806-067741 -P2_TABLE 13A.txt (25KB, created 1/7/2011), 002806-067741-P2_TABLE 13B.txt (28KB, created 1/7/2011), 002806-06774 1-P2_TABLE 14.txt (10KB, created 1/7/2011). The machine format of each compact disc ("Copy 1", "Copy 2" and "Copy 3") is IBM-PC and the operating system of each compact disc is MS-Windows. The contents of the compact discs labeled "Copy 1" and "Copy 2" and "Copy 3" are hereby incorporated by reference herein in their entireties. LENGTHY TABLES [005] The specification includes eleven (11) lengthy Tables; Tables 3, Table 4, Table 5, Table 8, Table 10, Table 12A, Table 12B, Table 12C, Table 13A,Table 13B and Table 14. Lengthy Table 3 is the integrated DNA methylation and gene expression data for Ensembl genes and promoter regions (defined as -5kb to +lkb surrounding the Ensembl-annotated transcription start site) and is provided herein in an electronic format on a CD, as file "002806-06774 1-P2_TABLE 3.txt". Lengthy Table 4 is the DNA methylation data for 35 cell lines and 31,929 Ensembl gene promoter regions, sorted in descending order of epigenetic variation among all ES cell lines (column BF) and is provided herein in an electronic format on a CD, as file "002806-067741-P2_TABLE 4.txt". Lengthy Table 5 is the Gene expression data for 35 cell lines and 15,079 Ensembl genes, sorted in descending order of transcription variation among all ES cell lines (column BG) and is provided herein in an electronic format on a CD, as file "002806-067741- P2_TABLE 5.txt". Lengthy Table 8 is a table of the details of the individual measurements contributing to the lineage scorecard prediction and is provided herein in an electronic format on a CD, as file "002806- 06774 1-P2_TABLE 8.txt". Lengthy Table 10 is a table of the Gene expression data used for construction and validation of the lineage scorecard and is provided herein in an electronic format on a CD, as file "002806-06774 1-P2_TABLE 10.txt". Lengthy Tables Table 12A, 12B and 12C are tables of the list of target genes for use in the score card, or assays and methods, with Table 12A showing, genes listed in descending order of priority which have been identified based on the variability in the reference set of DNA methylation variation among human pluripotent cell lines and Table 12B showing genes listed in descending order of priority that have been identified based on the variability in the reference set of gene expression variation among human pluripotent cell lines, and Table 12C showing genes are listed in descending order of priority and have been retrieved from the literature using an statistical ranking and information retrieval scheme, where genes from Table 12A, and/or Table 12B and/or Table 12C can be used for determining the score card and is provided herein in an electronic format on a CD, as files "002806-067741-P2_TABLE 12A.txt", "002806-06774 1-P2_TABLE 12B.txt" and "002806-067741- P2_TABLE 12C.txt" respectively. Lengthty Tables 13A and 13B are tables of an alternative list of target genes listed as "included genes" which can be used for DNA methylation and gene expression measurement for determining the score card and lineage scorecard and is provided herein in an electronic format on a CD, as files "002806-067741-P2_TABLE 13A.txt" and "002806-06774 1-P2_TABLE 13B.txt" respectively. Lengthty Tables 14 is a table of an alternative list of target genes which are subgroup of genes of Table 13A which can be used for DNA methylation and gene expression measurement for determining the score card and lineage scorecard and is provided herein in an electronic format on a CD, as files "002806-067741-P2_TABLE 14.txt" Table 3, Tables 4, Table 5, Table 8, Table 10 and Tables 12A-12C, provided herein in an electronic format on a CD, as files "002806-067741- P2_TABLE 3.txt"; "002806-06774 1-P2_TABLE 4.txt"; "002806-067741-P2_TABLE 5.txt"; "002806- 067741-P2_TABLE 8.txt" ; "002806-067741-P2_TABLE 10.txt", "002806-067741-P2_TABLE 12A.txt", "002806-067741-P2_TABLE 12B.txt", "002806-06774 1-P2_TABLE 12C.txt", "002806-067741- P2_TABLE 13A.txt", "002806-06774 1-P2_TABLE 13B.txt" and "002806-06774 1-P2_TABLE 14.txt" respectively are incorporated herein by reference in their entirety.