<<

The KMT1A-GATA3-STAT3 circuit is a novel self-renewal signaling of bladder stem cells

Zhao Yang1, Luyun He2,3, Kaisu Lin4, Yun Zhang1, Aihua Deng1, Yong Liang1, Chong

Li2, 5, & Tingyi Wen1, 6, 

1CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of

Microbiology, Chinese Academy of Sciences, Beijing 100101, China

2Core Facility for Research, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China

3CAS Key Laboratory of Infection and Immunity, Institute of Biophysics, Chinese

Academy of Sciences, Beijing 100101, China

4Department of Oncology, the Second Affiliated Hospital of Soochow University,

Suzhou 215000, China

5Beijing Jianlan Institute of Medicine, Beijing 100190, China

6Savaid Medical School, University of Chinese Academy of Sciences, Beijing 100049,

China

Correspondence author: Tingyi Wen, e-mail: [email protected] Chong Li, e-mail: [email protected]

Supplementary Figure S1. Isolation of human stem cells. BCMab1 and CD44 were used to isolate bladder cancer stem cells (BCSCs: BCMab1+CD44+) and bladder cancer non-stem cells (BCNSCs: BCMab1-CD44-) from EJ, samples #1 and #2 by flow cytometry.

Supplementary Figure S2. ontology analysis of downregulated of human BCSCs. (A) Pathway enrichment of 103 downregulated genes in BCSCs. (B) The seven downregulated genes in BCSCs participating in centromeric , mRNA-3’-UTR binding and translation regulator activity signaling pathways were validated by qRT-PCR. Data are presented as mean ± SD. P < 0.05; P < 0.01.

Supplementary Figure S3. The expression of KMT1A is higher in human BC than that in peri-tumor tissues. (A) The expression of KMT1A was higher in BC samples than that in peri-tumors as assessed by immunohistochemistry, Scale bar = 50 m. (B-C) The expression of KMT1A was analyzed according to the data from Bae’s cohort (GSE13507) and Kim’s cohort (GSE37815).

Supplementary Figure S4. KMT1A is highly expressed in basal subtype of bladder cancer.

(A and C). Clustering analysis of GSE48075 and GSE48276 according to the basal and luminal biomarkers. (B and D). The expression of KMT1A was analyzed in the basal and luminal subtypes (GSE48075 and GSE48276).

Supplementary Figure S5. The expression of KMT1A and NSD1 in normal and tumor lines of different origin, and TCGA database. (A and B) The mRNA and protein expression levels of KMT1A in normal and tumor cell lines of different origin. (C and D)

The analysis of the expression of KMT1A (C) and NSD1 (D) in tumor samples from TCGA database. KMT1A was highly expressed in ESCA, STES, STAD, LUSC, HNSC, BLCA and

LIHC. NSD1 was only highly expressed in LIHC. BLCA, Bladder Urothelial Carcinoma; BRCA, Breast invasive carcinoma; COAD, Colon adenocarcinoma; COADREAD,

Colorectal adenocarcinoma; ESCA, Esophageal carcinoma; HNSC, Head and Neck squamous cell carcinoma; KIPAN, Pan-kidney cohort (KICH+KIRC+KIRP); KIRC, Kidney renal clear cell carcinoma; KIRP, Kidney renal papillary cell carcinoma; LAML, Acute

Myeloid Leukemia; LIHC, Liver hepatocellular carcinoma; LUAD, Lung adenocarcinoma;

LUSC, Lung squamous cell carcinoma; OV, Ovarian serous cystadenocarcinoma; READ,

Rectum adenocarcinoma; STAD; Stomach adenocarcinoma, STES, Stomach and

Esophageal carcinoma; THCA, Thyroid carcinoma; UCEC, Uterine Corpus Endometrial

Carcinoma. Data are presented as mean ± SD. P < 0.05.

Supplementary Figure S6. The expression of KMT1A in human bladder. (A) The expression of KMT1A in bladder cancer stem cells (BCSCs: BCMab1+CD44+), bladder cancer non-stem cells (BCNSCs: BCMab1-CD44-), normal bladder stem cells (NBSCs: pan-CK+CD44+) and normal bladder non-stem cells (NBNSCs: pan-CK+CD44-) from primary bladder cancer samples, n = 5. (B) KMT1A was highly expressed in BCSCs and tumorspheres derived from bladder cancer cell lines compared to that in BCNSCs and non- sphere tumor cells, as assessed by qRT-PCR. Non-sphere: bladder cancer cell lines or primary bladder cancer cells that failed to form tumorspheres. (C) The expression levels of

CD44 and KMT1A in BC cell lines and primary BC samples were examined by qRT-PCR and then subjected to a correlation analysis, n = 10. Data are presented as mean ± SD. P

< 0.05; P < 0.01.

Supplementary Figure S7. Depletion of KMT1A abrogates the self-renewal and tumorigenicity of human BCSCs. (A and B) The qRT-PCR (A) and WB (B) analysis of

KMT1A in shCtrl and shKMT1A BCSCs. -actin served as a loading control. (C)

Representative photographs of tumorspheres formed by shCtrl and shKMT1A BCSCs. The number of tumorspheres was counted in five independent fields/well after two weeks of cultivation. Scale bar = 100 m. (D) shKMT1A BCSCs consisted of fewer CD44+ cells than shCtrl BCSCs. (E) Kaplan-Meier curves comparing the overall survival between bladder carcinomas patients expressing high or low levels of KMT1A, log-rank test. n, patient number. Data are presented as mean ± SD. P < 0.05; P < 0.01.

Supplementary Figure S8. STAT3 is highly expressed and activated in human BCSCs.

(A) The qRT-PCR analysis of the expression levels of GLI1, STAT3, BMI1, HES1, CTNNB1,

NANOG, POU5F1, SOX2 and CD44 in shCtrl and shKMT1A BCSCs. (B) The WB analysis of pY-STAT3, STAT3, GLI1 and SOX2 in shCtrl and shKMT1A BCSCs. -actin served as a loading control. (C) The expression of STAT3 was higher in KMT1Ahigh samples than that in KMT1Alow samples based on an analysis of McConkey’s cohort (GSE48276). (D) The expression of STAT3 along with KMT1A were analyzed using the data from McConkey’s cohort (GSE48276). (E) STAT3 is highly expressed in BCSCs and tumorspheres derived from bladder cancer cell lines compared to that in BCNSCs and non-sphere tumor cells, as assessed by qRT-PCR. Non-sphere: bladder cancer cell lines or primary bladder cancer cells that failed to form tumorspheres. Data are presented as mean ± SD. P < 0.05; P

< 0.01.

Supplementary Figure S9. STAT3 activation is indispensable for the self-renewal maintenance of human BCSCs. (A) The qRT-PCR analysis of STAT3 in shCtrl and shSTAT3 BCSCs. (B) The WB analysis of pY-STAT3 and STAT3 in shCtrl and shSTAT3

BCSCs. -actin served as a loading control. (C) Representative photographs of tumorspheres formed by shCtrl and shSTAT3 BCSCs. The number of tumorspheres was counted in five independent fields/well after two weeks of cultivation. Scale bar = 100 m.

(D) shSTAT3 BCSCs consisted of fewer CD44+ cells than shCtrl BCSCs. (E) Kaplan-Meier curves comparing the overall survival between bladder carcinomas patients expressing high or low levels of STAT3. n, patient number. Data are presented as mean ± SD. P <

0.05; **P < 0.01.

Supplementary Figure S10. KMT1A is not recruited on the promoters of GATA1,

GATA2, NFB, c-Myc, SOCS3, P53 and STAT3 in human BCSCs. (A-F) ChIP analysis of the promoters of GATA1, GATA2, NFB, c-Myc, SOCS3, P53 and STAT3 promoters with

IgG and KMT1A antibodies in BCSCs. The enrichment of different regions of the promoter was detected by qRT-PCR. Data are presented as mean ± SD.

Supplementary Figure S11. GATA3 was downregulated in BCSCs compared with that in BCNSCs.

Supplementary Figure S12. The expression of GATA3 was downregulated in BCSCs.

(A) GATA3 was downregulated in CD44high samples compared to that in CD44low samples based on an analysis of Michor’s cohort (GSE48276) and Michor’s cohort (GSE31684). (B)

The expression of GATA3 along with CD44 were analyzed using the data from Michor’s cohort (GSE48276) and Michor’s cohort (GSE31684). (C) Kaplan-Meier curves comparing the overall survival between bladder carcinomas patients expressing high or low levels of

GATA3. n, patient number. Data are presented as mean ± SD. P < 0.05; P < 0.01.

Supplementary Figure S13. The destruction of STAT3 promoter by Cas9 in human

BCNSCs. (A) Recognition sequence of GATA3 in the STAT3 promoter. (B) Sanger sequencing of PCR product including the STAT3 promoter in WT and Mut BCNSCs. Red bases indicated the abrogation of the binding motif of GATA3.

Supplementary Fig. S14. Depletion of GATA3 promotes the transcription of STAT3 in

BCNSCs. (A) The qRT-PCR analysis of GATA3 in shCtrl and shGATA3 BCNSCs,

Student’s t test. (B) The WB analysis of pY-STAT3, STAT3 and GATA3 in shCtrl and shGATA3 BCNSCs. -actin served as a loading control. (C) Representative photographs of tumorspheres formed by shCtrl and shGATA3 BCNSCs. The number of tumorspheres was calculated in five independent fields/well after two weeks of cultivation, Student’s t test. Scale bar = 100 m. Data are presented as mean ± SD. P < 0.05; P < 0.01.

Supplementary Figure S15. The KMT1A-GATA3-STAT3 circuit triggers the self- renewal and tumorigenicity of human BCSCs. (A) The qRT-PCR analysis of GATA3 and

STAT3 in vec, oeGATA3 and oeGATA3/STAT3 BCSCs. (B) The WB analysis of pY-STAT3,

STAT3 and GATA3 in vec, oeGATA3 and oeGATA3/STAT3 BCSCs. -actin served as a loading control. (C) oeGATA3 BCSCs consisted of fewer CD44+ cells than vec BCSCs. The overexpression of STAT3 in oeGATA3 BCSCs rescued the percentage of CD44+ cells,

Student’s t test. Data are presented as mean ± SD. P < 0.05; P < 0.01.

Supplementary Figure S16. The of STAT3 promoter abolished the negative regulation of GATA3 on the expression of STAT3 in BCSCs. (A) Sanger sequencing of PCR product including the STAT3 promoter in vec and oeGATA3 Mut BCSCs. Red bases indicated the abrogation of the binding motif of GATA3. (B) The qRT-PCR analysis of

GATA3 in vec and oeGATA3 Mut BCSCs (#3, T2a, high grade and #6, T2a, high grade),

Student’s t test. (C) The WB analysis of pY-STAT3, STAT3 and GATA3 in vec and oeGATA3

Mut BCSCs. -actin served as a loading control. (D) The percentage of tumor-free mice four months after the subcutaneous injection of the different dilutions of vec or oeGATA3

Mut BCSCs into immunodeficient mice, n=6. The estimated percentage of tumorigenic cells was calculated using an extreme limiting dilution analysis (ELDA). (E) Results of the tumor formation assays of vec and oeGATA3 Mut BCSCs, n=5, Student’s t test. Data are presented as mean ± SD. P < 0.05. Supplementary Table S1. Clinical characteristics of the bladder carcinoma patients.

Patient ID# Patient age (years) Sex Stage Grade Surgery Primary/Recurrent (TNM classification*) 1 75 M T1N0M0 high TURBT Primary 2 60 M T2bN0M0 high RC Primary 3 58 M T2aN0M0 low RC Primary 5 65 F T1N0M0 high TURBT Primary 6 45 M T2aN0M0 low RC Primary 7 55 F T1N0M0 high TURBT Primary 12 66 M T2N0M0 low RC Primary 17 67 M T1N0M0 high TURBT Primary 20 72 M T1N0M0 high TURBT Primary 21 56 M T2N0M0 high RC Primary 34 61 M T1N0M0 high TURBT Primary 37 59 M T1N0M0 high TURBT Primary 38 65 F T2N0M0 high RC Primary 39 72 M T3N0M0 high RC Primary 40 58 M T3N0M0 high RC Primary #The samples were collected randomly and objectively. Only the samples possessed the enough number of BCSCs for the quantitative real-time PCR (qRT-PCR), western blot (WB), immunofluorescence staining (IF), knockdown, immunoprecipitation (ChIP) and overexpression experiments were recorded. The TNM cancer staging system was designed to gauge the extent of cancer in a patient's body. T describes the size of the tumor and whether it has invaded nearby tissue, N describes regional lymph nodes that are involved, and M describes distant metastasis (spread of cancer from one body part to another). NA, not available. TURBT: transurethral resection of bladder tumor, RC: radical cystectomy.

Supplementary Table S2. Experimental information of bladder cancer samples. Figure Experiments Samples Figure 1A-B transcriptome microarray assay #1 & #2 Figure 1C, E & G qRT-PCR #3, #5, #6 & #7 Figure 1F IHC #3, #5, #6, #7, #12, #17, #20, #21, #34, #37, #38, #39 ( Figure 1H WB #3, #6, #17 & #20 Figure 1I IF #12, #17, #20, #21 & #34

Figures 2A-G, 3A-B & 4C-G shKMT1A #17 & #21

Figure 3C qRT-PCR #3, #5, #6 & #7 Figure 3D WB #5, #12, #17 & #34 Figure 3E IF #3, #5, #6 & #7 Figure 3F-L shSTAT3 #5 & #12

Figure 4A-B CHIP #17 & #21 Figure 4H oeKMT1A-ΔSET, oeKMT1A & WB #3 & #5

Figure 5A-F CHIP & of the promoter of STAT3 #17 & #21

Figure 6A-I oeGATA3 & oeGATA3/STAT3 #3 & #6 qRT-PCR, quantitative real-time PCR, IHC: immunohistochemistry, WB: western blot, IF: immunofluorescence, CHIP: chromatin immunoprecipitation. Supplementary Table S3. PCR primer sequences for selected genes.

Gene Forward primer Reverse primer Application GAPDH AAGGTGAAGGTCGGAGTCAA GGAAGATGGTGATGGGATTT RT-PCR KMT1A ATGCCGCCTACTATGGCAAC AAAGTTGGAGTCCATGCGGG RT-PCR GLI1 TCCTTTATTATCAGGAAACAG GAGTAGGGAATCTCATCCAT RT-PCR STAT3 ACCAGCAGTATAGCCGCTTC GCCACAATCCGGGCAATCT RT-PCR BMI1 CCACCTGATGTGTGTGCTTTG TTCAGTAGTGGTCTGGTCTTGT RT-PCR HES1 TCAACACGACACCGGATAAAC GCCGCGAGCTATCTTTCTTCA RT-PCR CTNNB1 ACAACTGTTTTGAAAATCCA CGAGTCATTGCATACTGTCC RT-PCR NANOG TACCTCAGCCTCCAGCAGA CCTCCAAGTCACTGGCAG RT-PCR POU5F1 GACAACAATGAAAATCTTCAGGAGA TTCTGGCGCCGGTTACAGAACCA RT-PCR SOX2 ATGCACCGCTACGACGTGA CTTTTGCACCCCTCCCATTT RT-PCR CD44 CTGCCGCTTTGCAGGTGTA CATTGTGGGCAAGGTGCTATT RT-PCR GATA3 ACTTCCCCAAGAACAGCTCG GTGGTGTGGTCCAAAGGACA RT-PCR GAPDH TACTAGCGGTTTTACGGGCG TCGAACAGGAGGAGCAGAGAGCGA CHIP GATA1 (-2001~-1852) TAGGTACTCAATAAATAAATAGGG AAAATGGGGTAATAATATACCCC CHIP GATA1 (-1801~-1671) CCTGATCCATTAGTGGTTAGGGCA GAACCCTCATCAGTAATAAGGAGA CHIP GATA1 (-1601~-1452) AAATCTCCCATTGCATATGAGGAC TGGGGAGACTGAGACCTACAAAGG CHIP GATA1 (-1351~-1202) TGAGGTCTATGAGACACTGTGGTT GGATTGATGTAGCCTGTGGCTCTG CHIP GATA1 (-1201~-1052) ACATCCTCCCATCCTACCTGCATG TGAAAGGGAAAGAGGCCAAAGACA CHIP GATA1 (-1001~-865) TCTCTCTATATGTCTTTAATGGTC AGAAAGAGAGTGACAGAGATGAAT CHIP GATA1 (-801~-658) ACTCCACCCCTTTCCTTTCCTACC GTGAGGGTGTGGGGGTAGGAACAC CHIP GATA1 (-601~-452) ACTTATCTGCTGCCCCAGGGCAGG CTGGATTTGAACTAGAGCCTGTGG CHIP GATA2 (-1971~-1819) GCGCCTAGCCTGTGTGTGTCTTTA ACGCCCCAACAGCCATCAATTAAT CHIP GATA2 (-1742~-1619) CTCAAGCCATCCTCCCACCTCAGC AGTCTGGGCAACATAGTGAGACTG CHIP GATA2 (-1561~-1419) TGGCATGTCTATGAATGGGTGTGT AACAGATCAAGACCCTGTCTCAAG CHIP GATA2 (-1142~-1019) GTTTCTGTCCCCTGGTCTTTGTGA CTGGGCACACAGGACATAGACTCT CHIP GATA2 (-942~-819) CAGTCCAGTCCTTGTGCTGCCAAA TCAAGGTCCCCAGGCCACTTTCAC CHIP GATA2 (-742~-619) TGGGGGACCCCCTGAGCACTTTAG CTTCTGTGAACAAGCCCTCTGCAC CHIP GATA2 (-542~-419) ACATCCCTGCACATGTGCACACAG TCCTGAGTGTGAGCCCGAGTGCCG CHIP GATA2 (-342~-219) CTCCCACCCTCTCCGGAGCGCACA TGCGTGTGCTCACAGCTCCTGAGG CHIP GATA3 (-2072~-1891) GAGGGTGCTGTTGCGTGCTG AACTGTACCTCAACTCGCCG CHIP GATA3 (-1891~-1709) AATGCCCGGGCCAAGGAGAC TGTCCCAGGAAGGCCCCTGC CHIP GATA3 (-1711~-1529) ACAGGCCAGGTAGACCCCAC CGTCACGAATGCAGCCCCTC CHIP GATA3 (-1521~-1352) CGACGGGGGAGCAGAGAGGG CCAGCCTCCACCCAGGCGCA CHIP GATA3 (-1351~-1172) TTTGGGGTTCTGGCGTCTGG TGGACAGAGGTGGAGGTGGT CHIP GATA3 (-1171~-1022) CACCCCAGTTCTCCCGACTC GGCTGGGCCAGGAAGTTAAG CHIP GATA3 (-1009~-812) GTTTTATTTCTCCCCAAAGA GATCAAAATAATCCACGGAG CHIP GATA3 (-809~-636) TCACCAGCAGGGGTAGGGAT GGGAAAATCTGCCTTCTAGC CHIP GATA3 (-611~-452) GCTAGAAGGCAGATTTTCCC CTCTCCTGAGCCGTAGTTGC CHIP GATA3 (-452~-292) ATGAAAGTAGAATGGGGGCG TTCCACTTTTGGGAGCATTG CHIP GATA3 (-271~-92) GGAAAATGCCATGCATGGGT AGGGGTGTGTGAAACTGACA CHIP GATA3 (92~83) GATGTGCGGTTTGCCTGTTT GAGGTAAAACCTGGGCGATG CHIP GATA3 (69~229) CCCAGGTTTTACCTCTCGCT CCGAGGGCCAGCTGTTTTTT CHIP NFB (-1999~-1712) ATCATGGCCTTAATGGCACCTTGA TTCAGGGTAGGGTGTTGAGCAGGC CHIP NFB (-1699~-1554) CCTTGCAGAATGAAAAGTAGAGTG TTGCAGTGAGCCAAGATTGCGCCA CHIP NFB (-1499~-1346) GATTACAGGCGCCCGCCACCACGC GGCTCACACCTGTAATCCCAGCAC CHIP NFB (-1299~-1150) CTTTATGGATCCTCTAAATTCCAG AAAACCCACACACACAAAAACCCA CHIP NFB (-1099~-950) TAACCTACTGGAGGAGGAGGATGG ACTGGGTGGAGTGACAACTGAAAG CHIP NFB (-899~-750) TGGGGAGGTAATCCACCCGAAGGT GGGGAGTTTTCTTTTTCTCTGAAG CHIP NFB (-699~-557) CATCACTATAATTCTATCCACAGT CTTGTGCCCAGTAAAGTATAACCC CHIP NFB (-399~-250) TGGACCGCATGACTCTATCAGCGG CTTCCTAGCAGGGCGCTCCCGAAT CHIP NFB (-199~-50) CGCCCGGCGCCCCGAAGCGGCCCC GACACACGCGCGCACGCAGCGGGG CHIP c-Myc (-1999~-1844) CCCGCCCTCGTTGACATCCAGGCG GGGGAGCAACCAATCGCTATGCTG CHIP c-Myc (-1799~-1650) TAAAGCTGAATTGTGCAGTGCATC AAAGAGAAAACAATTCGGGGGAAA CHIP c-Myc (-1599~-1450) GGGTGAGGGACCAAGGATGAGAAG CGACTTAGCTAGTTGCCCAGCCCC CHIP c-Myc (-1399~-1250) CAGTGCACTTTCACTAGTATTCAG ATCGATTCTGATCAAAGAA GAGG CHIP c-Myc (-1199~-1140) GTCCGGTTTGTCCGGGGAGGAAAG TATGGGAGGGGCAGGGGGTACCCG CHIP c-Myc (-999~-1050) GCCCGAGACTGTTGCAAACCGGCG GCGAGAGGGAGGTTGCCTGCTCTC CHIP c-Myc (-799~-600) TAAACAGACGCCTCCCGCACGGGG ACCTTCCACCCAGACTGAGTCCCC CHIP c-Myc (-599~-444) GCAATGCGTTGCTGGGTTATTTTA TTTGATCAAGAGTCCCAGGGAGAG CHIP SOCS3 (-1931~-1758) CAAGGCTGCGGTGAGCCATG ATCTCCTGACCTCGTGATCCA CHIP SOCS3 (-1543~-1381) CCAGAACCTTTTCCTTCCCT GACTACAGGCACAGGCCACC CHIP SOCS3 (-1331~-1181) CAGGGACAGGGAGCTTGGGA AACTGTGTGTCTTGTCCAAA CHIP SOCS3 (-1131~-981) CGGGGCCAGCCTTATGAAGG CTTTCAGCACCTCATTATCCTGG CHIP SOCS3 (-931~-781) AAAGCAGTTTTCCGGGGTTC AACCAATCTTTCACCTCAAT CHIP SOCS3 (-735~-555) CCGCGTCCGCCAGAGGGCGC CTCCTGCTAGAATTTTATTCCC CHIP SOCS3 (-531~-348) TTCGTGGCAGCTACCCAGCC ATCGTTCCACCTCGAGCTCT CHIP SOCS3 (-599~-449) CACCCCCTCGCCGCCACCCG TGACTGTCGCACGTCTCCAA CHIP P53 GCGGATTACTTGCCCTTACTTG CCAATCCAGGGAAGCGTGTC CHIP STAT3 (-365~-517) GCTCACGCAGAAACTGAAGTT TTGAGAGCCTCTTACCACG CHIP STAT3 (-1195~-1360) CTCTGTCCCTCATTGGATATG ACCTTTGAAGGCAGTCACA CHIP

Supplementary Table S4. PCR primer sequences for vector construction.

Gene Forward primer Reverse primer Application KMT1A T GCACAAGTTTGCCTACAAT TTCAAGAGA TCGAGAAAAAA GCACAAGTTTGCCTACAAT shRNA ATTGTAGGCAAACTTGTGC TTTTTTC TCTCTTGAA ATTGTAGGCAAACTTGTGC A STAT3 T GCAGCAGCTGAACAACATG TTCAAGAGA TCGAGAAAAAA GCAGCAGCTGAACAACATG shRNA CATGTTGTTCAGCTGCTGC TTTTTTC TCTCTTGAA CATGTTGTTCAGCTGCTGC A GATA3 T GCTCTACTACAAGCTTCAC TTCAAGAGA TCGAGAAAAAA GCTCTACTACAAGCTTCAC GTGAAGCTTGTAGTAGAGC TTTTTTC TCTCTTGAA GTGAAGCTTGTAGTAGAGC A Scramble RNA T TTCTCCGAACGTGTCACGT TTCAAGAGA TCGAGAAAAAA TTCTCCGAACGTGTCACGT shRNA ACGTGACACGTTCGGAGAA TTTTTTC TCTCTTGAA ACGTGACACGTTCGGAGAA A

KMT1A CCGGAATTC ATGGTGGGGATGAGTCGCCTGAGA CGCGGATCC CTAGAAGAGGTATTTGCGGCAGGA Overexpression KMT1A-ΔSET CCGGAATTC ATGGTGGGGATGAGTCGCCTGAGA CGCGGATCC ATCGGATACCCTTCTGTACCACAC Overexpression STAT3 CCGCTCGAG ATGGCCCAATGGAATCAGC CCCAAGCTT TCACATGGGGGAGGTAGCGCACTC Overexpression GATA3 AAGGAAAAAAGCGGCCGCAA CCGGAATTC CTAACCCATGGCGGTGACCATGCTG Overexpression ATGGAGGTGACGGCGGACCAGCCG Red bases indicate the shRNAs against KMT1A, STAT3, GATA3 and scramble RNA.

Supplementary Table S5. The 56 upregulated genes of BCSCs. GeneSymbol seqname fold changetype chrom strand RNAlength product ProbeName probeSeq ADAM15 NM 0038152.7106 coding chr1 + 2823 ADAM metallopeptidase domain 15 ASHG19A3A004191 GGCCCTGCTGGCACGAGGCACTAAGTCTCAGGGGCCAGCCAAGCCCCCACCCCCAAGGAA AURKB NM 0042172.1477 coding chr17 - 1253 Aurora B ASHG19A3A007875 AATTGATGACTTTGAGATTGGGCGTCCTCTGGGCAAAGGCAAGTTTGGAAACGTGTACTT BACE2 NM 13899110.3771 coding chr21 + 2843 Beta-site APP-cleaving 2 ASHG19A3A005906 ATGCGGTGGTGGAAGCTGTGGCCCGCGCATCTCTGCTTTACATTCAGCCCATGATGGGGG BCL2 NM 00063319.8903 coding chr18 - 6492 B-cell CLL/lymphoma 2 ASHG19A3A010357 CTTTGTTTGTGTTGTTGGAAAAAGTCACATTGCCATTAAACTTTCCTTGTCTGTCTAGTT BCL2L1 NM 1385782.3617 coding chr20 - 2575 BCL2-like 1 ASHG19A3A017538 TCGCAGCTTGGATGGCCACTTACCTGAATGACCACCTAGAGCCTTGGATCCAGGAGAACG CD44 NM 0010012.2336 coding chr11 + 5619 CD44 molecule (Indian blood group) ASHG19A3A000650 AGCACAGACAGAATCCCTGCTACCAGTACGTCTTCAAATACCATCTCAGCAGGCTGGGAG CENPM NM 0240532.4085 coding chr22 - 947 protein M ASHG19A3A019866 CACACCGTGGTGAAGCTGGCCCACACCTATCAAAGCCCCCTGCTCTACTGTGACCTGGAG CFLAR NM 0038792.0562 coding chr2 + 2215 CASP8 and FADD-like apoptosis regulator ASHG19A3A004203 ATTCAGGAATCAGAAGCTTTTTTGCCTCAGAGCATACCTGAAGAGAGATACAAGATGAAG CGREF1 NM 0065693.4914 coding chr2 - 1934 Cell growth regulator with EF-hand domain 1 ASHG19A3A013413 TAAGTTAAGGGGCAGATTACCAATAAAGAACTGAATGAATTCATCCCCCCGGCCACCTCT COMT NM 0007544.5423 coding chr22 + 2304 Catechol-O-methyltransferase ASHG19A3A020100 CCAGACCTGAGTGGCAGAAAGCAAAAAGTTCCTTTGCTGCTTTAATTTTTAAATTTTCTT CSNK1D NM 0013312.0004 coding chr11 + 6282 Catenin (cadherin-associated protein), delta 1 ASHG19A3A001630 AAAGGAGGATAAGCAAGTCGAATTTTTGTCTTACGCTCTCTCCTTCCTGCTTCCTCCTTG CSNK2A1 NM 1775602.1277 coding chr20 - 2522 Casein kinase 2, alpha 1 polypeptide ASHG19A3A006463 CGGCCGCCGCCGCTGCCGCTTCCACCACAGAAATCAAGATGACTACCAGCTGGTTCGAAA CTNNB1 NM 00109811.1300 coding chr3 + 3256 Catenin (cadherin-associated protein), beta 1, 88kDa ASHG19A3A001661 TACTGACCTGTAAATCATCCTTTAGGAGTAACAATACAAATGGATTTTGGGAGTGACTCA ESPL1 NM 0122912.8985 coding chr12 + 6641 Extra spindle pole bodies homolog 1 (S. cerevisiae) ASHG19A3A048641 ATGTCCTGAACCCTCACAATAACCTGTCAAGCACAGAGGAGCAATTTCGAGCCAATTTCA FHIT NM 0020123.0335 coding chr3 - 1103 Fragile histidine triad gene ASHG19A3A003975 AGCTCTGCGGGTCTACTTTCAGTGACACAGatgtttttcagatcctgaattccagcaaaa GALNT1 NM 0204743.2625 coding chr18 + 3852 UDP-N-acetyl-alpha-D-galactosamine:polypeptide N-ac ASHG19A3A010598 AAACTTAATGGCCCAGTTACAATGCTCAAATGCCACCACCTAAAAGGCAACCAACTCTGG GLI1 NM 0011602.1913 coding chr12 + 3414 GLI family zinc finger 1 ASHG19A3A003031 TCCCGAGCCCAGCGCCCAGACAGAGGCCCACTCTTTTCTTCTCCCCGGAGTGCAGTCAAG GPX1 NM 2013973.1254 coding chr3 - 1200 Glutathione peroxidase 1 ASHG19A3A021066 TGACTCATAGAAAATCTCCCTTGTTTGTGGTTAGAACGTTTCTCTCCTCCTCTTGACCCC GSK3B NM 0011462.0365 coding chr3 - 7095 Glycogen synthase kinase 3 beta ASHG19A3A002922 AGTATTGCAGGACAAGAGATTTAAGAATCGAGAGCTCCAGATCATGAGAAAGCTAGATCA GSTM4 NM 1471483.8872 coding chr1 + 1372 Glutathione S- mu 4 ASHG19A3A006061 GAAGGACTTCATCTCCCGCTTTGAGGTTTCCTGTGGCATAATGTGATGGTCAATTTTCTG HDAC1 NM 0049642.5345 coding chr1 + 2091 deacetylase 1 ASHG19A3A019578 TACATTAAATTCTTGCGCTCCATCCGTCCAGATAACATGTCGGAGTACAGCAAGCAGATG HDAC4 NM 0060372.4345 coding chr2 - 8980 Histone deacetylase 4 ASHG19A3A004457 TGGGCGTGAAGCCCGCCGAAAAGAGACCAGATGAGGAGCCCATGGAAGAGGAGCCGCCCC KCNAB1 NM 1721604.9758 coding chr3 + 3715 Potassium voltage-gated channel, shaker-related subfam ASHG19A3A023432 AAAGACAAATCTCCCAAGAAAGCCTCAGAAAACGCTAAAGACAGCAGCCTTAGTCCCTCA KLK5 NM 0010774.1063 coding chr19 - 1367 Kallikrein-related peptidase 5 ASHG19A3A001521 CCCCACCCCCTACCTGGGGGACAGGGTGCAGCGGCCATGGCTACAGCAAGACCCCCCTGG KMT1A NM 0031733.1256 coding chrX + 2745 Suppressor of variegation 3-9 homolog 1 (Drosophila) ASHG19A3A040957 TGGTTCCACTGGTCTCAAAAGTCACCTGCCTACAAATGTACAAAAGGCGAAGGTTCTGAT LGALS14 NM 2034712.4811 coding chr19 + 1099 Lectin, galactoside-binding, soluble, 14 ASHG19A3A012679 AAATCAATCATTCCCTCCAGTTATGTCCCTGACCCACAGGCTTCATTTGTGCAAGTACTG MAPKAP1 NM 0010062.5944 coding chr9 - 3254 Mitogen-activated associated protein 1 ASHG19A3A000822 GAGTTCTGCCTGGTCCGCGAGAACAGTATCTCTGGAGACAAAGTAGAGATAGACCCTGTT MSI2 NM 1707212.8679 coding chr17 + 2151 Musashi homolog 2 (Drosophila) ASHG19A3A009761 TACACGTCAAAATGGCCGATCTGACATCGGTGCTCACTTCTGTTATGTTTTCTCCCTCTA MUC1 NM 0010442.3652 coding chr1 - 1005 Mucin 1, cell surface associated ASHG19A3A001460 GTTTTCTGGGCCTCTCCAATATTAAGTTCAGTGAGTGATGTGCCATTTCCTTTCTCTGCC NDRG2 NM 2015412.5684 coding chr14 - 2077 NDRG family member 2 ASHG19A3A006919 TCGGGACGCAGCAAAGAGAGGAGAGACCCCAGAGTCAGAAGGAGTGAGAACCCTGACCCC NSD1 NM 0224552.1564 coding chr5 + 12998 Nuclear receptor binding SET domain protein 1 ASHG19A3A029223 CGCCTGCGGCCGCGTCTGCTCGGGGCCTGAGGCCTCGAAGACCCCAGCCCAAGCCCCCAG PAX5 NM 0167342.5364 coding chr9 - 3650 Paired box 5 ASHG19A3A037335 GAAAAAATCGCTGAATATAAACGCCAAAATCCCACCATGTTTGCCTGGGAGATCAGGGAC PDLIM7 NM 2033524.4257 coding chr5 - 1607 PDZ and LIM domain 7 (enigma) ASHG19A3A027748 ATGTTTGGCACGAAATGCCATGGCTGTGACTTCAAGATCGACGCTGGGGACCGCTTCCTG PHF12 NM 0010332.0681 coding chr17 - 4475 PHD finger protein 12 ASHG19A3A008122 ATATGCTGGACGAGAAGCTGATCAAGTTTCTGGCCTTGCAGAGAATACATCAGCTTTTCC PLAUR NM 0026592.2740 coding chr19 - 1548 Plasminogen activator, urokinase receptor ASHG19A3A011539 AATGACTGCCAGACTGTGGGGAGGCACTCTCCTCTGGACCTAAACCTGAAATCCCCCTCT POC1A NM 0011612.1137 coding chr3 - 2284 POC1 centriolar protein homolog A (Chlamydomonas) ASHG19A3A021132 GGCCCTGGGGAAGCCCCTTCCTCACCCAGGCTTCACACATTTGACCCTGGCTCTCTCTCA PSMC3 NM 0132902.1456 coding chr17 - 1318 PSMC3 interacting protein ASHG19A3A004670 CAGAGCTGCCGCTACATGGAGGCTGAGATGCAGAAAGAAATCCAGGAGTTAAAGAAGGAA SEC61A2 NM 0011424.1663 coding chr10 + 2462 Sec61 alpha 2 subunit (S. cerevisiae) ASHG19A3A002594 CAGAAATTCAGAAACCGGAAAGGAAAATCCCACTGTTTGGAATCATGTCATCAGATTCTG SET NM 0011222.1943 coding chr9 + 2863 SET nuclear oncogene ASHG19A3A039290 TCATGTAAATATCCCTTCCAGGCAGGGCGGCTCCAGTGCAGATTTAAGCCGCTGGCACCT SHBG NM 0011469.8969 coding chr17 + 932 Sex hormone-binding globulin ASHG19A3A002955 ACTCAGGCAGAATTCAATCTCCGAGGAGAAGACTCTTCCACCTCTTTTTGCCTGAATGGC SKP2 NM 0059833.4860 coding chr5 + 1600 S-phase kinase-associated protein 2 (p45) ASHG19A3A028110 TTTTATTCTTGGTTTTCCCTTTGCCTTCATTCTGCAAGTATACTAGGGAGCCATTTGAGA SLC3A2 NM 0010122.8574 coding chr11 + 2129 Solute carrier family 3 (activators of dibasic and neutral ASHG19A3A000965 GGTCTCAGCGCGGGGGACGACTCAGGCACCATGAGCCAGGACACCGAGGTGGATATGAAG SLC41A3 NM 0010082.9617 coding chr3 - 1698 Solute carrier family 41, member 3 ASHG19A3A000876 CGGCATGCTTCTGGACTATTTCCAGGCCAACACTGGACAAATTGATGACCCCCAGGAGCA SOX2 NM 0031063.2564 coding chr3 + 2518 SRY (sex determining region Y)-box 2 ASHG19A3A023598 CGGCTCTGTATTATTTGAATCAGTCTGCCGAGAATCCATGTATATATTTGAACTAATATC SRPR NM 0031392.2654 coding chr11 - 3100 Signal recognition particle receptor (docking protein) ASHG19A3A045854 TTCACCCATGAGGCACTCACACTCAAGTATAAACTGGACAACCAGTTTGAGCTGGTGTTT STAT3 NM 0031502.2456 coding chr17 - 4953 Signal transducer and activator of transcription 3 (acute- ASHG19A3A008368 TCCGACGTCGCAGCCGAGGGAACAAGCCCCAACCGGATCCTGGACAGGCACCCCGGCTTG TBCK NM 0011632.9572 coding chr4 - 3541 TBC1 domain containing kinase ASHG19A3A024582 CCTTCTGGCCCCAAATCAGATGTATGGTCTCTTGGAATCATTTTATTTGAGCTTTGTGTG TRIM25 NM 0050822.4991 coding chr17 - 5744 Tripartite motif-containing 25 ASHG19A3A008606 CCTTGTTTCAATTGGGGTAGGAGTTCCAGGAGGTGTGGATTGGAGGTTGTCTAAAGAATT TSPAN3 NM 1989023.2282 coding chr15 - 3796 Tetraspanin 3 ASHG19A3A006826 GGAAAGTCGCTGTGGACTTGCCACGGTGGAAAATGAGGTTGATCGCAGCATTCAGAAAGT TTLL6 NM 1736232.0146 coding chr17 - 2542 Tubulin tyrosine ligase-like family, member 6 ASHG19A3A008521 AAGGTTCTCATTTTCCTCCTCCAATGAGAATTGGCACCCCAACAGCAGGCCCTGGGCCCT UBE2C NM 0070192.0145 coding chr20 + 823 Ubiquitin-conjugating enzyme E2C ASHG19A3A004573 GACCATCCATGGAGCAGCTGGAACAGTATATGAAGACCTGAGGTATAAGCTCTCGCTAGA UBE2K NM 0011112.3127 coding chr4 + 5106 Ubiquitin-conjugating enzyme E2K (UBC1 homolog, yeASHG19A3A001881 CAAGGAGGTGCTGAAGAGCGAGGAGGTCCGGTTTATCACTAAAATATGGCATCCTAATAT UGDH NM 0011843.4843 coding chr4 - 3013 UDP-glucose 6-dehydrogenase ASHG19A3A003863 GCCATCAAAGAAGCTGATCTTGTATTTATTTCTGTGCTGTCCAACCCTGAGTTTCTGGCA USP19 NM 0066772.1256 coding chr3 - 4401 Ubiquitin specific peptidase 19 ASHG19A3A021056 TCTTTATCTGCCGGTGCCCTTGCCACAAAAGCAAAAGGTTCTCCCTGTCTTTTATTTTGC WNT3 NM 0307532.5784 coding chr17 - 1506 Wingless-type MMTV integration site family, member 3ASHG19A3A008472 TTCTGACAAGCCCGAAAGTCATTTCCAATCTCAAGTGGACTTTGTTCCAACTATTGGGGG WNT5A NM 0033922.4359 coding chr3 - 5855 Wingless-type MMTV integration site family, member 5ASHG19A3A021175 AGTGGCTTTGGCCATATTTTTCTCCTTCGCCCAGGTTGTAATTGAAGCCAATTCTTGGTG Supplementary Table S6. The 103 downregulated genes of BCSCs. GeneSymbol seqname fold change type chrom strand RNAlength product ProbeName probeSeq ABCD3 NM_002858 3.0911 coding chr1 + 3616 ATP-binding cassette, sub-family D (ALD), member 3ASHG19A3A026158 TTTGCGAATGTCTCAAGCTCTGGGTCGAATAGTTTTGGCTGGGCGTGAAATGACTAGATT ACSL3 NM_203372 3.2110 coding chr2 + 4262 Acyl-CoA synthetase long-chain family member 3 ASHG19A3A006964 TAGTACCTCCTACCATTGTCAACTGATTCTCGCTGAAGTCTGTTAATTCTACTTTTTGAG ADD3 NM_016824 2.8893 coding chr10 + 4454 Adducin 3 (gamma) ASHG19A3A044478 CTGTTTCACAAATTCAGTCTCAAACTCAGTCACCGCAAAATGTCCCTGAAAAATTAGAAG AHCYL1 NM_006621 2.5457 coding chr1 + 4024 Adenosylhomocysteinase-like 1 ASHG19A3A027735 AATTTCTGTGTGAAGAACATCAAGCAGGCAGAATTTGGACGCCGGGAGATTGAGATTGCA ALDH1A3 NM_000693 7.4165 coding chr15 + 3510 Aldehyde dehydrogenase 1 family, member A3 ASHG19A3A053304 AACTGCTACAACGCCCTCTATGCACAGGCTCCATTTGGTGGCTTTAAAATGTCAGGAAAT ARGLU1 NM_018011 15.2430 coding chr13 - 1772 Arginine and glutamate rich 1 ASHG19A3A049821 GAGCTAGAGCGAATACTGGAAGAGAATAACCGAAAAATTGCAGAAGCACAAGCCAAACTG ARHGAP12 NM_018287 3.3935 coding chr10 - 4172 Rho GTPase activating protein 12 ASHG19A3A042592 CAGACAGTTGCCAAAGCCAAACCAAGACACAATGCAGATTCTTTTCCGACATCTCAGAAG ARHGAP29 NM_004815 3.7156 coding chr1 - 9121 Rho GTPase activating protein 29 ASHG19A3A048459 CCCACGAAATGTAGGGATTGTGAAGGCATTGTAGTGTTCCAAGGTGTTGAATGTGAAGAG ARID4A NM_002892 2.4323 coding chr14 + 5792 AT rich interactive domain 4A (RBP1-like) ASHG19A3A051471 AACACAGAGAAAAACATCCGAATTCATCCCCTAGGACATATAAATGGAGCTTTCAGCTCA ARMC1 NM_018120 4.4423 coding chr8 - 2657 Armadillo repeat containing 1 ASHG19A3A035823 ACTTTTCAAATGGCTGTTCAAAGGTGTGTGGTGCGAATCCGTTCAGATTTGAAAGCTGAG BAZ1A NM_182648 2.8464 coding chr14 - 5935 Bromodomain adjacent to zinc finger domain, 1A ASHG19A3A006615 AAGAGCAACTAACTGATGCTGACACCAAAGGCTGCAGTTTGAAAAGTTTGGATCTTGATA BAZ1B NM_032408 3.3387 coding chr7 - 6133 Bromodomain adjacent to zinc finger domain, 1B ASHG19A3A032995 ATGGAACAGCAACAGAAGTTGCTGTAGAGACAACCACACCCAAACAAGGACAGAACCTAT BIRC2 NM_001166 2.5920 coding chr11 + 3753 Baculoviral IAP repeat-containing 2 ASHG19A3A046899 TACTGGCCATCTAGTGTTCCAGTTCAGCCTGAGCAGCTTGCAAGTGCTGGTTTTTATTAT BNIP2 NM_004330 2.6419 coding chr15 - 2533 BCL2/adenovirus E1B 19kDa interacting protein 2 ASHG19A3A052211 GGCAGAACTAGCAGAACTTGTCCCCATGGAATACGTTGGCATACCAGAATGCATAAAACA BTAF1 NM_003972 3.9808 coding chr10 + 7054 BTAF1 RNA II, B-TFIID transcription facASHG19A3A004212 TAAAGCTACAGGCCACGTATTCCAGGCATTACAGTACTTACGTAAACTGTGCAACCATCC C11orf57 NM_001082 2.8468 coding chr11 + 3199 11 open reading frame 57 ASHG19A3A001603 ACCCTGAAGAATTTGAAACAGACAGTAGTGATCAGCAAGATATTACCAACGGGAAGAAAA C14orf129 NM_016472 2.8846 coding chr14 + 2110 open reading frame 129 ASHG19A3A051750 TGACCTGTGAGGATTCCTTCCCTTCAGGTACTGGATTCTTGATCTTTCTGCATCATCAAG C14orf135 NM_022495 4.8195 coding chr14 + 3952 Chromosome 14 open reading frame 135 ASHG19A3A051487 TGCTTTGGACTGGCTCACAGAAAAGCCAGAACTGTTTCAACTAGCACTGAAAGCATTCAG C5orf28 NM_022483 3.3813 coding chr5 - 2761 open reading frame 28 ASHG19A3A026756 TGGATTTTTAGCCTCTGTTATTGATGTAGACCACTTTTTTCTAGCTGGATCCATGTCTTT C6orf192 NM_052831 3.5066 coding chr6 - 2440 open reading frame 192 ASHG19A3A030546 TTTACCCAAAGTTGGCCTTATAGCCTTCGTCATCAACTCACTCAGCTCGTGTTTTGGCTT C7orf68 NM_001098 3.0048 coding chr7 + 1318 open reading frame 68 ASHG19A3A001698 TGCGCTCTGCGGCTGACGGCGCTTTTGTCTCCGGGTCCAGAGGCCTTTCAGAAGGAGAAG CAPZA2 NM_006136 3.5798 coding chr7 + 2373 Capping protein (actin filament) muscle Z-line, alpha 2ASHG19A3A034943 TCAGTAGAAACTGCTCTGAGAGCTTACGTAAAAGAACATTACCCGAATGGAGTCTGCACT CASD1 NM_022900 3.9874 coding chr7 + 3898 CAS1 domain containing 1 ASHG19A3A034721 AGCAGTTGTAAAAACAAAGCAGAGTGCAATGAACTCCATCCGTCTGTTTCTGTGGTACAG CASZ1 NM_001079 2.6832 coding chr1 - 7938 Castor zinc finger 1 ASHG19A3A054819 CTCCTCGGCACCTGGTTATTAAGAACTGAATATTTTTCCACTTGAATTTAGTGCTATTAG CCBL2 NM_001008 3.4667 coding chr1 - 2065 Cysteine conjugate-beta lyase 2 ASHG19A3A000883 GAGGCGAGGTTCCCGCACCGGATAGAAAATGTCACTGAAATTCACAAATGCAAAACGGAT CCDC82 NM_024725 3.1440 coding chr11 - 2746 Coiled-coil domain containing 82 ASHG19A3A045616 TCGATGACTTTGTAGTGCAAGATGAGGAGGGTGATGAAGAGAATAAAAACCAACAAGGAG CENPC1 NM_001812 2.7977 coding chr4 - 3349 Centromere protein C 1 ASHG19A3A024249 TGTCACGAAAAGTCGAAGAATTTCCAGGCGTCCATCTGATTGGTGGGTGGTAAAATCAGA CEP57 NM_014679 3.5139 coding chr11 + 3158 Centrosomal protein 57kDa ASHG19A3A004752 TGCGGCTTCTGGTTCTCACTTGTCGAACAGCTTTGCTGAGCCATCAAGGTCTAATGGAAG CEP70 NM_024491 2.4397 coding chr3 - 2678 Centrosomal protein 70kDa ASHG19A3A021652 TCAAAACAGAGTGTTTGCCTATCTGTGCAAAAGAGTTCCTCATACCGTCTTGGATAGACA CHORDC1 NM_012124 4.2062 coding chr11 - 3400 Cysteine and histidine-rich domain (CHORD)-contain ASHG19A3A004624 TCCGGTCTTTCACGATGCATTAAAGGGTTGGTCTTGCTGTAAGAGAAGAACAACTGATTT CLK1 NM_004071 2.9306 coding chr2 - 1933 CDC-like kinase 1 ASHG19A3A004232 TACTTCACATCGTCGTTCACATGGGAAGAGTCACCGAAGGAAAAGAACCAGGAGTGTAGA CMTM6 NM_017801 3.1282 coding chr3 - 3384 CKLF-like MARVEL transmembrane domain contain ASHG19A3A020851 TTGGCATCCATCATTTTTGTTTCCACACATGACAGGACTTCAGCTGAGATTGCTGCAATT COPG NM_016128 4.4473 coding chr3 + 3114 Coatomer protein complex, subunit gamma ASHG19A3A023177 TATTGTGAAGTTCTTGGGAATGCACCCTTGTGAGAGGTCAGACAAAGTGCCGGATAACAA CYB5R4 NM_016230 5.2796 coding chr6 + 2259 Cytochrome b5 reductase 4 ASHG19A3A031779 AATCCATGCTGAAAGAATGCCTGGTTGGCAGAATGGCCATTAAACCTGCTGTTCTGAAAG DCK NM_000788 4.4441 coding chr4 + 2618 kinase ASHG19A3A025656 AACCAATTTGGCCAAAGCCTTGAATTGGATGGAATCATTTATCTTCAAGCCACTCCAGAG DEK NM_003472 5.2720 coding chr6 - 2879 DEK oncogene ASHG19A3A029455 TAGAGAGGTTGACAATGCAAGTCTCTTCCTTACAGAGAGAGCCATTTACAATTGCACAAG DENND1B NM_001142 4.0736 coding chr1 - 2689 DENN/MADD domain containing 1B ASHG19A3A009300 GCACTGTTCAACACAGCAATGACCAAAGCAACCCCTGCTGTACGGACAGCATATAAATTT DEPDC1 NM_001114 2.7512 coding chr1 - 5356 DEP domain containing 1 ASHG19A3A047015 CAATAGAACTTTCAGAAAATTCTTTACTTCCAGCTTCTTCTATGTTGACTGGCACACAAA DLG1 NM_004087 4.7847 coding chr3 - 5034 Discs, large homolog 1 (Drosophila) ASHG19A3A022160 TTTCCCGAAAATTCCCCTTCTACAAGAACAAGGACCAGAGTGAGCAGGAAACAAGTGATG DNAJC13 NM_015268 3.2460 coding chr3 + 7551 DnaJ (Hsp40) homolog, subfamily C, member 13 ASHG19A3A023225 TAAGGAAAAGCTTAGCTGGCATGCTGACACCCTATGTTGCTAGAAAACTTGCTGTGGCTA DNAJC7 NM_003315 3.0262 coding chr17 - 2096 DnaJ (Hsp40) homolog, subfamily C, member 7 ASHG19A3A004132 TGCTCGACGACCAAGAGGCGAAGAGGGAAGCAGAGACTTTCAAGGAACAAGGAAATGCAT DNMT1 NM_001130 2.3450 coding chr19 - 5425 DNA (cytosine-5-)-methyltransferase 1 ASHG19A3A011027 TAGCCCCAGGATTACAAGGAAAAGCACCAGGCAAACCACCATCACATCTCATTTTGCAAA EED NM_152991 3.2554 coding chr11 + 2413 Embryonic ectoderm development ASHG19A3A046823 TAGGGTAGACACTGACAACGTTATGTGTGGTCTTTAACCTGTTGTCATGTTTTTTCCCTA EIF3E NM_001568 3.5682 coding chr8 - 1516 Eukaryotic translation initiation factor 3, subunit E ASHG19A3A036045 GCGCACTTTTTGGATCGGCATCTAGTCTTTCCGCTTCTTGAATTTCTCTCTGTAAAGGAG ERH NM_004450 9.4280 coding chr14 - 801 Enhancer of rudimentary homolog (Drosophila) ASHG19A3A050914 ACCAAGAGGCCAGAAGGCAGAACTTATGCTGACTACGAATCTGTGAATGAATGCATGGAA ERRFI1 NM_018948 3.4708 coding chr1 - 3144 ERBB receptor feedback inhibitor 1 ASHG19A3A052429 TTTTTAAATATTGACCCGATAACCATGGCCTACAGTCTGAACTCTTCTGCTCAGGAGCGC ESCO1 NM_052911 3.1848 coding chr18 - 4499 Establishment of cohesion 1 homolog 1 (S. cerevisiae)ASHG19A3A010189 ATGGGTATTCAGCATGATGCGTCGGAAGAAAATTGCTTCTCGCATGATTGAATGCCTAAG ETAA1 NM_019002 3.1598 coding chr2 + 3298 Ewing tumor-associated antigen 1 ASHG19A3A015760 TATTCCTTGTACTCCCAGTGTAGCAAAAGGAAAATCAAGAGCAAAAATCAGCTGCACAAA FAM18B NM_016078 7.1574 coding chr17 + 1821 Family with sequence similarity 18, member B1 ASHG19A3A009221 TGTTTCACTGTTTGATGCGGAAGAGGAGACGACTAATAGACCAAGAAAAGCCAAAATCAG FAM18B2 NM_001135 3.7760 coding chr17 - 4244 Family with sequence similarity 18, member B2 ASHG19A3A007970 CCAGTTAATATAAGTGGAATCATCATAGTTTAAGGAATACCCAGAGATTGCTGCTATTCT GATA3 NM_001002 2.7486 coding chr10 + 3070 GATA binding protein 3 ASHG19A3A043584 TGCTAAACGACCCCTCCAAGATAATTTTTAAAAAACCTTCTCCTTTGCTCACCTTTGCTT GLMN NM_053274 2.4587 coding chr1 - 2042 Glomulin, FKBP associated protein ASHG19A3A048189 CATCAAGAATATGGGCTGGAATCTCGTTGGTCCTGTTGTTCGATGCCTTTTGTGTAAAGA GPR160 NM_014373 2.9604 coding chr3 + 2021 G protein-coupled receptor 160 ASHG19A3A023513 TGAAGGCAGTAAAAGTGAAATTAAATAGGAAGATCATCAGTCAAGGAAGACCCACTGGAG GPR89A NM_001097 2.7756 coding chr1 - 1984 G protein-coupled receptor 89A ASHG19A3A001649 GCTATTTTATTGTGAGCAATATCCGACTACTGCATAAACAACGACTGCTTTTTTCCTGTC HERC4 NM_015601 2.4441 coding chr10 - 4507 Hect domain and RLD 4 ASHG19A3A042877 TGTTTCATGTGGAGAAGCTCATACGTTAGCGCTAAATGACAAAGGCCAGGTGTATGCTTG HPS3 NM_032383 3.3739 coding chr3 + 4451 Hermansky-Pudlak syndrome 3 ASHG19A3A023368 TTTTCAAACTCACATCACAGTACATCTGGAGATTGTCTAAGAGGCAGCCTCCTGACACCA IGF2BP1 NM_006546 2.1578 coding chr17 + 8769 -like growth factor 2 mRNA binding protein 1 ASHG19A3A009669 CTCCTCCGCTTGTAAGATGATCTTGGAGATTATGCATAAAGAGGCTAAGGACACCAAAAC IGF2BP3 NM_006547 2.5890 coding chr7 - 4168 Insulin-like growth factor 2 mRNA binding protein 3 ASHG19A3A032548 ATTTGTTGGAGCCATCATAGGAAAAGAAGGTGCCACCATTCGGAACATCACCAAACAGAC IQCB1 NM_001023 3.0558 coding chr3 - 2594 IQ motif containing B1 ASHG19A3A021476 GAACTTAGACAGCTTGTTGGCCTTTTAAGCCCAATGGTCTATCAGGAAGTAGAAGAGCAG ITGB1 NM_033667 2.8207 coding chr10 - 3774 Integrin subunit beta 1 ASHG19A3A005658 cctcagcctcccgagtacctgggattacagGGTGAAAATCCTATTTATAAGAGTGCCGTA LMBRD1 NM_018368 2.4330 coding chr6 - 2308 LMBR1 domain containing 1 ASHG19A3A030117 TTAGAATTCATTGAAAACAGCTGGTGGACAAAATTTTGTGGCGCTCTGCGTCCCCTGAAG LRRC40 NM_017768 3.0481 coding chr1 - 2958 Leucine rich repeat containing 40 ASHG19A3A047048 AATAAACTTCAGTCACTTACAGATGACCTGCGACTCTTGCCTGCACTGACTGTTCTTGAT LRRFIP1 NM_001137 3.4124 coding chr2 + 3599 Leucine rich repeat (in FLII) interacting protein 1 ASHG19A3A017167 TTCGATCTGAAGATGATGTCTTGGAAAACGGGACAGACATGCATGTAATGGACCTACAAA MDM2 NM_002392 3.2410 coding chr12 + 7369 Mdm2 p53 binding protein homolog (mouse) ASHG19A3A048832 ATGTGCAATACCAACATGTCTGTACCTACTGATGGTGCTGTAACCACCTCACAGATTCCA MED17 NM_004268 2.6970 coding chr11 + 3497 Mediator complex subunit 17 ASHG19A3A046868 ACAAAAACAGGCTCCAGATATAGGTGACCTCGGCACAGTTAACCTCTTCAAACGACCTTT MEX3B NM_032246 3.0079 coding chr15 - 3398 Mex-3 homolog B (C. elegans) ASHG19A3A052468 GTTCAGACAAATCTTCTAGATCTGCTTCACCCAGCATATTTTCTATTCAGTGATATAAAG NDUFB4 NM_001168 2.6467 coding chr3 + 1560 NADH dehydrogenase (ubiquinone) 1 beta subcompleASHG19A3A023070 GGAGTGTTTCATTCTGTTTGTCAGTTGTACGGTGGGTTGTGCCAAAATGCAGTTTTTCTT NRD1 NM_001101 3.4578 coding chr1 - 3836 Nardilysin (N-arginine dibasic convertase) ASHG19A3A001804 TAGAAAAAAAACTACTGAAAAACAGTCTGCAGCGGCTCTTTGTGTTGGAGTTGGGAGTTT OXCT1 NM_000436 3.2929 coding chr5 - 3572 3-oxoacid CoA transferase 1 ASHG19A3A026740 GAAAAGTGCAAGGAATTTCAACTTGCCAATGTGCAAAGCTGCAGAAACCACAGTGGTAGA PGK1 NM_000291 8.0515 coding chrX + 2439 1 CUST_19_PI418767722 ATATTGCTGAATGCAAGAAGTGGGGCAGCAGCAGTGGAGAGATGGGACAATTAGATAAAT PHF10 NM_018288 3.9060 coding chr6 - 1692 PHD finger protein 10 ASHG19A3A030855 AGGAGGCGAATGGGCTCAGGAGATAGTTCTAGGAGTTGTGAAACTTCAAGTCAAGATCTT PIGK NM_005482 2.8915 coding chr1 - 4626 Phosphatidylinositol glycan anchor biosynthesis, class ASHG19A3A047313 TGAAGAAATTACCAACATAGAACTCGCGGATGCTTTTGAACAAATGTGGCAGAAAAGACG PLSCR1 NM_021105 2.8628 coding chr3 - 2228 Phospholipid scramblase 1 ASHG19A3A021720 TTCCTGTCCCAAATCAGCCAGTGTATAATCAGCCAGTATATAATCAGCCAGTTGGAGCTG PNPLA8 NM_015723 5.0748 coding chr7 - 3548 Patatin-like phospholipase domain containing 8 ASHG19A3A033368 GTTGAAGAACTGACTTTTCATCTTCTAGAATTTCCTGAAGGAAAAGGAGTGGCTGTCAAG PNRC2 NM_017761 2.9771 coding chr1 + 2428 Proline-rich nuclear receptor coactivator 2 ASHG19A3A017927 CTTCTGTATTGAGACAAAGGAAGGGATCTGTCAGAAAGCAACACTTGTTATCTTGGGCTT PPIL4 NM_139126 4.9717 coding chr6 - 2481 Peptidylprolyl isomerase (cyclophilin)-like 4 ASHG19A3A030684 CACAAGAAGAAAGGCACAGTGTCCATGGTGAATAATGGCAGTGATCAACATGGATCTCAG PPP6C NM_002721 5.7342 coding chr9 - 4256 Protein phosphatase 6, catalytic subunit ASHG19A3A038035 GACCTTTGTGAACTGTTCAGAACTGGAGGTCAGGTTCCTGACACAAACTACATATTTATG PRCP NM_005040 2.7665 coding chr11 - 2161 Prolylcarboxypeptidase (angiotensinase C) ASHG19A3A004360 CTATTCGGTTCTCTACTTCCAACAGAAGGTTGATCATTTTGGATTTAATACTGTGAAAAC PSIP1 NM_001128 3.0552 coding chr9 - 3393 PC4 and SFRS1 interacting protein 1 ASHG19A3A037138 ATGGTAATCAGCCACAACATAACGGGGAGAGCAATGAAGACAGCAAAGACAACCATGAAG RAB22A NM_020673 2.6133 coding chr20 + 8702 RAB22A, member RAS oncogene family ASHG19A3A018497 GAGTATTGTGTGGCGGTTTGTGGAAGACAGTTTTGATCCAAACATCAACCCAACAATAGG RNF13 NM_183381 3.3741 coding chr3 + 2863 Ring finger protein 13 ASHG19A3A006685 TGTGTTGCGGGGGCCGGACTTCAAGGTGATTTTACAACGAGATGCTGCTCTCCATAGGGA RRM1 NM_001033 3.3601 coding chr11 + 3234 M1 ASHG19A3A046025 ATCCTGGCAGCCAGGATCGCTGTCTCTAACTTGCACAAAGAAACAAAGAAAGTGTTCAGT SCAMP1 NM_004866 2.7407 coding chr5 + 6245 Secretory carrier membrane protein 1 ASHG19A3A028420 AAAGAAAAGCCGCAGAATTAGATCGTCGGGAACGAGAAATGCAAAACCTCAGTCAACATG SENP6 NM_001100 3.3278 coding chr6 + 6624 SUMO1/sentrin specific peptidase 6 ASHG19A3A031742 TGGTAGACGTTTTCATCATGCTCATGCACAGATACCAGTAGTAAAAACAGCAGCCCAAAG SERINC1 NM_020755 9.1177 coding chr6 - 3144 Serine incorporator 1 ASHG19A3A030470 AGTGTCAACATGCTCCTCTGCGTTGGTGCTTCTGTAATGTCTATACTGCCAAAAATCCAA SKAP2 NM_003930 2.6278 coding chr7 - 3984 Src kinase associated phosphoprotein 2 ASHG19A3A032577 TTGCAGCACAAGACCTTCCTTTTGTTCTAAAGGCTGGCTACCTTGAAAAACGCAGAAAAG SLC25A46 NM_138773 3.1787 coding chr5 + 2358 Solute carrier family 25, member 46 ASHG19A3A028605 TGTCCAGGGAGTCACACTTGGAGCAGAAGGCATAATTAGTGAATTTACACCTTTGCCAAG SLC35A5 NM_017945 3.0514 coding chr3 + 2958 Solute carrier family 35, member A5 ASHG19A3A023001 TTATTTATAATGCCAGCAAGCCTCAAGTTCCGGAATACGCACCTAGGCAAGAAAGGATCC SLC38A2 NM_018976 4.3497 coding chr12 - 4961 Solute carrier family 38, member 2 ASHG19A3A047506 TCTCATTGTCCGTCTGGCTGTGTTAATGGCTGTGACCCTGACAGTACCAGTAGTTATTTT SLC38A9 NM_173514 2.7366 coding chr5 - 2554 Solute carrier family 38, member 9 ASHG19A3A026803 TTTTCTATGCCAATGACACAGGAGCCCAACAGTTTGAAAAGTGGTGGGATAAGTCCAGGA SNX14 NM_020468 5.0106 coding chr6 - 3346 Sorting nexin 14 ASHG19A3A005156 TCACCAACACGCAATTCAAAATTGAACAGGAACACACAGAAAAGGGGAGAATCATTTGGA TANK NM_004180 3.5037 coding chr2 + 2089 TRAF family member-associated NFKB activator ASHG19A3A016517 TTCTTCTCCTAGAAAAGAAACTTCAGCAAGGAGTCTTGGCAGTCCTTTGCTCCATGAAAG TBK1 NM_013254 2.7941 coding chr12 + 2982 TANK-binding kinase 1 ASHG19A3A048798 AAGCAGAAAATGGACCAATTGACTGGAGTGGAGACATGCCTGTTTCTTGCAGTCTTTCTC THOC2 NM_001081 2.5553 coding chrX - 5625 THO complex 2 ASHG19A3A040358 TCCAATTGATCTTGCTGGTCTTCTTCAGTATGTTGCCAATCAGCTAAAGGCGGGCAAAAG TMEM30A NM_018247 4.0197 coding chr6 - 4544 Transmembrane protein 30A ASHG19A3A030160 TCTCCGGATGTGACACCTTGCTTTTGTACCATTAACTTCACACTGGAAAAGTCATTTGAG TMEM38B NM_018112 3.0087 coding chr9 + 2076 Transmembrane protein 38B ASHG19A3A039078 TGAGAGGTTGGTAAAAGGAGATTGGAAACCAGAAGGTGATGAATGGCTGAAGATGTCATA TMX3 NM_019022 2.6468 coding chr18 - 4767 Thioredoxin-related transmembrane protein 3 ASHG19A3A010372 AAAGGTTTCAGAATTACCTTGCTATGGATGGCTTCCTCTTGTATGAACTTGGAGACACAG TRNT1 NM_182916 3.1728 coding chr3 + 2291 TRNA nucleotidyl transferase, CCA-adding, 1 ASHG19A3A022197 TTCAGTCGGCTGGGATTCGGATGATAAACAACAGAGGAGAAAAGCACGGAACAATTACTG USP10 NM_005153 2.6644 coding chr16 + 3399 Ubiquitin specific peptidase 10 ASHG19A3A054853 AGAATGTAACCCTAATCCATAAACCAGTGTCGTTGCAACCCCGTGGGCTGATCAATAAAG USP16 NM_001001 3.3027 coding chr21 + 2986 Ubiquitin specific peptidase 16 ASHG19A3A000694 CAAGCCAGCATTACAACTCCAAAGCCAGAGAAAGATAATGGAAATATTGAACTTGAAAAT VPS35 NM_018206 7.3001 coding chr16 - 3298 Vacuolar protein sorting 35 homolog (S. cerevisiae) ASHG19A3A053724 GGAACAAATTTGGTGCGCCTCAGTCAGTTGGAAGGTGTAAATGTGGAACGTTACAAACAG YARS NM_003680 2.5756 coding chr1 - 3117 Tyrosyl-tRNA synthetase ASHG19A3A042871 ACATGTGGCTTACTTTGTGCCCATGTCAAAGATTGCAGACTTCTTAAAGGCAGGGTGTGA ZFAND1 NM_001170 2.4545 coding chr8 - 731 Zinc finger, AN1-type domain 1 ASHG19A3A003539 TGTGATGATTGTTCAGGAATATTTTGCCTTGAACACAGAAGCAGGGAGTCTCATGGTTGT Supplementary Table S7. Datasets of bladder cancer samples. # Normal Median Follow-Up % Radical Median Age/yr Dataset Accession Number Year Country Platform # Samples # Tumors (# invasive) tissues Time/months (max) Cystectomy (range) % Males PMID Ekaterini Blaveri GSE1827 2005 USA cDNA microarrays 80 80 (53) 0 13 (145) 62.50% 66 (28–113) 70.00% 15479860 Wun-Jae Kim GSE13507 2010 South Korea Illumina human-6 v2.0 255 188 (104) 67 37 (137) n/a 66 (24–88) 81.80% 20421545 David Lindgren GSE19915 2010 Sweden cDNA array 285 144 (47) 15 46 (180) 32.60% n/a n/a 20406976 Virginia Urquidi GSE31189 2012 USA Affymetrix U133 Plus 2.0 arrays 92 52(22) 40 n/a n/a 68 (36–90) 80.80% 23097579 Markus Riester GSE31684 2012 USA Affymetrix U133 Plus 2.0 arrays 93 93 (78) 0 32 (175) 16.13% n/a 73% 24486590 Gottfrid Sj€odahl GSE32894 2012 Sweden Illumina HumanHT-12 V3.0 expression beadchip 308 308 (95) 0 35 (110) n/a 72 (20–96) 83.44% 22553347 Yong-June Kim GSE37815 2013 South Korea Illumina human-6 v2.0 expression beadchip 24 18 (0) 6 n/a 0% n/a n/a 23436614 Hecker N GSE40355 2013 Germany Agilent-026652 Whole Human Microarray 4x44K v2 24 16 (n/a) 8 n/a n/a n/a n/a n/a Woonyoung Choi GSE48276 2014 USA Illumina HumanHT-12 WG-DASL V4.0 R2 expression beadchip 116 116 (116) 0 n/a n/a n/a n/a 24525232 Jakob Hedegaard E-MTAB-4321 2016 European Illumina HiSeq 2000 460 460 (16) 0 32 (75) n/a 69 (24–96) 79.78% 27321955 n/a: not available Supplementary Table S8. KMT1A , GATA3 , STAT3 , CD44 and NSD1 analysis in bladder cancer datasets. Expression: KMT1A Survival: KMT1A (months) Expression: GATA3 Survival: GATA3 (months) Expression: STAT3 Survival: STAT3 (months) Expression: CD44 Survival: CD44 (months) Expression: NSD1 Normal Normal Normal Normal tissue/ Normal tissue/ tissue/ tissue/ tissue/ Accession Number Tumor Peri-tumor P value High Low P value Tumor Peri-tumor P value High Low P value Tumor Peri-tumor P value High Low P value Tumor Peri-tumor P value High Low P value Tumor Peri-tumor P value E-MTAB-4321 38.2 40.5 0.0552 43.1 36.5 0.0002 36.7 40.9 0.0321 41 37.2 0.0404 GSE13507 9.655 N=187 8.975 N=68 < 0.0001 32.87 36.57 0.0086 11.05 N=187 10.77 N=68 0.1251 50.3 23.715 0.0017 9.341 N=187 9.733 N=68 < 0.0001 42.95 28.98 0.0071 11.40 N=187 12.15 N=68 < 0.0001 39.77 28.53 0.0963 8.269 N=187 8.260 N=68 0.8197 GSE1827 19.25 12.6 0.7647 14.4 10.85 0.4685 11.65 18.63 0.1339 GSE19915 48.6 37.3 0.0169 48 43.2 0.5512 48.45 43.2 0.7086 GSE31189 514.9 N=52 542.0 N=40 0.6874 1319 N=52 1816 N=40 0.3416 1879 N=52 1379 N=40 0.1766 1418 N=52 734.2 N=40 0.0663 307.4 N=52 281.1 N=40 0.5759 GSE31684 38.1602 19.5811 0.7546 42.7926 19.5811 0.2242 13.6673 44.6817 0.0044 7.32649 33.232 0.035 GSE32894 36.36 47.211 0.0449 46.7507 34.7836 0.0254 33.8795 46.7507 0.0522 44.0548 42.1151 0.5903 GSE37815 9.526 N=18 8.802 N=6 0.0033 11.49 N=18 9.832 N=6 0.0051 9.328 N=18 9.908 N=6 0.0078 11.55 N=18 12.46 N=6 0.0825 8.265 N=18 8.120 N=6 0.1708 GSE40355 0.1044 N=16 -1.036 N=8 0.0843 0.3521 N=16 -3.518 N=8 < 0.0001 -0.2643 N=16 0.7662 N=8 0.0073 -0.6894 N=16 2.158 N=8 < 0.0001 0.1569 N=16 -0.1017 N=8 0.3117 Supplementary Materials and Methods

Reagent and antibody. 1640 (Gibco) and KnockOut™ DMEM/F-12 (Gibco) medium were used in the cell culture. EGF (Gibco), bFGF (Gibco), N2 (Gibco), and B27 (Gibco) were used in the tumorsphere formation, type IV (Gibco) was used in the digestion of tumor tissues form human patients or tumors formed in mouse models.

Antibodies that recognize KMT1A (Abcam, ab12405), -actin (Sigma, A1978), PE- conjugated anti-CD44 antibody (BD, 550989), mouse Anti-Human CD44 antibody (BD,

550988),mIgG (Sigma, M8770), GATA3 (Abcam, ab199428), STAT3 (Cell Signaling

Technology, #12640), active form of STAT3 phosphorylated at amino acid residue

Y705 (p-STAT3) (Cell Signaling Technology, #9145), GLI1 (Cell Signaling

Technology, #3538), SOX2 (Cell Signaling Technology, #3579), H3K9me3 (Abcam, ab8898), Anti-pan Cytokeratin antibody (Abcam, ab7753) and mouse mAb BCMab1 which was purified through protein A-Sepharose from ascites were used in the study.

Corresponding species-specific HRP/FITC/PE/APC-conjugated secondary antibodies were purchased from Sigma or Abcam (A0168, A0545, F5387, P9670, ab130782 and ab72465).

Gene expression analysis. The expression of different genes was analyzed in

TCGA database (http://firebrowse.org/). Cohort datasets were downloaded from NCBI.

R language and Bioconductor was used for background correction, normalization, calculation of , and annotation. Genes and expression lists generated by

R3.1.0 were used for further analysis.

Immunohistochemistry. Sections (4 m) were deparaffinized and rehydrated.

After antigen retrieval, these sections were treated using 3% H2O2 solution, incubated using 10% bovine serum albumin for 30 minutes and primary antibody at 4 °C overnight, then incubated by corresponding secondary antibody, and subsequently stained with

DAB kit (ZSGB Bio). The nucleus was counterstained with hematoxylin. KMT1A staining was measured by multiplying the numerical score of the staining intensity

(none = 1, weak = 2, moderate = 3, strong = 4) with the staining percentage (0%–100%), resulting in an overall product score.

Overexpression of KMT1A, GATA3 and STAT3. KMT1A (Gene ID: 6839),

KMT1A-ΔSET, GATA3 (Gene ID: 2625) and STAT3 (Gene ID: 6774) were amplified by

PCR using the primers in Supplementary Table S4. The PCR product was purified and cloned into pCDNA3.1 vector. BCSCs were cultured in DMEM/F-12 medium supplemented with 20 ng/mL EGF, 20 ng/mL bFGF, 1% N2 and 2% B27 for maintaining an undifferentiated state. One week later, BCSCs were transfected by pCDNA3.1-KMT1A, pCDNA3.1-KMT1A-ΔSET, pCDNA3.1-GATA3 and/or pCDNA3.1-STAT3. Stable clones were obtained by selection with G418. All constructs were confirmed by DNA sequencing.