Non-Coding Variants Connect Enhancer Dysregulation with Nuclear Signaling in Hematopoietic Malignancies

Kailong Li1,2,5, Yuannyu Zhang1,2,5, Xin Liu1,2,5, Yuxuan Liu1,2,5, Zhimin Gu1,2, Hui Cao1,2, Kathryn E. Dickerson1,2, Mingyi Chen3, Weina Chen3, Zhen Shao4, Min Ni1, Jian Xu1,2,*

1Children’s Medical Center Research Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA

2Department of Pediatrics, Harold C. Simmons Comprehensive Cancer Center, and Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA

3Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA

4Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China

5These authors contributed equally

*Corresponding Author: [email protected] (J.X.)

Running Title: Non-Coding Variants in Hematopoietic Malignancies

Figure S1. Workflow for the identification of non-coding variants in human (A) Schematic of the major steps in targeted resequencing and functional analysis of non-coding alterations. The detailed information about the samples, identified mutations and variant-associated CREs are shown on the right. (B) Flowchart is shown for the annotation of blood CRE repertories. (C) Flowchart is shown for the mutation discovery pipeline.

Figure S2. Identification of non-coding variants and recurrently mutated CREs (A) Schematic of the mutation discovery pipeline for identifying high-confidence (Tier I) somatic SNVs and INDELs in tumor/normal paired AML samples. (B) Flowchart is shown for the identification of Tier II somatic SNVs and INDELs in other leukemia and lymphoma samples and cell lines. (C) Variant allele frequency (VAF) of the identified SNVs and INDELs in targeted sequencing. (D) Comparisons of mutations in this study and recently published WGS studies of AML samples from TCGA (1), ICGC (http://icgc.org), and Beat AML (3 batches of samples) (http://www.vizome.org) (2) cohorts. Both Tier I and II somatic gene mutations identified in targeted CRE sequencing are shown. (E) Flowchart is shown for the identification of recurrently mutated CREs.

Figure S3. Analysis of coding and non-coding variants in human leukemia (A) Heatmap shows the recurrent -coding gene mutations identified by targeted resequencing in human leukemia. The gene symbols are shown on the left. The cumulative numbers of mutated samples for each gene are shown on the right. The numbers of mutated in each sample are shown on the top. Samples are categorized by disease types (AML or MDS, lymphoma, ALL and normal) and sources (patient and cell line) with each sample ID shown on the bottom. (B) Comparison of coding and non-coding mutation frequency between patient samples and cell lines. (C) Comparison of coding and non-coding mutation frequency between different leukemia types (AML, lymphoma and ALL) and normal controls. Results are means ± SD and analyzed by a one-way ANOVA. ***P < 0.001, n.s. not significant. (D) Comparison of coding and non-coding mutation frequency between different sites (BM vs PB), genders (male vs female), and FAB classification of AML subtypes (M0 to M4). Results are means ± SD and analyzed by a student’s t-test or one-way ANOVA. n.s. not significant. (E) Lack of correlation between mutation frequency and ages. All leukemia samples (purple) or AML samples (blue) are shown. Pearson correlation coefficient (R) and P values are shown.

Figure S4. Development of the CRE perturbation screening systems (A) Flowchart is shown for the design of the sgRNA pool for CRE perturbation screening. (B) Validation of CRE perturbation screens by independent replicate experiments. CRE-specific sgRNAs (black) and non-targeting control sgRNAs (red) are shown. The x- and y-axis denote the log2 fold changes of sgRNA enrichment or dropout in T28 relative to T0 samples. The Pearson correlation coefficient (R) values are shown. (C) Comparisons of replicate experiments of enCRISPRa screening at different time points. The x- and y- axis of each graph represent the normalized sgRNA counts in enCRISPRa screens at day0 (T0) and day28 (T28). Spearman correlation value is shown for each comparison. (D) Comparisons of replicate experiments of enCRISPRi screening at different time points. (E) enCRISPRa-mediated activation of candidate oncogenic or tumor suppressive CREs resulted in significant enrichment or dropout of CRE-targeting sgRNAs, respectively. Box plots are shown for sgRNAs targeting oncogenic, tumor suppressive or other CREs, or non-targeting controls (x-axis) by the log2 fold changes of sgRNA enrichment or dropout in T28 relative to T0 samples (y-axis). Boxes show median of the data and quantiles, and whiskers extend to 1.5x of the interquartile range. P values were calculated by the Welch’s t-test. (F) enCRISPRi-mediated repression of candidate oncogenic or tumor suppressive CREs resulted in significant enrichment or dropout of CRE-targeting sgRNAs, respectively.

A Tumor Suppressive Promoters Oncogenic Promoters

CRE Nearest CRE923 CRE19872 CRE18011 CRE210 CRE Chromosome Nearest ID coordinates genes E/P (BCAR3;MIR760) CRE9782 (TOX) (ZNF253) (SPEN;FLJ37453) ID coordinates genes E/P CRE1161 chr1:146696609-146697700 FMO5 P CRE2191 chr10:6622165-6623239 PRKCQ-AS1;PRKCQ P 2 CRE2191 (BSG) CRE15272 CRE13414 chr21:36261257-36262671 RUNX1 P CRE923 chr1:94312038-94313684 MIR760;BCAR3 P (PRKCQ-AS1; (LPP;LPP-AS2) CRE18973 chr7:75367805-75369511 HIP1 P CRE16157 chr4:183837553-183839385 DCTD P PRKC1Q) CRE676 chr1:47696820-47697746 TAL1 P CRE9782 chr19:571003-572971 BSG P CRE16157 CRE18528 chr7:6098255-6099259 EIF2AK1 P CRE13645 chr22:20747916-20748772 ZNF74 P (DCTD) CRE18532 chr7:6387722-6389229 FAM220A P CRE19872 chr8:60031112-60032935 TOX P CRE11749 chr2:85132395-85133594 TMSB10 P CRE18372 chr6:159289942-159291725 C6orf99 P CRE13645 CRE3461 chr11:58345090-58346561 ZFP91;ZFP91-CNTF;LPXN P CRE10392 chr19:19976653-19977762 ZNF253 P (ZNF74) CRE2336 chr10:27541273-27542255 LRRC37A6P P CRE15272 chr3:187871255-187872821 LPP;LPP-AS2 P 0 CRE19440 chr7:149157199-149158530 ZNF777 P CRE9362 chr17:80022771-80024344 DUS1L P CRE11619 chr2:65454438-65455887 ACTR2 P CRE15386 chr4:925395-926355 TMEM175;GAK P CRE14603 chr3:57741420-57742382 SLMAP P CRE3575 chr11:64084943-64085746 PRDX5;TRMT112 P CRE11193 chr2:9695628-9696614 ADAM17 P CRE10913 chr19:50379407-50381413 TBC1D17;AKT1S1 P CRE8469 chr17:18163661-18164706 MIEF2;FLII P CRE1243 chr1:151118617-151119720 SEMA6C P CRE1592 chr1:179334231-179335393 AXDND1 P CRE21622 chrX:51636112-51637065 MAGED1 P CRE6194 chr14:73393076-73394163 DCAF4 P CRE6288 chr14:81687251-81687975 GTF2A1 P CRE19307 chr7:130080230-130081447 CEP41 P CRE20879 chr9:104159977-104161656 ZNF189;MRPL50 P CRE4519 chr12:48099412-48100456 RPAP3 P -2 CRE2030 chr1:236030212-236031340 LYST P CRE8291 chr17:4699228-4700114 PSMB6 P CRE12525 chr2:220042083-220043315 FAM134A;CNPPD1 P CRE850 chr1:85039161-85040173 CTBS P CRE16028 chr4:141677432-141678382 TBC1D9 P CRE21853 chrX:102941837-102942975 MORF4L2;MORF4L2-AS1 P CRE9489 chr18:12750264-12751100 LOC100996324 P CRE20513 chr9:34611607-34612582 RPP25L P CRE8971 CRE1245 chr1:151137838-151139325 SCNM1;LYSMD1 P chr1:220219140-220220593 EPRS P CRE12502 chr2:218990178-218990892 CXCR2 P CRE1857 (MSI2) CRE1225 chr1:150265803-150266742 MRPS21 P CRE19253 chr7:117823550-117824621 LSM8 P CRE3624 chr11:65292339-65293467 SCYL1 P CRE726 chr1:54355079-54355938 YIPF1 P CRE16100 chr4:159593272-159594085 ETFDH;C4orf46 P -4 CRE2044 chr1:236958591-236959468 MTR P CRE829 chr1:78244796-78245875 FAM73A P CRE18528 CRE12149 chr2:153574105-153575086 ARL6IP6;PRPF40A P CRE210 chr1:16174350-16175665 SPEN;FLJ37453 P (EIF1AK1) CRE18532 CRE4969 chr12:104609311-104610168 TXNRD1 P CRE7300 chr16:685892-687101 C16orf13 P (FAM220A) CRE3651 chr11:65728616-65729612 SART1 P CRE9953 chr19:4638110-4640206 TNFAIP8L1 P CRE17909 chr6:74170579-74172329 MTO1 P CRE18581 chr7:16460306-16461220 ISPD P CRE676 CRE10400 chr19:20747733-20748908 ZNF737 P CRE296 chr1:24126114-24128109 GALE P (TAL1) CRE11736 chr2:75185312-75186150 POLE4 P CRE4612 chr12:53614787-53615630 RARG P CRE17233 chr6:2875435-2876320 SERPINB9P1 P CRE16355 chr5:40678875-40679834 PTGER4 P CRE18973 CRE20707 chr9:80911164-80912545 PSAT1 P CRE15384 chr4:774868-776706 LOC100129917 P -6 (HIP1) CRE8971 chr17:55334065-55335008 MSI2 P enCRISPRi, log2 Enrichment (T28/T0) CRE153 chr1:10269924-10271441 KIF1B P CRE20172 chr8:124428099-124429820 WDYHV1 P CRE11583 chr2:64067994-64068916 UGP2 P sgRNAs targeting: CRE4244 chr12:6930058-6931215 GPR162 P CRE15745 chr4:76911162-76912330 SDAD1 P CRE13414 CRE20486 chr9:32551244-32552657 TOPORS-AS1;TOPORS P chr19:12163235-12164016 ZNF878 P CRE11314 chr2:27631829-27633103 PPM1G P CRE10147 All other promoters (RUNX1) CRE7302 chr16:698362-699516 WDR90 P CRE15960 chr4:120987496-120988961 MAD2L1 P CRE17302 chr6:8064039-8065222 BLOC1S5 P CRE7905 chr16:67193183-67194371 FBXL8;TRADD P CRE11386 chr2:37898839-37899616 CDC42EP3 P Non-targeting controls CRE175 chr1:11865359-11866764 CLCN6;MTHFR P CRE8561 chr17:27139090-27140525 FAM222B P -8 CRE7304 chr16:729283-730989 STUB1 P CRE17861 chr6:53211832-53213718 ELOVL5 P Tumor suppressive promoters CRE19753 chr8:37552261-37553387 ZNF703 P CRE19118 chr7:100181388-100182161 FBXO24 P CRE1161 CRE13663 chr22:21983495-21984551 YDJC P CRE11652 chr2:69870283-69871432 AAK1 P Oncogenic promoters CRE3682 chr11:66511693-66512760 C11orf80 P CRE14046 chr22:46691921-46692837 GTSE1;GTSE1-AS1 P (FMO5) CRE7856 chr16:57125881-57127080 CPNE2 P CRE12334 chr2:191745801-191746899 GLS P CRE10952 chr19:52390503-52391449 ZNF649-AS1;ZNF577 P CRE13897 chr22:38796320-38797029 LOC400927-CSNK1E P CRE17996 chr6:97284447-97285565 GPR63 P CRE9159 chr17:73083459-73084626 SLC16A5 P CRE20115 chr8:109260275-109261389 EIF3E P -8 -6 -4 -2 0 2 enCRISPRa, log2 Enrichment (T28/T0) B Tumor Suppressive Enhancers Oncogenic Enhancers

CRE Chromosome Nearest CRE12661 CRE Chromosome Nearest ID coordinates genes E/P CRE6590 CRE18011 (PER2) ID coordinates genes E/P CRE18564 chr7:8215498-8216965 ICA1 E (FMN1;TMCO5B) CRE2208 chr10:8287120-8287832 LINC00708 E 2 CRE18493 (SIM1) CRE350 chr1:26827000-26828194 HMGN2;RPS6KA1 E CRE20804 chr9:96928325-96929213 MIRLET7A1;MIRLET7F1 E CRE18564 (CARD11;GNA12) CRE6590 chr15:33493747-33494735 FMN1;TMCO5B E CRE17502 chr6:27470012-27470721 ZNF184 E (ICA1) CRE350 CRE18011 chr6:100795540-100797229 SIM1 E CRE6955 chr15:70820443-70821348 NA E (HMGN2;RPS6KA1) CRE18493 chr7:2933701-2934454 CARD11;GNA12 E CRE10923 chr19:50860281-50861408 NAPSA;NAPSB E CRE11563 chr2:61990869-61991991 NA E CRE18187 chr6:135504517-135505261 MYB E CRE6013 chr14:52487681-52488440 NID2;C14orf166 E CRE21452 chrX:24260990-24261833 ZFX E CRE20131 chr8:116660008-116661198 TRPS1 E 0 CRE21034 chr9:127882943-127884195 SCAI;PPP6C E CRE8682 chr17:35447019-35447926 ACACA;AATF E CRE4346 chr12:14409708-14410606 NA E CRE15902 chr4:107629151-107629921 NA E CRE5134 chr12:120729447-120730585 SIRT4;PXN E CRE16007 chr4:139409505-139410233 NA E CRE302 chr1:24717819-24718917 STPG1;NIPAL3 E CRE15492 chr4:9692867-9693706 NA E CRE14041 chr22:46453388-46454310 PRR34-AS1;PRR34 E CRE3360 chr11:39392559-39393437 NA E CRE10197 chr19:13278523-13279241 IER2;STX10 E CRE8086 chr16:83972527-83973363 OSGIN1;MLYCD E CRE9914 chr19:3434613-3435744 NFIC;SMIM24 E CRE11774 chr2:86262562-86264318 POLR1A;LOC90784 E CRE20942 chr9:115625021-115625725 SNX30;SLC46A2 E CRE18500 chr7:4764779-4765676 FOXK1;AP5Z1 E -2 CRE2637 chr10:75625477-75626252 CAMK2G;C10orf55 E CRE2599 chr10:73648369-73649316 PSAP E CRE4181 chr12:1905658-1906492 CACNA2D4;ADIPOR2 E CRE885 chr1:90223243-90223945 LRRC8C E CRE20197 chr8:126517993-126518698 NA E CRE8317 chr17:6656019-6656759 XAF1;FBXO39 E CRE208 chr1:16160379-16162282 FLJ37453;SPEN E CRE12661 chr2:239142836-239143619 PER2 E CRE4399 CRE14380 chr3:42705321-42706477 ZBTB47;LOC101928323 E CRE18884 chr7:65958067-65959039 GS1-124K5.11 E (KRAS) CRE2612 chr10:74090966-74091882 DNAJB12;MICU1 E CRE15336 chr3:196255216-196256410 SMCO1;RNF168 E CRE4391 CRE4398 CRE2235 chr10:12490188-12491664 CAMK1D E CRE20890 chr9:107877035-107877852 NA E CRE12556 chr2:225795025-225795782 DOCK10 E -4 (BCAT1) (KRAS) CRE20067 chr8:102120481-102121738 FLJ42969 E CRE4391 chr12:24967199-24968349 BCAT1 E CRE2816 chr10:98459973-98460780 PIK3AP1 E CRE16143 chr4:176867777-176868491 GPM6A E CRE15606 chr4:40552794-40553798 RBM47;RBM47 E CRE17502 CRE4399 chr12:25538591-25539583 KRAS E (ZNF184) CRE4416 chr12:27702331-27703065 PPFIBP1;SMCO2 E CRE907 chr1:92970067-92970871 EVI5;GFI1 E -6 CRE20804 CRE6331 chr14:91254408-91255183 TTC7B E enCRISPRi, log2 Enrichment (T28/T0) CRE1268 chr1:152432052-152433139 CRNN E (MIRLET7A1,MIRLET7F1) sgRNAs targeting: CRE17743 chr6:41673311-41674775 TFEB;PGC E CRE2208 CRE7383 chr16:3208813-3209656 ZNF213;ZNF213-AS1 E All other enhancers (LINC00708) CRE4071 chr11:122614110-122615622 UBASH3B E CRE4398 chr12:25429052-25430002 KRAS E Non-targeting controls CRE19301 chr7:129649912-129650743 ZC3HC1 E -8 CRE15480 chr4:8169111-8170256 ABLIM2;SH3TC1 E Tumor suppressive enhancers CRE10876 chr19:49389348-49390121 TULP2;PPP1R15A E CRE7092 chr15:83517842-83518593 HOMER2;WHAMM E Oncogenic enhancers CRE7026 chr15:77196826-77197790 SCAPER;RCN2 E CRE15295 chr3:193859290-193860013 HES1 E CRE899 chr1:92269671-92270441 TGFBR3 E CRE3075 chr10:134401705-134403146 INPP5A E -8 -6 -4 -2 0 2 enCRISPRa, log2 Enrichment (T28/T0) Figure S5. Identification of candidate tumor suppressive and oncogenic promoters or enhancers (A) CRE perturbation screens by enCRISPRa and enCRISPRi identified candidate tumor suppressive and oncogenic promoters in human leukemia. Scatter plot is shown as in Fig. 2D except that only promoter regions are shown. Dots represent sgRNA-targeted top candidate tumor suppressive promoters (green), top candidate oncogenic promoters (blue), all other promoters (grey) or non-targeting controls (red). The CRE ID, chromosome coordinates, promoter-associated genes, and annotations (P: promoter) for the top 50 candidate tumor suppressive (left) or oncogenic (right) promoters are shown. The complete lists are shown in Table S7. (B) CRE perturbation screens by enCRISPRa and enCRISPRi identified candidate tumor suppressive and oncogenic enhancers in human leukemia. Scatter plot is shown as in Fig. 2D except that only enhancer regions are shown.

Figure S6. Oncogenic CRE-associated genes display increased AML dependencies Candidate oncogenic CRE-associated genes show increased dependencies in 17 AML cell lines with available CRISPR/Cas9-based dropout screens in DepMap (5). In each cell line, the dependency scores (x-axis) for all CRE-associated genes and oncogenic CRE-associated genes (y-axis) are shown as box plots. Boxes show median of the data and quantiles, and whiskers extend to 1.5x of the interquartile range. P values were calculated by the Welch’s t-test.

Figure S7. Validation of enCRISPRi and enCRISPRa-mediated CRE perturbations (A) Validation of enCRISPRi and enCRISPRa-mediated perturbations of candidate oncogenic CREs by qRT-PCR analysis of CRE-associated in AML (MKPL-1 and MOLM-13) and ALL (Jurkat) cells. Bar graphs are shown for the log2 fold changes of mRNA expression of CRE-associated genes in enCRISPRi or enCRISPRa expressing cells with CRE-specific (sgCRE) or non-targeting (sgGal4) sgRNAs. (B) Validation of enCRISPRi and enCRISPRa-mediated perturbations of candidate tumor suppressive CREs by qRT-PCR of CRE-associated gene expression in AML (MKPL-1 and MOLM-13) and ALL (Jurkat) cells. (C) Validation of enCRISPRi and enCRISPRa-mediated perturbations of other CREs by qRT-PCR of CRE-associated gene expression in AML (MKPL-1 and MOLM-13) and ALL (Jurkat) cells. Results are mean ± SEM (N = 4 independent experiments) and analyzed by a two-sided student’s t-test. *P < 0.05, **P < 0.01, ***P < 0.001, n.s. not significant.

Figure S8. Mutation and expression of KRAS in human cancers (A) Mutation and mRNA expression of KRAS in various human cancers by RNA-seq profiling is shown. The results are illustrated by cBioPortal (3). (B) Increased KRAS expression is associated with poor overall survival in AML. The results are illustrated by GenomicScape (4). (C) Increased long-range chromatin interactions between KRAS promoter and candidate enhancer (CRE4399) based on in situ Hi-C in CD34+ HSPCs, K562 and THP1 leukemia cells. (D) Validation of CRE4399 inhibition by enCRISPRi in AML cells. The relative mRNA expression of KRAS was determined by qRT-PCR in MKPL-1 cells co-expressing Dox-inducible enCRISPRi and non-targeting control (sgGal4) or KRAS enhancer (CRE4399)-specific sgRNAs (sg1 to sg3) 3 days (left) or 15 days (right) after Dox treatment (1 µg/ml). Results are mean ± SEM (N = 4 independent experiments) and analyzed by a one-way ANOVA followed by Dunnett’s test for multiple comparisons. ***P < 0.001. (E) Alignment of DNA sequences harboring non-coding variants that locate in the proximity (+/- 10bp) of the predicted PPARG/RXRA binding sites at the KRAS enhancer in individual samples. The mutation coordinate, sample or mutation ID, and strand information for each sample are also shown. The motif logo for PPARG:RXRA is shown on the top.

Figure S9. Mutation and expression of PER2 in human cancers (A) Mutation and mRNA expression of PER2 in various human cancers by RNA-seq profiling is shown. (B) Decreased PER2 expression is associated with poor overall survival in AML. (C) Increased long-range chromatin interactions between PER2 promoter and candidate enhancer (CRE12661) based on in situ Hi-C in CD34+ HSPCs, K562 and THP1 leukemia cells. (D) Validation of CRE12661 activation by enCRISPRa in AML cells. The relative mRNA expression of PER2 was determined by qRT-PCR in MKPL-1 cells co-expressing Dox-inducible enCRISPRa and non- targeting control sgRNA (sgGal4) or PER2 enhancer (CRE12661)-specific sgRNAs (sg1 to sg4) 3 days (left) or 15 days (right) after Dox treatment (1 µg/ml). Results are mean ± SEM (N = 4 independent experiments) and analyzed by a one-way ANOVA. ***P < 0.001. (E) Alignment of DNA sequences harboring non-coding variants that locate in the proximity (+/- 10bp) of the predicted PPARG/RXRA binding sites at the PER2 enhancer in individual samples. The mutation coordinate, sample or mutation ID, and strand information for each sample are also shown.

Figure S10. Validation of CRISPR/Cas9-mediated CRE KO in AML cells (A) Validation of CRE4399 KO by CRISPR/Cas9 in AML (MKPL-1) cells. The diagram of genotyping strategy and the representative genotyping results are shown. Del: genotyping PCR primers for detection of CRE deletion; WT: genotyping PCR primers for detection of the wild-type (or unmodified) allele; Control: unmodified MKPL-1 cells. (B) DNA sequencing chromatograms of the PCR amplicons from genotyping primers for the indicated single-cell-derived CRE4399 KO clones are shown. The upstream and downstream junctions are shown as red and grey, respectively. (C) Validation of CRE12661 KO by CRISPR/Cas9 in AML (MKPL-1) cells. The diagram of genotyping strategy and the representative genotyping results are shown. (D) DNA sequencing chromatograms of the PCR amplicons from genotyping primers for the indicated single-cell-derived CRE12661 KO clones are shown.

Figure S11. Mutations at KRAS and PER2 enhancers modulated NR binding and NR-mediated transcriptional activity (A) KRAS and PER2 enhancer variants co-localize with NR motifs. The sequences for EMSA probes containing wild-type (WT) and mutant (Mut) KRAS or PER2 enhancer are shown, respectively. The WT (green) and altered (red) sequences are shown at KRAS and PER2 enhancers. The predicted PPARG:RXRA motifs are highlighted in red boxes. (B) KRAS (CRE4399) enhancer variant increased enhancer activity in MKPL-1, K562 and 293T cells. (C) PER2 (CRE12661) enhancer variant decreased enhancer activity in MKPL-1, K562 and 293T cells. (D) KRAS (CRE4399) enhancer variant increased enhancer activity upon activation of RXR and PPARG signaling in 293T cells. Results are mean ± SEM of 3 independent experiments. The differences between control (Ctrl) and WT or MUT enhancer were analyzed by a two-way ANOVA with Turkey correction for multiple comparisons. *P < 0.05, **P < 0.01, ***P < 0.001. The differences between DMSO and NR agonist-treated groups were analyzed by a two-way ANOVA. ###P < 0.001. (E) PER2 (CRE12661) enhancer variant decreased enhancer activity upon activation of RXR and PPARG signaling in 293T cells. (F) Validation of PPARG and RXRA protein expression in MKPL-1, K562 and 293T cells by Western blot. 293T cells transfected with PPARG or RXRA cDNA (PPARG OE or RXRA OE) were analyzed as positive controls. GAPDH was analyzed as a loading control. (G) Validation of the shRNA-mediated depletion of PPARG and RXRA mRNA expression in MKPL-1 cells. Individual shRNAs for PPARG and RXRA were combined and transduced into MKPL-1 cells (shNR). The non-targeting shRNA (shCtrl) was used as a control. Results are mean ± SEM of 4 experiments and analyzed by a two-sided student’s t-test. **P < 0.01. (H) Depletion of PPARG and RXRA by shRNAs impaired NR-induced activation of KRAS WT and MUT enhancers. Results are mean ± SEM of 3 independent experiments. The differences between control (Ctrl) and WT or MUT enhancer were analyzed by a two-way ANOVA. *P < 0.05, **P < 0.01, ***P < 0.001. The differences between shCtrl and shNR groups were analyzed by a two-way ANOVA. ###P < 0.001. (I) Depletion of PPARG and RXRA by shRNAs impaired NR-induced activation of PER2 WT and MUT enhancers. (J) Validation of the binding of PPARG and RXRA to the PPARG:RXRA motif-containing DNA probe. EMSA was performed using 293T nuclear extracts transfected with PPARG or RXRA-expressing cDNA individually or together and the validated PPARG/RXRA-binding probe. Excess unlabeled probes (specific competitor) effectively abolished the gel shift. (K) The quantification of PPARG/RXRA binding intensity is shown for EMSA in Fig. 5H,I. Results are mean ± SD (N = 3 independent experiments) and analyzed by a two-sided student’s t-test.

Figure S12. Genotyping of site-specific KI of KRAS and PER2 enhancer variants (A) Sanger DNA sequencing chromatograms are shown for KRAS enhancer sequences in the AML (4212T) sample, the unmodified K562 cells (WT), and two independent single-cell-derived KI clones (KI- Mut1 and KI-Mut2). The targeted variant (chr12:25538881:C>T) is indicated by the shaded line. Note that the AML (4212T) sample also contains a high-frequency common SNP (rs4963879; chr12:25538885:C>T). (B) Sanger DNA sequencing chromatograms are shown for PER2 enhancer sequences in the AML (4212T) sample, the unmodified K562 cells (WT), and two independent single-cell-derived KI clones (KI- Mut1 and KI-Mut2). The targeted variant (chr2:239142998:A>AC) is indicated by the shaded line. (C) Frequencies of leukemia cells in the peripheral blood of NSG mice xenografted with KRAS WT, KI- Mut1 or KI-Mut2 cells 6 weeks post-xenotransplantation. Results are mean ± SEM (N = 4 recipients per genotype) and analyzed by a two-sided t-test. (D) Frequencies of leukemia cells in the peripheral blood of NSG mice xenografted with PER2 WT, KI- Mut1 or KI-Mut2 cells 6 weeks post-xenotransplantation. Results are mean ± SEM (N = 4 recipients per genotype) and analyzed by a two-sided t-test.

SUPPLEMENTARY TABLE LEGENDS

Table S1. List of genomic datasets used in this study The name, data type, cell type, GEO accession number and citation are shown.

Table S2. Lists of annotated CREs and probes for targeted resequencing The chromosome coordinates (hg19) of each CRE and corresponding targeted resequencing probes are shown.

Table S3. List of leukemia cell lines and patient samples for targeted resequencing

Table S4. Identification of non-coding variants by targeted CRE resequencing in human leukemia The master table contains all the identified mutations in each tumor sample, together with the annotation of tumor type, mutation type, and chromosome position. The master table contains 3 separate sheets for organizing the mutations into CRE or mutation position level (Tier I or II), respectively. At the CRE level, the total and unique mutations identified in each CRE, together with their chromosome coordinates, mutation type and number, and corresponding samples IDs are shown. At the mutation position level, the sample ID and tumor type associated with each mutation are shown.

Table S5. Sequences of primers and sgRNAs The name and sequence of each primer or sgRNA are shown.

Table S6. Lists of CREs and sgRNAs used for CRE perturbation screens The CRE ID, chromosome coordinates (hg19) and corresponding sgRNAs used for CRE perturbation screens are shown.

Table S7. Lists of candidate tumor suppressive and oncogenic CREs identified by enCRISPRa and enCRISPRi screens The CRE ID, log2 fold changes of gRNA enrichment or dropout in T28 relative to T0 samples by enCRISPRa and enCRISPRi screens, and the enhancer nearest neighbor genes within 50kb or promoter- associated genes for putative tumor suppressive and oncogenic CREs are shown.

Table S8. Lists of non-coding variants at KRAS and PER2 enhancers in human cancers The chromosome coordinates (hg19) and cancer types for each non-coding variant at KRAS or PER2 enhancer are shown.

REFERENCES

1. Ding L, Bailey MH, Porta-Pardo E, Thorsson V, Colaprico A, Bertrand D, et al. Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics. Cell 2018;173(2):305- 20.e10 doi 10.1016/j.cell.2018.03.033. 2. Tyner JW, Tognon CE, Bottomly D, Wilmot B, Kurtz SE, Savage SL, et al. Functional genomic landscape of acute myeloid leukaemia. Nature 2018;562(7728):526-31 doi 10.1038/s41586-018- 0623-z. 3. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer discovery 2012;2(5):401-4 doi 10.1158/2159-8290.cd-12-0095. 4. Kassambara A, Reme T, Jourdan M, Fest T, Hose D, Tarte K, et al. GenomicScape: an easy-to-use web tool for gene expression data analysis. Application to investigate the molecular events in the differentiation of B cells into plasma cells. PLoS computational biology 2015;11(1):e1004077 doi 10.1371/journal.pcbi.1004077. 5. Tsherniak A, Vazquez F, Montgomery PG, Weir BA, Kryukov G, Cowley GS, et al. Defining a Cancer Dependency Map. Cell 2017;170(3):564-76.e16 doi 10.1016/j.cell.2017.06.010.