Supplemental Information Somatic mutations drive specific, but reversible epigenetic heterogeneity states in AML

Sheng Li*,#,1,2,3,4, Xiaowen Chen#,1, Jiahui Wang1, Cem Meydan5, Jacob L. Glass6, Alan H. Shih7, Ruud Delwel8, Ross L. Levine9, Christopher E. Mason5, and Ari M. Melnick*,10

Author Affiliation:

1The Jackson Laboratory for Genomic Medicine, Farmington, CT; 2The Jackson Laboratory Cancer Center, Bar Harbor, ME; 3The Department of Genetics and Genomic Sciences, The University of Connecticut Health Center, Farmington, CT; 4Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, 06269, USA; 5Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY; 6Leukemia Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York,NY; 7Center for Hematologic Malignancies, Memorial Sloan Kettering Cancer Center, New York, NY; 8Department of Hematology, Erasmus University Medical Center and Oncode Institute, Rotterdam, The Netherlands; 9Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY; 10Division of Hematology/Oncology, Weill Cornell Medicine, New York, NY

*Corresponding Authors:

Sheng Li, PhD The Jackson Laboratory for Genomic Medicine Farmington, CT 06032, USA Tel: 860-837-2119 Email: [email protected]

Ari M. Melnick, MD Department of Medicine Division of Hematology and Medical Oncology Weill Cornell Medicine New York, NY 10021, USA Tel: 212-746-7643 Email: [email protected]

# These authors contribute equally.

1 Supplementary Methods

Supplementary Figures 1-16

Supplementary Table 1

Supplementary Table 4

Supplementary Table 5

Supplementary Table 6

Supplementary Table 7

Supplementary Table 9

2 Supplementary Methods

Microarray data analysis. expression data of primary AML patients was profiled by Affymetrix Genome 133 Plus2.0 GeneChips, which has been published previously(1,2) (GEO accession number: GSE6891). Raw expression microarray data were processed using the HGU133PLUS2.0 BrainArray annotation version 17.0.0(3). levels were transformed to log2 values. Differentially expressed were called using limma (FDR adjusted P value ≤ 0.05, a fold change of 1.5).

RNA sequencing and analysis. We collected gene expression profiles of LSK cells from wild- type (4 samples), Idh2R140Q (2 samples) and Idh2R140Q;Flt3ITD (5 samples) mice, from Idh2R140Q;Flt3ITD mice treated with AG-221 (4 samples) or vehicle (4 samples), and from Tet2-/-; Flt3ITD mice treated with AzaC (4 samples) or vehicle (4 samples) (Table S9)(4). Briefly, the RNA library was generated using Illumina HiSeq. RNA sequence data were aligned to the mouse genome (mm9) using STAR(5), and RNAseq counts were annotated to GENCODE reference annotation using the Subread package(6). Gene expression data were normalized using the transcripts per million (TPM) method. Then we evaluated transcriptional heterogeneity using the above gene expression data. Specifically, the coefficient of variation per gene across samples was calculated using TPM. The top 75% of genes ranked in descending order by the average TPM of all samples were included in determining the coefficient of variation.

Whole Genome Bisulfite sequencing (WGBS) data collection and analysis

WGBS data of total bone marrow cells from AML patients with and without of DNMT3AR882H/C mutations were obtained from Spencer, D.H. et al.’s work(7). Five normal karyotype AML samples with DNMT3AR882H/C mutations and five normal karyotype AML samples with no DNMT3A mutations were used to investigate the contribution of DNMT3A mutations to DNA methylation heterogeneity in AML. WGBS was performed using either the EpiGnome/TruSeq DNA methylation Kit (illumina). Alignment was performed to the reference genome (hg19) using biscuit, after which methylation values were calculated with MethylDackel version 0.3.0.

To the above WGBS datasets, we estimated intra-sample DNA methylation heterogeneity using the M score approach(8). M-score assesses epigenetic heterogeneity by the percentage of CpG sites harboring intermediate methylation levels (from 0.45 to 0.55). An intermediate score reflects the presence of balanced hypo- and hyper-methylated CpGs within the cells of a sample and thus high intra-sample variation.

3 Calculation of epigenetic allele diversity. In this study, we used three local epigenetic allele diversity measurements, including PDR, Epipolymorphism and Shannon entropy. All the three measurements considered variability of DNA methylation patterning occurring in discrete sets of four adjacent CpGs which were able to be captured in individual sequencing reads. Any four consecutive CpGs can give up to 16 different allelic patterns. The three methods differ in how variance of these patterns is calculated. Specifically, the proportion of discordant reads (PDR) is a measure of locally disordered DNA methylation(9). In aligned ERRBS data, a read at one given with four adjacent CpG sites was classified as a concordant read or a discordant read. Here, a concordant read is one that shows either an unmethylated or a methylated state at all CpG sites of a given locus; i.e., four unmethylated cytosines or four methylated cytosines at a given locus. A discordant read is one that shows different methylated and unmethylated states at a given locus, such as one methylated cytosine followed by three unmethylated cytosines. PDR at each locus is defined as , i.e. the proportion of discordant reads compared to the total number of reads from that locus. Epipolymorphism of a given locus in the population was defined as the probability that epialleles randomly sampled from the locus differ from each

16 2 other(10). Epipolymorphism was calculated as1- p , where pi is the fraction of each epiallele åi=1 i i in the cell population. Shannon entropy was defined as the chance that two randomly chosen epialleles of a given locus have different methylated states(11). Shannon entropy of a given locus

16 was calculated as - p log p , where pi is the fraction of each epiallele i in the cell population. åi=1 i i Each measurement employs different calculations and equations and is based on slightly different assumptions.

Each assay has distinct strengths and weaknesses. The strength of the PDR approach is its simplicity in dividing loci into two categories: discordant vs concordant, which allows for simple distinctions that are easy to integrate with other biological or genomic parameters. The weakness is that it does not consider the frequency or types of distinct 4-CpG patterns occurring at each locus and hence misses potentially important information on the type of discordance occurring at each locus. PDR also does not distinguish fully methylated vs fully un-methylated alleles. In contrast, Epipolymorphism and Shannon entropy capture these kinds of differences. Both of these measurements capture information on all 16 possible allele variants that can be manifested at an individual locus. However, they differ in that Epipolymorphism is designed to capture the probability of sampling two distinct patterns at a time, whereas Shannon entropy considers the

4 proportion of all sixteen possible patterns together. They are more closely related to each other than PDR. Epipolymorphism is a statistical measurement of variance, whereas Shannon entropy is based on information theory assumptions and examines chaos in organized systems. The value in comparing and contrasting them is to determine whether these highly orthogonal and different types of equation yield similar results, which provides a far more rigorous analysis than using only one assay.

In contrast to the above three methods, which infer epigenetic allele diversity from the basic state of any given population of cells, epiallele shifts per million loci (EPM), which was described previously(12,13), is a normalized measure of a sample’s global epiallele changes at different time points. Briefly, the epiallele patterns of compositional changes between AML patient samples and normal bone marrow samples were examined using methclone(12) to calculate the combinatorial entropy change (∆�) of epialleles at each locus. Then, eloci were defined as those loci that underwent a significant epiallele shift ( DS < -70 ). EPM was calculated as the number of eloci per million loci covered by sequencing data. Here, we required that each locus was covered by at least 60 sequencing reads. Loci shared by all 119 patients were considered in this analysis. No DMR EPM was calculated by excluding DMR. Differentially methylated regions (DMRs) were defined as regions with total methylation differences of more than 25%.

DNA methylation heterogeneity by HELP assay

Bone marrow was obtained from patients with myelodysplastic syndromes (MDSs) and MDS- associated AML using the HELP assay at baseline and at days 15 and 29 during the first cycle of treatment with DNMTi 5’-azacytidine at concentrations of 30, 40, or 50 mg/m2 per day subcutaneously for 10 consecutive days. DNA methylation was profiled using the HELP (HpaII tiny fragment enrichment by ligation mediated PCR) assay in previous work, which was stored in the GEO database (accession number: GSE17328)(14). One patient was removed because no significantly changes in DNA methylation profile were found for this patient after treatment. Finally, our study used HELP assays of six patients at both baseline and at days 15 or 29 days of 5’- azacytidine treatment. Because of the nature of the platform, we are unable to perform allele diversity analyses, and instead have used the M score approach(8). The cutoff for defining the intermediate score [0.3, 0.85] was transformed from Mass Array methylation levels [0,45, 0.55] using previously described regression coefficient R= -0.77(14).

5 Data analysis. Unsupervised hierarchical clustering was performed using a Euclidean distance measurement. Specifically, in AML human patients, the top 50% variable loci across all patients measured by standard deviation values of epigenetic allele diversity were used in clustering of PDR, Epipolymorphism, and Shannon entropy (20,948 loci) and in clustering of epiallele shifts (20,764 loci). In mouse strains, the union of the eloci was used in hierarchical clustering of PDR and Epipolymorphism. (2,255 eloci for PDR and 2,145 eloci for Epipolymorphism in Idh2R140Q&Flt3ITD mice; 143 eloci for PDR and 19 eloci for Epipolymorphism in Tet2-/-&Flt3ITD mice). The top 10% variable loci across all samples measured by standard deviation values of Shannon entropy were used in clustering of Shannon entropy (12,234 loci in Idh2R140Q&Flt3ITD mice; 22,711 loci in Tet2-/-&Flt3ITD mice). Dendrograms were constructed using Ward.D2’s method.

To visualize the sample diversity of different AML subtypes and mouse model mutations in a 2- dimensional scatter plot, the t-distributed stochastic neighborhood embedding (t-SNE) algorithm(15) was performed using the Barnes-Hut approximation(16) implemented in the Bioconductor package Rtsne with 5,000 iterations. We applied this unsupervised technique to the same subset of loci with hierarchical clustering. After reducing dimensionality, each point on the obtained two-dimensional scatter plot was colored according to its AML subtype for human, and its mutation information for mouse strains.

Assessment of similarity between two AML subtypes based on eloci was performed using the Jaccard similarity coefficient. For any two AML subtypes i and j, the Jaccard similarity coefficient

si " s j was calculated as , where si and s j are the sets of eloci in AML subtype i and j, si ! s j respectively. Here, to avoid the potential bias caused by the patients having more than one AML genetic or epigenetic lesions for subtyping, we focused on eloci that undergo a significant epiallele diversity change between patients unique to a given AML subtype and NBM controls. Finally, 10 of 13 AML subtypes were considered in assessment of the Jaccard similarity coefficient.

Two-samples Welch’s t-test was performed to compare the mean of two independent groups, an adaptation of classical Student’s t-test. The Welch’s t-test does not assume that the homogeneity of variance. Specifically, the Welch’s t-test is more reliable than the Student’s t-test when the two samples have unequal variances and unequal sample sizes(17,18). Furthermore, the Welch’s t- test gives the same result as the Student’s t-test when sample sizes and variances are equal.

6 Genomic location annotation. GENCODE gene model was downloaded from the UCSC genome browser and was used for loci and eloci annotation with R-package GenomicRanges. Promoters were defined as regions up or down to a distance of 1kb from transcription start sites in the reference annotation (hg19 for human and mm9 for mouse). Gene neighborhoods were defined as regions 50 kb away from the transcription start sites or transcription end sites.

Region set enrichment analysis. The Molecular Signatures Database (MSigDB v6.1, http://software.broadinstitute.org/gsea/msigdb) is a widely used database of gene sets for performing gene set enrichment analysis. We obtained the curated gene sets and the motif gene sets from MSigDB v6.1. For each AML subtype, we performed a hypergeometric gene set enrichment analysis for genes containing eloci within their promoters with FDR adjusted P-value ≤ 0.05. As background, we used the genes containing loci shared by 119 AML patients and 14 NBM controls within their promoters.

Additionally, high-quality Chip-seq data for seven hematopoietic transcription factors (FLI1, ERG, GATA2, RUNX1, SCL, LYL1 and LMO2) in primary human CD34+ blood stem and progenitor cells were obtained from the BloodChIP database(19). For each AML subtype, we performed peak region enrichment analysis for eloci with FDR adjusted P-value ≤ 0.05. As background, we used all the loci shared by 119 AML patients and 14 NBM controls.

Association with clinical outcome. Multivariable Cox proportional hazards regression model was performed to investigate the association between the event-free survival (EFS, event denotes failure to achieve complete remission) and epigenetic allele diversity controlling for age, gender, t(15;17), inv(16), Del(5;7q), t(v;11q23), t(8;21), EVI1, CEBPA-dm, NPM1, IDH1/2, IDH1/2 + DNMT3A, DNMT3A, FLT3ITD, TET2 and CEBPA-sil (R package survival). We divided the primary AML patients into two groups: high epiallele diversity and low epiallele diversity. Here, an optimal cut-off for epiallele diversity was selected based on maximal log-rank statistic(20,21). Kaplan- Meier survival curves were drawn and compared among subgroups using log-rank tests.

Development of an epiallele-based prognostic model for event-free survival in AML In order to develop a prognostic biomarker model which is predictive of AML even-free survival, we used the supervised principal components (SuperPC) method(22). We collected 256 AML patients by combining the current 119 patients with the 137 patients from our 2016 Nature Medicine paper cohort(13). Two patients were removed because they don’t have information on

7 time to relapse. We then calculated the PDR values of the 635 loci shared by all 254 patients and used their clinical annotation to build an epiallele diversity prognostic classifier. The complete cohort was then randomly divided into a training set (n=151), a test set (n=65), and a validation set (n=38). We applied three steps to develop and validate our epiallele prognostic model.

Step 1. Epiallele Prognostic Model Training: We used the supervised principal components (SuperPC) method(22) on the training set of patients, to train a Cox proportional hazards regression model for event-free survival (event denotes relapse). Specifically, Univariate Cox model regression coefficients were computed for each locus. Here, the threshold for selecting prognosis-related loci was determined so that they maximized performance, using a 10-fold cross- validation procedure on the training set. This procedure yielded a set of 26 loci that very well (based on likelihood ratio test statistic) segregate patients according to their event free survival times. Then, we fit a SuperPC model to calculate the first principal component of the reduced PDR matrix of 26 prognosis-related loci. We confirmed the robustness of the selected threshold (set to 1.7381) and first principal componence by performing 1000 additional random splits of the data set into training sets of 151 patients and test sets of 103 patients and running the SuperPC algorithm for each of them, which were validated with a P value of <0.05 in a Cox proportional hazards regression model. The first principal component was subsequently used to calculate continuous prognostic scores (i.e. SuperPC scores) on the test cohort and validation cohort as follows.

Step 2. Model Testing: The SuperPC model was then applied to our test set of 65 patients. We fit the SuperPC scores to their event-free survival using a Cox proportional hazards model to investigate whether the SuperPC scores of the test set were successfully able to classify the test set patients into high-risk group and low-risk group.

Step 3. Model Validation: The above procedure was applied to the 38-patient validation cohort as well as the combined test and validation cohort to confirm the robustness of the prognostic model derived from the training set.

Finally, To determine the significance of this 26-loci prognostic classifier as a putative biomarker, we performed a multivariate analysis considering 26-loci SuperPC score, age, gender, ELN risk stratification(23), FLT3ITD, CEBPA-dm, t(8;21), NPM1, and inv(16) as covariates (these were the only variables that were available to us in all patients).

8 Supplementary Figures a b c 3.50 0.14

0.30 3.25 0.12 0.25 3.00 0.10 0.20 2.75 0.08

0.15 2.50 0.06

0.10 0.04 2.25

v(16) EVI1 v(16) TET2 EVI1 v(16)P TET2 EVI1 t(8;21) n TET2NPM1 P t(8;21)in NPM1 P in t(8;21) NPM1P i t(15;17)P t(15;17) P t(15;17) DNMT3A DNMT3A t(v;11q23) DNMT3A t(v;11q23) t(v;11q23) CEB CEB CEB CEB CEB CEB

Supplementary Figure 1. Epigenetic allele diversity is linked to somatic mutations. (a) The distributions of Epipolymorphism across various AML subtypes which are ranked by median Epipolymorphism; (b) The distributions of Shannon entropy across various AML subtypes which are ranked by median Shannon entropy; (c) The distributions of no DMR EPM (i.e. EPM with less than 25% methylation difference) across various AML subtypes which are ranked by median no DMR EPM. Epipolymorphism, Shannon entropy and no DMR EPM are significantly different among AML subtypes: P = 2.801x 10-10 (a),

P = 1.996x10-9 (b), and P= 0.0001 (c), Kruskal-Wallis rank sum test.

9 a1.00 c P = 0.023

0.75 al (%) v Hazard Ratio 95% CI for HR vi r Variables P value (HR) 0.50 Low Epipolymorphism Lower Upper +++ +++ + + + + + + + +++ + Class:High Epipolymorphism 0.0061 2.4256 1.2871 4.5714 ++ + + + High Epipolymorphism t(v;11q23) 0.0418 0.1677 0.0301 0.9357 0.25 ++ +++ Event-free Su ++ + + EVI1 0.0140 4.9593 1.3826 17.7886 CEBPA-dm 0.0074 0.2403 0.0847 0.6824 0.00 NPM1 0.0280 0.3679 0.1508 0.8975 0 1000 2000 3000 4000 5000 6000 Evrvival times(Days) Number at risk by time 72 22 11 6 1 0 0 47 21 11 9 7 1 0 0 1000 2000 3000 4000 5000 6000 Evrvival times(Days) b d 1.00

P = 0.045 0.75 95% CI for HR al (%) Hazard Ratio v Variables P value vi (HR) r Lower Upper 0.50 Low Shannon entropy Class:High +++ +++ + + + ++ + + + +++ + 0.0150 2.1949 1.1648 4.1362 Shannon entropy ++ t(v;11q23) 0.0498 0.1810 0.0328 0.9985 + + + High Shannon entropy 0.25 ++ +++

Event-free Su + + + EVI1 0.0176 4.7550 1.3124 17.2273 CEBPA-dm 0.0089 0.2474 0.0869 0.7044 0.00 NPM1 0.0261 0.3616 0.1476 0.8859 0 1000 2000 3000 4000 5000 6000 Evrvival times(Days) Number at risk by time 67 21 10 5 1 0 0 52 22 12 10 7 1 0 0 1000 2000 3000 4000 5000 6000 Evrvival times(Days) Supplementary Figure 2. Epigenetic allele diversity is associated with adverse clinical outcome. (a-b) Kaplan-Meier curves of primary AML patients for achievement of complete remission based on Epipolymorphism in panel a and Shannon entropy in panel b; (c-d) Multivariate Cox proportional hazards regression analysis using age, gender, high/low epiallele diversity, and 13 genetic lesions including t(15;17), inv(16), Del(5;7q), t(v;11q23), t(8;21), EVI1, CEBPA-dm, NPM1, IDH1/2, IDH1/2 + DNMT3A, DNMT3A, FLT3ITD and TET2 and the CEBPA silenced status (CEBPA-sil) as covariates, here the factors with P value < 0.05 were shown. Epipolymorphism in panel c, Shannon entropy in panel d.

10

Supplementary Figure 3. Unsupervised clustering analysis of epiallele diversity of all 119 AML patients based on the 50% most variable loci across patients. (a-b) Hierarchical clustering using Euclidean distance and based on 20,948 most variable loci for epiallele diversity as determined by Epipolymorphism in panel a and Shannon entropy in panel b. AML subtypes are shown in the left-most column. AML subtypes of which patients forming unique cluster are shown via an orange background, and those with no unique clusters are shown in gray; (c) t-SNE analysis of PDR of all the 119 patients highlighted with five AML subtypes: TET2, NPM1, EVI1, t(v;11q23), Del(5/7q). And patients with at least two co-occurring mutations are labeled by different colors.

11 DNMT3A vs Normal t(v;11q23) vs Normal t(8;21) vs Normal 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

IDH1/IDH2+DNMT3A inv(16) vs Normal NPM1 vs Normal vs Normal 1 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

TET2 vs Normal EVI1 vs Normal CEBPA-sil vs Normal erence: AML - Normal f 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 PDR Dif CEBPA-dm vs Normal 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

IDH1/IDH2 vs Normal 1 0 0.0 0.2 0.4 0.6 0.8 1.0 Mean PDR: (AML+Normal)/2

Supplementary Figure 4. MA plots showing epigenetic allele diversity difference in AML subtypes comparing with NBM controls. Each MA plot shows the difference in epigenetic allele diversity between an AML subtype and normal controls (y-axis), relative to the mean epigenetic allele diversity (x-axis). Epigenetic allele diversity was measured by PDR. Red dots represent epigenetic loci (eloci), as determined by t-test. Gray dots indicate non-eloci.

12 REL GATA2 LEF1 NF-kB SP1 STAT3 STAT5A CEB IDH1/IDH2 EVI1 Del(5/7q) CEB TET2 PFDR P P

FDR

P FDR Supplementary Figure 5. Motif enrichment analysis of epialleles unique to AML subtypes. Heatmap showing the FDR adjusted P-value (PFDR) of the motif enrichment analysis performed using a hypergeometric test for genes with eloci in promoter regions (measured by PDR). The rows represent transcription factors enriched by the eloci in some AML subtype, the columns represent AML subtypes.

13 ERG GAT R SCL L CEB CEB DNMT3A n P P

PFDR

FDR

PFDR

Supplementary Figure 6. Transcription factors found to be linked with epialleles unique to AML subtypes. FDR adjusted P-values (PFDR) are shown in the heatmap for enrichment analysis of ChIP-seq data of seven hematopoietic transcription factors in human CD34+ HSPCs; only those AML subtypes whose epialleles are significantly enriched for TF peak regions are shown (PFDR ≤ 0.05).

14 MEISSNER_NPC_HCP_WITH_H3K4ME2 a BENPORATH_ES_WITH_H3K27ME3 b MIKKELSEN_MEF_HCP_WITH_H3K27ME3 MIKKELSEN_MCV6_HCP_WITH_H3K27ME3 MEISSNER_NPC_HCP_WITH_H3K4ME3_AND_H3K27ME3 WNT4 MIKKELSEN_NPC_HCP_WITH_H3K4ME3_AND_H3K27ME3 NOTUM MEISSNER_NPC_HCP_WITH_H3K4ME2_AND_H3K27ME3 BCKDK SPON1 SP6 MEISSNER_BRAIN_HCP_WITH_H3K4ME3_AND_H3K27ME3 BENPORATH_PRC2_TARGETS PPP1R13B SDC1 BHAT_ESR1_TARGETS_VIA_AKT1_UP WRN YOSHIMURA_MAPK8_TARGETS_UP PCBP4 STAT3 LIN_NPAS4_TARGETS_DN NF-KB BENPORATH_EED_TARGETS JPH3 ZFYVE9 CPLX2 BENPORATH_SUZ12_TARGETS FOSL1 PEREZ_TP53_TARGETS KIM_WT1_TARGETS_8HR_UP PURG GNAO1 SV2B SLC38A3 KRIEG_HYPOXIA_NOT_VIA_KDM3A ACAP3 NABA_MATRISOME NABA_MATRISOME_ASSOCIATED SEMA3B AP-1 FGF11 NIKOLSKY_BREAST_CANCER_20Q12_Q13_AMPLICON REACTOME_GPCR_LIGAND_BINDING BTG3 PDE3B TUBA4A REACTOME_CLASS_B_2_SECRETIN_FAMILY_RECEPTORS REACTOME_INHIBITION_OF_INSULIN_SECRETION _BY_ADRENALINE_NORADRENALINE HSPB6 RP1L1 REACTOME_INTEGRATION_OF_ENERGY_METABOLISM CLDN15 REACTOME_SIGNALING_BY_GPCR PID_S1P_META_PATHWAY PID_S1P_S1P1_PATHWAY TF PID_S1P_S1P3_PATHWAY Gene KEGG_MELANOGENESIS 0 2 4 6 8 -log10(PFDR)

Supplementary Figure 7. Functional analysis for epigenetic alleles in CEBPA-sil subtype. (a) Genes from curated gene sets in MsigDB v6.1 significantly enriched for promoter epigenetic alleles in CEBPA silenced AML patients using a hypergeometric test

(PFDR ≤ 0.05); (b) Gene regulatory network centered around genes with promoter epigenetic alleles and binding motifs for AP-1, NF-kB and STAT3 shown in Fig.S5. Arrows indicate regulatory interactions inferred by motif enrichment analysis. Gray lines represent interactions from STRING databases (confidence score≥0.4).

15 Epipolymorphism a b n=9001

4000

2000 Number of epigenetic loci

0

v(16) EVI1 t(8;21) n TET2 NPM1 P i t(15;17)P DNMT3A t(v;11q23) Del(5/7q) CEB CEB IDH1/IDH2

v(16) EVI1 IDH1/IDH2+DNMT3A t(8;21) n TET2NPM1 P i t(15;17)P DNMT3A t(v;11q23) Del(5/7q) CEB CEB IDH1/IDH2 Loss of epigenetic alleles

IDH1/IDH2+DNMT3A Gain of epigenetic alleles Shannon entropy c 1500 n=3301 1000

500 Number of epigenetic loci v(16) EVI1 t(8;21)n NPM1 TET2 P i t(15;17) P DNMT3A t(v;11q23) Del(5/7q) CEB CEB IDH1/IDH2 0

IDH1/IDH2+DNMT3A

v(16) EVI1 t(8;21)n NPM1 TET2 P i t(15;17) P DNMT3A t(v;11q23) Del(5/7q) CEB CEB IDH1/IDH2

IDH1/IDH2+DNMT3A

Supplementary Figure 8. Comparison of AML epigenetic allele diversity measured by Epipolymorphism and Shannon entropy with NBM controls. (a) The number of eloci that are shared and unique between all AML subtypes, measured by Epipolymorphism; (b) Absolute numbers of eloci in AML subtypes measured by

16 Epipolymorphism. (a) and (b): AML subtypes were ranked in ascending order from left to right based on the median Epipolymorphism; (c) The number of eloci that are shared and unique between all AML subtypes, measured by Shannon entropy; (d) Absolute numbers of eloci in AML subtypes measured by Shannon entropy. (c) and (d): AML subtypes were ranked in ascending order from left to right based on the median Shannon entropy.

17 t-test,P = 0.040

0.026

0.024

0.022

0.020 Methylation heterogeneity 0.018

DNMT3AWT DNMT3AR882H/C

Supplementary Figure 9. Methylation heterogeneity for WGBS in human AML patients. Boxplot showing intra-sample DNA methylation heterogeneity measured by M score in five normal karyotype AML samples with no DNMT3A mutations and five normal karyotype AML samples with DNMT3AR882H/C mutations.

18

Supplementary Figure 10. Unsupervised analysis of epigenetic allele diversity measured by Epipolymorphism and Shannon entropy. (a-b) Hierarchical clustering, and (c-d) t-SNE analysis of Epipolymorphism in wild type (WT), Flt3ITD, Idh2R140Q, Idh2R140Q;Flt3ITD mice (a,c) and in WT, Tet2-/-, Flt3ITD, Tet2-/-;Flt3ITD mice (b,d); (e-f) Hierarchical clustering and (g-h) t-SNE analysis of Shannon entropy in WT, Flt3ITD, Idh2R140Q, Idh2R140Q;Flt3ITD mice (e,g) and in WT, Tet2-/-, Flt3ITD, Tet2-/-;Flt3ITD mice (f,h).

19 Epipolymorphism in Idh2R140Q&Flt3ITD mouse model Epipolymorphism in Tet2-/-&Flt3ITD mouse model a *** * ns b ns ns ns 1.15

1.2 1.10

1.1 1.05 Epipolymorphism Epipolymorphism 1.00 1.0

-/- ITD ITD ITD ITD WT R140Q WT Flt3 Tet2 Flt3 Flt3 ;Flt3 -/- ; Idh2 R140Q Tet2 Idh2 Shannon entropy in Idh2R140Q&Flt3ITD mouse model Shannon entropy in Tet2-/-&Flt3ITD mouse model c ** * ns d ns 1.2 ns 1.15 ns

• 1.10 1.1 1.05 Shannon entropy

Shannon entropy 1.00 1.0 0.95

-/- ITD ITD ITD ITD WT R140Q WT Tet2 Flt3 ;Flt3 Flt3 -/- ;Flt3 Idh2 R140Q Tet2 Idh2 Supplementary Figure 11. Somatic mutations cooperatively enhance epigenetic allele diversity measured by Epipolymorphism and Shannon entropy. (a-b) Boxplot of Epipolymorphism in wild type (WT), Flt3ITD, Idh2R140Q, Idh2R140Q;Flt3ITD mice (a) and in WT, Tet2-/-, Flt3ITD, Tet2-/-;Flt3ITD mice (b); (c-d) Boxplot of Shannon entropy in WT, Flt3ITD, Idh2R140Q, Idh2R140Q;Flt3ITD mice (c) and in WT, Tet2-/-, Flt3ITD, Tet2-/-;Flt3ITD mice (d). Here, *, P ≤ 0.05, **, P ≤ 0.01, ***, P ≤ 0.001 by t-test.

20 PBX1 GFI1 LEF1 MAZ MYOD1 NFATC

PFDR

FDR

Supplementary Figure 12. TFs shared by Idh2R140Q;Flt3ITD double mutant mouse model and AML patients with IDH2;FLT3 ITD double mutant.

21 MDS#8 MDS#9 MDS#12 0.12 0.12 0.15 0.09 0.09 0.10 0.06 0.06

0.03 0.05 0.03

0.00 0.00 0.00 Methylation heterogeneity baseline day15 day29 baseline day15 day29 baseline day15 day29 Supplementary Figure 13. The effect of epigenetic targeted therapy to DNA methylation heterogeneity. Barplot of intra-sample methylation heterogeneity measured by M score at baseline, days 15 and 29 after the first treatment cycle of 5’- azacytidine in secondary AML patients.

22 a High PDR P = 2.6334×10-11 Low PDR

High Epipolymorphism P = 3.2092×10-7 Low Epipolymorphism

High Shannon entropy P = 2.1893×10-11 Low Shannon entropy 2 4 6 8 10 Coefficient of variations

b P < 2.2×10-16 12 10 8 6 4 Coefficient of variations Coefficient 2

Good Intermediate Poor Supplementary Figure 14. The correlation between transcriptional variance and epiallele diversity. (a) Boxplot showing the coefficient of variations in patients with high/low epiallele diversity measured by PDR, Epipolymorphism and Shannon entropy (t- test); (b) Boxplot showing the coefficient of variations in good, intermediate and poor prognosis patient groups based on ELN risk classification (Kruskal-Wallis rank sum test). Here, patient-to-patient coefficient of variations was calculated using the top 75% genes ranked by the average gene expression level.

23 a b c P = 0.0621 P = 0.0035 P = 0.0058 0.10 0.16 0.34

0.32 0.14 0.30 0.08 0.28 0.12 0.26 0.24 0.10 0.06 0.22 PDR 0.08 0.20 0.18 Epipolymorphism 0.06 Shannon entropy 0.16 0.04 0.14 0.04 0.12 0.10 0.02 0.02 0.08 Good Intermediate Poor Good Intermediate Poor Good Intermediate Poor

Supplementary Figure 15. Epiallele diversity including PDR (a), Epipolymorphism (b) and Shannon entropy (c) is linked to distinct ELN risk classifications (Kruskal- Wallis rank sum test).

24 a b

135 600 Jaccard Similarity Coefficient inv(16) 0.37 59 0.2 0.3 0.4 0.5 0.6 t(8;21) 0.42 0.48

IDH1/IDH2 0.3 0.42 0.36 149 t(15;17) 0.38 0.46 0.46 0.34 400 85 CEBPA-dm 0.4 0.45 0.54 0.38 0.45 56 Number of genes 130 39 107 NPM1 0.34 0.48 0.44 0.46 0.42 0.46 87 512 39 27 CEBPA-sil 0.3 0.36 0.34 0.37 0.36 0.36 0.41 35 200 EVI1 0.41 0.35 0.43 0.31 0.39 0.45 0.35 0.32 47 21 19 48 313 13 13 13 318 t(v;11q23) 0.29 0.44 0.33 0.35 0.35 0.34 0.4 0.32 0.26 24 238246 CV Decrease 48 13 212 No Change A-sil 149 TET2 A-dm P EVI1 15 149150 inv(16) t(8;21) P NPM1 8 123 CV Increase t(15;17) IDH1/IDH2 CEB CEB 65 9 77 2 0 17

EVI1 A-sil A-dm t(8;21) NPM1 TET2 P P inv(16) t(15;17) DNMT3A Del(5/7q) t(v;11q23) CEB CEB IDH1/IDH2

IDH1/IDH2+DNMT3A Supplementary Figure 16. Transcriptional program comparison between AML subtypes. (a) Barplot showing the number of genes with increase coefficient of variations (CV, red), no change (green) and decrease CV (blue) in genes with increase eloci in promoter region. Here, the differential transcriptional heterogeneity of genes was defined as those with fold change absolute value of coefficient of variations vs controls > 0.2; (b) Jaccard similarity coefficient of one-on-one comparison among ten AML subtypes to measure differentially expressed gene similarity. Only patients with unique AML subtype from the thirteen lesions are included. AML subtypes which show higher similarity were colored by red.

25 Supplementary Tables Supplementary Table 1. Multivariate Cox proportional hazards regression model for PDR, age, gender, t(15;17), inv(16), Del(5;7q), t(v;11q23), t(8;21), EVI1, CEBPA-dm, NPM1, IDH1/2, IDH1/2 + DNMT3A, DNMT3A, FLT3ITD, TET2 and CEBPA-sil. 95% CI for HR Variables P value1 Hazard Ratio(HR) Lower Upper Class:High PDR 0.0058 2.5127 1.3056 4.8361 t(v;11q23) 0.0420 0.1672 0.0298 0.9370 EVI1 0.0136 4.9978 1.3921 17.9433 CEBPA-dm 0.0067 0.2356 0.0828 0.6703 NPM1 0.0194 0.3455 0.1417 0.8425

1 Here the factors with P value < 0.05 were shown.

26 Supplementary Table 4. The LSC signatures were significantly enriched by CEBPA-sil specific genes. (Hypergeometric test, FDR adjusted P-value (PFDR)<0.1)

LSC signatures Source PFDR LSC+ vs LSC- DEG PMID:27926740 0.0355 CD34+CD38- LSC vs CD34+CD38+ LSC UP PMID:17039238 0.0355 CD34+CD38- LSC vs CD34+CD38+ LSC DN PMID:17039238 0.0355 HALLMARK INFLAMMATORY RESPONSE MsigDBv6.1(M5932) 0.0523

27 Supplementary Table 5. TFs enriched for the eloci in Idh2R140Q;Flt3ITD double mutant mouse model.

TFs PFDR TFs PFDR GATA3 0.0402 NKX6-2 0.0010 MECOM 0.0250 ONECUT1 0.0034 PBX1 0.0056 TCF4 0.0074 CEBPA 0.0263 SMAD3 0.0003 FOXA2 0.0402 GTF2A1 0.0206 POU2F1 0.0119 CDX2 0.0034 NFKB 0.0325 POU1F1 0.0090 NR3C1 0.0183 EGR 0.0474 NKX2-5 0.0435 FOX 0.0474 PAX4 0.0213 SP1 0.0325 MSX1 0.0183 NR3C1 0.0325 ZEB1 0.0403 ESR1 0.0325 ALX1 0.0402 VSX2 0.0034 MEIS1A/HOXA9 0.0157 GFI1 0.0034 PDX1 0.0221 LEF1 0.0027 NKX3-1 0.0034 MAZ 0.0474 POU3F2 0.0221 MYOD1 0.0221 CDC5L 0.0325 NFATC 0.0474 PITX2 0.0402 OCT 0.0221

28 Supplementary table 6. 26 loci that make up the epiallele predictor for survival. Loci Genes chr2-133014316:133014322:133014326:133014330 MIR663B chr3-75718064:75718068:75718078:75718080 RP11-413E6.7 chr3-75718316:75718319:75718328:75718339 RP11-413E6.7 chr3-75718349:75718368:75718374:75718379 RP11-413E6.7 chr3-75718367:75718373:75718378:75718381 RP11-413E6.7 chr4-190862158:190862170:190862182:190862200 FRG1 chr9-140772529:140772533:140772535:140772540 CACNA1B chr9-140772533:140772535:140772540:140772542 CACNA1B chr9-140772535:140772540:140772542:140772544 CACNA1B chr10-127584621:127584627:127584630:127584635 FANK1 chr10-127584621:127584627:127584630:127584635 DHX32 chr10-127584627:127584630:127584635:127584641 FANK1 chr10-127584627:127584630:127584635:127584641 DHX32 chr10-127584630:127584635:127584641:127584646 FANK1 chr10-127584630:127584635:127584641:127584646 DHX32 chr10-127584670:127584689:127584704:127584707 FANK1 chr10-127584670:127584689:127584704:127584707 DHX32 chr10-127585307:127585309:127585315:127585327 FANK1 chr10-127585307:127585309:127585315:127585327 DHX32 chr10-135491043:135491045:135491055:135491067 DUX4L13 chrY-13470607:13470616:13470620:13470632 DUX4L17 chr3-75718609:75718616:75718622:75718631 - chr9-66456013:66456016:66456029:66456034 - chr9-66456016:66456029:66456034:66456040 - chr16-1235689:1235701:1235705:1235708 - chr16-1235701:1235705:1235708:1235712 - chrY-13463228:13463241:13463262:13463264 - chrY-13476550:13476578:13476584:13476595 - chr22-21636100:21636102:21636108:21636111 - chr22-21636102:21636108:21636111:21636116 - chr22-21636108:21636111:21636116:21636130 -

29 Supplementary Table 7. Multivariate Cox proportional hazards regression model for the SuperPC score, age, gender, FLT3ITD, CEBPA-dm, t(8;21), NPM1, inv(16) and ELN risk stratification. 95% CI for HR Variables P value Hazard Ratio(HR) Lower Upper SuperPC score 4.11× 10!" 3.0238 1.8883 4.842 Age 0.5411 0.9917 0.9656 1.019 Gender:Male 0.0924 0.5977 0.3282 1.088 FLT3ITD 0.9764 1.0127 0.4395 2.333 CEBPA-dm 0.8209 0.8543 0.2185 3.340 t(8;21) 0.9950 1.0076 0.0947 10.726 NPM1 0.7357 1.1910 0.4316 3.287 inv(16) 0.9029 0.9100 0.2001 4.139 ELN risk:Intermediate 0.4330 0.6948 0.2796 1.727 ELN risk:Good 0.2112 0.4266 0.1122 1.622

30 Supplementary table 9. Sample information of mouse model.

Sample numbers of mouse model with ERRBS (*The data were downloaded from GEO via accession number: GSE57114) Mouse model Number *Wild type 3 Idh2R140Q 3 *Flt3ITD 3 *Tet2-/- 3 Idh2R140Q;Flt3ITD 4 *Tet2-/-;Flt3ITD 3

Sample numbers of mouse model with drug treatment which were performed ERRBS (The data were downloaded from GEO via accession number GSE78690) The Idh2R140Q;Flt3ITD mouse model The Tet2-/-;Flt3ITD mouse model Drug treatment Number Drug treatment Number Vehicle 4 Vehicle 6 AC220 3 AC220 4 AG-221 4 AzaC 4 AG-221+ AC220 2 AzaC+ AC220 3

Sample numbers of mouse model with RNA sequencing Mouse model Number Wild type 4 Idh2R140Q 2 Idh2R140Q;Flt3ITD 5

Sample numbers of mouse model with drug treatment which were performed RNA sequencing (The data were downloaded from GEO via accession number: GSE78691) the Idh2R140Q;Flt3ITD mouse model The Tet2-/-;Flt3ITD mouse model Drug treatment Number Drug treatment Number Vehicle 4 Vehicle 4 AG-221 4 AzaC 4

31 Supplementary References

1. Glass JL, Hassane D, Wouters BJ, Kunimoto H, Avellino R, Garrett-Bakelman FE, et al. Epigenetic Identity in AML Depends on Disruption of Nonpromoter Regulatory Elements and Is Affected by Antagonistic Effects of Mutations in Epigenetic Modifiers. Cancer Discov 2017;7(8):868-83. 2. Verhaak RG, Wouters BJ, Erpelinck CA, Abbas S, Beverloo HB, Lugthart S, et al. Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling. Haematologica 2009;94(1):131-4. 3. Dai M, Wang P, Boyd AD, Kostov G, Athey B, Jones EG, et al. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res 2005;33(20):e175. 4. Shih AH, Meydan C, Shank K, Garrett-Bakelman FE, Ward PS, Intlekofer AM, et al. Combination Targeted Therapy to Disrupt Aberrant Oncogenic Signaling and Reverse Epigenetic Dysfunction in IDH2- and TET2-Mutant Acute Myeloid Leukemia. Cancer Discov 2017;7(5):494-505. 5. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 2013;29(1):15-21. 6. Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res 2013;41(10):e108. 7. Spencer DH, Russler-Germain DA, Ketkar S, Helton NM, Lamprecht TL, Fulton RS, et al. CpG Island Hypermethylation Mediated by DNMT3A Is a Consequence of AML Progression. Cell 2017;168(5):801-16 e13. 8. De S, Shaknovich R, Riester M, Elemento O, Geng H, Kormaksson M, et al. Aberration in DNA methylation in B-cell lymphomas has a complex origin and increases with disease severity. PLoS Genet 2013;9(1):e1003137. 9. Landau DA, Clement K, Ziller MJ, Boyle P, Fan J, Gu H, et al. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell 2014;26(6):813-25. 10. Landan G, Cohen NM, Mukamel Z, Bar A, Molchadsky A, Brosh R, et al. Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat Genet 2012;44(11):1207-14. 11. Sherwin WB. Entropy and information approaches to genetic diversity and its expression: Genomic geography. entropy 2010;12(7):1765-98. 12. Li S, Garrett-Bakelman F, Perl AE, Luger SM, Zhang C, To BL, et al. Dynamic evolution of clonal epialleles revealed by methclone. Genome Biol 2014;15(9):472. 13. Li S, Garrett-Bakelman FE, Chung SS, Sanders MA, Hricik T, Rapaport F, et al. Distinct evolution and dynamics of epigenetic and genetic heterogeneity in acute myeloid leukemia. Nat Med 2016;22(7):792-9. 14. Figueroa ME, Skrabanek L, Li Y, Jiemjit A, Fandy TE, Paietta E, et al. MDS and secondary AML display unique patterns and abundance of aberrant DNA methylation. Blood 2009;114(16):3448-58. 15. Maaten LvdH, G. Visualizing Data using t-SNE. J Mach Learn Res 2008;9:2579-605. 16. van der Maaten L. Accelerating t-SNE using Tree-Based Algorithms. J Mach Learn Res 2014;15:3221-45. 17. Ruxton GD. The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test. Behavioral Ecology 2006;17(4):688-90. 18. Derrick B, Toher D, White P. Why Welch’s test is Type I error robust. The Quantitative Methods for Psychology 2016;12(1):30-8.

32 19. Chacon D, Beck D, Perera D, Wong JW, Pimanda JE. BloodChIP: a database of comparative genome-wide transcription factor binding profiles in human blood cells. Nucleic Acids Res 2014;42(Database issue):D172-7. 20. Luskin MR, Gimotty PA, Smith C, Loren AW, Figueroa ME, Harrison J, et al. A clinical measure of DNA methylation predicts outcome in de novo acute myeloid leukemia. JCI Insight 2016;16(1):9. 21. Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res 2004;10(21):7252-9. 22. Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2004;2(4):E108. 23. Dohner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Buchner T, et al. Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel. Blood 2017;129(4):424-47.

33