SUPPLEMENTARY FIGURE LEGENDS 1. Examples of BET-inhibited prediction by ROSE super-enhancers. A. TMC3, a gene that is associated with a super-enhancer (shown in red) but is not down-regulated by JQ1. B. VREB3, an example of a JQ1 down-regulated gene that is not super-enhancer associated even though the gene is surrounded by highly enriched BRD4 and H3K27ac chromatin. C. RIPK4, a gene that is super-enhancer associated but not JQ1 down-regulated. D. F13A1, a JQ1 down-regulated gene that has a super-enhancer nearby, but is not detected as a super-enhancer target gene since the super-enhancer is located further than 50kb from the transcription start site. Narrow H3K27ac enriched regions that are not super-enhancers are located closer to the transcription start site of this gene.

2. Regulatory potential distance parameter effect on prediction of BET-inhibition in BET- inhibited cells, including 5 DLBCL cells lines (DHL6, HBL1, LY1, LY4, and TOLEDO), one liver cancer cell line (HepG2) and one malignant peripheral nerve sheath tumor cell line (90-8TL). The area under the curve of BET-inhibited changes with the decay rate in the regulatory potential function. The optimum range for the ½ potential lies in the range between 6kb and 20kb.

3. Performance comparison of the regulatory potential, relative regulatory potential and ROSE super-enhancer associated genes in the prediction of BET-inhibited genes. Receiver operator curves in 5 DLBCL cell lines (LY1, LY4, DHL6, HBL1, and TOLEDO) and in the malignant peripheral nerve sheath tumor cell line 90-8TL consistently show that the best and worst methods are respectively the relative regulatory potential and ROSE. Inclusion of topologically associating domains does not impact the prediction performance to a significant degree.

4. BET-inhibited gene prediction performance for the regulatory potential calculated using all reads and a regulatory potential that includes only reads confined to H3K27ac ChIP-seq peak regions. Receiver operator curves in 5 DLBCL cell lines (LY1, LY4, DHL6, HBL1, and TOLEDO) and the liver cancer cell line HepG2 show no significant difference between the two predictions.

5. analysis of the 500 genes with the highest median regulatory potentials across H3K27ac ChIP-seq compendium of 365 samples.

6. Precision-recall performance comparison of the regulatory potential, relative regulatory potential and ROSE super-enhancer associated genes in the prediction of BET-inhibited genes in DLBCL (LY1, LY4, TOLEDO, DHL6, and HBL1) and in the liver cancer cell line HepG2.

7. Regulatory potentials and relative regulatory potentials correlate with statistical significance (log q-value) of BET-inhibited genes in DLBCL (LY1, LY4, DHL6, HBL1, and TOLEDO) and in the liver cancer cell line HepG2. Boxplots represent the bin median, the interquartile range and whiskers that extend to the most extreme data point that is not more than the interquartile range from the box.

8. Regulatory potentials and relative regulatory potentials correlate with fold-change in gene expression of BET-inhibited genes in DLBCL (LY1, LY4, DHL6, HBL1, and TOLEDO) and in the liver cancer cell line HepG2. Boxplots represent the bin median, the interquartile range and whiskers that extend to the most extreme data point that is not more than the interquartile range from the box.

9. Gene Ontology analysis of the 500 genes with the highest relative regulatory potential or ROSE super-enhancer index in 14 diverse H3K27ac samples. Genes with high relative regulatory potentials are more highly enriched in cell type related GO categories than super-enhancer associated genes.

10. Number of samples selected by MARGE-express. Cross-validation AUCs increase then level off as more samples are added to the MARGE-express logistic regression model.

11. MARGE reweighted H3K27ac profiles in the identification of AR binding sites from testosterone induced genes in the LNCaP prostate cancer cell line.

SUPPLEMENTARY TABLES 1. Description of human H3K27ac ChIP-seq data sets. Table includes the following information for each H3K27ac ChIP-seq sample: internal dataset identifier; cell lineage; whether the sample is derived from a cell line or from tissue; whether the sample is cancer or normal; the cell line name; tissue/organ; the cell line classification; detailed tissue type; Gene Expression Omnibus (GEO) accession numbers for ChIP and control samples.

2. Description of mouse H3K27ac ChIP-seq data sets. Table includes the following information for each H3K27ac ChIP-seq sample: internal dataset identifier; cell lineage; whether the sample is derived from a cell line or from tissue; whether the sample is cancer or normal; the cell line name; tissue/organ; the cell line classification; detailed tissue type; GEO accession numbers for ChIP and control samples.

3. Relative regulatory potentials derived from human H3K27ac ChIP-seq data sets. Table includes the following information: ; position of transcription start site on human reference genome hg38; RefSeq ID; GeneSymbol; strand; median regulatory potential across H3K27ac samples in the human compendium; relative regulatory potentials of samples indexed by sample ID (to be cross-referenced with Supplementary Table S1).

4. Relative regulatory potentials derived from mouse H3K27ac ChIP-seq data sets. Table includes the following information: chromosome, position of transcription start site on mouse reference genome mm10; RefSeq ID; GeneSymbol; strand; median regulatory potential across H3K27ac samples in mouse compendium; relative regulatory potentials of samples indexed by sample ID (to be cross-referenced with Supplementary Table S2).

5. Median relative regulatory potentials of transcription factors, chromatin regulators and kinases across human neuronal, lymphoblastoid, and embryonic stem cell types.

6. Samples selected by MARGE-express for AR, ESR1, GR, PPARG, POU5F1 and NOTCH examples. Table includes sample IDs (to be cross-referenced with Supplementary Table S1), MARGE-express model coefficients, cell line, cell type, tissue, condition, cancer type and GEO accession numbers.

7. MARGE-express predictions for up-regulated gene sets in AR, ESR1, GR, PPARG, POU5F1 and NOTCH examples. Table includes Gene Symbol, RefSeq ID, MARGE- express prediction values and whether (1) or not (0) the gene is in the input gene set.

8. MARGE-cistrome predictions for AR, ESR1, GR, PPARG, POU5F1 (OCT4) and NOTCH examples. Table includes chromosome, start position of DHS region (hg38), end position of DHS region, DHS identifier, MARGE-cistrome predictions for AR, ESR1, GR, PPARG, POU5F1 (OCT4) and NOTCH, and whether (1) or not (0) DHS regions overlap with ChIP- seq peaks for the respective factors.

9. Samples selected by MARGE-express for LNCaP-abl siRNA knockdown experiments. Table includes sample IDs (to be cross-referenced with Supplementary Table S1), MARGE-express model coefficients, cell line, cell type, tissue, condition, cancer type and GEO accession numbers.

10. MARGE-express predictions for LNCaP-abl siRNA knockdown experiments of AR, E2F1, EZH2, FOXA1, FOXM1, KDM1A, MALAT1, RAD21, and UTX. Table includes RefSeq ID, GeneSymbol, MARGE-express predictions and whether (1) or not (0) the gene is in the input gene sets. MARGE-express predictions based on public H3K27ac data alone and public data combined with in-house H3K27ac ChIP-seq data are provided.

11. MARGE-cistrome predictions for AR, E2F1 and FOXA1 siRNA knockdown experiments in LNCaP-abl cells. Table includes DHS ID, chromosome, center position of DHS region (hg38), MARGE-cistrome predictions for AR, E2F1 and FOXA1 binding, and whether (1) or not (0) DHS regions overlap with ChIP-seq peaks for the respective factors. MARGE- cistrome predictions based on public H3K27ac data alone and public data combined with in-house H3K27ac ChIP-seq data are provided.

Supplementary Figure 1

a b 145kb 30 kb

6 BRD4 (J Q-)

6 BRD4 (J Q+)

6 H3K27ac(J Q-)

6 H3K27ac(J Q+)

TMC3 VPREB3 c d 224kb 180kb 6 BRD4 (J Q-)

6 BRD4 (J Q+)

6 H3K27ac(J Q-)

6 H3K27ac(J Q+)

RIPK4 F13A1 Supplementary Figure 2

0.74 LY1

0.72

0.70 HBL1

HEPG2 0.68 90-80TL performance(AUC)

LY4 0.66

TOLEDO 0.64 DHL6

1kb 2kb 4kb 6kb 8kb 10kb 15kb 25kb 50kb

influence function characteristic distance (log scale) Supplementary Figure 3

HepG2 LY4 true positive true rate positive true rate

regulatory potential (0.72) regulatory potential (0.70) regulatory potential with TAD (0.71) regulatory potential with TAD (0.70) relative regulatory potential (0.74) relative regulatory potential (0.76) ROSE super-enhancer (0.55) ROSE super-enhancer (0.53) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate

90-8TL DHL6 true positive true rate positive true rate

regulatory potential (0.71) regulatory potential (0.66) regulatory potential with TAD (0.71) regulatory potential with TAD (0.66) relative regulatory potential (0.75) relative regulatory potential (0.74) ROSE super-enhancer (0.50) ROSE super-enhancer (0.55) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate

HBL1 TOLEDO true positive true rate positive true rate

regulatory potential (0.71) regulatory potential (0.67) regulatory potential with TAD (0.71) regulatory potential with TAD (0.68) relative regulatory potential (0.77) relative regulatory potential (0.73) ROSE super-enhancer (0.55) ROSE super-enhancer (0.54) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate Supplementary Figure 4

LY1 LY4 true positive true rate positive true rate

all reads regulatory potential (0.75) all reads regulatory potential (0.70) peak reads regulatory potential (0.75) peak reads regulatory potential (0.70) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate

TOLEDO DHL6 true positive true rate positive true rate

all reads regulatory potential (0.67) all reads regulatory potential (0.66) peak reads regulatory potential (0.67) peak reads regulatory potential (0.66) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate

HBL1 HEPG2 true positive true rate positive true rate

all reads regulatory potential (0.71) all reads regulatory potential (0.72) peak reads regulatory potential (0.71) peak reads regulatory potential (0.72) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 false positive rate false positive rate Supplementary Figure 5

nucleosome organization

chromatin organization

chromatin assembly or disassembly

nucleosome assembly

chromatin assembly

-DNA complex assembly

DNA packaging

chromosome organization

regulation of transcription

transcription

0 2 4 6 8 10 12 gene ontology enrichment -log10 FDR Supplementary Figure 6

LY1 LY4

regulatory potential regulatory potential relative regulatory potential relative regulatory potential ROSE super-enhancer ROSE super-enhancer precision precision 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 recall recall

TOLEDO DHL6 regulatory potential regulatory potential relative regulatory potential relative regulatory potential ROSE super-enhancer ROSE super-enhancer precision precision 0.00 0.04 0.08 0.12 0.00 0.05 0.10 0.15 0.20

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 recall recall

HBL1 HEPG2

regulatory potential regulatory potential relative regulatory potential relative regulatory potential ROSE super-enhancer ROSE super-enhancer precision precision 0.0 0.2 0.4 0.6 0.0 0.1 0.2 0.3 0.4

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 recall recall

relativeregulatory ventiles potential relativeregulatory ventiles potential relativeregulatory ventiles potential

68

2

4

67 2.0 3 5 1 2 4 2.5 1.0 . 0.5 0.0 1.5

10

q-value

-log 0 0 10 q-value -log 10 -log q-value

regulatorypotentialventiles regulatorypotentialventiles

regulatorypotentialventiles

68 HEPG2

2

4

67 2.0 3 5 1 HBL1 2 4 LY4 2.5 1.0 . 0.5 0.0 1.5

10

q-value

-log 0 0 10 q-value -log 10 -log q-value

relativeregulatory ventiles potential relativeregulatory ventiles potential potential ventiles potential relativeregulatory

6 3 5 1 2 4

6 3 5 1 67

4 3 5 1 2

4

0 2 10

q-value -log 0 0 10

q-value -log

10 -log q-value

6 3 5 regulatorypotentialventiles regulatorypotentialventiles regulatorypotentialventiles 1 2 4

67 3 5 1 LY1 2

DHL6 4 6 3 5 1 TOLEDO 4

0 2 10

q-value

-log 0 0 10 10 q-value -log q-value -log Supplementary Figure 7 Figure Supplementary Supplementary Figure 8

LY1 LY4 0.0 0.0 0.0 0.0 -0.5 -0.5 -0.5 -0.5 -1.0 -1.0 -1.0 -1.0 fold change fold change fold change fold change fold 2 2 2 2 log log log log -1.5 -1.5 -1.5 -1.5 -2.0 -2.0 -2.0 -2.0

regulatory potential ventiles relative regulatory regulatory potential ventiles relative regulatory potential ventiles potential ventiles

DHL6 HBL1 0.0 0.0 0.0 0.0 -0.5 -0.5 -0.5 -0.5 -1.0 -1.0 -1.0 -1.0 fold change fold change fold change fold change fold 2 2 2 2 log log log log -1.5 -1.5 -1.5 -1.5 -2.0 -2.0 -2.0 -2.0 regulatory potential ventiles relative regulatory regulatory potential ventiles relative regulatory potential ventiles potential ventiles

TOLEDO HEPG2 0.0 0.0 0.0 0.0 -0.5 -0.5 -0.5 -0.5 fold change fold changelog change fold change fold change fold 2 2 2 2 log log log log -1.0 -1.0 -1.0 -1.0 -1.5 -1.5 -1.5 -1.5

relative regulatory relative regulatory regulatory potential ventiles potential ventiles regulatory potential ventiles potential ventiles Supplementary Figure 9

ROSE super-enhancer 0 -log10 p-value 32

relative regulatory potential

angiogenesis blood vessel morphogenesis immune system process skeletal system development endocrine system development pancreas development endocrine pancreas development digestion biological adhesion cell adhesion epidermis development ectoderm development vasculature development regulation of heart contraction regulation of system process muscle system process muscle contraction regulation of nervous system development tube development regulation of leukocyte activation regulation of T cell activation regulation of lymphocyte activation response to wounding inflammatory response defense response lymphocyte activation leukocyte activation cell activation immune response blood vessel development heart development muscle organ development extracellular matrix organization anterior/posterior pattern formation regionalization embryonic skeletal system development pattern specification process sexual reproduction male gamete generation spermatogenesis gamete generation

aorta heart lung thymus pancreas B-cell blood T-cell blood umbilical vein adipose tissueadrenal gland monocyte blood embryonic stem cell mammary epithelial

skeletal muscle myoblast Supplementary Figure 10 0.90 0.95 1.00 area undercurvearea ROC

ESR1 AR PPARG NOTCH GR OCT4 0.70 0.75 0.80 0.85

246810 12 14 number of H3K27ac samples in regression model Supplementary Figure 11

a b 35 kb 35 kb 6 H3K27ac:LNCaP (DHT+)

0 6 H3K27ac:LNCaP (DHT-)

0 6 H3K27ac:T Cell

0 6 H3K27ac:Sperm

0 6 H3K27ac:Blood

0 UDHS 30 MARGE reweighted

0 AR Peaks

THRAP3 AIDA BROX BROX

c d 35 kb 35 kb 6 H3K27ac:LNCaP (DHT+)

0 6 H3K27ac:LNCaP (DHT-)

0

6 H3K27ac:T Cell

0 6 H3K27ac:Sperm

0 6 H3K27ac:Blood

0 UDHS 30 MARGE reweighted-

0 AR Peaks

SRP9 MEAF6