bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Contribution of CTCF binding to transcriptional activity at the HOXA locus in NPM1-

2 mutant AML cells

3

4 Reza Ghasemi1, Heidi Struthers1, Elisabeth R. Wilson1, and David H. Spencer1,2*

5

6 1 Department of Medicine, Division of Oncology, Section of Stem Cell Biology, Washington

7 University School of Medicine, St. Louis, MO, USA

8 2 McDonnell Genome Institute, Washington University, St. Louis, MO, USA

9

10 Running title: HOXA CTCF binding in NPM1-mutant AML

11

12 The authors have no conflicts of interest relating to the work described in this manuscript.

13

14 * Corresponding author

15 David H. Spencer, MD, PhD 16 Assistant Professor 17 Division of Oncology 18 Department of Medicine 19 Washington University School of Medicine 20 Campus Box 8007 21 660 South Euclid Avenue 22 St. Louis, MO. 63110 23 24

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

25 Abstract

26 Transcriptional regulation of the HOXA is thought to involve CTCF-mediated

27 loops and the opposing actions of the COMPASS and Polycomb epigenetic complexes. We

28 investigated the role of these mechanisms at the HOXA cluster in AML cells with the common

29 NPM1c mutation, which express both HOXA and HOXB genes. CTCF binding at the HOXA

30 locus is conserved across primary AML samples, regardless of HOXA expression, and

31 defines a continuous chromatin domain marked by COMPASS-associated histone H3

32 trimethylation in NPM1-mutant primary AML samples. Profiling of the three-dimensional

33 chromatin architecture of NPM1-mutant OCI-AML3 cells identified chromatin loops between the

34 active HOXA9-HOXA11 genes and loci in the SNX10 gene and an intergenic region located

35 1.4Mbp upstream of the HOXA locus. Deletion of CTCF binding sites in OCI-AML3 cells

36 reduced these interactions, but resulted in new, CTCF-independent loops with regions in the

37 SKAP2 gene that were marked by -associated histone modifications in primary AML

38 samples. HOXA was maintained in the CTCF deletion mutants, indicating that

39 transcriptional activity at the HOXA locus in NPM1-mutant AML cells does not require long-

40 range CTCF-mediated chromatin interactions, and instead may be driven by intrinsic factors

41 within the HOXA gene cluster.

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

42 Introduction

43 The HOX genes developmentally regulated factors that are highly

44 expressed in acute myeloid leukemia (AML) and are important drivers of malignant self-renewal

45 in this disease. Understanding how HOX genes are regulated in AML may therefore provide

46 valuable information about the mechanisms that initiate and promote leukemogenesis. Previous

47 studies have shown that expression of HOX family members in AML is nearly always restricted

48 to specific genes in the HOXA and/or HOXB gene clusters (HOXC and HOXD genes are rarely

49 expressed), and that expression patterns correlate with recurrent AML mutations (1). HOX

50 expression is most closely associated with AMLs with MLL rearrangements, which exclusively

51 express HOXA genes, and AMLs with the recurrent NPM1c mutation, which nearly always

52 express both HOXA and HOXB cluster genes. The high prevalence of NPM1 mutations make

53 the combined HOXA/HOXB expression pattern the most common HOX phenotype in AML

54 patients. However, the regulatory mechanisms that drive this expression pattern are poorly

55 understood.

56

57 Studies of HOX gene regulation in model organisms have established that colinear expression

58 of each HOX cluster is mediated by COMPASS/Trithorax and Polycomb group proteins (PcG),

59 which promote gene activation and repression and perform methylation of histone H3 at lysine 4

60 (H3K4me3) and 27 (H3K27me3), respectively (2,3). These regulatory pathways are also

61 involved in HOX gene regulation in AML cells, and are best understood for the HOXA cluster in

62 AMLs with MLL rearrangements. MLL1 (KMT2A) is a component of the COMPASS complex,

63 and MLL fusion proteins bind to the HOXA locus in AML cells and recruit the non-COMPASS

64 histone H3 methyltransferase DOT1L, which is required for HOXA activation and AML

65 development in MLL-rearranged leukemia models (4–7). Regulatory DNA elements that control

66 three-dimensional chromatin architecture also play a role in HOXA gene regulation in AML cells.

67 Specifically, the HOXA and HOXB clusters contain multiple binding sites for the chromatin

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

68 organizing factor CTCF, and chromatin conformation experiments suggest these events

69 mediate local chromatin loops in AML cells with MLL rearrangements (8). In addition,

70 heterozygous deletion of a single CTCF binding site in the HOXA cluster in MLL-rearranged

71 AML cells resulted in altered chromatin structure and reduced HOXA gene expression (9).

72 These studies suggest that MLL fusion proteins directly activate the HOXA locus in ways that

73 require specific CTCF binding events or their associated chromatin structures.

74

75 While these mechanistic insights have provided valuable information about HOXA regulation in

76 MLL-rearranged AML, this molecular subtype accounts for <5% of all adult AML patients and

77 only 25% of AMLs that express HOXA genes (1). Although AMLs with the NPM1 mutations

78 nearly always express HOXA genes (along with HOXB genes), it is unclear whether HOXA

79 expression in these cells shares similar regulatory factors and chromatin structures that appear

80 to be critical for HOXA expression in MLL-rearranged AML cells. In this study, we investigated

81 histone modifications and chromatin domains at the HOXA locus in NPM1-mutated AML

82 samples vs. other AML subtypes, and used a NPM1-mutant AML cell line model to determine

83 whether CTCF binding at multiple sites at the HOXA locus is required to maintain HOXA gene

84 expression and chromatin structure in NPM1-mutated AML cells.

85

86 Materials and Methods

87 Primary AML samples, normal hematopoietic cells, and AML cell lines. Primary AML samples

88 and normal hematopoietic cells were obtained from bone marrow aspirates from AML patients

89 at diagnosis or healthy volunteers following informed consent using protocols approved by the

90 Human Research Protection Office at Washington University and have been described

91 previously ((10,11); see Table S1). OCI-AML3 cells were cultured at 0.5-1 x 106 cell/mL in MEM-

92 alpha media with 20% FBS and 1% penicillin-streptomycin. The presence of NPM1c mutation in

93 the parental OCI-AML3 line was verified by targeted sequencing at the beginning of the study

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

94 and in RNA-seq data from wild type and mutant clones. Kasumi-1, IMS-MS2, and MOLM13 cell

95 lines were cultured in RPMI-1640 with 1% penicillin-streptomycin and fetal calf serum (20% for

96 Kasumi-1 and MOLM13, 10% for IMS-MS2).

97

98 ChIP-seq preparation and analysis. ChIP-seq was performed using the ChIPmentation method

99 (12) with the following ChIP-grade antibodies: CTCF (2899S), H3K27me3 (9733S), and

100 H3K27ac (8173S) from Cell Signaling Technology and H3K4me3 (ab1012) from Abcam. All

101 libraries were sequenced on a NovaSeq 6000 (Illumina, San Diego, CA) to obtain ~50 million

102 150 bp paired end reads, and data were analyzed via adapter trimming with trim galore and

103 alignment to the GRCh38 human reference sequence using bwa mem (13). Preparation of

104 normalized coverage for visualization and analysis used the deeptools ‘bamCoverage’ tool (14),

105 and peaks were called with macs2 using default parameters (15) for CTCF and epic2 (16) for all

106 histone marks. Statistical comparisons were performed with DESeq2 (17) using raw fragment

107 counts at identified peak summits. Visualizations were prepared using the Gviz package in R

108 (18).

109

110 Targeted deletion of CTCF binding sites via CRISPR-Cas9. Targeted deletions in OCI-AML3

111 cells were generated using CRISPR/Cas9 with guide RNAs from the UCSC genome browser

112 (19,20) that overlapped CTCF ChIP-seq peaks from this study (see Table S2). Mutagenesis

113 was performed via either an inducible Cas9-expressing OCI-AML3 cell line (Lenti-iCas9-neo

114 vector; Addgene 85400) with lentiviral sgRNA expression (Addgene 70683), or transient

115 transfection of OCI-AML3 cells with Cas9 protein complexed with synthetic tracrRNA/crRNA

116 hybrids (Alt-R system, IDT, Coralville, IA). For the latter, RNAs were complexed with Cas9

117 protein using the manufacturer’s protocol with 1 million cells and 28 μM of Cas9/RNA for either

118 transfection (CRISPRMAX; Thermofisher Scientific, Waltham, MA) or nucleofection (SG Amaxa

119 Cell Line 4D-Nucleofector Kit, Lonza, Basel, Switzerland). Mutation efficiency was assessed in

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

120 bulk cultures via DNA extraction, PCR with tailed primers (see Table S2), and sequencing to

121 obtain 2x250 bp reads on an Illumina MiSeq instrument. Deletion abundance was assessed

122 using custom scripts to analyze insertions/deletion in minimap2-based alignments (21). Single

123 cell sorting into 96 well plates via FACS was used for expansion of individual mutant clones.

124 Cells from single wells were screened via direct lysis by proteinase K (P8107S; NEB) in 20 μl of

125 single cell lysis buffer (10 mM Tris-HCl pH 7.6, 50 mM NaCl, 6.25 mM MgCl2, 0.045% NP40,

126 0.45% Tween-20), PCR amplification, and gel electrophoresis; clones with evidence for

127 deletions from the amplicon band size were sequenced, and clones with deletions were

128 expanded for analysis.

129

130 RNA analysis. RNA extractions were performed using ~1 million cells using Quick-RNA

131 MicroPrep Kit (Zymo Research, Irvine, CA). RT-qPCR for HOXA9 used 100 ng of RNA for

132 cDNA synthesis (Applied Biosystems, Foster City, CA) and qPCR in duplicate in a StepOnePlus

133 PCR System (Applied Biosystems) for HOXA9 exons 1-2 with a GUSB internal control (IDT).

134 RNA-seq libraries were generated using 300 ng of RNA with the Kapa Hyper total stranded

135 RNA library kit for Illumina (Roche) following the manufacturer’s instructions. Libraries were

136 sequenced on a NovaSeq 6000 to obtain ~50 million 2x150 bp reads. Reads were trimmed with

137 trim galore, aligned with STAR (22), and transcript-level expression values in TPM were

138 obtained with stringtie (23).

139

140 In situ Hi-C. Hi-C libraries were prepared as described previously (24) using 4-5 million cells as

141 input. Final libraries were first assessed by sequencing ~1 million reads on a MiSeq instrument

142 (using metrics recommended by Rao et al., (24)); passing libraries were sequenced to obtain

143 >400 million 2x150 bp reads on a NovaSeq 6000. Hi-C data were analyzed on GRCh38 with

144 juicer ((25); see Table S4). All analyses used contact matrices (mapping quality >30),

145 deduplicated chimeric reads from the ‘merged_pairs.txt’ file, and chromatin loops and contact

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

146 domains identified using HICCUPS and arrowhead, respectively, with default parameters (25).

147 Chromatin loops from wild type and mutant OCI-AML3 cells were merged using bedtools

148 ‘pairToPair’ function (26) with 5,000 bp overlap. Pairwise and joint comparisons of chromatin

149 loop intensities were performed with hicCompare (27) and multiHicCompare (28), respectively,

150 using raw contact counts from 7 obtained from the juicer hic file at 25 kbp

151 resolution. Statistical comparisons used chromosome-wide FDR corrections for multiple

152 hypothesis testing. Visualizations used the HiTC (29), GenomicInteractions and Gviz R

153 packages.

154

155 Data availability. All processed sequence data from this study (e.g., bigwig files for ChIP-seq,

156 TPM values for RNA-seq, and Hi-C contact matrices) are available for public download at the

157 following site: https://wustl.box.com/v/ghasemiCTCFHOXA. Raw sequence files for cell lines will

158 be made available upon request.

159

160 Results

161 CTCF defines dynamic chromatin domains at the HOXA locus in primary AML cells

162 We previously showed that specific regions in the HOXA gene cluster have accessible

163 chromatin in primary AML samples that coincide with CTCF binding sites in other human cell

164 types (1). To confirm the presence of CTCF at these loci in primary AML cells, we performed

165 CTCF ChIP-seq on bone marrow samples from AML patients with the NPM1c insertion mutation

166 in NPM1, t(9;11) and t(11;19) MLL rearrangements, and t(8;21) creating the RUNX1-RUNX1T1

167 gene fusion, which displayed the expected HOXA and HOXB, HOXA only, and no HOX

168 expression patterns, respectively (Table S1, Figure S1A). This confirmed CTCF binding in

169 chromatin-accessible regions between HOXA6 and HOXA7 (“CTCF Binding Site CBSA6/7”),

170 and HOXA7 and HOXA9 (“CBSA7/9”), near the transcriptional start site of HOXA10 (“CBSA10”),

171 and in the 5’ UTR of HOXA13 (“CBSA13”) (Figure 1A). Additional CTCF peaks with lower

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

172 signals were present near the promoters of HOXA4, and HOXA5. All four major CTCF peaks

173 were present in all AML samples regardless of HOXA expression status, and quantification of

174 the CTCF signal demonstrated similar occupancy among the different AML types (Figures 1B-

175 D).

176

177 We next determined whether these CTCF binding events defined chromatin domains in primary

178 AML samples by performing ChIP-seq for H3K4me3 and H3K27me3 to measure active and

179 repressed chromatin, respectively. This identified a distinct region of active chromatin in the

180 center of the HOXA cluster that overlapped CBSA6/7 and CBSA7/9, and was conserved in the

181 MLL-rearranged and NPM1-mutant AML samples, and normal CD34+ cells (which also express

182 HOXA and HOXB genes) (Figures 2A, S2A). The H3K4me3 signal was continuous across this

183 interval, including non- sequences, and was also marked with H3K27ac in primary

184 AML samples (Figure S2B). Regions adjacent to this active chromatin domain possessed

185 repressive H3K27me3 marks in all AML samples and in CD34+ cells, which correlated with the

186 expression levels of the overlapping genes (see Figures S1A, S2A). Histone ChIP-seq using

187 AML samples with the RUNX1-RUNX1T1 gene fusion and low HOXA expression showed that

188 the active chromatin domain is dynamic, with little H3K4me3 signal and increased H3K27me3 in

189 this AML type (Figure 2B; blue tracks). This was also observed in ChIP-seq data from FACS-

190 purified normal promyelocytes and mature neutrophils (10) that also do not express HOXA

191 genes (Figures 2B, pink and green tracks, Figure S2A). Interestingly, H3K4me3 was not

192 completely absent in these cells, and similar H3K4me3 retention was observed in the RUNX1-

193 RUNX1T1 AML samples, although to a lesser extent. This suggests that regulation of HOXA

194 genes in both AML cells and normal hematopoietic cells is associated with a poised chromatin

195 state in CTCF-defined chromatin regions.

196

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

197 Targeted deletions at the HOXA locus eliminate CTCF binding but do not affect viability

198 in NPM1-mutant OCI-AML3 cells

199 We sought to define the sequences that are required for CTCF binding at the HOXA locus, and

200 determine whether loss of these DNA elements has functional consequences in NPM1-mutant

201 AML cells. To this end, we used the OCI-AML3 cell line with a canonical NPM1 insertion

202 mutation and that expresses MEIS1 and genes in both the HOXA and HOXB clusters (ref. 30;

203 see Figure 3A). This pattern was unique to OCI-AML3 cells and was not observed in cell lines

204 with other mutation-associated HOX expression phenotypes, including MLL-rearranged

205 MOLM13 cells that expressed only HOXA genes, and the RUNX1-RUNX1T1-containing

206 Kasumi-1 cell line, which had low HOX expression. ChIP-seq for CTCF using OCI-AML3 cells

207 identified the four conserved CTCF sites observed in primary AML samples (Figures 3B, S3A)

208 and peaks in the anterior HOXA cluster that were variably present in the primary AML samples

209 (see Figure 1A). ChIP-seq for H3K4me3 and H3K27me3 also demonstrated an active chromatin

210 domain between HOXA9 and HOXA13 (Figure 3B), which was consistent with the patterns of

211 gene expression in this cell line, but different from chromatin domain boundaries in primary AML

212 samples. However, the histone modifications between CBSA7/A9 and CBSA10 were shared

213 between OCI-AML3 cells and NPM1-mutated primary AML samples, indicating that regulatory

214 mechanisms in this specific region are conserved between primary AML samples and OCI-

215 AML3 cells.

216

217 We next used CRISPR/Cas9-mediated editing to delete the three conserved CTCF binding sites

218 in OCI-AML3 cells (CBSA6/7, CBSA7/9 and CBSA10). The resulting mutations did not

219 appreciably alter markers of cell maturation in the edited cells (Figure S3B), and the deletion

220 frequency at CBSA7/9 and CBSA10 remained stable after 7 and 14 days; CBSA6/7 deletion

221 was less efficient, and showed a modest decrease in deletion frequency over time (Figure S3C).

222 Cells sorted from bulk cultures were expanded and screened for deletions, which identified at

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

223 least 5 individual clonal lines with homozygous deletions at each of the three sites (see Table

224 S3), with deletions as small as 25, 10, and 9 bp sufficient to eliminate nearly all CTCF ChIP-seq

225 signal from CBSA6/7, CBSA7/9, and CBSA10, respectively (Figure 3C-H). Additional

226 experiments using single deletion mutants resulted in 10 doubly homozygous mutants and 7

227 triple mutants with homozygous deletions at all three sites (Figures 3I). None of the mutant OCI-

228 AML3 clones showed overt defects in growth or maturation status, despite complete loss of

229 CTCF binding in the posterior HOXA cluster.

230

231 CTCF binding is not required for maintenance of HOXA gene expression or chromatin

232 boundaries in NPM1-mutant AML cells

233 We selected 45 OCI-AML3 deletion mutants for HOXA9 expression analysis via RT-qPCR to

234 assess whether loss of CTCF binding affected HOXA9 expression. Surprisingly, HOXA9

235 expression was largely unchanged in all lines with deletions, with little evidence for consistent

236 reduction in expression across the homozygous mutant clones and no trend toward decreased

237 expression between wild type, heterozygous, and homozygous single mutants (Figure 4A) or

238 multiply mutated clones (Figure 4B). Expression analysis via RNA-seq showed modest

239 increases in the expression of the anterior HOXA genes HOXA1-HOXA7 in lines with CBSA7/9

240 deletions, and subtle increases in the posterior HOXA9-HOXA13 genes when CBSA10 was

241 deleted (Figures 4E, 4D, S4A). However, most mutant lines showed few expression changes.

242 Furthermore, a heterozygous single nucleotide polymorphism (SNP) in the HOXA9 gene

243 showed balanced expression of both alleles in wild type OCI-AML3 cells, and in all mutant lines,

244 except for two with large mutations that involved the posterior HOXA cluster (Figure S4B).

245

246 ChIP-seq for H3K4me3 and H3K27me3 was also performed on multiple mutants to determine

247 whether loss of CTCF binding altered chromatin boundaries. H3K4me3 signal was reduced

248 specifically at the CBSA7/9 site, but was otherwise intact. H3K27me3 was also still present, but

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

249 was modestly decreased across the anterior HOXA genes in CBSA7/9 mutants; few changes

250 were evident in other mutant lines, including triple mutants (Figure 4G, 4H, S4C-S4E). Mutants

251 were also analyzed for other histone modifications, including H3K79me2 and H3K27ac, which

252 were intact compared to wild type OCI-AML3 cells (Figure S4F).

253

254 CTCF deletions result in compensatory HOXA chromatin loops

255 Given the role of CTCF in regulating chromatin architecture, we used in situ Hi-C (24) to define

256 the chromatin interactions at the HOXA locus in wild type OCI-AML3 cells, CBSA7/9 single

257 mutants, double mutants with deletions of CBSA6/7 and CBSA7/9 or CBSA7/9 and CBSA10,

258 and two triply mutant lines (see Table S4). Analysis of the chromatin contacts from these cells

259 identified a mean of 9,797 chromatin loops and 5,160 contact domains at 10 kbp resolution (ref.

260 25; Table S4). The HOXA cluster on chromosome 7p was located at the edge of a contact

261 domain boundary (Figure 5A), with centromeric loops connecting HOXA13 to the promoter of

262 TAX1BP1 and intronic sequences of HIBADH and JAZF1. The telomeric loops involved the

263 remaining HOXA genes (HOXA1-HOXA11), which contacted regions in the SKAP2, and SNX10

264 genes, and an intergenic region 1.4 Mbp upstream with no associated gene annotations. Similar

265 chromatin loops were observed in Hi-C data generated from the MLL-rearranged MOLM13 cell

266 line, and in previously reported data from normal human HSPCs (31) (Figures S5A, S5B), which

267 both express HOXA genes.

268

269 We next analyzed Hi-C data from the OCI-AML3 deletion mutants to determine whether loss of

270 CTCF binding altered chromatin architecture at the HOXA gene cluster. Although the general

271 contact domain and chromatin loop structure remained intact, loops involving the HOXA locus

272 were altered in the deletion mutants (Figure 5B). Pairwise comparisons of normalized

273 chromosome 7 interaction frequencies between each mutant line vs. wild type cells identified

274 decreased long-range interactions and increased interactions with the more proximal SKAP2

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

275 gene (Figure 5B and S5C; red loops indicate chromosome-wide FDR<0.05). These findings

276 were confirmed via statistical analysis of joint normalized contact frequencies from all samples,

277 which demonstrated that mutant cells had significantly reduced interaction frequencies between

278 the HOXA cluster and SNX10 and the intergenic region, and increased interactions with two

279 SKAP2 loci (chromosome-wide FDR<0.05; Figure 5C). To define the specific HOXA genes

280 involved in these changes, we mapped the positions of HOXA-SKAP2 interacting reads within

281 the HOXA locus. This showed that SKAP2-HOXA loops involved the anterior HOXA genes and

282 intron 1 of SKAP2 in wild type cells (Figure 5D, top panel, shown in black); however, in mutant

283 cells there were increased interactions between the posterior HOXA genes HOXA9-HOXA13

284 and intron 11 of SKAP2 (Figure 5D, top panel, shown in red). Interacting reads with SNX10 and

285 the intergenic locus showed the opposite pattern, and were decreased in the posterior HOXA

286 cluster (Figure S5D), indicating that loss of CTCF binding resulted in ‘spreading’ of the proximal

287 SKAP2 interactions to include the posterior HOXA genes (Figure 5E). There were no clear

288 differences between cells with only CBSA7/9 deletions only, vs. double or triple deletions,

289 suggesting that CBSA7/9 is critical for defining long-distance chromatin interactions with the

290 anterior vs. posterior HOXA locus.

291

292 Posterior HOXA chromatin loops correlate with expression and involve enhancer loci in

293 primary AML samples

294 To determine the functional relevance of chromatin loops involving the posterior HOXA genes,

295 we performed in situ Hi-C on the Kasumi-1 cell line with low expression of HOXA genes (see

296 Figure 3A) to assess the correlation between chromatin architecture and posterior HOXA gene

297 expression. Although the interactions in these cells displayed similar overall patterns in the

298 HOXA region, the posterior HOXA genes did not participate in any statistically supported

299 chromatin loops (Figures 6A, 6B), and reads involved in HOXA contacts mapped primarily to the

300 anterior HOXA genes (Figure 6C). Loops to the posterior HOXA cluster were also present in the

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

301 NPM1-mutant IMS-M2 cell line with high HOXA expression (Figures S6A, S6B), further

302 supporting the observation that expression of these HOXA genes is associated with long-

303 distance chromatin interactions in NPM1-mutant AML cells.

304

305 We next analyzed ChIP-seq data for H3K27ac from primary AML samples to determine whether

306 loci that interact with HOXA genes may have enhancer properties. Indeed, both intron 1 and 11

307 of SKAP2 displayed H3K27ac signals in NPM1-mutant AML samples, as well as MLL-

308 rearranged samples that also express HOXA genes (Figure 6D). Multiple loci in the SNX10

309 gene also possessed this mark, including a non-coding RNA downstream of the 3’ UTR of

310 SNX10 (Figure 6E), but H3K27ac was not present at other regions that formed contacts with the

311 HOXA cluster, including the distal intergenic locus (Figure S6C). Although the enhancer

312 modifications observed at interacting regions had relatively low signal, they were clearly absent

313 in the RUNX1-RUNX1T1 samples, which lack HOXA expression. This provides evidence that

314 these specific regions interact with the HOXA cluster in primary AML samples that express

315 HOXA genes, and may therefore contribute to HOXA gene regulation.

316

317 Discussion

318 HOX transcription factors are drivers of self-renewal in AML cells and are highly expressed in

319 AMLs with NPM1c mutations. In this study, we demonstrated that the chromatin organizing

320 factor CTCF is bound equally to specific sites at the HOXA locus in NPM1-mutant primary AML

321 samples compared to other AML subtypes, and define a dynamic chromatin domain in primary

322 AML samples and normal hematopoietic cells. Targeted deletions in the NPM1-mutant OCI-

323 AML3 cell line eliminated CTCF binding, but surprisingly did not disrupt HOXA gene expression.

324 This was true when multiple binding sites were deleted individually, or in combination, and was

325 supported by independent mutant clones that showed little evidence for consistent decreases in

326 HOXA gene expression or changes to the histone modifications we measured. However, loss of

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

327 CTCF binding did result in clear alterations to the chromatin loops involving the posterior HOXA

328 genes, including HOXA9 and HOXA10. Specifically, long-range loops were diminished, and

329 were replaced by compensatory interactions with regions of the SKAP2 gene. Some of these

330 candidate enhancers have been reported in other studies (32), but have not previously been

331 shown to be active in hematopoietic cells. Further investigation of these sequences, and their

332 associated regulatory proteins, may shed light into factors that promote HOXA gene activation

333 in normal and malignant myeloid cells.

334

335 The central observation in this study is that CTCF binding at the HOXA cluster is not absolutely

336 required for maintenance of HOXA expression in NPM1-mutated AML cells. Deletion of CTCF

337 site CBSA7/9 was previously shown to affect HOXA expression in the MLL-rearranged

338 MOLM13 cell line (9). These discrepant findings may be due to fundamental differences in how

339 HOX genes are regulated in MLL-rearranged vs. NPM1-mutant AML cells. Indeed, the HOX

340 expression phenotype of these AML subtypes is strikingly different: MLL rearrangements are

341 associated with only HOXA gene expression, whereas both HOXA and HOXB genes are

342 expressed in NPM1-mutant AML cells. HOXA and HOXB genes are simultaneously

343 downregulated during normal myeloid maturation (1), which implies that the clusters may be

344 controlled by common factors that may function in different ways in NPM1-mutant vs. MLL-

345 rearranged AML cells. We also observed differences in HOXA cluster expression patterns in

346 MOLM13 vs. OCI-AML3 cells, including higher expression of HOXA11, HOXA13, and the

347 HOTTIP long noncoding RNA in OCI-AML3 cells. HOTTIP has an established role in HOXA

348 gene regulation (33,34), and its higher expression may influence the requirement for certain

349 CTCF-mediated chromatin architectures, thereby making these CTCF binding sites dispensable

350 for steady-state expression of posterior HOXA genes in NPM1-mutant AML cells.

351

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

352 Three-dimensional architecture is an important component of the regulatory control of HOXA

353 genes during development. Our analysis of Hi-C data from OCI-AML3 cells identified long-range

354 loops between the active posterior HOXA genes HOXA9-HOXA13 and sequences with

355 enhancer-associated epigenetic marks in the SNX10 and SKAP2 genes. Additional loops were

356 identified at other loci, but these did not possess active chromatin modifications in primary AML

357 samples. AML cells without HOXA expression did not display these loops, and lacked active

358 histone marks at the interacting loci, which provides evidence that these contacts may be

359 functionally relevant for HOXA gene expression in AML. Interestingly, loss of CTCF within the

360 HOXA cluster revealed the plasticity of these interactions, implying that multiple loci may serve

361 as exchangeable HOXA enhancers, which is reminiscent of other well-studied, complex

362 regulatory systems, and has been implicated in developmental HOXA gene regulation (35,36).

363

364 Our data showed that transcriptional activity and active histone modifications remained highly

365 localized to the posterior HOXA cluster in OCI-AML3 cells, even after elimination of key CTCF

366 boundaries. In fact, there was a tendency for deletion mutants to show increased HOXA gene

367 expression levels compared to wild type cells. This suggests the posterior HOXA cluster

368 possesses intrinsic properties that maintain its active state. It is unclear whether this activity is

369 caused by the compensatory interactions we observed, or if these chromatin loops are

370 consequences of persistent gene expression that is mediated by autonomous, sequence-

371 specific factors that recruit transcriptional machinery. The HOXA cluster contains many highly

372 conserved noncoding elements beyond CTCF binding sites. Additional studies will be required

373 to better understand whether these elements provide key signals that promote and maintain

374 HOXA gene expression in NPM1-mutant AML cells.

375

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

376 Acknowledgements

377 This work was supported by the American Society of Hematology (ASH Scholar Award), the

378 American Cancer Society (Institutional Research Grant), and the National Cancer Institute

379 (K08CA190815) to D.H.S. Primary AML samples were provided by the Genomics of AML

380 Program Project (P01CA101937, T. Ley, PI). Technical assistance was provided by the Siteman

381 Cancer Center Tissue Procurement and Cell Sorting cores (NCI Cancer Center Support Grant

382 P30CA91842).

383

384

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

385 References

386

387 1. Spencer D, Young M, Lamprecht T, Helton N, Fulton R, O’Laughlin M, et al. Epigenomic

388 analysis of the HOX gene loci reveals mechanisms that may control canonical expression

389 patterns in AML and normal hematopoietic cells. Leukemia. 2015;29(6):1279–89.

390

391 2. Hanson R, Hess J, Yu B, Ernst P, van Lohuizen M, Berns A, et al. Mammalian Trithorax and

392 Polycomb-group homologues are antagonistic regulators of homeotic development. Proc

393 National Acad Sci. 1999;96(25):14372–7.

394

395 3. Gould A. Functions of mammalian Polycomb group and trithorax group related genes. Curr

396 Opin Genet Dev. 1997;7(4):488–94.

397

398 4. Muntean AG, Tan J, Sitwala K, Huang Y, Bronstein J, Connelly JA, et al. The PAF Complex

399 Synergizes with MLL Fusion Proteins at HOX Loci to Promote Leukemogenesis. Cancer Cell.

400 2010;17(6):609–21.

401

402 5. Milne TA, Martin M, Brock HW, Slany RK, Hess JL. Leukemogenic MLL Fusion Proteins Bind

403 across a Broad Region of the Hox a9 Locus, Promoting Transcription and Multiple Histone

404 Modifications. Cancer Res. 2005;65(24):11367–74.

405

406 6. Daigle SR, Olhava EJ, Therkelsen CA, Basavapathruni A, Jin L, Boriack-Sjodin AP, et al.

407 Potent inhibition of DOT1L as treatment of MLL-fusion leukemia. Blood. 2013;122(6):1017–25.

408

409 7. Deshpande AJ, Chen L, Fazio M, Sinha AU, Bernt KM, Banka D, et al. Leukemic

410 transformation by the MLL-AF6 fusion oncogene requires the H3K79 methyltransferase Dot1l.

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

411 Blood. 2013;121(13):2533–41.

412

413 8. Rousseau M, Ferraiuolo MA, Crutchley JL, Wang X, Miura H, Blanchette M, et al. Classifying

414 leukemia types with chromatin conformation data. Genome Biol. 2014;15(4):R60.

415

416 9. Luo H, Wang F, Zha J, Li H, Yan B, Du Q, et al. CTCF boundary remodels chromatin domain

417 and drives aberrant HOX gene transcription in acute myeloid leukemia. Blood.

418 2018;132(8):837–48.

419

420 10. Spencer DH, Russler-Germain DA, Ketkar S, Helton NM, Lamprecht TL, Fulton RS, et al.

421 CpG Island Hypermethylation Mediated by DNMT3A Is a Consequence of AML Progression.

422 Cell. 2017;168(5):801-816.e13.

423

424 11. Network T. Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid

425 Leukemia. New Engl J Medicine. 2013;368(22):2059–74.

426

427 12. Schmidl C, Rendeiro AF, Sheffield NC, Bock C. ChIPmentation: fast, robust, low-input ChIP-

428 seq for histones and transcription factors. Nat Methods. 2015;12(10):963–5.

429

430 13. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.

431 2013;

432

433 14. Ramírez F, Dündar F, Diehl S, Grüning BA, Manke T. deepTools: a flexible platform for

434 exploring deep-sequencing data. Nucleic Acids Res. 2014;42(W1):W187–91.

435

436 15. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

437 Analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137.

438

439 16. Stovner E, Sætrom P. epic2 efficiently finds diffuse domains in ChIP-seq data.

440 Bioinformatics. 2019;btz232.

441

442 17. Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-

443 seq data with DESeq2. Genome Biol. 2014;15(12):550.

444

445 18. Hahne F, Ivanek R. Methods in Molecular Biology. Methods Mol Biology Clifton N J.

446 2016;1418:335–51.

447

448 19. Haeussler M, Schönig K, Eckert H, Eschstruth A, Mianné J, Renaud J-B, et al. Evaluation of

449 off-target and on-target scoring algorithms and integration into the guide RNA selection tool

450 CRISPOR. Genome Biol. 2016;17(1):148.

451

452 20. Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, et al. Optimized

453 sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat

454 Biotechnol. 2016;34(2):184–91.

455

456 21. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics.

457 2018;34(18):3094–100.

458

459 22. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast

460 universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.

461

462 23. Pertea M, Pertea GM, Antonescu CM, Chang T-C, Mendell JT, Salzberg SL. StringTie

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

463 enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol.

464 2015;33(3):290–5.

465

466 24. Rao S, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, Robinson JT, et al. A 3D Map

467 of the at Kilobase Resolution Reveals Principles of Chromatin Looping. Cell.

468 2014;159(7):1665–80.

469

470 25. Durand NC, Shamim MS, Machol I, Rao SS, Huntley MH, Lander ES, et al. Juicer Provides

471 a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 2016;3(1):95–

472 8.

473

474 26. Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features.

475 Bioinformatics. 2010;26(6):841–2.

476

477 27. Stansfield JC, Cresswell KG, Vladimirov VI, Dozmorov MG. HiCcompare: an R-package for

478 joint normalization and comparison of HI-C datasets. Bmc Bioinformatics. 2018;19(1):279.

479

480 28. Stansfield JC, Cresswell KG, Dozmorov MG. multiHiCcompare: joint normalization and

481 comparative analysis of complex Hi-C experiments. Bioinform Oxf Engl. 2019;

482

483 29. Servant N, Lajoie BR, Nora EP, Giorgetti L, Chen C-J, Heard E, et al. HiTC: exploration of

484 high-throughput “C” experiments. Bioinform Oxf Engl. 2012;28(21):2843–4.

485

486 30. Brunetti L, Gundry MC, Sorcini D, Guzman AG, Huang Y-H, Ramabadran R, et al. Mutant

487 NPM1 Maintains the Leukemic State through HOX Expression. Cancer Cell. 2018;34(3):499-

488 512.e9.

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

489

490 31. Jeong M, Huang X, Zhang X, Su J, Shamim M, Bochkov I, et al. A Cell Type-Specific Class

491 of Chromatin Loops Anchored at Large DNA Methylation Nadirs. Biorxiv. 2017;212928.

492

493 32. Su G, Guo D, Chen J, Liu M, Zheng J, Wang W, et al. A distal enhancer maintaining Hoxa1

494 expression orchestrates retinoic acid-induced early ESCs differentiation. Nucleic Acids Res.

495 2019;47(13):6737–52.

496

497 33. Luo H, Zhu G, Xu J, Lai Q, Yan B, Guo Y, et al. HOTTIP lncRNA Promotes Hematopoietic

498 Stem Cell Self-Renewal Leading to AML-like Disease in Mice. Cancer Cell. 2019;

499

500 34. Wang KC, Yang YW, Liu B, Sanyal A, Corces-Zimmerman R, Chen Y, et al. A long

501 noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature.

502 2011;472(7341):120.

503

504 35. Martin DI, Fiering S, Groudine M. Regulation of β-globin gene expression: straightening out

505 the locus. Curr Opin Genet Dev. 1996;6(4):488–95.

506

507 36. Cao K, Collings CK, Marshall SA, Morgan MA, Rendleman EJ, Wang L, et al.

508 SET1A/COMPASS and shadow enhancers in the regulation of homeotic gene expression. Gene

509 Dev. 2017;31(8):787–801.

510

511

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

512 Figure 1. CTCF is bound to chromatin accessible sites at the HOXA locus in primary AML

513 samples. A. Chromatin accessibility by ATAC (top tracks highlighted in white; ref. 1) and ChIP-

514 seq for CTCF (bottom tracks highlighted in yellow) from primary AML samples with either t(8;21)

515 creating the RUNX1-RUNX1T1 gene fusion (blue), MLL rearrangements (green), or a normal

516 karyotype and NPM1 mutation (purple). CTCF sites CBSA6/7, CBSA7/9, CBSA10, and CBSA13

517 are indicated by the dashed boxes. B-D. Scatter plots comparing the CTCF peak summit counts

518 between each AML type in log2 normalized read counts. Red points indicate ChIP-seq signal for

519 the four CTCF binding sites highlighted in panel A, which are similar across AML samples.

520 Dashed orange lines indicate a two-fold change between the samples.

521

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

522 Figure 2. CTCF defines chromatin boundaries in AML samples and normal CD34+ cells

523 with HOXA gene expression. A. Top track shows CTCF ChIP-seq from a NPM1-mutant

524 primary AML sample. Tracks highlighted in yellow show ChIP-seq for H3K4me3 from primary

525 samples with high HOXA expression, including AML samples with MLL rearrangements (green)

526 and NPM1 mutations (purple), and primary CD34+ hematopoietic stem/progenitor cells (HSPCs)

527 purified from normal donor bone marrow samples (gray; GSE104579). Tracks highlighted in

528 blue show H3K27me3 ChIP-seq from the same set of samples. B. H3K4me3 and H3K27me3

529 ChIP-seq from primary samples with no HOXA expression, including AML samples with t(8;21)

530 creating the RUNX1-RUNX1T1 fusion, and normal promyelocytes (CD14-, CD15+, CD16 low)

531 and neutrophils (CD14-, CD15+, CD16 high) from healthy donor individuals. Dashed box

532 indicates the region of dynamic chromatin that correlates with HOXA gene cluster expression.

533

534

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

535 Figure 3. Targeted deletions eliminate CTCF binding in the NPM1-mutant OCI-AML3 cell

536 line. A. RNA-seq expression of HOXA and HOXB genes in OCI-AML3 cells, which display the

537 canonical mutant NPM1-associated HOXA/HOXB expression phenotype. Also shown are the

538 MLL-rearranged MOLM13 cell line that expresses only HOXA genes, and the RUNX1-

539 RUNX1T1-containing Kasumi-1 cell line that has low HOXA and HOXB gene expression. B.

540 ChIP-seq data from OCI-AML3 cells for CTCF (black), H3K4me3 (highlighted in yellow) and

541 H3K27me3 (highlighted in blue), which show conserved CTCF binding sites and distinct regions

542 of active (H3K4me3) and repressed (H3K27me3) chromatin. C-E. Targeted deletions that

543 disrupt CTCF binding in OCI-AML3 cells at sites CBSA6/7 (in C), CBSA7/9 (in D), and CBSA10

544 (in E). Bottom panels show allele pairs from homozygous or compound heterozygous deletion

545 mutants at each site; top panels show CTCF ChIP-seq signal from these mutant cell lines (multi-

546 colored lines) compared to wild type OCI-AML3 cells (in gray). F-H. Normalized CTCF ChIP-seq

547 for all CTCF peaks from deletion mutants (Y axis) vs. wild type OCI-AML3 cells, showing

548 dramatically reduced CTCF ChIP-seq signal in deletion mutants at all three sites, with the

549 exception of clones A101 and A091.31, which only partially eliminates CTCF signal at site

550 CBSA6/7. I. CTCF ChIP-seq tracks from double (in purple) and triple mutants (in blue),

551 generated via sequential targeted deletion experiments. CTCF ChIP-seq from wild type OCI-

552 AML3 cells is shown in black at the top for reference.

553

554

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

555 Figure 4. CTCF binding is not required to maintain gene expression or chromatin

556 boundaries in the HOXA gene cluster. A. qPCR for HOXA9 in single mutants with

557 heterozygous or homozygous deletions at CBSA6/7, CBSA7/9, or CBSA10. HOXA9 expression

558 from the Kasumi-1 cell line is shown in blue as a no-HOXA9 expressing control. B. qPCR for

559 HOXA9 in double and triple mutants at the CTCF binding sites indicated. Kasumi-1 cells are

560 included in blue as in panel A. C-F. RNA-seq expression of all HOXA genes in single mutants

561 (C-E) and triple mutants (D). Expression level is shown in log2 transcripts per million (TPM). G.

562 ChIP-seq for H3K4me3 and H3K27me3 in deletion mutants lacking CTCF at site CBSA7/9 (in

563 red; N=2). Mean ChIP-seq signal from wild type OCI-AML3 cells (N=2) is shown in gray. H.

564 Mean ChIP-seq for H3K4me3 and H3K27me3 in triple mutants (N=2), with ChIP-seq from wild

565 type cells shown in gray, as in panel G.

566

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

567 Figure 5. CTCF-mediated chromatin architecture at the HOXA locus in NPM1-mutant OCI-

568 AML3 cells. A. Contact matrix, contact domains, and chromatin loops for a 2.5 Mbp region of

569 chromosome 7p that contains the HOXA gene cluster from in situ Hi-C using wild type OCI-

570 AML3 cells. Top panel shows the KR-normalized contact matrix at 5kb resolution. Tracks below

571 the matrix show the contact domains and statistically supported chromatin loops identified using

572 established methods (25). Loops shown in black are anchored within the HOXA gene cluster. B.

573 Contacts and HOXA region loops from in situ Hi-C from wild type OCI-AML3 cells and

574 homozygous (biallelic) single mutants at CBSA7/9, double mutants at CBSA6/7 and CBSA7/9,

575 and triple mutants at CBSA6/7, CBSA7/9, and CBSA10. Loops in red indicate statistically

576 significant differences in pairwise comparisons between each mutant and wild type cells with a

577 chromosome-wide FDR<0.05. C. Normalized interaction frequencies for all datasets for 6

578 telomeric loops. Top panel shows normalized interaction frequencies for individual Hi-C libraries

579 from wild type OCI-AML3 cells (N=2), single mutants at CBSA7/9, CBSA6/7-CBSA7/9 and

580 CBSA7/9-CBSA10 double mutants, and two CBSA6/7-CBSA7/9-CBSA10 triple mutant lines

581 (N=5 total mutant lines). Dashed boxes highlight loops that were statistically different in

582 comparisons of all mutant libraries vs. two wild type OCI-AML3 libraries (chromosome-wide

583 FDR<0.5). Mean interaction frequencies for chromatin loops from wild type and mutant lines are

584 shown graphically in the lower panel in solid and dashed lines, respectively. D. Relative read

585 depth (reads per million) across the HOXA locus for reads that interact between HOXA and

586 SKAP2. Depth for wild type OCI-AML3 cells is shown in black and deletion mutants are shown

587 in red. CTCF ChIP-seq signal from wild type OCI-AML3 cells and HOXA genes are shown in the

588 bottom tracks. CTCF site deletions result in new interactions between the posterior HOXA

589 cluster (genes HOXA9-HOXA13) with the SKAP2 gene. E. Graphical representation of HOXA

590 chromatin loops in wild type OCI-AML3 cells (left) and deletion mutants lacking HOXA CTCF

591 binding sites (right).

592

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

593 Figure 6. Long-distance interactions with posterior HOXA genes correlate with

594 expression and involve loci with enhancer-associated histone acetylation. A. Chromatin

595 interactions at chromosome 7p from the Kasumi-1 AML cell line that contains t(8;21)/RUNX1-

596 RUNX1T1 and has low expression of HOXA genes. The HOXA cluster resides at the boundary

597 of a topologically associated domain, but genes in the posterior HOXA cluster do not participate

598 in chromatin loops in this cell line. B. Focused view of HOXA chromatin loops in the Kasumi-1

599 and OCI-AML3 cell lines with low and high posterior HOXA expression, respectively, which

600 display the differences in chromatin loop structure and loop anchors. C. Relative read depth in

601 reads per million of Hi-C reads between the HOXA locus and telomeric chromatin loops in

602 Kasumi-1 (in blue) and OCI-AML3 cells (in black). Interacting reads in Kasumi-1 cells are

603 localized to the central HOXA cluster, compared to interacting reads in OCI-AML3 cells that

604 map to the central and posterior HOXA locus. D. Enhancer-associated histone H3 lysine 27

605 acetylation (H3K27ac) at HOXA interacting loci from primary AML samples with and without

606 HOXA gene expression. Primary AML samples include patients with NPM1c (purple; N=2

607 distinct patients) and MLL rearrangements (t(9;11) and t(11;19)) (green; N=2 distinct patients)

608 with high HOXA expression, and samples with t(8;21)/RUNX1-RUNX1T1 (N=2 distinct patients).

609 Regions highlighted in the dashed boxes were shown to interact with the HOXA cluster in the

610 OCI-AML3 cell line model (loop track, top), and display H3K27ac signal suggesting they may

611 represent functional genomic elements. E. High-resolution view of two loop anchor regions in

612 SKAP2 intron 1 and downstream of the SNX10 gene, which possess the enhancer-associated

613 H3K27ac mark in HOXA-expressing AML samples but not in samples with no HOXA

614 expression.

615

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 1

A RUNX1-RUNX1T1 MLLX NPM1c CBSA6/7 CBSA7/9 CBSA10 CBSA13

100 80 60 40 20 0 100 80 60 40 20 0 −120 15

Chromatin 10 accessibility 5 0 80 60 40 20 0 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 100 80 60 40 20 0 150 CTCF ChIP-seq 100 50 0 100 80 60 40 20 0 100 80 60 40 20 0 HOXA1 HOXA3 HOXA7 HOXA10 HOXA13 HOXA2 HOXA4 HOXA5 HOXA9 HOXA11 HOXA6 27.09 mb 27.11 mb 27.13 mb 27.15 mb 27.17 mb 27.19 mb

27.1 mb 27.12 mb 27.14 mb 27.16 mb 27.18 mb 27.2 mb B C D Normalized CTCF peak signal Normalized CTCF peak signal Normalized CTCF peak signal

● HOXA CTCF sites ● HOXA CTCF sites ● HOXA CTCF sites NPM1c RUNX1−RUNX1T1 RUNX1−RUNX1T1 6 8 10 12 6 8 10 12 14 6 8 10 12 14

6 8 10 12 6 8 10 12 6 8 10 12

MLLX MLLX NPM1c bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 2

A MLLX NPM1c CD34 100 80 60 CTCF ChIP-seq 40 20 0 200 150 100 50 0 120 100 80 60 40 20 0 100 80 60 40 20 0 150 100 50 0 60 50 40 30 20 10 0

H3K4me3 ChIP-seq 150 100 50 0 50 40 30 20 10

50 40 30 20 10

25 20 15 10 5 0 25 20 15 10 5 0 25 20 15 10 5 0 25 20 15 10 5 0 25 20 15 10 5 0 25 20 15 10 H3K27me3 ChIP-seq 5 0 25 20 15 10 5 0 25 20 15 10 5 0 HOXA1 HOXA3 HOXA7 HOXA10 HOXA13 HOXA2 HOXA4 HOXA5 HOXA9 HOXA11 HOXA6 27.09 mb 27.11 mb 27.13 mb 27.15 mb 27.17 mb 27.19 mb

27.1 mb 27.12 mb 27.14 mb 27.16 mb 27.18 mb 27.2 mb B RUNX1-RUNX1T1 Pro PMN 100 80 CTCF ChIP-seq 60 40 20 0 40 30 20 10 0

30 20 10 0 50 40 30 20 10 0 50 40 30 20 10 0 80 60 40 20 H3K4me3 ChIP-seq 0 60 40 20 0 30 25 20 15 10 5 0 30 25 20 15 10 5 0 30 25 20 15 10 5 0 30 25 20 15 10 5 0 30 25 20 15 10 5

H3K27me3 ChIP-seq 0 30 25 20 15 10 5 0 HOXA1 HOXA3 HOXA7 HOXA10 HOXA13 HOXA2 HOXA4 HOXA5 HOXA9 HOXA11 HOXA6 27.09 mb 27.11 mb 27.13 mb 27.15 mb 27.17 mb 27.19 mb

27.1 mb 27.12 mb 27.14 mb 27.16 mb 27.18 mb 27.2 mb bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 3

CBSA7/9 A B CBSA6/7 CBSA10 100 80 OCI−AML3 80 MOLM13 60 40 Kasumi−1 CTCF 60 20 0

30 40 20 10 H3K4me3 0 30 20 25 20 RNA−seq expression (TPM) RNA−seq expression 15 10 0 5 H3K27me3 0 HOXA1 HOXA4 HOXA6 HOXA9 HOXA11 HOXA2 HOXA5 HOXA10

MEIS1 HOXA3 HOXA13 HOXA1 HOXA2 HOXA3 HOXA4 HOXA5 HOXA6 HOXA7 HOXA9 HOXB1 HOXB2 HOXB3 HOXB4 HOXB5 HOXB6 HOXB7 HOXB8 HOXB9 HOTTIP HOXA10 HOXA11 HOXA13 HOXB13 HOXA7

C D E CBSA6/7 CBSA7/9 CBSA10

100 WT OCI-AML3 100 WT OCI-AML3 100 WT OCI-AML3 80 80 80 60 60 60 40 40 40 20 20 20 CTCF signal CTCF signal CTCF signal 0 0 0 A100 A056 A073 A101 A058 A074 A076 A091.31 A059 A077 A091.33 A060 A079 A095.12.1 A091 A082 A095.12.2 A095 A083 700 800 A086 A089

600 F G H

I CBSA6/7 CBSA7/9 CBSA10 100 80 WT OCI-AML3 60 40 20 0 100 80 ∆CBSA6/7, ∆CBSA7/9 60 40 20 0 100 80 ∆CBSA7/9, ∆CBSA10 60 40 20 0 100 80 ∆CBSA6/7, ∆CBSA7/9, ∆CBSA10 60 40 20 0 100 CTCF ChIP-seq signal 80 ∆CBSA6/7, ∆CBSA7/9, ∆CBSA10 60 40 20 0 100 80 ∆CBSA6/7, ∆CBSA7/9, ∆CBSA10 60 40 20 0 HOXA1 HOXA3 HOXA7 HOXA10 HOXA13 HOXA2 HOXA4 HOXA6 HOXA9 HOXA11 HOXA5 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 4

A B 0.6 HOXA9 RT-qPCR: single site deletions 0.6 HOXA9 RT-qPCR: multiple site deletions

0.4 ∆CBSA6/7 ∆CBSA7/9 ∆CBSA10 0.4 HOXA9 expression HOXA9 expression 0.2 0.2 Relative Relative

0.0 0.0

Kasumi-1 Kasumi-1 ∆CBSA6/7 ∆CBSA7/9 ∆CBSA6/7 Heterozygous Homozygous Heterozygous Homozygous Homozygous ∆CBSA7/9 ∆CBSA10 ∆CBSA7/9 WT OCI-AML3 WT OCI-AML3 ∆CBSA10

C CTCF A6/7 mutants D CTCF A7/9 mutants G 200 WT OCI-AML3 WT OCI-AML3 WT OCI-AML3 ∆CBSA7/9 (N=2) Deletion mutants Deletion mutants 150 ∆CBSA7/9 100 H3K4me3 DataTrack 50 ∆CBSA6/7 0 12 WT OCI-AML3 10 ∆CBSA7/9 (N=2) 8 6 4 RNA−seq expression (log2 TPM) RNA−seq expression RNA−seq expression (log2 TPM) RNA−seq expression DataTrack H3K27me3 2 0 2 4 6 8 0 2 4 6 8 0 HOXA1 HOXA4 HOXA7 HOXA11 HOXA2 HOXA5 HOXA9 HOXA13 HOXA1HOXA2HOXA3HOXA4HOXA5HOXA6HOXA7HOXA9 HOXA1HOXA2HOXA3HOXA4HOXA5HOXA6HOXA7HOXA9 HOXA3 HOXA10 HOXA10HOXA11HOXA13 HOXA10HOXA11HOXA13 HOXA6 E CTCF A10 mutants F CTCF A6/7, A7/9 & A10 mutants H 200 WT OCI-AML3 WT OCI-AML3 ∆CBSA10 WT OCI-AML3 ∆CBSA10 ∆CBSA6/7, ∆CBSA7/9, ∆CBSA10 (N=2) Deletion mutants Deletion mutants 150 ∆CBSA7/9 100 H3K4me3 DataTrack 50 ∆CBSA6/7 0 12 WT OCI-AML3 10 ∆CBSA6/7, ∆CBSA7/9, ∆CBSA10 (N=2) 8 6 4 RNA−seq expression (log2 TPM) RNA−seq expression RNA−seq expression (log2 TPM) RNA−seq expression DataTrack H3K27me3 2 0 2 4 6 8 0 2 4 6 8 0 HOXA1 HOXA4 HOXA7 HOXA11 HOXA2 HOXA5 HOXA9 HOXA13 HOXA1HOXA2HOXA3HOXA4HOXA5HOXA6HOXA7HOXA9 HOXA1HOXA2HOXA3HOXA4HOXA5HOXA6HOXA7HOXA9 HOXA10HOXA11HOXA13 HOXA10HOXA11HOXA13 HOXA3 HOXA10 HOXA6 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 5 A B Normalized interactions WT

0 5 I.F.

∆CBSA7/9

* *

MUT vs. WT P<0.05 I.F.

∆CBSA6/7 ∆CBSA7/9 *

MUT vs. WT P<0.05 I.F.

∆CBSA6/7 Contact domains ∆CBSA7/9 All loops ∆CBSA10 * HOXA loops

MUT vs. WT P<0.05 HOXA cluster I.F. HNRNPA2B1 SKAP2 HIBADH NFE2L3 TAX1BP1 HOXA cluster HNRNPA2B1 SKAP2 EVX1 HIBADH CBX3 JAZF1 NFE2L3 TAX1BP1 SNX10 CBX3 JAZF1 SNX10 26 mb 27 mb 28 mb 26 mb 27 mb 28 mb

26.5 mb 27.5 mb 26.5 mb 27.5 mb C D HOXA-SKAP2 interacting reads ● WT OCI−AML3 ● Deletion mutants 5 ● WT OCI-AML3 Deletion mutants ● 3e-5 5e-3 0.02 ● 4 ● ● ● ● ● ● ●● ● ● ● ● ●●● 3 ● ● ● ● ● ●● ● ●● ● Norm. Freq. Interaction ● ● ●● ●● ●● ●●●●● 0 100 200

DataTrack 2

1 WT OCI-AML3 Relative read depth (rpm) Deletion mutants (mean) 0 CBSA7/9 CBSA10 CBSA6/7 100 80 CTCF ChIP-seq 60 40 Signal 20

HOXA cluster Normalized Norm. interaction freq. 0 HNRNPA2B1 SNX10 SKAP2 NFE2L3 HOXA1 HOXA4 HOXA6 HOXA9 HOXA11 CBX3 HOXA2 HOXA5 HOXA10 HOXA13 HOXA3 HOXA7 E TEL SNX10 SKAP2 SKAP2 CBSA7/9 deletion intergenic intergenic enhancer enhancer TEL SNX10 CEN CEN Anterior HOXA CTCF Posterior HOXA Anterior HOXA Posterior HOXA

CTCF bioRxiv preprint doi: https://doi.org/10.1101/2020.02.17.952390; this version posted February 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Figure 6 A B Kasumi-1 Kasumi-1 Normalized interactions SKAP2

0 5 SNX10

SKAP2 OCI-AML3

SNX10, distal enhancer

HOXA1 HOXA4 HOXA6 HOXA9 HOXA11 HOXA2 HOXA5 HOXA10 HOXA13 HOXA3 HOXA7 C HOXA interacting reads 5 WT OCI-AML3

See Panel B 4 Kasumi-1 HOXA cluster HNRNPA2B1 SKAP2 EVX1 HIBADH 3 NFE2L3 TAX1BP1 CBX3 JAZF1 SNX10 2 26 mb 27 mb 28 mb 1

26.5 mb 27.5 mb Relative read depth (rpm) 0 CBSA6/7 CBSA7/9 CBSA10 100 80 60 D 40 signal 20 HOXA proximal loops 0

NPM1c MLLX Normalized HOXA1 HOXA4 HOXA6 HOXA9 HOXA11 HOXA2 HOXA5 HOXA10 RUNX1-RUNX1T1 HOXA3 HOXA13 HOXA7

80 60 E 40 NPM1c MLLX RUNX1-RUNX1T1 20 0 HOXA loop anchor HOXA loop anchor 80 60 40 80 40 20 60 30 0 40 20 20 10 80 0 0 60 80 40 40 60 30 40 20 20 20 10 0 0 0 40 80 40 30 60 30 40 20 20 20 10 10 0 0 0 40 40 30 30 40 20 H3K27ac signal 20 30 10 10 20 0 0 40 40 H3K27ac signal 10 H3K27ac signal 30 30 0 20 20 40 10 10 30 0 0 40 40 20 30 30 10 20 20 0 10 10 0 0

SNX10 SNX10 SKAP2 HOXA cluster SKAP2 AC004540.2