bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 MLL3/MLL4 Histone Methyltranferase Activity Dependent Chromatin Organization 2 at Enhancers during Embryonic Stem Cell Differentiation 3 4 Naoki Kubo1, Rong Hu1, Zhen Ye1, and Bing Ren1,2,3* 5 6 1Department of Cellular and Molecular Medicine, University of California San Diego School of 7 Medicine, La Jolla, CA, USA 8 9 2Center for Epigenomics, Department of Cellular and Molecular Medicine, Moores Cancer Center 10 and Institute of Genome Medicine, University of California San Diego School of Medicine, La Jolla, 11 CA, USA 12 13 3 Ludwig Institute for Cancer Research, La Jolla, CA, USA 14 15 16 *Correspondence to: [email protected] bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

17 SUMMARY 18 19 MLL3 (KMT2C) and MLL4 (KMT2D), the major mono-methyltransferases of histone H3 20 lysine 4 (H3K4), are required for cellular differentiation and embryonic development in 21 mammals. We previously observed that MLL3/4 promote long-range chromatin 22 interactions at enhancers, however, it is still unclear how their catalytic activities 23 contribute to enhancer-dependent gene activation in mammalian cell differentiation. To 24 address this question, we mapped histone modifications, long-range chromatin contacts 25 as well as gene expression in MLL3/4 catalytically deficient mouse embryonic stem (ES) 26 cells undergoing differentiation toward neural precursor cells. We showed that MLL3/4 27 activities are responsible for deposition of H3K4me1 modification and formation of long- 28 range enhancer-promoter contacts at a majority of putative enhancers gained during cell 29 differentiation, but are dispensable for most candidate enhancers found in 30 undifferentiated ES cells that persist through differentiation. While transcriptional 31 induction at most genes is unaltered in the MLL3/4 catalytically deficient cells, genes 32 making more contacts with MLL3/4-dependent putative enhancers are disproportionately 33 affected. These results support that MLL3/4 contributes to cellular differentiation through 34 histone-methyltransferase-activity dependent induction of enhancer-promoter contacts 35 and transcriptional activation at a subset of lineage-specific genes. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

36 INTRODUCTION 37 38 Spatiotemporal gene expression in mammals is governed primarily by transcriptional 39 enhancers, where binding of sequence-specific transcription factors (TFs) drives local chromatin 40 changes by recruiting chromatin remodeling complexes such as SWI/SNF proteins and 41 chromatin modifiers (Clapier and Cairns, 2009; Euskirchen et al., 2012; Heintzman et al., 2009; 42 Long et al., 2016). Some of the most pronounced histone modifications found at enhancers 43 include mono-methylation of histone H3 lysine 4 (H3K4me1) and acetylation of histone H3 44 lysine 27 (H3K27ac) , which have been broadly utilized to identify and annotate enhancers in 45 the genome (Andersson et al., 2014; Calo and Wysocka, 2013; Consortium, 2012; Creyghton et 46 al., 2010; Rada-Iglesias et al., 2011; Shen et al., 2012; Shlyueva et al., 2014). Histone H3 lysine 47 4 mono-methylation at enhancers is catalyzed by the histone methyltransferases MLL3 and 48 MLL4 (MLL3/4) (Herz et al., 2012; Hu et al., 2013; Lee et al., 2013; Wang et al., 2016), while 49 H3K27ac is catalyzed by CBP/p300, recruitment of which could be facilitated by MLL3/4 (Jin et 50 al., 2011; Lai et al., 2017). MLL3/4 play crucial roles in mammalian development. Mll4 knockout 51 in mice leads to embryonic lethality (Ashokkumar et al., 2020; Lee et al., 2013), and 52 development of heart, adipose, muscle, and immune cells is severely impeded after Mll3/4 53 depletion (Ang et al., 2016; Lee et al., 2013; Placek et al., 2017). Furthermore, mutations in 54 MLL3/4 genes are frequently observed in human cancers and developmental disorders (Ng et 55 al., 2010; Parsons et al., 2011; Pasqualucci et al., 2011; Sze and Shilatifard, 2016; Will and 56 Steidl, 2014). However, the role of MLL3/4 catalytic activity and MLL3/4-dependent H3K4me1 at 57 enhancers is still incompletely defined. 58 59 A recent study demonstrated that catalytic inactivation of MLL3/4 causes loss of H3K4me1 at 60 enhancers along with partial reduction of H3K27ac in mouse embryonic stem cells (ESCs), but 61 with surprisingly minor effects on gene expression (Cao et al., 2018; Dorighi et al., 2017). 62 Additionally, catalytic inactivation of Trr, the homolog of MLL3/4, does not impede 63 Drosophila development (Rickels et al., 2017). These observations raise the questions about 64 the role of MLL3/4-depednent H3K4me1 in enhancer-dependent gene activation in general. We 65 previously showed that MLL3/4 regulate chromatin organization at enhancers and modulate 66 enhancer-promoter (E-P) contacts at the gene in mouse embryonic stem cells (Yan et al., 67 2018). However, the scope of MLL3/4-dependent histone methylation and its impact on E-P 68 contacts and transcriptional programs of cellular differentiation required further investigation. To 69 gain a better understanding of MLL3/4’s role at enhancers, it is essential to precisely determine bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

70 how MLL3/4-dependent H3K4me1 regulates the dynamics of chromatin contacts between 71 enhancers and promoters and expression of target genes (Deng et al., 2012; Gorkin et al., 72 2014). Here we used mouse ESCs with catalytically deficient MLL3/4 (hereafter referred to as 73 dCD) (Dorighi et al., 2017; Li et al., 2016; Zhang et al., 2015) to delineate the role of MLL3/4 74 activities in histone H3K4 methylation at enhancers, E-P contacts, and transcriptional induction 75 during ESC differentiation (Figure 1A, Table S1). As a model for cellular differentiation, we 76 focused on retinoic acid (RA)-induced neural differentiation toward neural precursor cells 77 (NPCs) (Methods) (Bain et al., 1995; Strubing et al., 1995). 78 79 RESULTS 80 81 MLL3/4 Catalytic Activity Dependent and Independent H3K4me1 at Enhancers during 82 Neural Precursor Cell (NPC) Differentiation 83 We first analyzed how catalytic inactivation of MLL3/4 methyltransferase altered histone 84 modification genome-wide in MLL3/4 dCD cells during ESC differentiation to NPC. Consistent 85 with previous reports (Dorighi et al., 2017; Hu et al., 2013; Lee et al., 2013; Wang et al., 2016), 86 we observed reduction of H3K4me1 at 19,454 and 25,271 distal elements in ESCs and NPCs,

87 respectively (FDR < 0.05, log2 FC > 0.5) (Figures S1A–S1F). H3K27ac signals at the same 88 regions were partially reduced and the degree of changes was positively correlated with the 89 change in H3K4me1 (Figures S1G). Surprisingly, significantly elevated H3K4me1 signals were 90 also observed around a large number of gene promoters (11,001 and 16,175 loci in ESCs and 91 NPCs, respectively), where MLL3/4 occupancy is not detected (Dorighi et al., 2017; Hu et al., 92 2013; Lee et al., 2013; Wang et al., 2016) (Figures S1B, S1D, and S1F), possibly due to 93 activities of other methyltransferases that bind around promoter regions (Hu et al., 2017; Hu et 94 al., 2013; Hyun et al., 2017). We next focused on the candidate distal enhancers associated 95 with both H3K27ac and H3K4me1 (distance to transcription start site ≥ 10 kb) and measured 96 their dynamic chromatin states by ChIP-seq upon neural precursor differentiation in wild-type 97 (WT) and MLL3/4 dCD cells. In total, 35,744 and 33,777 candidate enhancers were identified in 98 ESCs and NPCs, respectively. During NPC differentiation, 3,373 candidate enhancers gained

99 H3K4me1 signals (FDR < 0.05, log2 FC > 0.5) along with increased chromatin accessibility as 100 profiled previously by ATAC-seq (Duren et al., 2017; Xu et al., 2017). Interestingly, 90% (N = 101 3,028) of them failed to acquire H3K4me1 in MLL3/4 dCD cells, suggesting that MLL3/4 102 catalytic activity plays a key role in the deposition of H3K4me1 at these de novo candidate 103 enhancers during ESC differentiation. Similarly, acquisition of H3K27ac at these distal elements bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

104 in NPC was also severely impaired in the dCD cells (Figure 1B, Table S2). On the other hand, 105 H3K4me1 level was unaffected at over 26,000 candidate enhancers in MLL3/4 dCD NPCs 106 (Figure 1C). These MLL3/4-independent candidate enhancers were already associated with 107 H3K4me1 in ESC in general and persisted during NPC differentiation. Many of them (14,778 108 loci) were also annotated as poised enhancers in ESCs and gained H3K27ac signals during 109 NPC differentiation, consistent with a recent report (Dorighi et al., 2017). Interestingly, in the 110 MLL3/4-dependent de novo candidate enhancers, motifs of GATA family TFs that are known to 111 be important for ESC differentiation and embryonic development were highly enriched (Figure 112 1D, Figures. S1H and S1I), suggesting a potential role for GATA family TFs in the activation and 113 recruitment of MLL3/4 at these distal enhancers during ESC differentiation (Fujikura et al., 2002; 114 Jozwik et al., 2016; Tremblay et al., 2018; Wamaitha et al., 2015; Yu et al., 2019). These 115 findings suggest that MLL3/4 catalytic activity plays a crucial role in chromatin state of a subset 116 of candidate enhancers, especially those that gained H3K4me1 during NPC differentiation, but 117 is dispensable for the maintenance of H3K4me1 at most candidate enhancers in NPCs. This 118 result highlights both MLL3/4-dependent and -independent mechanisms responsible for 119 H3K4me1 histone modification at enhancers. 120 121 Catalytic Activity of MLL3/4 is Required for Newly Formed E-P Contacts upon NPC 122 Differentiation 123 We next investigated the changes of chromatin contacts between candidate enhancers and 124 promoters upon loss of MLL3/4 catalytic activity. We performed PLAC-seq (Fang et al., 2016; 125 Mumbach et al., 2016) using antibodies against the promoter histone mark H3K4me3 to map 126 chromatin contacts anchored at active or poised promoters. Previous studies showed that 127 H3K4me3 is not affected by loss of MLL3/4 (Dorighi et al., 2017; Lee et al., 2013). We obtained 128 between 280 and 430 million paired-end reads for each replicate (Table S1). To determine the 129 differential chromatin contacts between compared samples, gene promoters with significantly 130 altered levels of H3K4me3 ChIP-seq signals (DEseq2 (Love et al., 2014), p value < 0.01) were 131 filtered out to remove the antibody bias. In the differential analysis between WT and MLL3/4 dCD 132 cells in ESCs and NPCs, we analyzed about 12,000 gene promoters with similar levels of 133 H3K4me3 ChIP-seq signals using a negative binomial model for each distance-stratified 10-kb 134 interval (Kubo et al., 2021; Li et al., 2014; Su et al., 2019) (Figure S2, Methods). These chromatin 135 contacts analysis showed that the decreases of E-P contacts upon loss of MLL3/4 catalytic activity 136 were predominantly observed in the NPCs (2252 reduced contacts, FDR < 0.05), while much 137 fewer changes were observed in the ESCs (43 reduced contacts, FDR < 0.05) (Figure 2A, Table bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

138 S3). Meanwhile, the comparison between ESCs and NPCs revealed that the majority of E-P 139 contacts induced during NPC differentiation in WT cells (2545 induced contacts, FDR < 0.05) 140 failed to form in MLL3/4 dCD cells (529 induced contacts, FDR < 0.05) (Figures S3A–S3C). We 141 also identified significant chromatin loops formed between enhancers and promoters using the 142 MAPS algorithm (Juric et al., 2019), and 882 significant E-P contacts (FDR < 0.01) could be 143 detected at the 3,028 MLL3/4-dependent de novo enhancers. As expected, they generally 144 displayed increased E-P contacts upon NPC differentiation in WT cells, but such induced E-P 145 contacts were largely diminished in MLL3/4 dCD cells (69 vs 17 of significantly induced contacts, 146 FDR < 0.05), suggesting that the MLL3/4-dependent H3K4me1 is required for the formation of E- 147 P contacts at these putative enhancers during NPC differentiation (Figure 2B). On the other hand, 148 the induction of E-P contacts at the 345 MLL3/4-independent de novo enhancers upon NPC 149 differentiation was less affected by the loss of MLL3/4 catalytic activity than at the 3,028 MLL3/4- 150 dependent enhancers (Figures 2C, 2D, and S3D). These results support a general role of MLL3/4- 151 dependent H3K4me1 at enhancers in the establishment of chromatin contacts, as we previously 152 reported (Yan et al., 2018). More importantly, we defined the set of candidate enhancers that are 153 dependent on MLL3/4 catalytic activity for H3K4me1 deposition and formation of long-range E-P 154 contacts during cell differentiation. 155 156 Meanwhile, we also observed loss of large numbers of promoter-promoter (P-P) contacts in 157 MLL3/4 dCD cells (1,233 contacts in NPCs, FDR < 0.05) (Figure 2A). Interestingly, the changes 158 in P-P contacts upon loss of MLL3/4 catalytic activity did not show any correlation with the 159 changes of H3K4me1 signals, unlike the changes of E-P contacts. (Figure S3E). Instead, the 160 changes of P-P contacts were correlated with the changes of E-P contacts that shared their 161 anchor sites at promoters, suggesting that these P-P contacts might be a consequence of the 162 changes of their nearby E-P contacts (Figures S3F and S3G). Taken together, loss of MLL3/4 163 catalytic activity caused the failure of H3K4me1 acquisition especially at de novo distal enhancers 164 in NPCs, resulting in global disruption of newly formed E-P contacts during neural differentiation. 165 166 Loss of MLL3/4 Catalytic Activity Delays NPC differentiation and Impairs Activation of a 167 Small Number of Genes 168 We next investigated the impact of the loss of MLL3/4 catalytic activity on gene activation during 169 neural differentiation. The MLL3/4 dCD cells exhibited a delay in formation of neuronal axons 170 and remained in ESC-like round shape colonies after 5 days of the neural induction (Figure 3A). 171 Consistent with this observation, the overall gene expression profiles at each time point also bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

172 suggested a delay of transcriptional transition in MLL3/4 dCD cells (Figure 3B). While the up- 173 regulation of most NPC marker genes such as Pax6, Sox3, Map2, and the down-regulation of 174 pluripotent markers such as Pou5f1, Sox2 were not interrupted in MLL3/4 dCD cells, induction 175 of other NPC marker genes such as Tuj1, NeuN, and Olig2 was significantly delayed (Figure 176 S4A). As a matter of fact, of the 1,303 genes activated during the cell differentiation (FDR < 177 0.05, FC > 2, RPKM in NPCs > 1), 31.6% or 411 genes (FC > 1.5, FDR < 0.05) were not fully 178 induced in MLL3/4 dCD cells (Figures 3C, S4B–S4D, Table S4). Genes related to organism 179 development and cell differentiation (e.g. Sox11, Hoxb9, Lhx1) were highly enriched in these 180 genes, suggesting that the loss of MLL3/4 catalytic activity impairs NPC differentiation (Figure 181 3D). 182 183 We focused on the induced genes that also displayed chromatin interactions between their 184 promoters and the 3,028 MLL3/4-dependent de novo enhancers. In general, genes interacting 185 with these de novo candidate enhancers tended to be up-regulated upon cell differentiation in 186 WT cells (Figure 3E). However, over 60% of them continued to be induced in MLL3/4 dCD cells 187 during NPC differentiation despite the reduction of H3K4me1 at distal elements (Figures 3F and 188 S4E). Why do some genes depend on MLL3/4 catalytic activity while others don’t? We 189 hypothesized that genes could be activated by both MLL3/4-dependent and -independent 190 enhancers during cell differentiation (Kubo et al., 2021; Lagha et al., 2012), and the relative 191 fraction of MLL3/4-dependent and -independent enhancers that contact promoters may 192 determine their dependence on the MLL3/4 catalytic activity. To test this hypothesis, we 193 analyzed the chromatin contact counts on two groups of candidate enhancers classified based 194 on the dependence of H3K4me1 on MLL3/4 catalytic activity as defined above (Figures 1B, 1C, 195 and 4A). For each gene, we calculated the ratio of total contact counts on the MLL3/4- 196 independent enhancers to that on all candidate enhancers. Supporting our hypothesis, genes 197 with relatively higher input from MLL3/4-independent enhancers tended to be unaffected by the 198 loss of MLL3/4 catalytic activity than genes making more contacts with MLL3/4-dependent 199 candidate enhancers (Figure 4B). The MLL3/4-independent genes (692 genes, defined in 200 Figure 3C) were more likely to interact with MLL3/4-independent candidate enhancers than the 201 MLL3/4-dependent genes (228 down-regulated and 98 up-regulated genes, defined in Figure 202 3C) (Figures 4C, S5). These findings suggest that the chromatin contacts with multiple MLL3/4- 203 independent enhancers could sustain the gene activation in the majority of genes during neural 204 differentiation in the absence of MLL3/4 catalytic activity. 205 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

206 DISCUSSION 207 In summary, our study found that accumulation of H3K4me1 and H3K27ac at new candidate 208 enhancers established during NPC differentiation were generally dependent on catalytic activities 209 of MLL3 and MLL4. By contrast, H3K4me1 at a much larger number of candidate enhancers is 210 independent of Mll3/4 catalytic activity and could be catalyzed by other histone 211 methyltransferases. Furthermore, consistent with our previous reports (Dixon et al., 2015; Yan et 212 al., 2018), we observed a severe disruption of newly formed E-P contacts at these new candidate 213 enhancers in MLL3/4 dCD NPCs. Lastly, our study demonstrated that loss of MLL3/4 catalytic 214 activity delays NPC differentiation and activation of nearly 1/3 of lineage specific genes, and our 215 results support that MLL3/4-dependent H3K4me1 plays a significant role at E-P contacts at 216 candidate enhancers which in turn contributes to gene activation at a subset of genes that make 217 more contacts with them. Notably, although MLL3/4 are believed to be the major regulators of 218 H3K4me1 in mammalian cells, H3K4me1 at over 75% of candidate distal enhancers in NPCs was 219 independent of MLL3/4 catalytic activity. Future studies are required to assess whether this finding 220 can be generalized to other cell lineages, and to determine the histone methyltransferase that 221 mediate the H3K4me1 at these elements (Crump and Milne, 2019; Hyun et al., 2017). Additionally, 222 further exploration of TFs such as GATA family members as shown in this study would help to 223 unveil the cell type specific role of MLL3/4 (Jozwik et al., 2016). A recent preprint study also 224 focused on the role of MLL3/4 catalytic activity in mouse embryonic development and ESC 225 differentiation and reported their altered transcription in subsets of genes (Xie et al., 2020). 226 Although the impact of MLL3/4 catalytic activity loss on those gene regulation programs 227 apparently differs depending on tissue types and differentiation conditions, our study provides an 228 explanation of the observed MLL3/4-dependent and -independent gene activations by focusing 229 on the E-P contact dynamics. It should be noted that our analysis of E-P contacts does not include 230 the information of chromatin contacts within 10-kb genomic distance due to a limitation of 231 resolution in the current approach. Our analysis is also not based on acute depletion of Mll3/4 232 catalytic activity, and some genes might be affected by secondary effects of the long-term 233 culturing. Nevertheless, our findings clarify the functional role of MLL3/4 catalytic activity in gene 234 regulation programs and provide new insight into the role of MLL3/4-mediated H3K4me1 in cell 235 differentiation. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

236 METHODS

237 Contact for Reagent and Resource Sharing

238 Further information and requests for reagents may be directed to and will be fulfilled by the 239 corresponding author Bing Ren ([email protected]) and the first author Naoki Kubo 240 ([email protected]).

241 EXPERIMENTAL MODEL AND SUBJECT DETAILS

242 Cell lines 243 244 Mouse R1 ES cell line was used for MLL3/4 catalytically deficient ES cell line, which was 245 reported in the previous study (Dorighi et al., 2017). ESCs were cultured in KnockOut Serum 246 Replacement containing mouse ES cell media: DMEM 85%, 15% KnockOut Serum 247 Replacement (Gibco), penicillin/streptomycin (Gibco), 1× non-essential amino acids (Gibco), 1× 248 GlutaMax (Gibco), 1000 U/ml LIF (Millipore), 0.4 mM β-mercaptoethanol. The cells were grown 249 on 0.2% gelatin-coated plates with irradiated mouse embryonic fibroblasts (MEFs) 250 (GlobalStem). Cells were maintained by passaging using Accutase (Innovative Cell 251 Technologies) at 37°C and 5% CO2. Medium was changed daily when cells were not passaged. 252 Cells were checked for mycoplasma infection and tested negative. 253

254 METHOD DETAILS

255 Neural progenitor cell differentiation 256 257 NPC differentiation was conducted utilizing retinoic acid (RA) induction (Bain et al., 1995; 258 Strubing et al., 1995). ESCs were grown on MEFs and passaged on 0.2% gelatin-coated plates 259 without MEFs one day before starting differentiation treatment. On day 0, LIF was deprived from 260 the culture medium. From day 1, 5 uM RA (Sigma, R2625) was added with the LIF-deprived 261 medium. Cells were harvested on day 2.5 and day 5. Alkaline phosphatase staining was 262 performed at each time point using the AP Staining kit II (Stemgent, 00-0055). 263 264 ChIP-seq library preparation bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

265 266 ChIP-seq experiments for each histone mark were performed as described in ENCODE 267 experiment protocols (“Ren Lab ENCODE Chromatin Immunoprecipitation Protocol” in 268 https://www.encodeproject.org/documents/) with minor modifications. Cells were crosslinked 269 with 1% formaldehyde for 10 minutes. We used 1.0 million cells for each ChIP sample. Shearing 270 of chromatin was performed using truChIP Chromatin Shearing Reagent Kit (Covaris) according 271 to the manufacturer’s instructions. Covaris M220 was used for sonication with following 272 parameters: 10 minutes duration at 10.0% duty factor, 75.0 peak power, 200 cycles per burst at 273 5-9°C temperature range. For immunoprecipitation, we used 11 μL anti-rabbit or anti-mouse IgG 274 Dynabeads (Life Technologies) and wash them with cold BSA/PBS (0.5 mg / mL bovine serum 275 albumin in 1x phosphate buffered saline) for 3 times. After washing, 3 μg antibody with 147 μL 276 cold BSA/PBS were added to the beads and incubated over 2 hours at 4°C. After incubation, 277 beads were washed with150 μL cold BSA/PBS for 3 times and mixed with 100 μL Binding Buffer 278 (1% Triton X-100, 0.1% Sodium Deoxycholate, 1x complete protease inhibitor (Roche)) plus 100 279 μL 0.2 μg/μl chromatin followed by overnight incubation on a rotating platform at 4°C. Beads 280 were washed 5 times with 50 mM Hepes pH 8.0, 1% NP-40, 1 mM EDTA, 0.70% Sodium 281 Deoxycholate, 0.5 M LiCl, 1x complete protease inhibitor (Roche) and washed once with 150 μL 282 cold 1x TE followed by incubation at 65°C for 20 minutes in 150 μL ChIP elution buffer (10 mM 283 Tris-HCl pH 8.0, 1 mM EDTA, 1% SDS). The beads were removed and the samples were 284 incubated at 65°C overnight to reverse crosslinks. The input samples were also processed in 285 parallel with the ChIP samples. Samples were incubated with RNase A (final conc. = 0.2 mg/mL) 286 at 37°C for 1 hour, and Proteinase K (final conc. = 0.4 mg/mL) at 55°C for 1 hour. The samples 287 were extracted with phenol: chloroform: isoamyl alcohol (25:24:1) and precipitated with ethanol. 288 We used 3-5 ng of starting IP materials for preparing Illumina sequencing libraries. The End-It 289 DNA End-Repair Kit (Epicentre) was used to repair DNA fragments to blunt ends. A-tailing 3’ 290 end was performed using Klenow Fragment (3'→5' exo-) (New England Biolabs), and then 291 TruSeq Adapters were ligated by Quick T4 DNA Ligase (New England Biolabs). Size selection 292 using AMpure Beads (Beckman Coulter) was performed to get 300-500bp DNA and PCR 293 amplification (8-10 cycles) was performed. Libraries were sequenced on HiSeq4000 single end 294 for 50 bp. Two biological replicates were prepared for each sample. 295 296 RNA-seq library preparation 297 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

298 Total RNA was extracted using the AllPrep Mini kit (QIAGEN) according to the manufacturer’s 299 instructions and 1 μg of total RNA was used to prepare each RNA-seq library. The libraries 300 were prepared using TruSeq Stranded mRNA Library Prep Kit (Illumina). Libraries were 301 sequenced on HiSeq4000 using 50 bp paired-end. Two biological replicates were prepared for 302 each sample. 303 304 PLAC-seq library preparation 305 306 PLAC-seq experiments were performed as previously described (Fang et al., 2016). Cells were 307 crosslinked with 1% formaldehyde (w/v, methanol-free, ThermoFisher) for 15 minutes. The 308 crosslinked pellets (2.5–3 million cells per sample) were incubated with 300ul of lysis buffer (10 309 mM Tris-HCl pH 8.0, 10 mM NaCl, 0.2% Igepal CA630, 33 μL, 1x complete protease inhibitor 310 (Roche)) on ice for 15 min, washed with 500 μL cold lysis buffer, and then incubated in 50uL of 311 0.5% SDS for 10 min at 62°C. After heating, 160 μL of 1.56% Triton X-100 was added and 312 incubated for 15min at 37˚C. To digest chromatin 100U MboI and 25uL of 10X NEBuffer2 were 313 added followed by 2 hours incubation at 37˚C with agitation at 900rpm. MboI was inactivated by 314 heating at 62°C. Digestion efficiency was confirmed by performing agarose gel electrophoresis 315 of the samples. The digested ends were labeled with biotin by adding 37.5uL of 0.4mM biotin- 316 14-dATP (Life Tech), 1.5 μL of 10mM dCTP, 10mM dTTP, 10mM dGTP, and 8uL of 5U/ul 317 Klenow (New England Biolabs) and incubating at 37°C for 1 hour with shaking at 900 rpm. Then 318 the samples were mixed with 1x T4 DNA ligase buffer (New England Biolabs), 0.83% Trition X- 319 100, 0.1 mg/mL BSA, 2000U T4 DNA Ligase (New England Biolabs, M0202), and incubated at 320 room temperature for 2 hours with shaking with slow rotation. The ligated cell pellets were 321 resuspended in 125 ul of RIPA buffer with protease inhibitor and incubated on ice for 10 322 minutes. The cell lysates were sonicated using Covaris M220. After spinning, we saved 20 ul 323 supernatant as input, and for the rest part, 100 ul of antibody-coupled beads were added to the 324 supernatant sample, and then rotated in cold room at least 12 hours. For immunoprecipitation, 325 300 ul of M280 sheep anti-rabbit IgG beads (ThermoFisher) was washed with cold BSA/PBS 326 (0.5 mg / mL bovine serum albumin in 1x phosphate buffered saline) for 4 times. After washing, 327 30 ug anti-H3K4me3 (Millipore, 04-745) with 1 mL cold BSA/PBS were added to the beads and 328 incubated on a rotating platform at 4°C for over 3 hours. After incubation, beads were washed 329 with cold BSA/PBS and resuspended in 600 ul RIPA buffer. The beads were washed with RIPA 330 buffer (3 times), RIPA buffer + 0.16M NaCl (2 times), LiCl buffer (1 time), and TE buffer (2 331 times) at 4°C for 3 minutes at 1000 rpm. For reverse crosslinking, 163 ul extraction buffer (135 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

332 ul 1xTE, 15 ul 10% SDS, 12 ul 5M NaCl, 1 ul RnaseA (10mg/ml)) was added and incubated at 333 37°C for 1 hour at 1000 rpm, and 20 ug of proteinase K was added and incubated at 65°C for 2 334 hours at 1000rpm. After crosslinking, DNA was purified using Zymo DNA clean & concentrator 335 and eluted with 50 ul of 10mM Tris (pH 8.0). For biotin enrichment, 25 ul of T1 Streptavidin 336 Beads (Invitrogen) per sample were washed with 400 ul Tween wash buffer (5 mM Tris-HCl pH 337 8.0, 0.5 mM EDTA, 1 M NaCl, 0.05% Tween-20), and resuspended in 50 ul of 2x Binding buffer 338 (10 mM Tris-HCl pH 7.5, 1 mM EDTA, 2 M NaCl). The purified 50 ul DNA sample was added to 339 the 50 ul resuspended beads and incubated at room temperature for 15 minutes with rotation. 340 The beads were washed with 500 ul of Tween wash buffer twice and washed with 100 ul Low 341 EDTA TE (supplied by Swift Biosciences kit). Then beads were resuspended in 40 ul Low EDTA 342 TE. Next, we used Swift Biosciences kit (Cat. No. 21024) for library construction with modified 343 protocol. The Repair I Reaction Mix was added to 40 ul sample beads and incubated at 37°C for 344 10 minutes at 800 rpm. The beads were washed with 500 ul Tween wash buffer twice and 345 washed with 100 ul Low EDTA TE once. The Repair II Reaction Mix was added to the beads 346 followed by incubation at 20°C for 20 minutes at 800 rpm. The beads were washed in the same 347 way with 500 ul Tween wash buffer and 100 ul Low EDTA TE. Then, 25 ul of the Ligation I 348 Reaction Mix and Reagent Y2 was added to the beads followed by incubation at 25°C for 15 349 minutes at 800 rpm. The beads were washed with Tween wash buffer and Low EDTA TE. Then 350 50 ul of the Ligation II Reaction Mix was added to the beads followed by incubation at 40°C for 351 10 minutes at 800 rpm. The beads were washed and resuspended in 21 ul 10mM Tris-HCl (pH 352 8.0). The amplification and purification were performed according to the Swift library kit 353 protocols. Libraries were sequenced on Illumina HiSeq 4000. Two biological replicates were 354 prepared for each sample. 355 356 357 QUANTIFICATION AND STATISTICAL ANALYSIS 358 359 ChIP-seq data analysis 360 361 Each fastq file was mapped to mouse genome (mm10) with BWA(Li and Durbin, 2009) -aln with 362 “-q 5 -l 32 -k 2” options. PCR duplicates were removed using Picard MarkDuplicates 363 (https://github.com/broadinstitute/picard) and the bigWig files were created using deepTools 364 (Ramirez et al., 2016) with following parameters: bamCompare --binSize 10 --normalizeUsing 365 RPKM --operation subtract (or log2). The deepTools was also used for generating heatmaps. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

366 Peaks were called with input control using MACS2 (Zhang et al., 2008) with broad peak calling. 367 The candidate active enhancer regions were characterized by the presence of both H3K4me1 368 and H3K27ac peaks, but not H3K4me3 peaks. DEseq2 (Love et al., 2014) was used for 369 differential peak analysis. We defined “De novo enhancers in NPC” as the enhancer regions

370 whose H3K4me1 signal was significantly increased from ESCs to NPCs (FDR < 0.05, log2 FC > 371 0.5). The same differential peak analysis was performed between WT and MLL3/4 dCD cells to 372 determine the MLL3/4-dependent and -independent enhancers. 373 374 Motif analysis 375 376 Enrichment analysis of known DNA binding motifs was performed using HOMER tool (Heinz et 377 al., 2010). Default parameters with a fragment size of 1000 bp and “-mask” parameter were 378 used. In the differential motif analysis, regions from the compared MLL3/4-dependent or - 379 independent enhancers were used as background by adding “-bg” parameter. 380 381 RNA-seq data analysis 382 383 RNA-seq reads (paired-end) were mapped to the mm10 genome using STAR (Dobin et al., 384 2013). The mapped reads were counted using HTSeq (Anders et al., 2015) and the output files 385 from two replicates were subsequently analyzed by edgeR (Robinson et al., 2010) to detect the 386 differentially expressed genes (FDR < 0.05, FC > 2 or FC > 1.5). RPKM was calculated using 387 an in-house pipeline. 388 389 ATAC-seq data analysis 390 391 ATAC-seq reads (paired-end) were mapped to the mm10 genome and processed using 392 ENCODE ATAC-seq pipeline (https://github.com/ENCODE-DCC/atac-seq-pipeline). The 393 deepTools (Ramirez et al., 2016) was used to generate bigwig files and heatmaps as described 394 above. 395 396 PLAC-seq data analysis 397 398 PLAC-seq reads (paired-end) were aligned against the mm10 genome using BWA -mem (Li 399 and Durbin, 2009). PCR duplicate reads were removed using Picard MarkDuplicates. Filtered bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

400 reads were binned at 10 kb size to generate the contact matrix. Individual bins that were 401 overlapped with H3K4me3 peaks on transcription start sites (TSSs) were used for downstream 402 differential contact analysis. MAPS (Juric et al., 2019) was used for peak calling with default 403 settings in 10 kb resolution. For the differential contact analysis (Kubo et al., 2021), the raw 404 contact counts in 10 kb resolution bins that have the same genomic distance were used as 405 inputs. To minimize the bias from genomic distance, we stratified the inputs into every 10-kb 406 genomic distance from 10 kb to 150 kb, and the rest of the input bins with longer distances were 407 stratified to have a uniform size of input bins that were equal to that of 140–150 kb distance 408 bins. Since these inputs showed negative binomial distribution, edgeR (Robinson et al., 2010) 409 was used to get the initial set of differential interactions. Only bins that had more than 20 410 contact counts in each sample of two replicates were used for the downstream analysis. The 411 significances of these differential interactions are either due to the difference in their H3K4me3 412 ChIP coverage or 3D contacts coverage. Therefore, the chromatin contacts from promoter 413 regions with differential ChIP-seq peaks between the samples (p value < 0.01) were removed 414 and only the chromatin contacts with the same level of H3K4me3 ChIP-seq peaks were 415 processed. We used all bins for inputs that included non-significant interactions that were not 416 identified by MAPS peak caller because the majority of short-range interactions were not 417 identified as significant peaks due to their high background and the changes in these short- 418 range interactions might be also important for gene regulation. We identified a large number of 419 differentially changed short-range interactions even though many of them were not identified by 420 peak caller, and we observed that such differentially changed interactions were positively 421 correlated with the changes of H3K4me1/H3K27ac levels on their anchor sites during neural 422 differentiation (Figure S3C), suggesting these interaction changes might reflect the biological

423 changes. We used significance level with change direction (-/+ log10(p-value)) instead of fold 424 change to show the changes of chromatin contacts because fold change tends to be marginal 425 when it is short-range interaction although their changes are actually significant and biologically 426 meaningful. To visualize the chromatin contacts, we used WashU Epigenome Browser (Zhou et 427 al., 2013). 428 429 In Figure 4, the ratio of chromatin contacts on MLL3/4-independent enhancers was calculated 430 by simply summing raw contact counts between promoters and all MLL3/4-independent 431 enhancers and dividing by total contact counts between promoters and all candidate enhancers 432 in each gene in WT NPCs. The contact counts within 10-kb distance (1 bin) was not added. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

433 Genes that had less than 50 of the total contacts counts were removed for downstream 434 analysis. 435 436 Odds ratio calculation for CTCF-dependent E-P contacts enrichment 437 438 For Figure S5B, all genes were classified based on the distance to the nearest interacting 439 enhancer and the number of enhancers around TSS (< 200 kb) (categorized into 3x3 bins). The 440 distance to the nearest interacting enhancer is represented by the shortest genomic distance of 441 significant PLAC-seq peaks on enhancers and promoters (p-value < 0.01). Then, we generated 442 2x2 tables based on whether they are stably-regulated genes or not (FDR < 0.05) and whether 443 they were categorized into the bin or not. Odds ratios and p-values on each 2x2 tables were 444 calculated. In Figure S5C and S5D, the same analysis as panel (B) was performed in 445 differentially down-regulated genes and differentially up-regulated genes. 446 447 DATA AND SOFTWARE AVAILABILITY 448 449 All datasets generated in this study have been deposited to Gene Expression Omnibus (GEO), 450 with accession number GSE160892. ATAC-seq datasets were downloaded from GSE84646 451 and GSE98479 (Duren et al., 2017; Xu et al., 2017). 452 453 KEY RESOURCES TABLE 454

REAGENT or RESOURCE SOURCE IDENTIFIER

Antibodies Rabbit polyclonal anti-H3K4me1 Abcam ab8895 Rabbit monoclonal anti-H3K4me3 EMD Millipore 04-745 Rabbit polyclonal anti-H3K27ac Active Motif 39133 Mouse monoclonal anti-H3K27me3 Active Motif 61017 Critical Commercial Assays TruSeq Stranded mRNA Library Prep Kit Illumina RS-122-2101 ThruPLEX DNA-seq 12s kit Rubicon Genomics R400428 Accel-NGS 2S Plus DNA Library Kit Swift Biosciences 21024 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Deposited Data Raw and processed sequencing data This paper GSE160892 Experimental Models: Cell Lines WT R1 mESC ATCC Cat#SCRC-1036 MLL3/4 dCD mESC Dorighi et al., 2017 N/A Software and Algorithms R R Core Team, 2020 http://www.R- project.org/ Perl Perl.org https://www.perl.or g edgeR Robinson et al., https://bioconducto 2012 r.org/packages/rel ease/bioc/html/edg eR.html MACS Zhang et al., 2008 http://liulab.dfci.har vard.edu/MACS/00 README.html deepTools Ramírez et al., 2016 https://github.com/f idelram/deepTools DESeq2 Love et al., 2014 https://bioconducto r.org/packages/ release/bioc/html/ DESeq2.html MAPS Juric et al., 2019 https://github.com/i juric/MAPS Homer Heinz et al., 2010 http://homer.ucsd. edu/homer/motif/ ENCODE ATAC-seq pipeline ENCODE DCC 2017 https://github.com/ ENCODE- DCC/atac-seq- pipeline STAR Dobin et al., 2013 https://github.com/ alexdobin/STAR bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

bwa Li et al., 2013 https://github.com/l h3/bwa WashU Epigenome Browser Zhou et al., 2013 http://epigenomeg ateway.wustl.edu/ igv Robinson et al., http://software.bro 2011 adinstitute.org/soft ware/igv/ bedtools Quinlan, 2014 http://bedtools.rea dthedocs.io/en/late st/ pgltools Greenwald et al. https://github.com/ 2017 billgreenwald/pglto ols Datasets Wild-type mouse ESCs ATAC-seq Xu et al., 2017 GSE84646 Wild-type mouse NPCs ATAC-seq Duren et al., 2017 GSE98479 455 456 ACKNOWLEDGMENTS 457 We would like to thank Drs. Kristel M Dorighi and Joanna Wysocka (Stanford School of Medicine) 458 for sharing the MLL3/4 dCD mouse ES cell line. We would like to give special thanks to Samantha 459 Kuan for operating the sequencing instruments and Robert Morey for helping with experiments. 460 We would like to acknowledge the help of Drs. Ivan Juric, Armen Abnousi, and Ming Hu (Lerner 461 Research Institute, Cleveland Clinic Foundation) for sharing computational pipelines. We would 462 also like to give special thanks to Drs. Bin Li, Miao Yu, Ramya Raviram, Yanxiao Zhang, and 463 Yang Li for sharing helpful computational pipelines and protocols, as well as all the other members 464 of the Ren laboratory. This work was supported by the Ludwig Institute for Cancer Research 465 (B.R.), NIH (1U54DK107977-01) (B.R.), and a Postdoc fellowship from the TOYOBO 466 Biotechnology Foundation (N.K.). 467 468 AUTHOR CONTRIBUTIONS 469 N.K. and B.R. conceived the project. N.K., R.H., and Z.Y. carried out library preparation. N.K. 470 performed data analysis. N.K. and B.R. wrote the manuscript. All authors edited the manuscript. 471 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

472 DECLARATION OF INTERESTS 473 B.R. is a co-founder of Arima Genomics, Inc. and Epigenome Technologies, Inc.. 474 475 REFERENCES 476 Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq--a Python framework to work with high- 477 throughput sequencing data. Bioinformatics 31, 166-169. 478 Andersson, R., Gebhard, C., Miguel-Escalada, I., Hoof, I., Bornholdt, J., Boyd, M., Chen, Y., 479 Zhao, X., Schmidl, C., Suzuki, T., et al. (2014). An atlas of active enhancers across human cell 480 types and tissues. Nature 507, 455-461. 481 Ang, S.Y., Uebersohn, A., Spencer, C.I., Huang, Y., Lee, J.E., Ge, K., and Bruneau, B.G. 482 (2016). KMT2D regulates specific programs in heart development via histone H3 lysine 4 di- 483 methylation. Development 143, 810-821. 484 Ashokkumar, D., Zhang, Q., Much, C., Bledau, A.S., Naumann, R., Alexopoulou, D., Dahl, A., 485 Goveas, N., Fu, J., Anastassiadis, K., et al. (2020). MLL4 is required after implantation, whereas 486 MLL3 becomes essential during late gestation. Development 147. 487 Bain, G., Kitchens, D., Yao, M., Huettner, J.E., and Gottlieb, D.I. (1995). Embryonic stem cells 488 express neuronal properties in vitro. Dev Biol 168, 342-357. 489 Calo, E., and Wysocka, J. (2013). Modification of enhancer chromatin: what, how, and why? 490 Mol Cell 49, 825-837. 491 Cao, K., Collings, C.K., Morgan, M.A., Marshall, S.A., Rendleman, E.J., Ozark, P.A., Smith, 492 E.R., and Shilatifard, A. (2018). An Mll4/COMPASS-Lsd1 epigenetic axis governs enhancer 493 function and pluripotency transition in embryonic stem cells. Sci Adv 4, eaap8747. 494 Clapier, C.R., and Cairns, B.R. (2009). The biology of chromatin remodeling complexes. Annu 495 Rev Biochem 78, 273-304. 496 Consortium, E.P. (2012). An integrated encyclopedia of DNA elements in the . 497 Nature 489, 57-74. 498 Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna, 499 J., Lodato, M.A., Frampton, G.M., Sharp, P.A., et al. (2010). Histone H3K27ac separates active 500 from poised enhancers and predicts developmental state. Proc Natl Acad Sci U S A 107, 501 21931-21936. 502 Crump, N.T., and Milne, T.A. (2019). Why are so many MLL lysine methyltransferases required 503 for normal mammalian development? Cell Mol Life Sci 76, 2885-2898. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

504 Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P.D., Dean, A., and Blobel, G.A. 505 (2012). Controlling long-range genomic interactions at a native locus by targeted tethering of a 506 looping factor. Cell 149, 1233-1244. 507 Dixon, J.R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J.E., Lee, A.Y., Ye, Z., Kim, 508 A., Rajagopal, N., Xie, W., et al. (2015). Chromatin architecture reorganization during stem cell 509 differentiation. Nature 518, 331-336. 510 Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, 511 M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 512 (Oxford, England) 29, 15-21. 513 Dorighi, K.M., Swigut, T., Henriques, T., Bhanu, N.V., Scruggs, B.S., Nady, N., Still, C.D., 2nd, 514 Garcia, B.A., Adelman, K., and Wysocka, J. (2017). Mll3 and Mll4 Facilitate Enhancer RNA 515 Synthesis and Transcription from Promoters Independently of H3K4 Monomethylation. Mol Cell 516 66, 568-576 e564. 517 Duren, Z., Chen, X., Jiang, R., Wang, Y., and Wong, W.H. (2017). Modeling gene regulation 518 from paired expression and chromatin accessibility data. Proc Natl Acad Sci U S A 114, E4914- 519 E4923. 520 Euskirchen, G., Auerbach, R.K., and Snyder, M. (2012). SWI/SNF chromatin-remodeling 521 factors: multiscale analyses and diverse functions. J Biol Chem 287, 30897-30905. 522 Fang, R., Yu, M., Li, G., Chee, S., Liu, T., Schmitt, A.D., and Ren, B. (2016). Mapping of long- 523 range chromatin interactions by proximity ligation-assisted ChIP-seq. Cell Res 26, 1345-1348. 524 Fujikura, J., Yamato, E., Yonemura, S., Hosoda, K., Masui, S., Nakao, K., Miyazaki Ji, J., and 525 Niwa, H. (2002). Differentiation of embryonic stem cells is induced by GATA factors. Genes Dev 526 16, 784-789. 527 Gorkin, D.U., Leung, D., and Ren, B. (2014). The 3D genome in transcriptional regulation and 528 pluripotency. Cell Stem Cell 14, 762-775. 529 Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, 530 L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modifications at human enhancers reflect 531 global cell-type-specific gene expression. Nature 459, 108-112. 532 Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X., Murre, C., 533 Singh, H., and Glass, C.K. (2010). Simple combinations of lineage-determining transcription 534 factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell 38, 535 576-589. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

536 Herz, H.M., Mohan, M., Garruss, A.S., Liang, K., Takahashi, Y.H., Mickey, K., Voets, O., 537 Verrijzer, C.P., and Shilatifard, A. (2012). Enhancer-associated H3K4 monomethylation by 538 Trithorax-related, the Drosophila homolog of mammalian Mll3/Mll4. Genes Dev 26, 2604-2620. 539 Hu, D., Gao, X., Cao, K., Morgan, M.A., Mas, G., Smith, E.R., Volk, A.G., Bartom, E.T., 540 Crispino, J.D., Di Croce, L., et al. (2017). Not All H3K4 Methylations Are Created Equal: 541 Mll2/COMPASS Dependency in Primordial Germ Cell Specification. Mol Cell 65, 460-475 e466. 542 Hu, D., Gao, X., Morgan, M.A., Herz, H.M., Smith, E.R., and Shilatifard, A. (2013). The 543 MLL3/MLL4 branches of the COMPASS family function as major histone H3K4 544 monomethylases at enhancers. Mol Cell Biol 33, 4745-4754. 545 Hyun, K., Jeon, J., Park, K., and Kim, J. (2017). Writing, erasing and reading histone lysine 546 methylations. Exp Mol Med 49, e324. 547 Jin, Q., Yu, L.R., Wang, L., Zhang, Z., Kasper, L.H., Lee, J.E., Wang, C., Brindle, P.K., Dent, 548 S.Y., and Ge, K. (2011). Distinct roles of GCN5/PCAF-mediated H3K9ac and CBP/p300- 549 mediated H3K18/27ac in nuclear transactivation. EMBO J 30, 249-262. 550 Jozwik, K.M., Chernukhin, I., Serandour, A.A., Nagarajan, S., and Carroll, J.S. (2016). FOXA1 551 Directs H3K4 Monomethylation at Enhancers via Recruitment of the Methyltransferase MLL3. 552 Cell Rep 17, 2715-2723. 553 Juric, I., Yu, M., Abnousi, A., Raviram, R., Fang, R., Zhao, Y., Zhang, Y., Qiu, Y., Yang, Y., Li, 554 Y., et al. (2019). MAPS: Model-based analysis of long-range chromatin interactions from PLAC- 555 seq and HiChIP experiments. PLoS Comput Biol 15, e1006982. 556 Kubo, N., Ishii, H., Xiong, X., Bianco, S., Meitinger, F., Hu, R., Hocker, J.D., Conte, M., Gorkin, 557 D., Yu, M., et al. (2021). Promoter-proximal CTCF binding promotes distal enhancer-dependent 558 gene activation. Nat Struct Mol Biol. 559 Lagha, M., Bothma, J.P., and Levine, M. (2012). Mechanisms of transcriptional precision in 560 animal development. Trends Genet 28, 409-416. 561 Lai, B., Lee, J.E., Jang, Y., Wang, L., Peng, W., and Ge, K. (2017). MLL3/MLL4 are required for 562 CBP/p300 binding on enhancers and super-enhancer formation in brown adipogenesis. Nucleic 563 Acids Res 45, 6388-6403. 564 Lee, J.E., Wang, C., Xu, S., Cho, Y.W., Wang, L., Feng, X., Baldridge, A., Sartorelli, V., Zhuang, 565 L., Peng, W., et al. (2013). H3K4 mono- and di-methyltransferase MLL4 is required for enhancer 566 activation during cell differentiation. Elife 2, e01503. 567 Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler 568 transform. Bioinformatics (Oxford, England) 25, 1754-1760. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

569 Li, Y., Han, J., Zhang, Y., Cao, F., Liu, Z., Li, S., Wu, J., Hu, C., Wang, Y., Shuai, J., et al. 570 (2016). Structural basis for activity regulation of MLL family methyltransferases. Nature 530, 571 447-452. 572 Li, Y., Rivera, C.M., Ishii, H., Jin, F., Selvaraj, S., Lee, A.Y., Dixon, J.R., and Ren, B. (2014). 573 CRISPR reveals a distal super-enhancer required for Sox2 expression in mouse embryonic 574 stem cells. PloS one 9, e114485. 575 Long, H.K., Prescott, S.L., and Wysocka, J. (2016). Ever-Changing Landscapes: Transcriptional 576 Enhancers in Development and Evolution. Cell 167, 1170-1187. 577 Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and 578 dispersion for RNA-seq data with DESeq2. Genome Biol 15, 550. 579 Mumbach, M.R., Rubin, A.J., Flynn, R.A., Dai, C., Khavari, P.A., Greenleaf, W.J., and Chang, 580 H.Y. (2016). HiChIP: efficient and sensitive analysis of protein-directed genome architecture. 581 Nat Methods 13, 919-922. 582 Ng, S.B., Bigham, A.W., Buckingham, K.J., Hannibal, M.C., McMillin, M.J., Gildersleeve, H.I., 583 Beck, A.E., Tabor, H.K., Cooper, G.M., Mefford, H.C., et al. (2010). Exome sequencing 584 identifies MLL2 mutations as a cause of Kabuki syndrome. Nat Genet 42, 790-793. 585 Parsons, D.W., Li, M., Zhang, X., Jones, S., Leary, R.J., Lin, J.C., Boca, S.M., Carter, H., 586 Samayoa, J., Bettegowda, C., et al. (2011). The genetic landscape of the childhood cancer 587 medulloblastoma. Science 331, 435-439. 588 Pasqualucci, L., Trifonov, V., Fabbri, G., Ma, J., Rossi, D., Chiarenza, A., Wells, V.A., Grunn, A., 589 Messina, M., Elliot, O., et al. (2011). Analysis of the coding genome of diffuse large B-cell 590 lymphoma. Nat Genet 43, 830-837. 591 Placek, K., Hu, G., Cui, K., Zhang, D., Ding, Y., Lee, J.E., Jang, Y., Wang, C., Konkel, J.E., 592 Song, J., et al. (2017). MLL4 prepares the enhancer landscape for Foxp3 induction via 593 chromatin looping. Nat Immunol 18, 1035-1045. 594 Rada-Iglesias, A., Bajpai, R., Swigut, T., Brugmann, S.A., Flynn, R.A., and Wysocka, J. (2011). 595 A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 596 279-283. 597 Ramirez, F., Ryan, D.P., Gruning, B., Bhardwaj, V., Kilpert, F., Richter, A.S., Heyne, S., Dundar, 598 F., and Manke, T. (2016). deepTools2: a next generation web server for deep-sequencing data 599 analysis. Nucleic Acids Res 44, W160-165. 600 Rickels, R., Herz, H.M., Sze, C.C., Cao, K., Morgan, M.A., Collings, C.K., Gause, M., 601 Takahashi, Y.H., Wang, L., Rendleman, E.J., et al. (2017). Histone H3K4 monomethylation bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

602 catalyzed by Trr and mammalian COMPASS-like proteins at enhancers is dispensable for 603 development and viability. Nat Genet 49, 1647-1653. 604 Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: a Bioconductor package for 605 differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. 606 Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., 607 Lobanenkov, V.V., et al. (2012). A map of the cis-regulatory sequences in the mouse genome. 608 Nature 488, 116-120. 609 Shlyueva, D., Stampfel, G., and Stark, A. (2014). Transcriptional enhancers: from properties to 610 genome-wide predictions. Nature reviews Genetics 15, 272-286. 611 Strubing, C., Ahnert-Hilger, G., Shan, J., Wiedenmann, B., Hescheler, J., and Wobus, A.M. 612 (1995). Differentiation of pluripotent embryonic stem cells into the neuronal lineage in vitro gives 613 rise to mature inhibitory and excitatory neurons. Mech Dev 53, 275-287. 614 Su, G., Guo, D., Chen, J., Liu, M., Zheng, J., Wang, W., Zhao, X., Yin, Q., Zhang, L., Zhao, Z., 615 et al. (2019). A distal enhancer maintaining Hoxa1 expression orchestrates retinoic acid- 616 induced early ESCs differentiation. Nucleic Acids Res 47, 6737-6752. 617 Sze, C.C., and Shilatifard, A. (2016). MLL3/MLL4/COMPASS Family on Epigenetic Regulation 618 of Enhancer Function and Cancer. Cold Spring Harb Perspect Med 6. 619 Tremblay, M., Sanchez-Ferras, O., and Bouchard, M. (2018). GATA transcription factors in 620 development and disease. Development 145. 621 Wamaitha, S.E., del Valle, I., Cho, L.T., Wei, Y., Fogarty, N.M., Blakeley, P., Sherwood, R.I., Ji, 622 H., and Niakan, K.K. (2015). Gata6 potently initiates reprograming of pluripotent and 623 differentiated cells to extraembryonic endoderm stem cells. Genes Dev 29, 1239-1255. 624 Wang, C., Lee, J.E., Lai, B., Macfarlan, T.S., Xu, S., Zhuang, L., Liu, C., Peng, W., and Ge, K. 625 (2016). Enhancer priming by H3K4 methyltransferase MLL4 controls cell fate transition. Proc 626 Natl Acad Sci U S A 113, 11871-11876. 627 Will, B., and Steidl, U. (2014). Combinatorial haplo-deficient tumor suppression in 7q-deficient 628 myelodysplastic syndrome and acute myeloid leukemia. Cancer Cell 25, 555-557. 629 Xie, G., Lee, J.-E., McKernan, K., Park, Y.-K., Jang, Y., Liu, C., Peng, W., and Ge, K. (2020). 630 MLL3/MLL4 methyltransferase activities regulate embryonic stem cell differentiation 631 independent of enhancer H3K4me1. bioRxiv, 2020.2009.2014.296558. 632 Xu, J., Carter, A.C., Gendrel, A.V., Attia, M., Loftus, J., Greenleaf, W.J., Tibshirani, R., Heard, 633 E., and Chang, H.Y. (2017). Landscape of monoallelic DNA accessibility in mouse embryonic 634 stem cells and neural progenitor cells. Nat Genet 49, 377-386. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

635 Yan, J., Chen, S.A., Local, A., Liu, T., Qiu, Y., Dorighi, K.M., Preissl, S., Rivera, C.M., Wang, C., 636 Ye, Z., et al. (2018). Histone H3 lysine 4 monomethylation modulates long-range chromatin 637 interactions at enhancers. Cell Res 28, 204-220. 638 Yu, W., Huang, W., Yang, Y., Qiu, R., Zeng, Y., Hou, Y., Sun, G., Shi, H., Leng, S., Feng, D., et 639 al. (2019). GATA3 recruits UTX for gene transcriptional activation to suppress metastasis of 640 breast cancer. Cell Death Dis 10, 832. 641 Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nusbaum, C., 642 Myers, R.M., Brown, M., Li, W., et al. (2008). Model-based analysis of ChIP-Seq (MACS). 643 Genome Biol 9, R137. 644 Zhang, Y., Mittal, A., Reid, J., Reich, S., Gamblin, S.J., and Wilson, J.R. (2015). Evolving 645 Catalytic Properties of the MLL Family SET Domain. Structure 23, 1921-1933. 646 Zhou, X., Lowdon, R.F., Li, D., Lawson, H.A., Madden, P.A., Costello, J.F., and Wang, T. 647 (2013). Exploring long-range genome interactions using the WashU Epigenome Browser. Nat 648 Methods 10, 375-376. 649 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

650 Figure legends 651 652 Figure 1 | Loss of MLL3/4 catalytic activity leads to failure of accumulation of H3K4me1 at 653 de novo enhancers during NPC differentiation of ESC. 654 (A) Schematic representation of wild-type (WT) and catalytically deficient (dCD) mouse MLL3 and 655 MLL4 proteins (top) and experimental design to explore the changes of histone modifications, 656 gene regulation, and enhancer-promoter (E-P) contacts during neural differentiation in the 657 presence and absence of MLL3/4 catalytic activity (bottom). Tyrosine (Y) residues are mutated to 658 alanine (A) in SET domain to inactivate MLL3/4 in dCD cells. 659 660 (B)(C) Heatmaps showing H3K4me1 (left) and H3K27ac (middle) ChIP-seq and ATAC-seq (right) 661 signals centered at H3K27ac peaks of candidate enhancers identified in WT NPCs. Candidate 662 enhancers were classified based on whether H3K4me1 peak signals were significantly gained 663 only in NPCs (day 5) (de novo enhancers, N=3,373) (B) or not (persistent enhancers, N=30,404)

664 (FDR < 0.05, log2FC > 0.5) (C), and further classified into enhancers that had significantly lower 665 level of H3K4me1 signals in MLL3/4 dCD NPCs compared with that in WT NPCs (MLL3/4- 666 dependent, N=3,028 and 4,150) and the other enhancers (MLL3/4-independent, N=345 and 667 26,254). 668 669 (D) The top 5 enriched known TF binding motifs at MLL3/4-dependent (left) and -independent 670 enhancers (right) in the group of de novo enhancers. Their enrichment p values and p values of 671 differential motif analysis between MLL3/4-dependent and -independent enhancers are also 672 indicated (see Figure S1I for the same analysis in the group of persistent candidate enhancers). 673 See also Figure S1. 674 675 Figure 2 | Severe disruption of newly formed E-P contacts during NPC differentiation in 676 MLL3/4 dCD cells. 677 (A) Scatter plots showing genome-wide changes of chromatin contacts anchored on promoters 678 and enhancers (y-axis) identified in differential interaction analysis between WT cells and MLL3/4 679 dCD cells in ESCs (left) and NPCs (right). Genomic distances between their two loop anchor sites

680 are plotted on x-axis. The interaction changes are indicated by significance value (-/+log10(p- 681 value)). The numbers of significantly changed E-P and promoter-promoter (P-P) contacts are also 682 indicated (FDR < 0.05). Red and orange dots; induced chromatin contacts in MLL3/4 dCD cells bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

683 (FDR < 0.05 and p value < 0.01, respectively). Blue and light-blue dots; reduced chromatin 684 contacts in MLL3/4 dCD cells (FDR < 0.05 and p value < 0.01, respectively). 685 686 (B) Volcano plots showing changes of E-P contacts centered at the MLL3/4-dependent de novo 687 enhancers (3028 loci) upon cell differentiation from ESCs towards NPCs in WT (left) and 688 MLL3/4 dCD cells (right). E-P contacts that were overlapped with significant peaks called by 689 MAPS are plotted (FDR < 0.01). Significantly induced E-P contacts upon cell differentiation 690 (FDR < 0.05) are plotted as red dots and the numbers of them are also indicated. 691 692 (C) Histogram showing the odds ratio of the decrease of the number of significantly induced E-P 693 contacts upon loss of MLL3/4 catalytic activity. E-P contacts centered at the MLL3/4-dependent 694 de novo enhancers (3028 loci) and the MLL3/4-independent de novo enhancers (345 loci) are 695 shown separately. P values for each odds ratio (Fisher’s exact test) are also indicated. 696 697 (D) Heatmaps showing the changes of E-P contacts centered at the MLL3/4-dependent de novo 698 enhancers (3028 loci) and the MLL3/4-independent de novo enhancers (345 loci) upon cell 699 differentiation from ESCs towards NPCs in WT (left column) and MLL3/4 dCD cells (right

700 column). The interaction changes are indicated by significance value (-/+log10(p-value)). 701 Boxplots also show the changes of E-P contacts by fold change (NPC/WT in MLL3/4 dCD). 702 Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; 703 whiskers, minimum and maximum value within the range of (1st quartile-1.5*(3rd quartile- 1st 704 quartile)) to (3rd quartile+1.5*(3rd quartile- 1st quartile)). *** p value < 0.001, two-tailed t-test. 705 See also Figures S2 and S3. 706 707 Figure 3 | Over 60% of genes interacting with MLL3/4-dependent de novo enhancers are 708 still fully induced in MLL3/4 dCD cells during NPC differentiation. 709 (A) Microscopic images of cell differentiation from mouse ESCs towards NPCs (day 2.5, 5) in WT 710 and MLL3/4 dCD cells. Alkaline phosphatase staining was performed at each time point. 711 712 (B) Principal component analysis of overall gene expression profiles of WT and Mll3/4 dCD cells 713 at each time point of cell differentiation. Two replicates of each sample are shown. 714 715 (C) Scatter plots showing gene expression levels of NPC-differentiation induced genes (FDR < 716 0.05, FC > 2, RPKM in NPCs > 1.0) in WT NPCs (x-axis) and MLL3/4 dCD NPCs (day 5) (y-axis). bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

717 Blue and light-blue dots; down-regulated in MLL3/4 dCD NPCs (FC > 2 and FC > 1.5, respectively, 718 FDR < 0.05). Red and orange dots; up-regulated in MLL3/4 dCD NPCs (FC > 2 and FC > 1.5, 719 respectively, FDR < 0.05). 720 721 (D) Top three enriched GO terms in genes that failed to be up-regulated upon loss of MLL3/4 722 catalytic activity (228 genes, FC (WT/dCD) > 2, FDR < 0.05 in NPCs). p values (Fisher's exact 723 test) are also indicated. 724 725 (E) Volcano plots showing gene expression changes between WT ESCs and NPCs in genes that 726 have significant PLAC-seq chromatin contacts (MAPS, FDR < 0.01) with the 3028 of de novo 727 MLL3/4-dependent candidate enhancers. Significantly up-regulated and down-regulated genes 728 were plotted as red and blue dots, respectively (FDR < 0.05, FC > 2). 729 730 (F) Scatter plots showing gene expression levels of NPC-differentiation induced genes that had 731 significant interaction with the 3028 of MLL3/4-dependent de novo enhancers in WT NPCs (x- 732 axis) and MLL3/4 dCD NPCs (day 5) (y-axis). Down-regulated genes in MLL3/4 dCD NPCs (FC 733 > 1.5, FDR < 0.05) are plotted as light-blue dots. 734 See also Figure S4. 735 736 Figure 4 | Effects of MLL3/4 catalytic activity loss on enhancer-dependent gene activation 737 during NPC differentiation vary depending on the relative promoter contacts from MLL3/4- 738 dependent candidate enhancers. 739 (A) Schematic representation of the method of calculating the ratio of chromatin contacts between 740 promoters and MLL3/4-dependent or independent candidate enhancers. Total contact counts on 741 a gene promoter and multiple MLL3/4-independent enhancers are divided by total contact counts 742 on a gene promoter and all candidate enhancers in WT NPCs (contact range ≥ 10 kb). See 743 Methods for details of the calculation. 744 745 (B) Boxplots showing changes in gene expression between WT and MLL3/4 dCD cells during 746 NPC differentiation. NPC-differentiation induced genes were classified into four groups based 747 on the ratios of chromatin contact between promoters and MLL3/4-independent enhancers 748 versus all chromatin contacts anchored at the same promoters. The ratios and the numbers of 749 genes are indicated on the bottom. Central bar, median; lower and upper box limits, 25th and 750 75th percentiles, respectively; whiskers, minimum and maximum value within the range of (1st bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

751 quartile-1.5*(3rd quartile- 1st quartile)) to (3rd quartile+1.5*(3rd quartile- 1st quartile)). * p 752 value < 0.05, *** p value < 0.001, one-tailed t-test. 753 754 (C) Boxplots showing the number of MLL3/4-independent candidate enhancers with significant 755 chromatin contacts as determined by PLAC-seq assays (MAPS, FDR < 0.01) with each group of 756 genes. These NPC-differentiation induced genes were classified based on the differential gene 757 expression analysis in Figure 3C. Central bar, median; lower and upper box limits, 25th and 75th 758 percentiles, respectively; whiskers, minimum and maximum value within the range of (1st quartile- 759 1.5*(3rd quartile- 1st quartile)) to (3rd quartile+1.5*(3rd quartile- 1st quartile)). ns p value > 0.05, 760 *** p value < 0.001, one-tailed t-test. 761 See also Figure S5. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

762 Supplementary figure legends 763 764 Figure S1 | Changes of histone ChIP-seq peak levels between WT and MLL3/4 dCD cells. 765 Related to Figure 1. 766 (A)(B) Scatter plots showing changes of H3K4me1 ChIP-seq peak signals between WT and 767 Mll3/4 dCD cells in ESCs (left) and NPCs (day 5) (right). H3K4me1 ChIP-seq peaks at distal loci 768 (distance to transcription start site (TSS) ≥ 10 kb) (A) and peaks at proximal loci (distance to TSS 769 < 10 kb) (B) are plotted separately. Significantly increased and decreased peaks are plotted as

770 red and blue dots, respectively (FDR < 0.05, log2FC > 0.5). 771 772 (C)(D) Average enrichments of H3K4me1 ChIP-seq signals around distal enhancers (distance to 773 TSS ≥ 10 kb) (C) and gene promoters (TSSs) (D) in ESCs and NPCs. Blue lines: WT. Red lines: 774 MLL3/4 dCD. 775 776 (E) Genome browser snapshots of regions around a candidate enhancer whose H3K4me1 signal 777 was significantly decreased upon loss of MLL3/4 catalytic activity in ESCs and NPCs (left) and 778 an enhancer whose H3K4me1 was significantly decreased only in NPCs (right). H3K4me1, 779 H3K27ac, and H3K4me3 ChIP-seq and RNA-seq are shown. 780 781 (F) Genome browser snapshots of a region around gene promoter whose H3K4me1 signal was 782 significantly increased upon loss of MLL3/4 catalytic activity. 783 784 (G) Scatter plots showing changes of H3K4me1 peak signals (x-axis) and changes of H3K27ac 785 peak signals (y-axis) upon loss of MLL3/4 catalytic activity in ESCs (left) and NPCs (right). Peaks 786 at distal loci (distance to TSS ≥ 10 kb) are shown. Pearson correlation coefficients and linear 787 trendlines (red line) are also indicated. 788 789 (H) Gene expression profiles of Gata family members. RPKM (reads per kilobase of transcript, 790 per million mapped reads) values of RNA-seq (average of two replicates) at each time point in 791 WT and MLL3/4 dCD cells are shown. 792 793 (I) The top 5 enriched known TF binding motifs at MLL3/4-dependent (left) and -independent 794 enhancers (right) in the group of persistent candidate enhancers. Their enrichment p values and 795 p values of differential motif analysis between MLL3/4-dependent and -independent enhancers bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

796 are also indicated (see Figure 1D for the same analysis in the group of de novo enhancers in 797 NPCs). 798 799 Figure S2 | Differential chromatin contact analysis. Related to Figure 2. 800 (A) Schematic representation of the method for the differential analysis of H3K4me3 PLAC-seq 801 datasets. The input contact matrix bins were stratified into every 10-kb genomic distance from 10- 802 kb to 150-kb and the rest of the bins of longer distances were stratified to have a uniform number 803 of input bins that were equal to that of 140–150 kb distance bins, and the contact counts in the 804 same genomic distance were compared separately. Two sets of the discrete inputs of two 805 biological replicates were compared using a negative binomial model, edgeR (Robinson et al., 806 2010). The interactions anchored at the differential H3K4me3 ChIP-seq peaks between the 807 compared samples (p value < 0.01) were removed and only the genes with the same level of 808 H3K4me3 peaks on promoters were processed in the downstream analysis (see Methods). 809 810 (B) Genome browser snapshots of regions around Hoxa gene cluster (top) and Sox2 (bottom). 811 The arcs show the changes of chromatin contacts on active elements and promoters identified by 812 the differential interaction analysis between WT ESCs and NPCs (Hoxa gene cluster), and the 813 analysis between WT ESCs and MLL3/4 dCD ESCs (Sox2). The colors of arcs indicate degrees

814 of interaction change between the conditions (blue to red, -/+log10(p-value)). The promoter regions 815 of these genes and interacting candidate enhancer regions are shown in green and yellow 816 shadows, respectively. H3K4me1, H3K27ac, H3K4me3, H3K27me3 ChIP-seq and RNA-seq 817 datasets are also shown. 818 819 (C) Genome browser snapshots of regions around Sox2 gene. The arcs show the chromatin 820 contacts on active elements and promoters in WT and MLL3/4 dCD cells in ESCs and NPCs. The 821 color of arcs indicates normalized contact counts. The gene promoter and interacting candidate 822 enhancer regions are shown in green and yellow shadows, respectively. H3K4me1, H3K27ac, 823 H3K4me3, ChIP-seq and RNA-seq in WT and MLL3/4 dCD cells in ESCs and NPCs are also 824 shown. 825 826 Figure S3 | Changes of enhancer-promoter and promoter-promoter contacts upon loss of 827 MLL3/4 catalytic activity. Related to Figure 2. 828 (A) Scatter plots showing genome-wide changes of chromatin contacts anchored on promoters 829 and enhancers (y-axis) identified in differential interaction analysis. The comparison between WT bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

830 ESCs and NPCs (left) and the comparison between MLL3/4 dCD ESCs and NPCs (right) are 831 shown. Genomic distances between their two loop anchor sites are plotted on x-axis. The

832 interaction changes are indicated by significance value (-/+log10(p-value)). The numbers of 833 significantly changed enhancer-promoter (E-P) and promoter-promoter (P-P) contacts are also 834 indicated. Significantly induced and reduced chromatin contacts are shown as red and blue dots, 835 respectively (FDR < 0.05). For details on the differential interaction analysis, see Methods and 836 Figure S2A. 837 838 (B) Genome browser snapshots of regions around Sox11 gene (left) and Hoxb9 gene (right) that 839 failed to be activated upon NPC differentiation in MLL3/4 dCD cells. The arcs show the changes 840 of chromatin contacts on active elements and promoters identified in differential interaction 841 analysis between ESCs and NPCs in WT and MLL3/4 dCD cells. The colors of arcs indicate

842 degrees of interaction change between the conditions (blue to red, -/+log10(p-value)). The 843 promoter regions of these genes are shown in green shadows. H3K4me1, H3K27ac, H3K4me3 844 ChIP-seq and RNA-seq in ESCs and NPCs (day 5) in WT and MLL3/4 dCD cells are also shown. 845 846 (C) Scatter plots showing changes of H3K4me1 ChIP-seq peak levels on distal element loci of 847 significantly induced (red) and reduced (blue) E-P contacts during neural differentiation in WT 848 cells (left). Changes of H3K4me1 ChIP-seq peak levels in MLL3/4 dCD cells on the same loci are 849 also shown on the right. 850 851 (D) Volcano plots showing changes of E-P contacts anchored on the MLL3/4-independent de 852 novo enhancers (345 loci) upon cell differentiation from ESCs towards NPCs in WT (left) and 853 MLL3/4 dCD cells (right). E-P contacts that were overlapped with significant peaks called by 854 MAPS are plotted (FDR < 0.01). Significantly induced E-P contacts upon cell differentiation (FDR 855 < 0.05) are plotted as red dots and the numbers of them are also indicated. 856 857 (E) Boxplots of yellow boxes showing changes of H3K4me1 ChIP-seq peak signals at the distal 858 loci of the induced and reduced E-P contacts (p value < 0.01) upon loss of MLL3/4 catalytic activity 859 in ESCs and NPCs. Boxplots of green boxes showing changes of H3K4me1 ChIP-seq peak 860 signals at the distal promoters of the induced and reduced P-P contacts (p value < 0.01) upon 861 loss of MLL3/4 catalytic activity in ESCs and NPCs. ns p value > 0.05, *** p value < 0.001, two- 862 tailed t-test. 863 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

864 (F) Boxplots showing the changes of nearby E-P contacts that were anchored on the induced and 865 reduced P-P contacts (p value < 0.01) upon loss of MLL3/4 catalytic activity in ESCs and NPCs 866 (Schematic representation on the top). Changes of nearby E-P contacts anchored on induced 867 and reduced P-P contacts were compared in each time point. *** p value < 0.001, two-tailed t-test. 868 869 (G) Boxplots showing gene expression changes of genes with significantly increased H3K4me1 870 ChIP-seq peaks around their TSS (≤ 10 kb) upon loss of MLL3/4 catalytic activity in ESCs (left) 871 and NPCs (middle). Gene expression changes of genes with significantly increased H3K4me1 872 ChIP-seq peaks around their TSS (≤ 10 kb) upon NPC differentiation in WT cells were also plotted 873 (right). *** p value < 0.001, ns p value > 0.05, two-tailed t-test. 874 875 Figure S4 | Changes of gene expression profiles upon loss of MLL3/4 catalytic activity. 876 Related to Figure 3. 877 (A) Gene expression levels of pluripotent marker genes (Pou5f1, Sox2, Nanog) and development 878 related genes (Hoxb9, Sox11), and gene expression levels of NPC (Pax6, Sox3, Olig2) and 879 neuron (Tuj1, NeuN, Map2) marker genes during neural differentiation. RPKM (reads per kilobase 880 of transcript, per million mapped reads) values of RNA-seq (average of two replicates) at each 881 time point in WT and MLL3/4 dCD cells are shown. 882 883 (B) Heatmaps showing the changes of gene expression levels between ESCs and NPCs (day 2.5 884 and 5) in WT and MLL3/4 dCD cells (4 columns on the left), and the changes between WT and 885 MLL3/4 dCD cells in day 5 NPCs (right column). Genes were classified based on the gene 886 expression changes between ESCs and day 5 NPCs in WT cells (FC > 2, FDR < 0.05) and further 887 classified based on the changes between WT and MLL3/4 dCD cells in day 5 NPCs (FC > 2, FDR 888 < 0.05). The numbers of genes in each group (3 x 3 = 9 groups) are also shown. 889 890 (C) Gene expression changes between ESCs and NPCs (day 5) in WT (left) and MLL3/4 dCD 891 cells (right). Differentially up-regulated and down-regulated genes are plotted in red and blue, 892 respectively (FC > 2, FDR < 0.05). The total number of up-regulated and down-regulated genes 893 are also indicated. 894 895 (D) Gene expression changes between WT and MLL3/4 dCD cells in ESCs (left) and NPCs (day 896 5) (right). Differentially up-regulated and down-regulated genes are plotted as red and blue dots, bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

897 respectively (FC > 2, FDR < 0.05). The total number of up-regulated and down-regulated genes 898 are indicated. 899 900 (E) Histogram (left) shows the fraction of down-regulated genes (FC > 1.5, FDR < 0.05) in genes 901 shown in Figure 3F (195 genes) and genes that were located at over 500-kb genomic distance 902 away from the MLL3/4-dependent de novo enhancers. * p value < 0.05, Fisher’s exact test. 903 Boxplots (right) shows fold changes of the gene expression levels of these group genes. Central 904 bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, 905 minimum and maximum value within the range of (1st quartile-1.5*(3rd quartile- 1st quartile)) to 906 (3rd quartile+1.5*(3rd quartile- 1st quartile)). * p value < 0.05, two-tailed t-test. 907 908 Figure S5 | Features of MLL3/4 catalytic activity-dependent and -independent genes. 909 Related to Figure 4. 910 (A) Boxplots showing the genomic distance to the nearest MLL3/4-independent candidate 911 enhancers from each group gene. These NPC-differentiation induced genes were classified 912 based on the differential gene expression analysis in Figure 3C. Central bar, median; lower and 913 upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value 914 within the range of (1st quartile-1.5*(3rd quartile- 1st quartile)) to (3rd quartile+1.5*(3rd quartile- 915 1st quartile)). ** p value < 0.01, one-tailed t-test. 916 917 (B–D) Enrichment analysis of stably expressed genes that were not differentially expressed in 918 MLL3/4 dCD ESCs (left) and NPCs (right) (defined in Figure S4D) (B). Genes were categorized 919 based on the distance to the nearest interacting enhancer (vertical columns) and the number of 920 enhancers around TSS (< 200 kb) (horizontal columns). Enrichment values are shown by odds 921 ratio (scores in boxes) and p-values (color). The distance to the nearest interacting enhancer is 922 represented by the shortest genomic distance of significant PLAC-seq peaks on enhancers and 923 promoters (p-value < 0.01). Enrichment analysis of the other significantly down-regulated and up- 924 regulated genes (FC > 2, FDR < 0.05) in MLL3/4 dCD ESCs and NPCs are also shown in panel 925 C and D, respectively. For details on the odds ratio calculation and statistical analysis, see 926 Methods. 927 928 (E) Boxplots showing changes of H3K27ac ChIP-seq peak signals (dCD/WT) at the all interacting 929 distal enhancers (MAPS, p value < 0.01) of down-regulated genes and up-regulated genes in bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

930 MLL3/4 dCD ESCs (left) and NPCs (right) (defined in Figure S4D). *** p value < 0.001, two-tailed 931 t-test. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

932 SUPPLEMENTARY TABLES

933 934 Table S1. 935 List of NGS sample information. Related to Figures 1–4. 936 937 Table S2. 938 List of MLL3/4 catalytic activity-dependent and -independent candidate enhancers in NPCs. 939 Related to Figure 1. 940 941 Table S3. 942 List of differentially changed enhancer-promoter contacts. Related to Figure 2. 943 944 Table S4. 945 Gene expression changes during neural differentiation in WT and MLL3/4 dCD cells. Related to 946 Figures 3 and 4. bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 1

A B De novo candidate enhancers in NPC (N=3,373) H3K4me1 H3K27ac H3K4me1 ESC NPC ESC NPC WT ATAC-seq MLL3 WT MLL3/4 MLL3/4 MLL3/4 MLL3/4 MLL4 dCD WT dCD WT dCD WT dCD WT dCD ESC NPC

Y4792A dCD MLL3 ESC NPC MLL4 Y5477A N=3,028 SET AT Hook FYRN HMG FYRC PHD 2 2 10

-2 -1.5 2 Dynamics of chromatin states, N=345 -5 5 -5 5 -5 5 -5 5 -3 3 -3 3 -3 3 -3 3 -5 5 -5 5 kb kb kb kb kb kb kb kb kb kb gene expression, and E-P contacts C Persistent candidate enhancers in NPC (N=30,404) WT H3K4me1 H3K27ac ESC NPC ESC NPC ChIP-seq ChIP-seq ATAC-seq RNA-seq RNA-seq RNA-seq ESC NPC MLL3/4 MLL3/4 MLL3/4 MLL3/4 PLAC-seq PLAC-seq WT dCD WT dCD WT dCD WT dCD ESC NPC Mll3/4 dCD N=4,150 ESC NPC NPC Day 2.5 Day 5

N=26,254 3 2 10

-3 -1.5 2 -5 5 -5 5 -5 5 -5 5 -3 3 -3 3 -3 3 -3 3 -5 5 -5 5 kb kb kb kb kb kb kb kb kb kb D Top 5 P value Top 5 P value De novo Logo P value (differential De novo Logo P value (differential enhancers motifs (enrichment) dep./indep.) enhancers motifs (enrichment) indep./dep.) MLL3/4- 1.0E-77 1.0E-19 MLL3/4- 1.0E-19 1.0E-6 dependent GATA2 independent TEAD ESC NPC GATA6 1.0E-76 1.0E-17 ESC NPC TEAD1 1.0E-17 1.0E-5 GATA1 1.0E-75 1.0E-16 TEAD2 1.0E-16 1.0E-5 N=3,028 N=345 GATA4 1.0E-73 1.0E-16 HOXA1 1.0E-16 1.0E-5 GATA3 1.0E-66 1.0E-15 TEAD4 1.0E-15 1.0E-4 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 2

A ESC (MLL3/4 dCD / WT) NPC day 5 (MLL3/4 dCD / WT) 20 20 ESC NPC Induced contacts 14/3 contacts 122/17 contacts Reduced contacts WT 10 (E-P/P-P) 10 (E-P/P-P) (FDR < 0.05) P P contacts P P contacts - - (p value)

(p value) Induced contacts

10 0 10

P/P 0 P/P Reduced contacts - Mll3/4 - dCD (p value < 0.01) -10 -10 Changes of of Changes contacts chromatin /+ Log /+ /+ Log /+

- 43/3 contacts Neural (E-P/P-P) - 2252/1233 contacts Other contacts differentiation -20 -20 (E-P/P-P)

Changes of Changes E 0 0.5Mb 1Mb 0 0.5Mb 1Mb of Changes E Contact range N=153029 Contact range N=133753 B D Changes of E-P contact E-P contacts changes upon cell differentiation ESC NPC ESC NPC C Decrease of the # of MLL3/4 WT MLL3/4 WT dCD dCD induced E-P contacts ESC 12 12 upon MLL3/4 dCD 69/882 17/882 ESC

E-P contacts on FC- in dCD cells 0 1 2 ESC NPC 1 MLL3/4-dependent NPC de novo enhancers 3,028 loci 345 loci NPC 0 *** ESC 6 6 345 loci LogP LogP - -

NPC 2 3,028 loci ESC 0 0 P value -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 =0.039 NPC Log FC Odds ratio Log2FC 2 3,028 loci (NPC day 5 / ESC) (NPC day 5 / ESC) 4 P value Induced E-P contacts # of induced E-P =6.1e-09 (FDR < 0.05) / # of all E-P with Other E-P contacts MAPS peaks in NPCs -/+ log(p value)

-5 5 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 3

A C NPC-differentiation WT Mll3/4 dCD B induced genes in WT ESCs ESCs (N=1303, RPKM in NPCs > 1.0) Day 5

2 reg 8 -

Mll3/4 dCD 98 genes reg - 1 ESCs 7.5% Up Down 100 μm 100 μm Day 2.5 101 genes > 2.0 FC 4 7.8%

0 (RPKM) > 1.5 FC 2 Day 2.5 Day 2.5 PC2 FDR < 0.05

WT ESCs Log -1 Day 2.5 Day 5 183 genes Changed by

MLL3/4 dCD NPC dCD MLL3/4 14.0% less than 50% 0 692 genes -2 x 2 replicate 228 genes 53.1% 17.6% 3 100 μm 100 μm -3 -2 -1 0 1 2 PC1 0 4 8 Day 5 Day 5 Log2(RPKM) E WT NPC Neural differentiation treatment Genes interacting with MLL3/4- dependent de novo enhancers F 195 of NPC-differentiation 300 induced genes 60 195 ESC 100 μm 100 μm genes genes MAPS peak 8 NPC 200 Down-reg 3,028 loci 195 genes > 1.5 FC D MLL3/4-dependent down-reg genes FDR < 0.05 LogP

(FC > 2, FDR < 0.05, N=228) - 4 Changed by

100 (RPKM)

2 less than 50% multicellular organism development

909 genes Log cell differentiation in total 0 NPC dCD MLL3/4 0 nervous system development -8 -4 0 4 8 Log2FC 0 2 4 6 NPC / ESC in WT 67/195 genes -Log10 (P value) Up-reg genes FDR < 0.05, 34.4% Down-reg genes FC > 2 0 4 8 Log2(RPKM) WT NPC bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 4

A B Gene expression C difference in NPC (day 5) # of interacting MLL3/4- MLL3/4-dependent MLL3/4-independent independent enhancers enhancers in NPC enhancers in NPC 4 *** (MAPS, FDR < 0.01) (N=3,028+4,150) (N=345+26,254) * * 15 N=4,150 N=26,254 *** ns N=3,028 N=345 2 ESC 10 FC

NPC-differentiation 2 induced genes 0 NPC Log 5 (Mll3/4 dCD / WT) dCD (Mll3/4 -2 c1 c2 c3 c4 0 Ratio of chromatin contacts on Robustly Down Up MLL3/4-independent enhancers -4 induced 228 genes- 98 genes-reg 692 genes reg = (c3+c4)/(c1+c2+c3+c4) Ratio > 0.9 0.8 0.7 < 0.7 N=485 N=315- N=179 N=210 of C.C. 0.9 -0.8 FC > 2, FDR < 0.05

NPC-differentiation induced genes bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S1 A B H3K4me1 ChIP-seq peaks at distal loci H3K4me1 ChIP-seq peaks at proximal loci (distance to TSS ≥ 10kb) (distance to TSS < 10kb) ESC (WT vs Mll3/4dcd) NPC (WT vs Mll3/4dcd) ESC (WT vs Mll3/4dcd) NPC (WT vs Mll3/4dcd)

11,001 peaks 16,175 peaks 4 9,630 peaks 4 15,223 peaks 4 4 18,866 11,917 53,820 45,708 0 0 0 peaks 0 peaks peaks peaks (Mll3/4dCD / WT) (Mll3/4dCD (Mll3/4dCD / WT) (Mll3/4dCD

(Mll3/4dCD / WT) (Mll3/4dCD -4 2

(Mll3/4dCD / WT) (Mll3/4dCD -4 -4 2 -4 2 4,665 peaks 5,114 peaks 2 19,454 peaks 25,271 peaks Log Log Log Log 2.5 5 7.5 10 12.5 2.5 5 7.5 10 12.5 2.5 5 7.5 10 12.5 2.5 5 7.5 10 12.5 Log (MeanCounts) Log2(MeanCounts) Log2(MeanCounts) Log2(MeanCounts) 2 Increased peaks FDR < 0.05, Increased peaks FDR < 0.05, |log FC| > 0.5 Decreased peaks |log2FC| > 0.5 Decreased peaks 2 C D ESC NPC (day 5) ESC NPC (day 5) 2 0.8 0.8 1 0.6 0.6 1 0.4 0.4 0 0.2 0 0.2 H3K4me1 enhancers Distal 0 H3K4me1 promoters Gene -5kb center 5kb -5kb center 5kb -5kb TSS 5kb -5kb TSS 5kb WT MLL3/4 dCD WT MLL3/4 dCD E F 75,896 kb chr18 75,907 kb 120,122 kb chr2 120,134 kb 31,093 kb chr3 31,102 kb WT 30 30 WT 30 H3K4me1 H3K4me1 30 dCD 30 30 dCD 30 30 WT 30 WT ESC H3K27ac H3K27ac 30 30 30

ESC dCD dCD WT 30 30 WT 30 H3K4me3 H3K4me3 30 dCD 30 30 dCD WT 30 30 WT 30 H3K4me1 H3K4me1 30 dCD 30 30 dCD 30 30 WT 30 WT NPC H3K27ac H3K27ac 30 30 30 dCD Day 5 dCD 30 NPC D5 NPC WT 30 30 WT H3K4me3 30 H3K4me3 30 dCD 30 dCD ESC WT 100 500 ESC WT 100 RNA-seq dCD 100 500 RNA-seq dCD 100 100 500 WT 100 NPC WT NPC dCD 100 500 dCD 100 genes genes

G ChIP-seq peaks at distal loci ESC (distance to TSS > 10kb) NPC day 5 r=0.58 r=0.58

3 3

0 0 H3K27ac H3K27ac (MLL3/4dCD / WT) (MLL3/4dCD -3 / WT) (MLL3/4dCD -3 2 0.3 2 0.3 0.1 0.1 Log Log density density -3 0 3 -3 0 3 H3K4me1 H3K4me1 Log2 (MLL3/4dCD / WT) Log2 (MLL3/4dCD / WT) H Gata2 Gata6 Gata1 Gata4 Gata3 6 200 8 120 0.4 150 4 6 80 100 4 0.2 RPKM 2 40 50 2 0 0 0 0 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 ESC Day2.5Day5 0 ESC Day2.5Day5 ESC Day2.5Day5

WT MLL3/4 dCD WT MLL3/4 dCD WT MLL3/4 dCD WT MLL3/4 dCD WT MLL3/4 dCD I Top P value Top P value Persistent Logo P value (differential Persistent Logo P value (differential enhancers motifs (enrichment) dep./indep.) enhancers motifs (enrichment) indep./dep.) MLL3/4- SOX2 1.0E-50 1.0E-13 MLL3/4- KLF14 1.0E-156 n.s. dependent independent ESC NPC SOX17 1.0E-45 1.0E-15 ESC NPC FOSL2 1.0E-144 1.0E-4 SOX3 1.0E-42 1.0E-10 FOS 1.0E-143 1.0E-14 N=4,150 N=26,254 SOX21 1.0E-40 1.0E-8 JUNB 1.0E-143 1.0E-13 SOX6 1.0E-40 1.0E-10 FRA2 1.0E-140 1.0E-8 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S2

A Sample A Sample B

1140–1500 kb

930–1140 kb .

140–... 150 kb 20–30 kb 10–20 kb Compare contact counts in the same distance bins

1140-1500 kb 400-450 kb 100-110 kb ESC à NPC

50-60 kb 20

10 H3K4me3

30-40 kb P contacts - 20-30 kb (p value) Merge 10 0 10-20 kb Gene A same level -10 Overlapped Log /+ - with E-P Gene A

Changes of Changes E -20 logFC Induced contacts Remove contacts with Reduced contacts different levels of 0 0.5Mb 1Mb H3K4me3 peaks Contact range FDR < 0.05 logCPM (p value < 0.01)

TSS with different TSS with same Comparison level of H3K4me3 level of H3K4me3

WT, ESC vs NPC 729 genes 12,226 genes Mll3/4 dCD, ESC vs NPC 693 genes 12,250 genes ESC, WT vs MLL3/4 dCD 135 genes 12,839 genes NPC, WT vs MLL3/4 dCD 53 genes 12,766 genes

B 51,500kb chr6 52,800kb Genes C Hoxa cluster 34,500kb chr3 35,400kb 50 Induced E-P and P-P 0 Genes

contacts change value) (p

10 -50 Sox2 SE WT ESC E-P and P-P Reduced contacts à WT NPC WT ESC /+ Log /+ - ESC 30 H3K4me1 30 NPC MLL3/4 dCD ESC 200 ESC 40 H3K27ac 40 NPC 0 20 WT NPC WT ESC H3K27me3 20 NPC Normalizedc.c. ESC 40 H3K4me3 40 NPC MLL3/4 dCD NPC 250 RNA-seq ESC 20 250 ESC H3K4me1 WT NPC MLL3/4 dCD 20 H3K27ac WT 50 MLL3/4 dCD 50 H3K4me3 WT 40 34,500kb chr3 35,400kb MLL3/4 dCD 40 NPC H3K4me1 WT 20 Genes MLL3/4 dCD 20 Sox2 SE H3K27ac WT 50 E-P and P-P 20 MLL3/4 dCD 50 Induced 40 contacts change 0 H3K4me3 WT 40

(p value) (p MLL3/4 dCD

WT ESC 10 -20 àdCD ESC ESC WT 250 Reduced RNA-seq dCD 250 250 /+ Log /+

- WT 30 NPC H3K4me1 WT dCD 250 MLL3/4 dCD 30 H3K27ac WT 80 MLL3/4 dCD 80 H3K27me3 WT 20 ESC MLL3/4 dCD 20 H3K4me3 WT 40 MLL3/4 dCD 40 RNA-seq WT1000 MLL3/4 dCD1000 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S3

A WT (NPC day 5 / ESC) MLL3/4 dCD (NPC day 5 / ESC) Changes of chromatin contacts 20 2545/1678 contacts 20 ESC NPC (E-P/P-P) 529/203 contacts 10 (E-P/P-P) WT 10 P P contacts P P contacts - - (p value) (p value) 0 0 10 10 P/P P/P - MLL3/4 - dCD -10 -10

987/136 contacts Log /+ /+ Log /+ - - (E-P/P-P) 1212/265 contacts Neural (E-P/P-P) differentiation -20 -20 Changes of Changes E Changes of Changes E 0 0.5Mb 1Mb 0 0.5Mb 1Mb Contact range N=149419 Contact range N=131907 Induced contacts Reduced contacts Other contacts (FDR < 0.05)

B 95,900kb chr11 97,000kb 27,000kb chr12 27,600kb Genes Genes Sox11 Hoxb9

E-P and P-P WT 20 E-P and P-P WT 35 contacts change 0 (p value) (p contacts change 0 (p value) (p 10 ESCàNPC 10 -20 ESCàNPC -35 Mll3/4 Mll3/4 /+ Log /+

dCD Log /+ - dCD - ESC 20 ESC 30 H3K4me1 NPC 20 H3K4me1 NPC 30 ESC 20 ESC 50 H3K27ac 20 H3K27ac NPC NPC 50

WT 30 ESC WT 30 H3K4me3 30 H3K4me3 ESC NPC NPC 30 RNA-seq ESC 250 500 250 ESC NPC RNA-seq 500 20 NPC ESC 30 H3K4me1 20 ESC NPC H3K4me1 30 20 NPC ESC 50 H3K27ac 20 H3K27ac ESC NPC 50 ESC 30 NPC H3K4me3 30 NPC 30 H3K4me3 ESC 250 NPC 30

MLL3/4dCD ESC RNA-seq MLL3/4dCD NPC 250 RNA-seq ESC 500 NPC 500

C D Changes of E-P contact H3K4me1 ChIP-seq peaks at distal loci of ESC NPC ESC NPC induced/reduced E-P contacts (non-promoter regions) WT MLL3/4 WT, NPC day 5 / ESC MLL3/4 dCD, NPC day 5 / ESC dCD 5.0 5.0 12 12 E-P contacts on 23/111 11/111 MLL3/4-independent 2.5 2.5 de novo enhancers ESC FC 6 6 2 LogP 0 0 LogP - -

Log NPC -2.5 -2.5 345 loci

H3K4me1 changes upon upon changes H3K4me1 neural differentiation 0 0 -5.0 -5.0 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 0 5 10 0 5 10 Log2FC Log2FC Log2(MeanCounts) Log2(MeanCounts) (NPC day 5 / ESC) (NPC day 5 / ESC) Induced contacts Induced E-P contacts # of induced E-P Reduced contacts (FDR < 0.05) / # of all E-P with Other E-P contacts MAPS peaks in NPCs F Changes of nearby E-P E H3K4me1 peak changes at distal loci G Genes with increased H3K4me1 upon MLL3/4 dCD Induced/reduced P-P signal at promoters 4 ns ns *** *** 5 ns *** E-P contacts with 2 induced/reduced P-P contacts 2.5

FC *** *** 2 0 2 FC Log 2 H3K4me1 P P contacts 0 - 1 -2 Log (MLL3/4 dCD / WT) dCD (MLL3/4 FC

2 0 -4 -2.5 Log -1 Gene change expression ESC NPC day 5 ESC NPC day 5 ( MLL3/4 dCD / WT) dCD ( MLL3/4 -5 MLL3/4 dCDMLL3/4 / WT dCDWT /NPC WT / ESC E-P contacts Induced contacts -2 (ESC) (NPC)

P-P contacts Reduced contacts of Changes nearby E (p value < 0.01) ESC NPC day 5 Induced P-P contacts Reduced P-P contacts (p value < 0.01) bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S4

A

Pluripotent marker genes Development related genes NPC marker genes Neuron marker genes Pou5f1 Hoxb9 Pax6 Tuj1 180 4 40 60 120 40 2 RPKM RPKM 20 RPKM 60 20

0 0 0 0 Sox2 Sox11 Sox3 NeuN 18 60 4 0.6 12 40 0.4

RPKM 2 RPKM

6 RPKM 20 0.2

0 0 ESC Day2.5Day5 ESC Day2.5Day5 0 0

Nanog Olig2 Map2 60 WT MLL3/4 dCD 1.2 6 40 0.8 4 RPKM 20 RPKM 0.4 2

0 ESC Day2.5Day5 ESC Day2.5Day5 0 ESC Day2.5Day5 ESC Day2.5Day5 0 ESC Day2.5Day5 ESC Day2.5Day5

WT MLL3/4 dCD WT MLL3/4 dCD WT MLL3/4 dCD

B C WT MLL34 dCD WT (NPC day 5 / ESC) MLL3/4 dCD (NPC day 5 / ESC) NPC day5 day2.5 day5 day2.5 day5 MLL34 dCD / ESC in WT /ESC /ESC /ESC /ESC / WT in NPC day5 14 genes 10 1839 genes 10 Down-reg 1385 genes 756 genes 5 5 FC

266 genes FC 2 0 8580 2 0 9437 137 genes genes genes Log -5 Log -5 1036 genes 633 genes -10 -10 Gene expression Gene change expression Stable 8014 genes 0 5 10 0 5 10 Log2 CPM Log2 CPM Up-reg genes Down-reg genes FDR < 0.05, FC > 2

416 genes D NPC day 5 (MLL3/4 dCD / WT) 323 genes ESC (MLL3/4 dCD / WT) Up-reg 1284 genes 1168 genes 909 genes 227 genes 10 10 Log FC 2 MLL34 dCD / WT 5 5

in NPC (day 5) FC FC

9947 2 10060 -4 4 2 Down-reg genes 0 genes 0 genes Stable genes Log Up-reg genes Log -5 330 genes -5 476 genes FDR < 0.05, FC > 2, -10 -10 RPKM ≥ 0 Gene expression Gene change expression 0 5 10 0 5 10 E Log2 CPM Log2 CPM 50% Odds ratio 4 Up-reg genes Down-reg genes =1.55 * * FDR < 0.05, FC > 2 40% 2 30% reg genes reg genes - 0 20% FC (dCD / WT) 2 102/404 down 67/195 -2 Log

upon MLL3/4 dCD 10% (FDR < 0.05, FC > 1.5) > FC 0.05, < (FDR % of

Gene expression Gene change expression -4 0% MAPS peak > 500kb MAPS peak > 500kb bioRxiv preprint doi: https://doi.org/10.1101/2021.03.17.435905; this version posted March 18, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure S5

A Distance to the nearest MLL3/4-independent enhancers B 250 ** ** Enrichment of stably regulated Enrichment of stably regulated kb genes upon MLL34 dCD in ESC genes upon MLL34 dCD in NPC 200 scores in box; odds ratio scores in box; odds ratio

80 kb+ 0.33 0.77 1.21 80 kb+ 0.39 0.81 1.14 150 10 5 (p value)(p 40-80 kb 0.72 0.96 1.39 0.79 1.16 1.42 value)(p

10 40-80 kb 100 0 10 0 Log Log - 0-40 kb 1.32 1.65 1.76 0-40 kb 0.89 1.44 1.68 - 50 0 3 0 3 7+ - - 7+ -2 -6

Distance to Distance the nearest interacting enhancer 2 6 0 # of enhancers around TSS (<200kb) # of enhancers around TSS (<200kb) Robustly Down Up induced 228 genes- 98 genes-reg 692 genes reg

FC > 2, FDR < 0.05 C Enrichment of down-regulated Enrichment of down-regulated E genes upon MLL34 dCD in ESC genes upon MLL34 dCD in NPC H3K27ac ChIP-seq peaks scores in box; odds ratio scores in box; odds ratio at distal enhancers ESC NPC (day 5) 80 kb+ 1.25 1.12 0.91 80 kb+ 1.70 1.22 1.05 5 5 *** ***

40-80 kb 2.70 2.08 0.91 value)(p (p value)(p 40-80 kb 2.27 0.84 0.88 2 2 10 10 0 0 Log 0-40 kb 1.03 0.76 0.70 Log

0-40 kb 0.56 0.57 0.65 - FC -

2 0 0 0 3 0 3 - - 7+ - - 7+ Distance to Distance the nearest interacting enhancer 2 6 2 6 Log # of enhancers around TSS (<200kb) # of enhancers around TSS (<200kb) -2 -2 H3K27ac changes changes H3K27ac upon MLL3/4 dCD

D Enrichment of up-regulated Enrichment of up-regulated Down Up Down Up genes upon MLL34 dCD in ESC genes upon MLL34 dCD in NPC in MLL3/4 dCDin MLL3/4 dCD in MLL3/4 dCDin MLL3/4 dCD scores in box; odds ratio scores in box; odds ratio -reg -reg -reg -reg 80 kb+ 3.51 1.34 0.81 80 kb+ 2.89 1.23 0.80 30 40-80 kb 0.98 0.75 0.68 50 40-80 kb 0.72 0.89 0.63 (p value)(p (p value)(p 10 10 0 0-40 kb 0.69 0.58 0.55 0 0-40 kb 1.45 0.79 0.58 Log Log - - 0 3 7+ 0 3 7+ Distance to Distance the nearest interacting enhancer -2 -6 -2 -6 # of enhancers around TSS (<200kb) # of enhancers around TSS (<200kb)