bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 MLL3/MLL4 methyltransferase activities regulate embryonic stem cell

2 differentiation independent of enhancer H3K4me1

3

4 Guojia Xie1, Ji-Eun Lee1, Kaitlin McKernan1, Young-Kwon Park1, Younghoon Jang1, Chengyu Liu2, Weiqun

5 Peng3 and Kai Ge1*

6

7 1Laboratory of Endocrinology and Receptor Biology, National Institute of Diabetes and Digestive and Kidney

8 Diseases, National Institutes of Health, Bethesda, MD 20892, USA

9 2Transgenic Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD

10 20892, USA

11 3Departments of Physics and Anatomy and Cell Biology, The George Washington University, Washington,

12 DC 20052, USA

13

14 *To whom correspondence should be addressed. (Email: [email protected])

15

16

17

18

19 Highlights

20 ● Simultaneous elimination of MLL3 and MLL4 enzymatic activities leads to early embryonic lethality in

21 mice

22 ● MLL3/4 enzymatic activities are dispensable for ESC differentiation towards the three germ layers

23 ● ESCs lacking MLL3/4 enzymatic activities show cavitation defects during EB differentiation, likely due

24 to impaired VE induction

25 ● MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation in ESC differentiation

26

1

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

27 Abstract

28 Enhancers drive cell-type-specific transcription and are marked by H3K4me1. MLL4 (KMT2D), a

29 major H3K4me1 methyltransferase with partial functional redundancy with MLL3 (KMT2C), is critical for

30 enhancer activation and cell-type-specific gene induction during cell differentiation and development.

31 However, the roles of MLL3/4-mediated enhancer H3K4me1 and MLL3/4 enzymatic activities in general in

32 these processes remain unclear. Here, we report that MLL3/4 enzymatic activities are partially redundant

33 during mouse development. Simultaneous elimination of both leads to embryonic lethality around E8.5.

34 Using embryoid body (EB) differentiation as an in vitro model for early embryonic development, we show

35 that Mll3 knockout MLL4 enzyme-dead embryonic stem cells (ESCs) are capable of differentiating towards

36 the three germ layers but display severe cavitation defects, likely due to impaired induction of visceral

37 endoderm. Importantly, MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB

38 differentiation and lineage-specific neural differentiation. Together, these results suggest a critical, but

39 enhancer H3K4me1-independent, role of MLL3/4 enzymatic activities in early embryonic development and

40 ESC differentiation.

41

2

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

42 Introduction

43 Enhancers are cis-regulatory DNA elements recognized by transcription factors (TFs). They communicate

44 with promoters to regulate cell-type-specific and cell identity. In-depth epigenomic

45 research has uncovered the chromatin signatures of enhancers. Primed enhancers, which are marked by

46 H3K4me1, become activated with the addition of H3K27ac1,2. The activation of enhancers during cell fate

47 transition is enabled by several epigenomic regulators sequentially recruited by lineage determining TFs,

48 including histone mono-methyltransferases MLL3 (KMT2C) and MLL4 (KMT2D) followed by histone

49 acetyltransferases CBP and p300, which catalyze the placement of H3K4me1 and H3K27ac, respectively3-

50 6. However, the exact role of these epigenomic regulators themselves versus the histone modifications they

51 catalyze, particularly MLL3/4 versus MLL3/4-catalyzed H3K4me1, in enhancer activation has remained

52 elusive.

53 MLL3 and MLL4 (MLL3/4) are members of the Set1-like family of mammalian H3K4

54 methyltransferases that are responsible for catalyzing H3K4me14. They are the largest known nuclear

55 proteins (4,903 and 5,588 amino acids in mice, respectively), and associate with the WRAD (WDR5, RbBP5,

56 ASH2L, DPY30) subcomplex as well as NCOA6, UTX, PA1, and PTIP in a large multi-subunit complex7,8.

57 Enzymatic activities of MLL3/4 are conferred by the C-terminal SET domain, which is also required for

58 maintaining their protein stability9,10. Consistent with their critical roles in enhancer activation, MLL3/4 are

59 broadly required for normal development and cell differentiation. Mll4 knockout (KO) in mice leads to

60 lethality around embryonic day (E) 9.5. Deletion of Mll3 has a milder effect, as Mll3 KO mice die at birth,

61 suggesting that MLL4 plays a dominant role in embryonic development4,7. Previous work in mice has also

62 shown that MLL3/4 are required for the development of tissues such as adipose, muscle, heart, B cells, T

63 cells and mammary gland4,11-14. In humans, MLL3/4 are frequently mutated in developmental diseases and

64 cancers15. However, the functional role of MLL3/4 enzymatic activities in development and cell

65 differentiation is poorly understood.

66 During mouse early embryonic development from implantation to gastrulation, the blastocyst

67 undergoes an orchestrated series of lineage specification events and develops into the three germ layers

68 (ectoderm, mesoderm and definitive endoderm), which contain progenitors of all fetal tissues16. Visceral

3

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

69 endoderm (VE), one major type of extraembryonic endoderm, is an epithelial layer of cells that surrounds

70 the post-implantation embryo. Anterior VE migration induces early embryonic asymmetry by E6.5, when

71 primitive streak (PS), the precursor of mesoderm and definitive endoderm, forms on the posterior side of

72 the embryo17. Proper VE development is required for correct patterning of PS and early embryonic

73 development18,19. Embryonic stem cells (ESCs) are self-renewing pluripotent cells that are derived from the

74 inner cell mass of blastocysts20. ESCs can differentiate and organize into three-dimensional cavitated

75 structures called embryoid bodies (EBs). EB differentiation of ESCs is a valuable in vitro model

76 recapitulating many aspects of early embryonic development, during which the three germ layers form21.

77 Under the treatment of defined factors, ESCs can also differentiate homogeneously into specific lineages

78 such as neurons22.

79 We reported previously that MLL3/4 are required for enhancer activation and ESC differentiation,

80 but are largely dispensable for ESC identity maintenance5. A recent study showed that MLL3/4-catalyzed

81 H3K4me1 is mostly dispensable for maintaining active enhancers in undifferentiated ESCs9. These findings

82 motivated us to investigate the roles of MLL3/4 enzymatic activities and MLL3/4-mediated enhancer

83 H3K4me1 during development and cell differentiation. Using MLL3/4 enzyme-dead single and double

84 knockin mice generated by CRISPR/Cas9, we found that MLL3/4 enzymatic activities are partially

85 redundant and are essential for early embryonic development. By knocking-in MLL4 enzyme-dead point

86 mutation in Mll3 KO ESCs, we observed that eliminating MLL3/4 enzymatic activities has little effects on

87 ESC differentiation towards all three germ layers but results in defective cavitation and cardiomyogenesis

88 during EB differentiation. The cavitation defect is likely the consequence of impaired VE induction. Finally,

89 using EB differentiation and neural differentiation as model systems, we demonstrated that MLL3/4-

90 catalyzed H3K4me1 is dispensable for enhancer activation during ESC differentiation.

91

4

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

92 Results

93 MLL3/4 enzymatic activities are essential for early embryonic development in mice

94 MLL3 and MLL4 are major enhancer H3K4me1 methyltransferases4. MLL3 Tyr 4792 (Y4792) and MLL4

95 Tyr5477 (Y5477) located in the catalytic SET domain are conserved throughout different histone

96 methyltransferases (Figure 1a; Figure 1-S1) and are essential for enzymatic activity9. To investigate the

97 roles of MLL3/4 enzymatic activities in mouse development, we introduced the enzyme-dead point

98 mutations Y4792A and Y5477A to Mll3 and Mll4 gene loci, respectively, by injecting CRISPR/Cas9

99 components into wild type zygotes (Figure 1b). After germline transmission, mice heterozygous for Mll3 KI

100 and Mll4 KI (Mll3KI/+ and Mll4KI/+) survived without any discernible phenotypes and were inbred to obtain

101 homozygotes. While Mll3-/- mice display perinatal lethality4, Mll3KI/KI mice survived to adulthood with reduced

102 numbers (Figure 1c). Adult Mll3KI/KI mice were fertile and showed similar body weights as wild type or Mll3KI/+

103 littermates (data not shown). Mll4-/- mice die around embryonic day (E) 9.54,7. In contrast, Mll4KI/KI mice died

104 around birth with no obvious morphological abnormalities (Figure 1d). These results suggest that MLL3 and

105 MLL4 proteins play more crucial roles than their enzymatic activities in mouse development.

106 Since MLL3 and MLL4 are partially redundant in vitro and in adipose tissue development4,10, we

107 crossed Mll3 KI and Mll4 KI mice and obtained Mll3/Mll4 double KI strains. Both Mll3KI/+;Mll4KI/+ and

108 Mll3KI/KI;Mll4KI/+ mice survived to adulthood and were fertile. We set up self-crossing of Mll3KI/+;Mll4KI/+ mice

109 but failed to recover homozygous Mll3/Mll4 double KI (Mll3KI/KI;Mll4KI/KI) embryos at E14.5 (Figure 1e). To

110 determine at which stage Mll3KI/KI;Mll4KI/KI embryos die, we set up self-crossing of Mll3KI/KI;Mll4KI/+ mice.

111 Mll3KI/KI;Mll4KI/KI embryos were detected at E8.0 and E8.5 but not at any later stage (Figure 1f). All

112 Mll3KI/KI;Mll4+/+ and the majority of Mll3KI/KI;Mll4KI/+ embryos displayed normal developmental morphology

113 and exhibited characteristics of grooved neural plate and unturned early-somite-stage embryo at E8.0 and

114 E8.5, respectively. However, Mll3KI/KI;Mll4KI/KI embryos displayed delayed growth at E8.0; they

115 morphologically resembled normal embryos around E5.5-6.5 with an egg cylinder-like structure. At E8.5,

116 all Mll3KI/KI;Mll4KI/KI embryos were much smaller and exhibited severe morphological defects (Figure 1g).

117 Together, these data indicate that MLL3 and MLL4 enzymatic activities are partially redundant during

5

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

118 embryonic development, and that simultaneous elimination of both results in early embryonic lethality

119 around E8.5.

120

121 ESCs lacking MLL3/4 enzymatic activities maintain cell identity

122 MLL4 is required for early embryonic development in mice while MLL3 is not4. To investigate the function

123 of MLL3/4 enzymatic activities in vitro, we took advantage of our previously derived Mll3-/-;Mll4f/f (f/f) ESCs

124 to eliminate the potential redundancy between MLL3 and MLL45. Using CRISPR/Cas9 gene editing, we

125 knocked-in the enzyme-dead point mutation Y5477A to both alleles of Mll4 in f/f cells and generated

126 independent Mll3-/-;Mll4KI/KI (KI, knockin) ES cell lines (Figure 2-S1a). We also re-generated Mll3/Mll4

127 double knockout (KO) ESCs by transiently transfecting a Cre-expressing plasmid into f/f ESCs5. The

128 genotypes of all cell lines were confirmed by genomic PCR (Figure 2-S1b-c), and the Mll4 Y5477A point

129 mutation in KI cell lines was confirmed by sequencing (Figure 2-S1d). Further validation of KI cells found

130 no evidence of potential exonic off-target mutations (Figure 2-S1e-h). MLL4 was expressed at comparable

131 levels in f/f and KI cells but absent in KO cells, and MLL3/4-associated protein UTX followed a similar

132 pattern (Figure 2a). Other chromatin regulators including CBP, BRG1 and ARID1A did not exhibit

133 observable changes among the panel of cells (Figure 2a). As expected, immunoblotting of histone extracts

134 revealed similarly reduced levels of global H3K4me1 and H3K4me2, but not H3K4me3, in KI and KO ESCs

135 (Figure 2b).

136 Next, we characterized the morphology and cell growth rate of KI ESCs. When cultured in 2i+LIF

137 medium, neither KO nor KI ESCs could form regular dome-like colonies shown in f/f ESCs. Specifically, KI

138 colonies were flat, and cells grew homogenously in monolayer (Figure 2c). In agreement with the previous

139 report5, f/f and KO ESCs displayed similar population doubling time. Surprisingly, KI ESCs grew faster and

140 displayed ~20% decrease in population doubling time (Figure 2d). By RT-qPCR analysis of ESC identity

141 , we observed that mRNA levels of Nanog, Oct4, Rex1 and Esrrb were comparable among all cell

142 lines (Figure 2e). Only the Sox2 level decreased about two-fold in both KI and KO ESCs as reported

143 previously23. In addition, both f/f and KI ESCs presented positive alkaline phosphatase (AP) staining (Figure

6

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

144 2c). Together, these data indicate that while MLL3/4 enzymatic activities are dispensable for maintaining

145 ESC identity, KI ESCs show monolayer growth and increased proliferation.

146

147 ESCs lacking MLL3/4 enzymatic activities are capable of differentiating towards the three germ

148 layers, but show defects in EB cavitation and cardiomyogenesis

149 We next investigated the roles of MLL3/4 enzymatic activities in embryoid body (EB) differentiation, which

150 is an in vitro model that mimics early embryonic development21. ESCs were aggregated in hanging drops

151 in the absence of LIF to form EBs, and the outgrowths were kept in culture for spontaneous differentiation

152 (Figure 3-S1a). During EB differentiation, UTX was expressed at similar levels between f/f and KI cells

153 (Figure 3-S1b), suggesting that the loss of enzymatic activity did not affect MLL4 level, which dictates the

154 stability of UTX protein4,24. As expected, global H3K4me1 and, to a lesser extent, H3K4me2 decreased in

155 KI cells during EB differentiation (Figure 3-S1c).

156 Consistent with the previous report5, f/f ESCs formed large cystic EBs during differentiation

157 whereas KO cells displayed severe defects (Figure 3a; Figure 3-S1d). Although EBs derived from KI ESCs

158 exhibited similar sizes as f/f EBs throughout differentiation (Figure 3-S1d), none of KI EBs displayed cystic

159 cavities at day 7 (D7) and day 10 (D10), when these structures were observed in the majority of f/f EBs

160 (Figure 3a). By RT-qPCR analysis, we confirmed that during differentiation of f/f cells, expression of ESC

161 identity genes Nanog and Oct4 decreased dramatically, whereas germ layer markers for endoderm (Gata4

162 and Afp), mesoderm (T and Kdr) and ectoderm (Nes and Cdh2) increased markedly (Figure 3b). In contrast

163 to the severe defects observed in KO cells, KI cells did not exhibit appreciable impairment in ESC identity

164 gene decommissioning or germ layer marker induction (Figure 3b). Consistently, NANOG protein

165 decreased to undetectable levels during EB differentiation in both f/f and KI cells (Figure 3-S1b).

166 In addition to the cavitation defects during EB differentiation, KI ESCs also exhibited impaired

167 capacity of cardiomyogenesis. After 8 days of differentiation, less than 20% of KI EBs contained contracting

168 clusters, while 100% of f/f EBs displayed spontaneous contraction (Figure 3-S1e). Although the percentage

169 of beating EBs derived from KI ESCs gradually rose to nearly 100% after day 12, KI EBs displayed obvious

7

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

170 morphologic changes compared to f/f EBs, evidenced by decreased beating frequency and smaller

171 contracting clusters (Figure 3-S1f-g; Supplementary Videos). Correlated with the phenotype, RT-qPCR

172 analysis revealed the induction of cardiac progenitor marker Tbx5 and cardiac myofilament genes Myh6,

173 Myl2 and Ttn was profoundly compromised during differentiation of KI ESCs (Figure 3-S1h).

174 Next, we performed teratoma assays to validate the differentiation capacity of KI ESCs in vivo. Both

175 f/f and KI ESCs developed into typical teratomas in which all three germ layers formed (Figure 3c). However,

176 KI teratomas failed to form striated muscle-like tissues that were observed in f/f teratomas (Figure 3c).

177 Together, these data indicate that while KI ESCs are generally capable of differentiating towards the three

178 germ layers, they show specific defects in cavitation and cardiomyogenesis.

179

180 Loss of MLL3/4 enzymatic activities impairs visceral endoderm induction in EB differentiation

181 Sequential expression of marker genes during EB differentiation reflects successive developmental

182 stages21,25,26. With an extraembryonic endoderm layer established on the surface, D3-D4 EBs transiently

183 express primitive streak (PS) markers and are equivalent to the early gastrulation stage around E6.5. After

184 D4, specialized cell lineages of the three germ layers develop (Figure 4a). Cavitation starts around D4-D5

185 and gives rise to large and cystic structures similar to the proamniotic cavity in embryos21. To further

186 investigate the roles of MLL3/4 enzymatic activities during differentiation and the mechanism underlying

187 the cavitation defects of KI EBs, we performed RNA-seq at ESC (D0) and EB (D4, D10) stages. Principal

188 component analysis (PCA) revealed that KI cells, but not KO cells, had similar differentiation trajectories

189 with f/f cells (Figure 4b).

190 Unsupervised K-means clustering of differentially expressed genes identified 12 clusters. Each

191 cluster was associated with distinct GO terms (Figure 4c-d; Figure 4-S1a). Among the 12 clusters, only 4

192 (I, III, IV, IX) displayed evident discrepancies between f/f and KI cells at D4 or D10. Cluster I genes were

193 markedly upregulated at D10 in f/f EBs but showed reduced expression levels in KI EBs. These genes were

194 enriched for muscle and cardiovascular development terms, consistent with the cardiomyogenesis defects

195 of KI EBs (Figure 4-S1a). Cluster III and IX genes were induced from D0 to D4 in f/f EBs with impaired

8

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

196 induction in KI EBs. Interestingly, these genes were enriched for early embryonic development terms such

197 as gastrulation and anterior/posterior pattern specification (Figure 4d), which depend on the development

198 of visceral endoderm (VE) and PS17. Gene set enrichment analysis (GSEA) using cell-type markers defined

199 previously by single-cell RNA-seq in E6.5 embryos27 revealed that strikingly, VE markers were significantly

200 enriched in genes downregulated in KI D4 EBs, while markers of PS or nascent mesoderm (NM), an early

201 derivative of PS, showed no enrichment (Figure 4e; Figure 4-S1b).

202 We also compared the expression of reported VE and PS genes between f/f and KI D4 EBs17,28.

203 All 16 VE genes were expressed at lower levels in KI EBs, with 11 of them showing over two-fold decreases.

204 In contrast, no PS genes changed more than two-fold (Figure 4f; Figure 4-S1c). BMP, Wnt and Nodal

205 signaling pathways play key roles in early embryonic development16. Among components of these

206 pathways, Bmp2 and Bmp4, which encode two ligands critical for VE differentiation29, decreased more than

207 2-fold in KI EBs at D4 (Figure 4-S1d). Together, these data indicate that MLL3/4 enzymatic activities are

208 required for induction of VE, which is known to be required for cavitation during EB differentiation30.

209

210 MLL3/4 enzymatic activities are required for induction of visceral endoderm TFs GATA4/6 in the

211 early phase of EB differentiation

212 To understand how MLL3/4 methyltransferase activities regulate VE induction, we conducted ChIP-seq of

213 MLL4, H3K4me1 and H3K27ac in ESCs and D4 EBs. We found that loss of enzymatic activity resulted in

214 redistribution of MLL4 genomic occupancy in KI EBs without discernible changes of total binding intensity

215 (Figure 5-S1). We identified 10,767 de novo MLL4+ AEs that were present in D4 f/f or KI EBs but not in

216 ESCs, and divided them into three groups based on the comparison of MLL4 binding intensities between

217 f/f and KI EBs. Among the 10,767 de novo MLL4+ AEs, 14% (1,504; Group I), 34% (3,610; Group II) and

218 52% (5,653; Group III) showed increased, unchanged and decreased levels of MLL4 occupancies in KI

219 EBs compared to f/f EBs, respectively (Figure 5a). Among them, only Group III was overrepresented in VE

220 marker-associated AEs (Figure 5b). Motifs of ESC master TFs NANOG and OCT4 were enriched in Group

221 I; motifs of T-box factors EOMES and MGA and Wnt pathway effector TCF4 were enriched in Group II; and

9

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

222 motifs of GATA family TFs were enriched in Group III (Figure 5c). Consistent with changes in MLL4 binding,

223 TFs linked to Group I, II or III exhibited increased, unchanged or decreased expression levels in KI EBs

224 compared to f/f EBs (Figure 5d), suggesting the notion that differentially expressed TFs contribute to the

225 redistribution of MLL4 genomic binding.

226 GATA4 and GATA6 are lineage-determining TFs essential for VE development in mouse

227 embryos31,32. Motifs of GATA family TFs including GATA4/6 were highly enriched in de novo MLL4+ AEs

228 associated with VE markers but not PS or NM markers (Figure 5-S2). In addition, GATA6 binding sites in

229 extraembryonic endoderm (GSE69323)33 colocalized with Group III de novo MLL4+ AEs around

230 representative VE genes, whose expression levels decreased in KI D4 EBs (Figure 5e). These observations

231 indicate that MLL3/4 enzymatic activities are required for induction of VE transcription factors GATA4/6 in

232 the early phase of EB differentiation.

233

234 MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB differentiation

235 We reported previously that MLL3/4 proteins are required for de novo enhancer activation in D4 EBs5. To

236 test if this requirement is mediated by MLL3/4-catalyzed H3K4me1, we examined distributions of H3K4me1,

237 H3K27ac and chromatin accessibility on Group II de novo MLL4+ AEs where f/f and KI EBs displayed similar

238 levels of MLL4 binding. We observed dramatic induction of H3K4me1, H3K27ac and chromatin accessibility

239 from ESC to D4 EB in f/f cells. In KI EBs, despite no gain of H3K4me1, Group II AEs showed comparable

240 induction of H3K27ac and chromatin accessibility as in f/f EBs (Figure 6a-c; Figure 6-S1). Using ChIP-seq

241 data from KO cells5, we further confirmed the requirement of MLL3/4 proteins for the activation of Group II

242 de novo MLL4+ AEs. Representative genes near Group II AEs were induced comparably in f/f and KI but

243 not KO EBs (Figure 6-S2). On Group I and III AEs, induction patterns of H3K27ac and chromatin

244 accessibility in f/f and KI EBs were consistent with those of MLL4 but not H3K4me1 (Figure 6a-b). Together,

10

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

245 these data indicate that while MLL3/4 proteins are required for enhancer activation during early EB

246 differentiation, MLL3/4-catalyzed H3K4me1 is dispensable.

247

248 MLL3/4-catalyzed H3K4me1 is dispensable for activation of super-enhancers during neural

249 differentiation

250 We also investigated the role of MLL3/4-catalyzed H3K4me1 in enhancer activation during lineage-specific

251 differentiation. Following the retinoid acid (RA)-induced neural differentiation protocol22 (Figure 7-S1a), we

252 observed that ESCs differentiated homogeneously into neural progenitor cells (NPCs) around day 8 (D8)

253 with the characteristics of spindle-like shape and neural rosette. After day 10, NPCs gradually differentiated

254 into mature glutamatergic neurons until day 16 (D16), when a dense neuritic network formed. Interestingly,

255 no morphological differences could be observed between f/f and KI cells during RA-induced neural

256 differentiation (Figure 7a; Figure 7-S1b). RNA-seq analysis in NPCs and neurons revealed that only induced

257 genes with comparable expression levels between f/f and KI cells were functionally associated with nervous

258 system development (Figure 7b-c; Figure 7-S1c-d). NPC markers Nes, Sox3, Pax6 and Pax7 and neuron

259 markers NeuN, Tau, Ntrk2 and App were induced comparably in f/f and KI cells (Figure 7-S1e-f). These

260 results indicate that MLL3/4 enzymatic activities are dispensable for cell-identity gene induction during

261 neural differentiation.

262 Super-enhancers (SEs) are clusters of AEs bound by lineage-specific master TFs and drive high-

263 level expression of cell-identity genes34. SOX3 is a master TF in NPCs35. We identified 254 SEs in f/f NPCs

264 using stitched SOX3 peaks36 ranked by H3K27ac signal intensity. 97.6% (248) of them were MLL4+ and

265 associated with NPC identity genes such as Nes, Notch1 and Ptn (Figure 7d). GO analysis of SE-

266 associated genes identified nervous system development and neuron differentiation as top terms (Figure

267 7-S1g). While induction of H3K4me1 signals on NPC SEs displayed severe impairment in KI NPCs, signal

268 intensities of MLL4 and H3K27ac did not show significant differences between f/f and KI NPCs. Expression

269 levels of SE-associated genes were also induced comparably (Figure 7e-g; Figure 7-S1h). Together, these

270 data indicate that MLL3/4-catalyzed H3K4me1 is dispensable for SE activation during neural differentiation.

11

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

271 Discussion

272 Using the CRISPR/Cas9 system to knockin enzyme-dead point mutations into MLL3/4 in mice or ESCs, we

273 have examined roles of MLL3/4 enzymatic activities in embryonic development and ESC differentiation, as

274 well as MLL3/4-catalyzed H3K4me1 in enhancer activation. We show that MLL3/4 carry both enzymatic

275 activity-dependent and -independent functions during mouse development, and that simultaneous

276 elimination of MLL3/4 enzymatic activities leads to embryonic lethality around E8.5. In Mll3 KO ESCs,

277 inactivating MLL4 enzymatic activity has little effects on the maintenance of cell identity but leads to faster

278 proliferation and monolayer cell growth. During EB differentiation, ESCs lacking MLL3/4 enzymatic activities

279 maintain the capacity of differentiating towards the three germ layers but show defective EB cavitation,

280 which is due to, at least in part, the impaired induction of visceral endoderm. Finally, our data reveal that

281 MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB differentiation and

282 super-enhancer activation in lineage-specific neural differentiation (Figure 7h).

283 In mice, all six Set1-like H3K4 methyltransferases are essential for normal development37. MLL3 is

284 required for lung maturation and Mll3-/- embryos exhibit perinatal lethality. Mll4-/- embryos die around E9.5

285 and display a defect in anterior VE migration precedes gastrulation4,7. The phenotypes of Mll3KI/KI and

286 Mll4KI/KI mice are much alleviated. Mll3KI/KI mice survive to adulthood with a lower proportion and Mll4KI/KI

287 mice die around birth without obvious phenotypes, indicating that MLL3/4 proteins play a more dominant

288 role than their enzymatic activities during mouse development (Figure 1h). Given the partial redundancy

289 between MLL3 and MLL4 during adipogenesis4,10, we generated homozygous Mll3/Mll4 double KI embryos.

290 Strikingly, Mll3KI/KI;Mll4KI/KI embryos display severe developmental retardation at E8.0 and die around E8.5,

291 indicating enzymatic activities of MLL3 and MLL4 are partially redundant and together indispensable for

292 early embryonic development. This is the first in vivo evidence of functional redundancy between enzymatic

293 activities of Set1-like methyltransferases. Interestingly, a previous study reported that Drosophila embryos

294 expressing catalytically deficient Trr, the only homolog of MLL3 and MLL4 in Drosophila, produce fertile

295 adults with only minor abnormalities38. Such a distinct dependence of embryonic development in mice and

296 in Drosophila suggests a more critical role of MLL3/4 enzymatic activities in mammals.

297 The cavitation of EBs is a complex process regulated by VE as well as factors involved in autophagy

298 and apoptosis30,39-42. During EB differentiation of ESCs lacking MLL3/4 enzymatic activities, VE induction

12

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

299 is impaired while autophagy or apoptosis factors known to regulate cavitation show normal levels of

300 expression (Figure 4-S1e), which indicate that the EB cavitation defects after inactivating MLL3/4 enzymatic

301 activities is attributed, at least in part, to impaired induction of VE. During EB differentiation, the damaged

302 VE induction likely contributes to the defective cardiomyogenesis of MLL3/4 enzyme-dead ESCs as well,

303 because VE-secreted factors promote cardiomyogenesis of ESCs43. In addition, the impairment of VE

304 induction may also lead to the early lethality of Mll3KI/KI;Mll4KI/KI embryos. This possibility is supported by at

305 least two observations. First, E8.0 Mll3KI/KI;Mll4KI/KI embryos morphologically resemble normal embryos at

306 E5.5-6.5, indicating development of Mll3KI/KI;Mll4KI/KI embryos is blocked at the stage when anterior VE

307 migration takes place to establish early embryonic asymmetry17. Second, MLL3/4 enzyme-dead ESCs are

308 capable of decommissioning ESC identity genes and differentiating towards all three germ layers during

309 EB differentiation and teratoma formation, suggesting that eliminating MLL3/4 enzymatic activities may

310 affect primarily the development of extraembryonic tissues, such as VE. Interestingly, the defect in anterior

311 VE migration has been reported in Mll4 null embryos7.

312 In undifferentiated ESCs, MLL3/4 proteins but not MLL3/4-catalyzed H3K4me1 are essential for

313 maintaining AEs and target gene transcription9. Using Mll3/Mll4 double KO ESCs, our previous study

314 showed that MLL3/4-regulated p300 recruitment and enhancer activation during EB differentiation even

315 occur on H3K4me1-primed enhancers5. In this study, we generated Mll3-/-;Mll4KI/KI ESCs and, for the first

316 time, directly addressed the question whether MLL3/4-mediated enhancer H3K4me1 is required for

317 enhancer function during cell fate transition. Using two model systems, we found that MLL3/4-catalyzed

318 H3K4me1 is dispensable either for enhancer activation during EB differentiation or for SE activation during

319 lineage-specific neural differentiation. Consistently, MLL3/4 regulate developmental gene induction largely

320 independent of enhancer H3K4me1. Interestingly, during EB differentiation, enzyme-dead point mutation

321 results in significant redistribution of de novo MLL4+ AEs without discernible changes of MLL4 total binding

322 intensity, which partially explains the transcriptional alteration and phenotypic changes after eliminating

323 MLL3/4 enzymatic activities. These observations suggest that MLL3/4 enzymatic activities regulate ESC

324 differentiation through mechanisms other than MLL3/4-catalyzed H3K4me1 on enhancers. One possibility

325 is that MLL3/4 regulate ESC differentiation through non-histone substrates, such as lineage-determining or

326 signal activated TFs. Indeed, a recent publication reported that is a potential substrate of Set1-like

13

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

327 methyltransferases including MLL3/444. MLL3/4-catalyzed H3K4me1 has been reported to be required for

328 enhancer-promoter interaction and gene expression at the Sox2 locus in ESCs, due to H3K4me1-

329 dependent recruitment of Cohesin complex to enhancers23. While our results cast doubt on the role of

330 H3K4me1-mediated enhancer activation during ESC differentiation in general, it could well be that

331 H3K4me1-dependent recruitment functions at specific loci such as Sox2.

14

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

332 Materials and Methods

333 Generation of Mll3/Mll4 knockin mice

334 The MLL3 Y4792A and MLL4 Y5477A knockin mouse lines were generated using CRISPR/Cas9 system

335 modified from Wang et al.45. Briefly, sgRNA sequences cutting near genomic loci of MLL3 Y4792 (5’-

336 ACTATGGTCATCGAGTACAT-3’) and MLL4 Y5477 (5’-ACGATGGTCATCGAGTACAT-3’) were

337 synthesized by Thermo Fisher’s custom in vitro transcription service. The Tyrosine to Alanine point

338 mutations were introduced using single-strand Ultramer DNA Oligonucleotide (IDT). Resulting donor

339 oligonucleotides are as follows: MLL3 Y4792A, 5’-

340 TTCCTTCCACTTAGGGACTGGGCCTGTATGCTGCTAGAGACATTGAAAAACACACTATGGTCATCGA

341 GGCTATTGGAACAATTATTCGAAATGAGGTTGCAAACCGGAAGGAGAAGCTTTATGAGTCTCAGGTA

342 CTGTAT-3’; MLL4Y5477A, 5’-

343 CGCGTATCCAGGGCCTCGGCCTCTATGCAGCCAAGGACCTGGAGAAGCACACGATGGTCATCGAG

344 GCTATCGGCACCATCATTCGCAATGAGGTGGCCAATCGGCGGGAGAAAATCTATGAGGAGCAGGTA

345 CTGTGGG-3’, in which the inserted mutations are highlighted in italic and bold. To generate each knockin

346 mouse line, sgRNA sequences (20 ng/μL) and its corresponding donor oligonucleotide (100 ng/μL) were

347 co-microinjected with Cas9 nickase mRNA (50 ng/μL, Trilink Biotechnologies) into the cytoplasm of zygotes

348 collected from B6D2F1/J mice (JAX #100006). Injected embryos were cultured in M16 medium

349 (MilliporeSigma) overnight in a 37°C incubator with 6% CO2. The next morning, 2-cell-stage embryos were

350 implanted into the oviducts of pseudopregnant surrogate mothers. Offspring were genotyped by PCR and

351 confirmed by Sanger Sequencing. All mouse experiments were performed in accordance with the NIH

352 Guide for the Care and Use of Laboratory Animals and approved by the Animal Care and Use Committee

353 of NIDDK, NIH.

354

355 Generation of Mll4 knockin ES cells

356 The MLL4 Y5477A knockin ES cell lines were generated using the CRISPR/Cas9 system modified from

357 Ran et al.46. Briefly, a sgRNA sequence cutting near MLL4 Y5477 (5’-ACGATGGTCATCGAGTACAT-3’)

358 was cloned into lentiCRISPR v2 plasmid (Addgene #52961). The Tyrosine to Alanine point mutation was

359 introduced using single-strand Ultramer DNA Oligonucleotide (IDT). The resulting donor oligonucleotide is

15

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

360 as follows: 5’-

361 TCGGCCTCTATGCAGCCAAGGACCTGGAGAAGCACACGATGGTCATCGAGGCTATCGGCACCATCA

362 TTCGCAATGAGGTGGCCAATCGGCGGGAGAAAATCTA-3’, in which the inserted mutations are

363 highlighted in italic and bold. To generate knockin ES cells, the plasmid containing sgRNA (10 μg) and the

364 donor oligonucleotide (400 pmol) were co-transfected into 1 x 106 Mll3-/-;Mll4f/f mouse ES cells by

365 Lipofectamine 3000 (Thermofisher). 24 hours later, cells were selected with 1.5 μg/mL puromycin for 3 days.

366 After recovery, 1 x 104 single cells were seeded into a 6 cm dish. Colonies derived from single cells were

367 picked 1 week later and expanded for genotyping. ~1.1 kb PCR products covering the targeted region were

368 subsequently screened by Sanger sequencing to map the expected mutation. Several potential exonic off-

369 target regions were also checked and no indel was detected. Finally, the stability of the mutated protein

370 was confirmed by immunoblotting.

371

372 ES cell culture

373 Mouse ES cells were cultured as described in Koehler and Hashino47. Briefly, cells were maintained on

374 dishes coated with 0.1% gelatin (Millipore) in ESC culture medium, a serum-free 2i+LIF medium containing

375 47.75% Advanced DMEM/F-12 (Gibco), 47.75% Neurobasal (Gibco), 0.5% N-2 supplement (Gibco), 1%

376 B-27 supplement minus Vitamin A (Gibco), 1% GlutaMax (Gibco), 1% Penicillin-Streptomycin (Corning), 1%

377 EmbryoMax 2-mercaptoethanol (MilliporeSigma), and supplemented with 1 µM MEK inhibitor PD0325901

378 (Selleckchem), 3 µM GSK3 inhibitor CHIR99021 HCl (Selleckchem) and 1,000 units/mL mouse LIF protein

379 (MilliporeSigma).

380 For Alkaline Phosphatase (AP) staining, 4 x 103 single cells were seeded into one well of a 24-well

381 plate. 4 days later, AP staining was done using Alkaline Phosphatase Detection Kit (MilliporeSigma)

382 following the manufacturer's instructions.

383

384 ES cell differentiation

385 Embryoid body (EB) differentiation was performed according to the hanging drop method modified from

386 Cao et al.48 (Figure 3-S1a). Briefly, cells were suspended into 1 × 105 /mL in indicated differentiation medium

387 and aliquoted into 20 µL drops on the lid of tissue culture dishes. EBs were harvested 2 days later,

16

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

388 transferred into ultra-low-attached dishes (Corning) and further cultured in suspension to the indicated time

389 points. The medium used in EB differentiation from day 0 to day 2 is modified from ESC culture medium. It

390 contains no LIF, 1/4 concentration of both inhibitors and is supplemented with 2% ESC-qualified FBS

391 (Gibco). After day 2, the two inhibitors were removed, and medium was changed every other day. The

392 diameter of EB was measured using ImageJ software (NIH).

393 To monitor the cardiomyogenesis during EB differentiation, day 2 EBs harvested from hanging

394 drops were plated into gelatinized 24-well plates with one EB per well. Spontaneous contracting clusters

395 could be observed as early as day 6. The percentage of beating EBs in each 24-well plate was calculated

396 at indicated time points.

397 Neural differentiation protocol was modified from Bibel et al.22 (Figure 7-S1a). Briefly, neural lineage

398 aggregates were generated following EB differentiation method in ultra-low-attached dishes described

399 above from day 0 to day 8 except that after day 4, 5 µM retinoic acid (MilliporeSigma) was added to the

400 culture. At day 8, aggregates were dissociated with trypsin and neural progenitor cells (NPCs) were plated

401 on PDL/laminin-coated dishes in N-2 medium (Advanced DMEM/F-12 supplemented with 1% N-2

402 supplement, 1% 2 mM GlutaMax, 1% 2-mercaptoethanol and 50 μg/mL BSA). The N-2 medium was

403 changed after 2 hours and again after 1 day. After 2 days, the medium was replaced by the B-27 medium

404 (Neurobasal medium supplemented with 2% B-27 serum free (Gibco), 1% 2 mM GlutaMax and 1% 2-

405 mercaptoethanol) to induce neurogenesis. The B-27 medium was changed every other day. Neural

406 differentiation was performed until day 16.

407

408 Teratoma formation assay

409 Approximately 2 x 105 ES cells were mixed with 50% volume of Matrigel Matrix (Corning) and injected

410 subcutaneously into two sides of an immunocompromised mouse (NSG mouse, JAX #005557). 25 days

411 later, the teratomas were dissected out, fixed in 10% neutral buffered formalin and subjected to histological

412 analysis with H&E staining following standard protocols.

413

414 Immunoblotting

17

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

415 Nuclear proteins were extracted as described previously10. Briefly, cells were collected and resuspended

416 in cell lysis buffer (10 mM HEPES pH 7.9, 1.5 mM MgCl2, 10 mM KCl and 0.1% NP-40) supplemented with

417 protease inhibitors (Roche) for 20 min to swell cellular membrane. After centrifugation at 2,000xg, nuclei

418 were extracted in nuclear lysis buffer (20 mM HEPES, pH 7.9, 1.5 mM MgCl2, 420 mM NaCl, 0.2 mM EDTA

419 and 25% glycerol) supplemented with protease inhibitors. Nuclear extracts were separated using 3–8%

420 Tris-Acetate gradient gel (Invitrogen).

421 Histone extraction protocol was modified from Shechter et al.49. Briefly, cells were collected and

422 resuspended in hypotonic lysis buffer (10 mM Tris-HCl pH 8.0, 1 mM KCl, 1.5 mM MgCl2 and 1 mM DTT)

423 supplemented with protease inhibitors for 10 min to swell cellular membrane. After centrifugation at

424 10,000xg, histone fraction was extracted with 0.4 N HCl and neutralized to pH 7.0. Histone extracts were

425 separated using home-made 12% SDS-PAGE gel.

426 Primary antibodies used in immunoblotting were indicated as follows. Anti-MLL3#38, anti-MLL4#38,

427 and anti-UTX#250 were home-made. Anti-CBP (A700-010) and anti-RBBP5 (A300-109A) were from Bethyl

428 Laboratories. Anti-ARID1A (sc-32761X) and anti-BRG1 (sc-10768 X) were from Santa Cruz Biotechnology.

429 Anti-CBP (7389S) was from Cell Signaling Technology. Anti-H3K4me1 (13-0040) was from EpiCypher.

430 Anti-H3K4me2 (ab7766), anti-H3 (ab1791), and anti-H3K27ac (ab4729) were from Abcam. Anti-H3K4me3

431 (17-614) was from MilliporeSigma.

432

433 RNA isolation and RT-qPCR analysis

434 Total RNA was extracted by TRIzol (Life Technologies) and reverse transcribed using ProtoScript II first-

435 strand cDNA synthesis kit (NEB) following the manufacturer's instructions. Quantitative PCR was performed

436 in duplicate with Luna Universal qPCR Master Mix (NEB) using QuantStudio 5 Real-Time PCR System

437 (Thermo Fisher). RT-qPCR data were normalized using Gapdh and were presented as means ± SD. Primer

438 sequences used in RT-qPCR are listed below.

439 Mll4: forward 5’-GCTATCACCCGTACTGTGTCAACA-3’, reverse 5’-CACACACGATACACTCCACACAA-3’

440 Nanog: forward 5’-CACCCACCCATGCTAGTCTT-3’, reverse 5’-ACCCTCAAACTCCTGGTCCT-3’

441 Oct4: forward 5’-CGGAAGAGAAAGCGAACTAGC-3’, reverse 5’-ATTGGCGATGTGAGTGATCTG-3’

442 Sox2: forward 5’-GGAAAGGGTTCTTGCTGGGT-3’, reverse 5’-ACGAAAACGGTCTTGCCAGT-3’

18

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

443 Rex1: forward 5’-GTTCGTCCATCTAAAAAGGG-3’, reverse 5’-TAGTCCATTTCTCTAATGCCC-3’

444 Esrrb: forward 5’-AAAGCCATTGACTAAGATCG-3’, reverse 5’-AATTCACAGAGAGTGGTCAG-3’

445 Gata4: forward 5’-CACCCCAATCTCGATATGTTTGA-3’, reverse 5’-GGTTGATGCCGTTCATCTTGT-3’

446 Afp: forward 5’-TTTCCAGAACCTGCCGAGAG-3’, reverse 5’-GACAGAATGGCTGGGGCATA-3’

447 T: forward 5’-GCTTCAAGGAGCTAACTAACGAG-3’, reverse 5’-CCAGCAAGAAAGAGTACATGGC-3’

448 Kdr: forward 5’-CAAACCTCAATGTGTCTCTTTGC-3’, reverse 5’-AGAGTAAAGCCTATCTCGCTGT-3’

449 Nes: forward 5’-CTCTTGGCTTTCCTGACCCC-3’, reverse 5’-AGGCTGTCACAGGAGTCTCA-3’

450 Cdh2: forward 5’-CAGGGTGGACGTCATTGTAG-3’, reverse 5’-AGGGTCTCCACCACTGATTC-3’

451 Gapdh: forward 5’-AATGTGTCCGTCGTGGATCTGA-3’, reverse 5’-GATGCCTGCTTCACCACCTTCT-3’

452

453 RNA-seq library preparation

454 1 µg total RNA was used for mRNA purification with NEBNext Poly(A) mRNA Magnetic Isolation Module

455 (NEB, E7490). From purified mRNA, libraries were constructed using NEBNext Ultra™ II RNA Library Prep

456 Kit for Illumina (NEB, E7770) following the manufacturer’s instructions, and were sequenced on Illumina

457 HiSeq 4000 and NovaSeq 6000.

458

459 ChIP and ChIP-seq library preparation

460 Chromatin immunoprecipitation (ChIP) and ChIP-Seq were performed as described previously5. Briefly,

461 cells were cross-linked with 1% formaldehyde for 10 min and quenched by 125 mM glycine for 10 min.

462 Fixed cells were swelled in the lysis buffer containing 5mM PIPES pH 7.5, 85mM KCl, 0.5% NP-40 and

463 protease inhibitors, incubated on ice for 20 min, and centrifuged at 500xg for 5 min at 4°C. Nuclei were

464 resuspended with cold TE buffer (10mM Tris-HCl pH8.0, 1mM EDTA), and subjected to sonication. Sheared

465 chromatin was clarified by centrifugation at 13,000xg for 10 min at 4°C. The supernatant was transferred

466 to a new tube and further supplemented with 150mM NaCl, 1% Triton X-100, 0.1% sodium deoxycholate

467 and protease inhibitors. 2% of the mixture was set aside as input. For each ChIP, chromatin from 1 x 107

468 cells were mixed with 20 ng spike-in chromatin (Active Motif, #53083) and incubated with 4-8 μg primary

469 target antibody and 2 μg spike-in antibody (Active Motif, #61686) overnight at 4°C. Then, ChIP samples

470 were precipitated by 50 μL prewashed Protein A Dynabeads (ThermoFisher). Beads were washed twice

19

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

471 with 1 mL RIPA buffer, once with 1 mL RIPA buffer containing 500mM NaCl, twice with 1 mL LiCl buffer

472 and once with 1 mL TE buffer, and then eluted with 100 μL fresh elution buffer (0.1M NaHCO3, 1% SDS).

473 ChIP samples and input were incubated with Proteinase K (NEB) at 65°C overnight to reverse formaldehyde

474 crosslinking, and then purified using QIAquick PCR purification kit (Qiagen).

475 For ChIP-Seq, the entire ChIP DNA and 300 ng input DNA were used to construct libraries using

476 NEBNext Ultra™ II DNA Library Prep kit for Illumina (NEB, E7645) following the manufacturer’s instructions.

477 The final libraries were sequenced on Illumina HiSeq 3000 and HiSeq 4000.

478

479 ATAC-seq library preparation

480 Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-Seq) was

481 performed as described51. Briefly, for each reaction, 5 × 104 cells were freshly collected, washed and

482 swelled in 50 µL cold lysis buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.1%

483 Tween-20, and 0.01% digitonin) for 3 min. Nuclei were collected by centrifugation at 500×g for 10 min at

484 4°C, and then resuspended in 50 µL of transposition reaction buffer containing 25 µL 2x Tagment DNA

485 buffer, 2.5 µL transposase (100 nM final, Illumina), 0.5 µL 1% digitonin, 0.5 µL 10% Tween-20, 16.5 µL PBS,

486 and 5 µL H2O. The reaction was mixed for 30 min at 37°C and subjected to purification using the MinElute

487 Reaction Cleanup Kit (Qiagen) according to the manufacturer’s instructions. Purified DNA was amplified for

488 library construction with PCR using Nextera i5 common adaptor and i7 index adaptors (Illumina). The final

489 libraries were sequenced on Illumina HiSeq 4000.

490

491 Computational analysis

492 NGS data processing

493 For RNA-seq, raw sequencing data were aligned to the mouse genome mm9 using STAR software52

494 (v2.7.5). Reads on exons defined by UCSC gene annotation system were collected to calculate Transcripts

495 Per Million (TPM) as a measure of gene expression level, and genes with TPM > 3 were regarded as

496 expressed.

497 For ChIP-seq, raw sequencing data were aligned to the mouse genome mm9 and the drosophila

498 genome dm6 using Bowtie253 (v2.3.2). To identify ChIP-enriched regions, SICER54 (v1.1) was used. For

20

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

499 ChIP-Seq of MLL4, the window size of 50 bp, the gap size of 50 bp, and the false discovery rate (FDR)

500 threshold of 10-10 were used. For ChIP-Seq of histone modifications (H3K4me1 and H3K27ac), the window

501 size of 200 bp, the gap size of 200 bp, and the FDR threshold of 10-3 were used. Reads on indicated regions

502 were collected to calculate Reads Per Kilobase Million (RPKM) as a measure of signal intensity.

503 For ATAC-seq, raw ATAC-Seq reads were processed using Kudaje lab’s ataqc pipeline

504 (https://github.com/kundajelab/atac_dnase_pipelines) (v0.3.4) which includes adapter trimming, aligning to

505 mouse mm9 genome by Bowtie2, and peak calling by MACS255. For downstream analysis, filtered reads

506 that were retained after removing unmapped reads, duplicates and mitochondrial reads were used.

507

508 RNA-seq analysis

509 Differentially expressed genes were determined using DESeq256 (v1.20.0) in R/Bioconductor with a cutoff

510 of 2.5-fold (p < 0.05). PCA graph and scatter plots were drawn using ggplot257 (v3.2.1) in R.

511 (GO) analysis was done using DAVID58 with the whole mouse genome as background.

512 For comparison of gene expression levels during EB differentiation (Figure 4-S1b,e; Figure 5c),

513 duplicates of KI #27 and KI #48 were combined as KI quadruplicates to compare with f/f duplicates. For

514 comparison of gene expression levels during neural differentiation (Figure 7-S1c-d), single data of KI #27

515 and KI #48 were combined as KI duplicates to compare with f/f duplicates.

516 For clustering analysis, we collected genes expressed in either f/f, KI #27 or KI #48 at any of the

517 three time points during EB differentiation. Genes differentially expressed between any two time points in

518 either cell line were clustered into 12 groups (Cluster I-XII) using the K-means clustering (ward.D2). Heat

519 map (Figure 4c) of clustered gene expression in terms of z-score was drawn using pheatmap59 (v1.0.12) in

520 R. Line chart (Figure 4d; Figure 4-S1a) was drawn from the average expression of genes in each cluster in

521 terms of z-score.

522 Gene set enrichment analysis (GSEA) was performed using GSEA software60 (v4.0.3), and cell-

523 type markers in E6.5 embryos with p < 0.0527 were used as signature databases.

524

525 Genomic distribution of ChIP-seq peaks

21

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

526 To define regulatory regions, a combination of genomic coordinates and histone modification ChIP-Seq

527 data were used. Promoter regions were defined as transcription start sites ± 1 kb. Promoter-distal regions

528 were further overlapped with H3K27ac+ regions to define active enhancers (AEs). MLL4+ AEs were defined

529 as AEs that are overlapping with MLL4 peaks at least 1 bp. All MLL4+ AEs existed in D4 EBs but not in

530 ESCs were regarded as de novo MLL4+ AEs. We associated AEs to the proximal genes within ±100 kb.

531 To compare MLL4 genomic distributions between f/f and KI D4 EBs, the union of de novo MLL4+

532 AEs was divided into three groups based on MLL4 binding intensities in KI cells compared to f/f cells with

533 a cutoff of 2-fold. Heat map matrices were generated using in-house scripts with 50 bp resolution and

534 visualized in R. Average profiles were plotted using the number of ChIP-seq reads (normalized to the size

535 of each library) in 5 bp bins from the center of each regulatory element to 2 kb on both sides.

536

537 Motif analysis

538 To find enriched TF motifs in given genomic regions identified by ChIP-Seq, we utilized the SeqPos motif

539 tool in the Cistrome toolbox61. For ChIP-Seq data with more than 5,000 regions, top 5,000 significant

540 regions were used. For ChIP-Seq data with small numbers, all regions were used.

541

542 Analysis of super-enhancer

543 We used Rank Ordering of Super-Enhancers62 (ROSE) with default parameters to identify super-enhancers

544 (SEs). To identify SEs in NPCs, we stitched SOX3 binding sites (GSE33059)36 in NPC AEs and used

545 H3K27ac signal intensity for ranking. We associated SEs to the proximal genes within ±100 kb.

546

547 Box plots and genomic profile visualization

548 Box plots were drawn with GraphPad Prism 8 software using the Tukey method. In Figure 7e-f, ChIP-seq

549 signal intensities of MLL4, H3K4me1 and H3K27ac on NPC SEs, and expression levels of NPC SE-

550 associated genes in f/f and KI cells were plotted. Wilcoxon signed-rank test (two-sided) was used to

551 determine statistical differences in signal enrichment and gene expression levels between f/f and KI NPCs.

552 To visualize data from ChIP-seq, ATAC-seq and RNA-seq in the Integrative Genomics Viewer63

553 (IGV) (v2.4.4), reads were collected and converted to wiggle (wig) format-profiles using in-house script.

22

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

554

555 Acknowledgement

556 We thank Nhien Tran and Ruth Kopyto for assistance in genotyping, Hualong Yan and Todd S. Macfarlan

557 for suggestions on ESC culture and gene editing, NHLBI DNA Sequencing and Genomics Core and UCSD

558 IGM Genomics Center for next-generation sequencing, NIH HPC group for high-performance computing.

559 This work was supported by the Intramural Research Program of NIDDK, NIH to K.G.

560

561 Author Contributions

562 Conceptualization, G.X. and K.G.; Methodology, G.X., J.-E.L., W.P. and C.L.; Investigation, G.X., Y.-K.P.

563 and Y.J.; Software, Formal Analysis, and Data Curation, J.-E.L., G.X. and W.P.; Writing – Original Draft,

564 G.X. and K.M.; Writing – Review & Editing, G.X., J.-E.L., W.P. and K.G.; Project Administration and Funding

565 Acquisition, K.G.

566

567 Declaration of Interests

568 The authors declare no competing interests.

569

570 Data availability

571 All data sets described in the paper have been deposited in NCBI Gene Expression Omnibus under

572 accession number GSE154475.

573

574 Code availability

575 In-house generated computational codes are available upon request.

576

23

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

577 References 578 1. Calo, E. & Wysocka, J. Modification of enhancer chromatin: what, how, and why? Mol Cell 49, 579 825-37 (2013). 580 2. Creyghton, M.P. et al. Histone H3K27ac separates active from poised enhancers and predicts 581 developmental state. Proc Natl Acad Sci U S A 107, 21931-6 (2010). 582 3. Jin, Q. et al. Distinct roles of GCN5/PCAF-mediated H3K9ac and CBP/p300-mediated 583 H3K18/27ac in transactivation. EMBO J 30, 249-62 (2011). 584 4. Lee, J.E. et al. H3K4 mono- and di-methyltransferase MLL4 is required for enhancer activation 585 during cell differentiation. Elife 2, e01503 (2013). 586 5. Wang, C. et al. Enhancer priming by H3K4 methyltransferase MLL4 controls cell fate transition. 587 Proc Natl Acad Sci U S A 113, 11871-11876 (2016). 588 6. Lai, B. et al. MLL3/MLL4 are required for CBP/p300 binding on enhancers and super-enhancer 589 formation in brown adipogenesis. Nucleic Acids Res 45, 6388-6403 (2017). 590 7. Ashokkumar, D. et al. MLL4 is required after implantation whereas MLL3 becomes essential 591 during late gestation. Development (2020). 592 8. Cho, Y.W. et al. PTIP associates with MLL3- and MLL4-containing histone H3 lysine 4 593 methyltransferase complex. J Biol Chem 282, 20395-406 (2007). 594 9. Dorighi, K.M. et al. Mll3 and Mll4 Facilitate Enhancer RNA Synthesis and Transcription from 595 Promoters Independently of H3K4 Monomethylation. Mol Cell 66, 568-576 e4 (2017). 596 10. Jang, Y. et al. H3.3K4M destabilizes enhancer H3K4 methyltransferases MLL3/MLL4 and impairs 597 adipose tissue development. Nucleic Acids Res 47, 607-620 (2019). 598 11. Zhang, J. et al. Disruption of KMT2D perturbs germinal center B cell development and promotes 599 lymphomagenesis. Nat Med 21, 1190-8 (2015). 600 12. Ang, S.Y. et al. KMT2D regulates specific programs in heart development via histone H3 lysine 4 601 di-methylation. Development 143, 810-21 (2016). 602 13. Zhang, Z. et al. Mammary-Stem-Cell-Based Somatic Mouse Models Reveal Breast Cancer 603 Drivers Causing Cell Fate Dysregulation. Cell Rep 16, 3146-3156 (2016). 604 14. Placek, K. et al. MLL4 prepares the enhancer landscape for Foxp3 induction via chromatin 605 looping. Nat Immunol 18, 1035-1045 (2017). 606 15. Froimchuk, E., Jang, Y. & Ge, K. Histone H3 lysine 4 methyltransferase KMT2D. Gene 627, 337- 607 342 (2017). 608 16. Tam, P.P. & Loebel, D.A. Gene function in mouse embryogenesis: get set for gastrulation. Nat 609 Rev Genet 8, 368-81 (2007). 610 17. Rossant, J. & Tam, P.P. Blastocyst lineage formation, early embryonic asymmetries and axis 611 patterning in the mouse. Development 136, 701-13 (2009). 612 18. Stern, C.D. & Downs, K.M. The hypoblast (visceral endoderm): an evo-devo perspective. 613 Development 139, 1059-69 (2012). 614 19. Stuckey, D.W., Di Gregorio, A., Clements, M. & Rodriguez, T.A. Correct patterning of the primitive 615 streak requires the anterior visceral endoderm. PLoS One 6, e17620 (2011). 616 20. Keller, G. Embryonic stem cell differentiation: emergence of a new era in biology and medicine. 617 Genes Dev 19, 1129-55 (2005). 618 21. Leahy, A., Xiong, J.W., Kuhnert, F. & Stuhlmann, H. Use of developmental marker genes to 619 define temporal and spatial patterns of differentiation during embryoid body formation. J Exp Zool 620 284, 67-81 (1999). 621 22. Bibel, M., Richter, J., Lacroix, E. & Barde, Y.A. Generation of a defined and uniform population of 622 CNS progenitors and neurons from mouse embryonic stem cells. Nat Protoc 2, 1034-43 (2007). 623 23. Yan, J. et al. Histone H3 lysine 4 monomethylation modulates long-range chromatin interactions 624 at enhancers. Cell Res 28, 387 (2018).

24

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

625 24. Kato, H. et al. Cancer-derived UTX TPR mutations G137V and D336G impair interaction with 626 MLL3/4 complexes and affect UTX subcellular localization. Oncogene (2020). 627 25. Hirst, C.E. et al. Transcriptional profiling of mouse and human ES cells identifies SLAIN1, a novel 628 stem cell gene. Dev Biol 293, 90-103 (2006). 629 26. Gloss, B.S. et al. High resolution temporal transcriptomics of mouse embryoid body development 630 reveals complex expression dynamics of coding and noncoding loci. Sci Rep 7, 6731 (2017). 631 27. Pijuan-Sala, B. et al. A single-cell molecular map of mouse gastrulation and early organogenesis. 632 Nature 566, 490-495 (2019). 633 28. Paca, A. et al. BMP signaling induces visceral endoderm differentiation of XEN cells and parietal 634 endoderm. Dev Biol 361, 90-102 (2012). 635 29. Coucouvanis, E. & Martin, G.R. BMP signaling plays a role in visceral endoderm differentiation 636 and cavitation in the early mouse embryo. Development 126, 535-46 (1999). 637 30. Coucouvanis, E. & Martin, G.R. Signals for death and survival: a two-step mechanism for 638 cavitation in the vertebrate embryo. Cell 83, 279-87 (1995). 639 31. Narita, N., Bielinska, M. & Wilson, D.B. Wild-type endoderm abrogates the ventral developmental 640 defects associated with GATA-4 deficiency in the mouse. Dev Biol 189, 270-4 (1997). 641 32. Morrisey, E.E. et al. GATA6 regulates HNF4 and is required for differentiation of visceral 642 endoderm in the mouse embryo. Genes Dev 12, 3579-90 (1998). 643 33. Wamaitha, S.E. et al. Gata6 potently initiates reprograming of pluripotent and differentiated cells 644 to extraembryonic endoderm stem cells. Genes Dev 29, 1239-55 (2015). 645 34. Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934-47 646 (2013). 647 35. Bylund, M., Andersson, E., Novitch, B.G. & Muhr, J. Vertebrate neurogenesis is counteracted by 648 Sox1-3 activity. Nat Neurosci 6, 1162-8 (2003). 649 36. Bergsland, M. et al. Sequentially acting Sox transcription factors in neural lineage development. 650 Genes Dev 25, 2453-64 (2011). 651 37. Crump, N.T. & Milne, T.A. Why are so many MLL lysine methyltransferases required for normal 652 mammalian development? Cell Mol Life Sci 76, 2885-2898 (2019). 653 38. Rickels, R. et al. Histone H3K4 monomethylation catalyzed by Trr and mammalian COMPASS- 654 like proteins at enhancers is dispensable for development and viability. Nat Genet 49, 1647-1653 655 (2017). 656 39. Qu, X. et al. Autophagy gene-dependent clearance of apoptotic cells during embryonic 657 development. Cell 128, 931-46 (2007). 658 40. He, X. et al. Rac1 is essential for basement membrane-dependent epiblast survival. Mol Cell Biol 659 30, 3569-81 (2010). 660 41. Qi, Y. et al. Bnip3 and AIF cooperate to induce apoptosis and cavitation during epithelial 661 morphogenesis. J Cell Biol 198, 103-14 (2012). 662 42. Qi, Y. et al. PTEN induces apoptosis and cavitation via HIF-2-dependent Bnip3 upregulation 663 during epithelial lumen formation. Cell Death Differ 22, 875-84 (2015). 664 43. Wang, J. et al. MicroRNA-125b/Lin28 pathway contributes to the mesendodermal fate decision of 665 embryonic stem cells. Stem Cells Dev 21, 1524-37 (2012). 666 44. Li, Y. et al. Crystal Structure of MLL2 Complex Guides the Identification of a Methylation Site on 667 P53 Catalyzed by KMT2 Family Methyltransferases. Structure (2020). 668 45. Wang, H. et al. One-step generation of mice carrying mutations in multiple genes by 669 CRISPR/Cas-mediated genome engineering. Cell 153, 910-8 (2013). 670 46. Ran, F.A. et al. Genome engineering using the CRISPR-Cas9 system. Nat Protoc 8, 2281-2308 671 (2013). 672 47. Koehler, K.R. & Hashino, E. 3D mouse embryonic stem cell culture for generating inner ear 673 organoids. Nat Protoc 9, 1229-44 (2014).

25

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

674 48. Cao, N. et al. In vitro differentiation of rat embryonic stem cells into functional cardiomyocytes. 675 Cell Res 21, 1316-31 (2011). 676 49. Shechter, D., Dormann, H.L., Allis, C.D. & Hake, S.B. Extraction, purification and analysis of 677 histones. Nat Protoc 2, 1445-57 (2007). 678 50. Hong, S. et al. Identification of JmjC domain-containing UTX and JMJD3 as histone H3 lysine 27 679 demethylases. Proc Natl Acad Sci U S A 104, 18439-44 (2007). 680 51. Corces, M.R. et al. An improved ATAC-seq protocol reduces background and enables 681 interrogation of frozen tissues. Nat Methods 14, 959-962 (2017). 682 52. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21 (2013). 683 53. Langmead, B. & Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat Methods 9, 357-9 684 (2012). 685 54. Zang, C. et al. A clustering approach for identification of enriched domains from histone 686 modification ChIP-Seq data. Bioinformatics 25, 1952-8 (2009). 687 55. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol 9, R137 (2008). 688 56. Love, M.I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA- 689 seq data with DESeq2. Genome Biol 15, 550 (2014). 690 57. Wickham, H., Sievert, C. & Springer International, P. ggplot2 : elegant graphics for data analysis, 691 (2016). 692 58. Huang da, W., Sherman, B.T. & Lempicki, R.A. Systematic and integrative analysis of large gene 693 lists using DAVID bioinformatics resources. Nat Protoc 4, 44-57 (2009). 694 59. Kolde, R. Pheatmap: pretty heatmaps. R package version 1(2012). 695 60. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for 696 interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-50 (2005). 697 61. He, H.H. et al. Nucleosome dynamics define transcriptional enhancers. Nat Genet 42, 343-7 698 (2010). 699 62. Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell 700 identity genes. Cell 153, 307-19 (2013). 701 63. Robinson, J.T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24-6 (2011). 702

26

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

703 Figure legends

704 Figure 1. MLL3/4 enzymatic activities are essential for early embryonic development in mice

705 (a) Schematic diagram of mouse MLL3 and MLL4 proteins. To eliminate the methyltransferase activities,

706 tyrosine (Y) residues (highlighted in red) are mutated to alanine (A). PHD: Plant Homeotic Domain, HMG:

707 High Mobility Group, FYRC/N: FY-Rich C/N-terminal, SET: Su(var)3-9, Enhancer-of-zeste and Trithorax.

708 (b) Diagram of generating enzyme-dead MLL3/MLL4 knockin (KI) mice by Cas9 nickase. sgRNA: single

709 guide RNA, nCas9: Cas9 nickase, ssDNA: single-strand DNA.

710 (c) Genotype of progeny from crossing between Mll3KI/+ mice at postnatal day 21 (P21).

711 (d) Mll4KI/KI mice died perinatally. Genotypes of progeny from crossing between Mll4KI/+ mice at embryonic

712 day 18.5 (E18.5), P0.5 and P21 are shown. The numbers of dead pups are indicated in parentheses.

713 (e) Genotype of progeny from crossing between Mll3KI/+;Mll4KI/+ mice at E14.5.

714 (f) Mll3KI/KI;Mll4KI/KI mice died around E8.5. Genotypes of progeny from crossing between Mll3KI/KI;Mll4KI/+

715 mice at E8.0, E8.5, E10.5 and E14.5 are shown. The numbers of morphologically abnormal embryos are

716 indicated in parentheses.

717 (g) Representative images of E8.0 and E8.5 embryos with indicated genotypes.

718 (h) Summary of phenotypes of Mll3-/-, Mll3KI/KI, Mll4-/-, Mll4KI/KI, and Mll3KI/KI;Mll4KI/KI mice. Phenotypes of

719 Mll3-/- and Mll4-/- mice have previously been reported4.

720

721 Figure 1-S1. Alignment of SET domains in different histone methyltransferases

722 The highly conserved tyrosines (Y), Y4792 in MLL3 and Y5477 in MLL4, are mutated to alanines (A) in

723 mice and/or ESCs in this study.

27

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

724

725 Figure 2. ESCs lacking MLL3/4 enzymatic activities maintain cell identity

726 (a-b) Enzyme-dead Y5477A mutation reduces global levels of H3K4me1 and H3K4me2 without affecting

727 MLL4 protein stability. MLL3/4 enzyme-dead (Mll3-/-;Mll4KI/KI, KI) ESCs were generated as described in

728 Figure 2-S1. Nuclear extracts (a) or histone extracts (b) prepared from wild-type (WT), Mll3-/-;Mll4f/f (f/f),

729 Mll3/Mll4 double knockout (KO) and KI ESCs were analyzed with immunoblotting using indicated antibodies.

730 The asterisk indicates a non-specific band.

731 (c-d) KI ESCs grow in monolayer and proliferate faster than f/f ESCs. (c) Representative phase contrast

732 microscopic (upper) and AP staining (lower) images. Scale bar = 100 μm. (d) Population doubling time.

733 Data are presented as means ± SD (n = 5).

734 (e) MLL3/4 enzymatic activities are dispensable for maintaining ESC identity. Mll4 and ESC identity genes

735 were analyzed by RT-qPCR (n = 2).

736

737 Figure 2-S1. Generation of ESCs lacking MLL3/4 enzymatic activities

738 (a) Diagram of generating ESCs harboring enzyme-dead mutant (Y5477A) MLL4. Knockin was done in

739 Mll3-/-;Mll4f/f ESCs.

740 (b) Schematic representation of Mll4 flox, knockout (KO) and knockin (KI) alleles. In flox and KI alleles, two

741 loxP sites were inserted in the intron before exon 16 and the intron after exon 19; in the KI allele, the

742 Y5477A mutation is located in exon 52. The locations of PCR genotyping primers P1, P2, P3, WF, MF and

743 R are indicated by arrows. The MF primer is specific to the KI allele.

744 (c) PCR genotyping using primer pairs indicated on the right. Sizes of PCR products are indicated on the

745 left.

746 (d-h) Chromatograms of genomic PCR sequencing from two independent KI ES cell lines. Trace files

747 display sequences around the Y5477A mutation (d) or sequences around potential exonic off-target regions

748 of Dups4 (e), Setd1b (f), Ccdc80 (g) and Fignl1 (h). In the reference sequences, sgRNA-defined target

749 regions and PAM sites are highlighted in yellow and cyan, respectively.

28

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

750

751 Figure 3. ESCs lacking MLL3/4 enzymatic activities are capable of differentiating towards the three

752 germ layers but show defects in EB cavitation

753 (a-b) MLL3/4 enzyme-dead ESCs show defects in cavitation during EB differentiation. f/f, KO and KI ESCs

754 were induced to differentiate into embryoid bodies (EBs). (a) Representative microscopic images at day 2

755 (D2), day 4 (D4), day 7 (D7) and day 10 (D10) of EB differentiation. Representative cystic cavities are

756 marked with red arrows. Scale bar = 250 μm. (b) RT-qPCR analysis of ESC identity genes (Nanog, Oct4),

757 endoderm markers (Gata4, Afp), mesoderm markers (T, Kdr) and ectoderm markers (Nes, Cdh2) at

758 indicated time points.

759 (c) MLL3/4 enzyme-dead ESCs are capable of differentiating towards all three germ layers in teratoma

760 assays. Representative histological sections of teratomas are shown. C, cartilage; G, glandular epithelium;

761 M, muscle; N, neural epithelium. Scale bar = 200 μm.

762

763 Figure 3-S1. ESCs lacking MLL3/4 enzymatic activities show defects in cardiomyogenesis

764 (a) Diagram of EB differentiation.

765 (b-c) Immunoblotting of f/f and KI cells at D0 (Day 0), D4 and D10 of EB differentiation. Nuclear extracts (b)

766 or histone extracts (c) were analyzed using indicated antibodies.

767 (d) The diameters of EBs at indicated time points are presented as floating bars. Each bar indicates min to

768 and the horizontal line inside represents the mean. For each time point, more than 20 EBs were

769 measured per group.

770 (e-h) Loss of MLL3/4 enzymatic activities impairs cardiomyogenesis during EB differentiation. (e) The

771 percentage of beating EBs at indicated time points. The results are from 3 independent experiments with

772 24 EBs per group. (f) Representative microscopic images of attached EBs at day 12 (D12). Scale bar =

773 250 μm. (g) The beating frequencies of EBs at D12 are presented as dot plots. Horizontal lines represent

774 mean values. 24 EBs were measured per group. Statistical significance was determined by the two-tailed

775 unpaired t-test. (h) RT-qPCR analysis of cardiac progenitor marker (Tbx5) and cardiac myofilament genes

776 (Myh6, Myl2, Ttn) at indicated time points.

29

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

777

778 Figure 4. Loss of MLL3/4 enzymatic activities impairs visceral endoderm induction in EB

779 differentiation

780 f/f and KI ESCs were induced to EB differentiation for 10 days. D4 and D10 EBs were collected for RNA-

781 seq.

782 (a) Schematic diagram of ESC (D0) and EB (D4, D10) stages during EB differentiation. At D4, EBs are

783 covered by the extraembryonic endoderm layer and contain differentiated cells that transiently express PS

784 markers. From D4 to D10, cavitation takes place and the three germ layers develop.

785 (b) Principal component analysis (PCA) of RNA-seq data from cells of indicated genotypes at D0, D4 and

786 D10 during EB differentiation.

787 (c-d) Characterization of the dynamic transcriptome. Differentially expressed genes during EB

788 differentiation from f/f and KI cells were clustered according to their expression patterns. (c) Heat map of

789 clustered gene expression. (d) Temporal profiles of expression levels in terms of z-score (left) and GO

790 analysis (right) of genes in Clusters III and IX. The number of genes in each group is indicated in parenthesis.

791 Top 5 significant GO terms are displayed. Terms associated with early embryonic development are

792 highlighted in blue.

793 (e-f) Impaired expression of visceral endoderm (VE) but not primitive streak (PS) markers in KI D4 EBs. (e)

794 Gene set enrichment analysis (GSEA) of expression profiles in D4 EBs using cell-type markers in E6.5

795 embryos27. NES, normalized enrichment score. (f) Expression fold changes of reported VE and PS genes

796 between KI and f/f D4 EBs.

797

798 Figure 4-S1. Loss of MLL3/4 enzymatic activities impairs visceral endoderm induction in EB

799 differentiation

800 (a) Temporal profiles of expression levels in terms of z-score (left) and GO analysis (right) of genes in

801 indicated Clusters. The number of genes in each group is indicated in parenthesis. Top 4 significant GO

802 terms are displayed. Terms associated with cardiomyogenesis are highlighted in blue.

803 (b) GSEA of expression profiles in D4 EBs using nascent mesoderm (NM) markers in E6.5 embryos27. NES,

804 normalized enrichment score.

30

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

805 (c) Expression levels of representative VE genes. Data from RNA-seq are presented as dot plots with

806 median lines (f/f: n = 2; KI: n = 4). Statistical significance was determined by the two-tailed unpaired t-test.

807 (d) Expression fold changes between KI and f/f D4 EBs for genes encoding components of BMP, Wnt and

808 Nodal signaling pathways.

809 (e) Expression levels of genes reported to be important for EB cavitation. Data from RNA-seq are presented

810 as dot plots with median lines (f/f: n = 2; KI: n = 4).

811

812 Figure 5. MLL3/4 enzymatic activities are required for induction of visceral endoderm TFs GATA4/6

813 in the early phase of EB differentiation

814 f/f and KI D4 EBs were collected for ChIP-seq.

815 (a) MLL4+ AEs in f/f or KI cells at ESC and D4 EB are depicted by the Venn diagram. 10,767 de novo MLL4+

816 AEs highlighted in red were divided into three groups based on the changes of MLL4 binding intensities

817 from f/f to KI EBs: increased (Group I), unchanged (Group II) and decreased (Group III).

818 (b) Group III de novo MLL4+ AEs are exclusively overrepresented in AEs associated with VE markers.

819 (c) Motif analysis of three groups of de novo MLL4+ AEs in D4 EBs.

820 (d) Expression levels of representative transcription factors highlighted in bold in Figure 5c. Data from RNA-

821 seq are presented as dot plots with median lines (f/f: n = 2; KI: n = 4). Statistical significance was determined

822 by the two-tailed unpaired t-test.

823 (e) ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac, ATAC-seq and RNA-seq profiles in f/f and KI cells

824 are displayed on representative VE gene loci. Group III de novo MLL4+ AEs are highlighted in shades.

825 GATA6 binding sites in extraembryonic endoderm (GSE69323)36 are presented as red arrows.

826

827 Figure 5-S1. Genomic binding profiles of MLL4 in D4 EBs

828 f/f and KI D4 EBs were collected for ChIP-seq and 34,772 MLL4+ regions were identified in f/f or KI ESCs.

829 MLL4+ distal regions or promoters were splitted based on the MLL4 binding intensities in KI ESCs compared

830 to f/f ESCs.

831 (a) Heat maps of 34,772 MLL4+ regions identified by ChIP-seq in f/f or KI D4 EBs.

31

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

832 (b) Enzyme-dead mutation does not affect MLL4 total binding intensities in D4 EBs. Average profiles of

833 MLL4 genomic binding on MLL4+ regions in f/f and KI EBs are shown.

834

835 Figure 5-S2. GATA family TFs are highly enriched in de novo MLL4+ AEs associated with visceral

836 endoderm markers

837 (a-c) Motif analysis of de novo MLL4+ AEs associated with cell-type markers in E6.5 embryos. 89, 95 and

838 41 de novo MLL4+ AEs were annotated to markers in E6.5 embryos for VE (a), PS (b) and NM (c),

839 respectively. GATA family TFs are highlighted in bold.

840

841 Figure 6. MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB

842 differentiation

843 f/f and KI D4 EBs were collected for ChIP-seq and ATAC-seq.

844 (a-b) Heat maps (a) and average profiles (b) of MLL4 genomic bindings, H3K4me1 and H3K27ac

845 enrichments and ATAC-seq signals on the three groups of de novo MLL4+ AEs.

846 (c) ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac, ATAC-seq and RNA-seq profiles in f/f and KI cells

847 are displayed on Fgf8 locus. Group II de novo MLL4+ AEs are highlighted in shades.

848

849 Figure 6-S1. MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB

850 differentiation

851 ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac, ATAC-seq and RNA-seq profiles in f/f and KI cells are

852 displayed on Wnt3, Mixl1 and loci. Group II de novo MLL4+ AEs are highlighted in shades.

853

854 Figure 6-S2. MLL3/4 proteins, but not MLL3/4-catalyzed H3K4me1 are required for enhancer

855 activation during early EB differentiation

856 ChIP-seq data were obtained from GSE505345.

857 (a-b) Heat maps (a) and average profiles (b) of MLL4 genomic bindings, H3K4me1 and H3K27ac

858 enrichments in f/f and KO cells on Group II de novo MLL4+ AEs identified in Figure 5a.

32

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

859 (c-f) ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac in f/f and KO cells (c, e), and RNA-seq profiles in

860 f/f, KO and KI cells (d, f) are displayed on Fgf8 (c-d), Wnt3, Mixl1 and Evx1 (e-f) loci. The Group II de novo

861 MLL4+ AEs are highlighted in shades.

862

863 Figure 7. MLL3/4-catalyzed H3K4me1 is dispensable for activation of super-enhancers during neural

864 differentiation

865 f/f and KI ESCs were induced to differentiate into neural progenitor cells (NPCs) and neurons sequentially.

866 D8 NPCs and D16 neurons were collected for RNA-seq. D8 NPCs were collected for ChIP-seq.

867 (a-c) MLL3/4 enzymatic activities are dispensable for differentiation into NPCs. (a) Representative

868 microscopic images of NPCs (day 8, upper) and neurons (day 16, lower). Scale bar = 25 μm. (b) Expression

869 levels of 2,526 genes induced from ESC to NPC were presented as scatter plots. Green, black and red

870 dots indicate genes expressed at higher, similar and lower levels in KI NPCs compared to f/f NPCs,

871 respectively. Pearson correlation coefficient (R) and P value are shown. 2.5-fold cutoff was used to identify

872 differentially expressed genes. (c) GO analysis of three gene groups in (B). Terms associated with neural

873 differentiation are highlighted in purple.

874 (d-f) MLL3/4-catalyzed H3K4me1 is dispensable for super-enhancer activation during neural differentiation.

875 (d) NPC super-enhancers (SEs) were identified by stitching SOX3 peaks from GSE3305936 in NPC AEs.

876 SEs were ranked by H3K27ac signal intensities. (e) MLL4, H3K4me1 and H3K27ac intensities in ESCs and

877 NPCs on NPC SEs are presented in box plots. (f) Expression levels in ESCs and NPCs of genes associated

878 with NPC SEs are presented in box plots. Statistical significance was determined by the two-sided Wilcoxon

879 signed-rank test. Box plot elements: center line, median; box limits, lower and upper quartiles; whiskers,

880 calculated using the Tukey method and outliers are not shown.

881 (g) ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac, and RNA-seq profiles in f/f and KI cells are

882 displayed on Nes locus. NPC SEs are indicated by purple bars and de novo MLL4+ enhancers within SEs

883 are highlighted in shades.

884 (h) Model on regulation of ESC differentiation by MLL3/4 methyltransferase activities. LDTFs: lineage-

885 determining transcription factors.

33

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

886

887 Figure 7-S1. MLL3/4-catalyzed H3K4me1 is dispensable for activation of super-enhancers during

888 neural differentiation

889 (a) Diagram of neural differentiation.

890 (b) Representative microscope images of NPCs (day 9). Scale bar = 25 μm. White arrows indicate neural

891 rosettes.

892 (c-d) MLL3/4 enzymatic activities are dispensable for differentiation into neurons. (c) Expression levels of

893 3,833 genes induced from ESC to neuron were presented as scatter plots. Green, black and red dots

894 indicate genes expressed at higher, similar and lower levels in KI neurons compared to f/f neurons,

895 respectively. Pearson correlation coefficient (R) and P value are shown. 2.5-fold cutoff was used to identify

896 differentially expressed genes. (d) GO analysis of three gene groups in (E). Terms associated with neural

897 differentiation are highlighted in purple.

898 (e-f) Expression levels of NPC (e) and neuron (f) markers. Data from RNA-seq are presented as dot plots

899 with median lines (f/f: n = 2; KI: n = 2).

900 (g) GO analysis of genes associated with NPC SEs. Terms associated with neural differentiation are

901 highlighted in purple.

902 (h) ChIP-seq profiles of MLL4, H3K4me1 and H3K27ac, and RNA-seq profiles in f/f and KI cells are

903 displayed on representative gene loci. NPC SEs are indicated by purple bars and de novo MLL4+ enhancers

904 within SEs are highlighted in shades.

905

906 Supplementary Information

907 Supplementary Video 1. Representative day 12 beating EB derived from f/f ESCs during EB differentiation

908 Supplementary Video 2. Representative day 12 beating EB derived from KI #27 ESCs during EB

909 differentiation

910 Supplementary Video 3. Representative day 12 beating EB derived from KI #48 ESCs during EB

911 differentiation

34

bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Y4792A MLL3 4903 aa Y5477A

MLL4 5588 aa

AT hook PHD HMG FYRC FYRN SET

b c Mll3KI/+ x Mll3KI/+ d Mll4KI/+ x Mll4KI/+ sgRNA + nCas9 mRNA + ssDNA Genotype P21 Genotype E18.5 P0.5 P21

Mll3+/+ 30 Mll4+/+ 19 18 26

Mll3KI/+ 69 Mll4KI/+ 46 27 53

Mll3KI/KI 19 Mll4KI/KI 17 1 (1) 0 injection into zygotes mosaic founders heterozygous mice Total 118 Total 82 46 59

e Mll3KI/+;Mll4KI/+ x Mll3KI/+;Mll4KI/+ f Mll3KI/KI;Mll4KI/+ x Mll3KI/KI;Mll4KI/+

Genotype Observed Expected Genotype E8.0 E8.5 E10.5 E14.5 (E14.5) number number Mll3KI/KI;Mll4+/+ 4 4 8 6 Mll3+/+;Mll4+/+ 18 16 Mll3KI/KI;Mll4KI/+ Mll3+/+;Mll4KI/+ 36 31 11 (2) 14 (2) 13 6 Mll3+/+;Mll4KI/KI 13 16 Mll3KI/KI;Mll4KI/KI 4 (4) 6 (6) 0 0 Mll3KI/+;Mll4+/+ 46 31 Total 19 24 21 12

Mll3KI/+;Mll4KI/+ 72 61 Resorbed 3 2 11 0 Mll3KI/+;Mll4KI/KI 11 31 Mll3KI/KI;Mll4+/+ 18 16 g Mll3KI/KI;Mll4+/+ Mll3KI/KI;Mll4KI/+ Mll3KI/KI;Mll4KI/KI Mll3KI/KI;Mll4KI/+ 35 31 Mll3KI/KI;Mll4KI/KI 0 16

Total 249 249

h Summary E8.0 embryos

Genotype Phenotype

Mll3-/- Perinatal lethal

Mll3KI/KI Survived to adulthood, fertile

-/-

Mll4 Embryonic lethal ~E9.5 E8.5 embryos

Mll4KI/KI Perinatal lethal

Mll3KI/KI;Mll4KI/KI Embryonic lethal ~E8.5

Figure 1. MLL3/4 enzymatic activities are essential for early embryonic development in mice bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

MLL1 MLL2 MLL3 MLL4 SET1A SET1B SET7 PRDM9 G9a SUV39H1 EZH2 NSD2 SET2 MLL3: Y4792A MLL4: Y5477A

Figure 1-S1. Alignment of SET domains in different histone methyltransferases bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a KI c f/f KO KI #27 KI #48 kDa WT f/f KO #27 #48

MLL3

MLL4

CBP 250 -

250 - ARID1A * BRG1 150 -

150 - UTX d e f/f KO KI #27 KI #48 75 - 25 1.5 2.0 RBBP5 Mll4 Nanog Oct4 20 2 1.5 1.0 b 15 1.0 1 10 0.5 15 - H3K4me1 0.5 Population Population 5 0 0.0 0.0 15 - H3K4me2 doubling (hrs) time 0 1.5 1.5 1.5 f/f KO #27 #48 Sox2 Rex1 Esrrb 1.0 1.0 15 - H3K4me3 KI (fold) level mRNA 1.0

0.5 0.5 0.5 15 - H3 0.0 0.0 0.0

Figure 2. ESCs lacking MLL3/4 enzymatic activities maintain cell identity bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a co-transfect: 1. plasmid expressing pick single colonies re-seed single cells genotyping and PCR sequencing gRNA + Cas9 + puroR into 24-well plate 2. targeting oligo MLL4 Y5477 3 days 1 week 1 week Ref. ACACGATGGTCATCGAGTACATCGGCAC ||||||||||||||||| |||||||| Mut. ACACGATGGTCATCGAGGCTATCGGCAC A Mll3-/-;Mll4f/f b Mll4 c KI flox allele: 15 16-19 20 50-51 52 53-54 bp f/f KO #27 #48

400 - P2 + P3

KO allele: P1 WF 322 - P1 + P3

231 - WF + R KI allele: P2 P3 MF R

230 - MF + R

loxP Y5477A KI site d Mll4 exon 52 Ref. GGAGAAGCACACGATGGTCATCGAGTACATCGGCACCATCATT ||||||||||||||||||||||||| ||||||||||||||| Mut. GGAGAAGCACACGATGGTCATCGAGGCTATCGGCACCATCATT

#27

#48

e Dusp4 exon 3 f Setd1b exon 15 Ref. CTCCTGGTTCATGGAAGCCATCGAGTACATCGGTAGGCTGCCC Ref. TGCTGCTGACGAGATGGTCATCGAGTACGTGGGCCAGAACATC

#27 #27

#48 #48

g Ccdc80 exon 8 h Fignl1 exon 3 Ref. GTATCCTTCTCCTATGTGGTCGATGGTCATCGTGTATGACTTA Ref. CAATATCGTCCCAATGTACTGGAGGCCCATGGTCCATAATTTC

#27 #27

#48 #48

Figure 2-S1. Generation of ESCs lacking MLL3/4 enzymatic activities bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a f/f KO KI #27 KI #48 c f/f KI #27

D2 G G G G G Endoderm

D4 C M

Mesoderm C C D7

N N D10 Ectoderm

b ESC identity Endoderm Mesoderm Ectoderm 1.5 f/f Nanog Gata4 10 T 60 Nes KO 1.0 1000 40 KI #27 500 5 0.5 20 KI #48

0.0 0 0 0 15000 300 1.5 Oct4 Afp 1000 Kdr Cdh2 10000 1.0 200 mRNA level (fold) level mRNA 500 5000 0.5 100 0 0 0.0 0 0 2 4 7 10 0 2 4 7 10 0 2 4 7 10 0 2 4 7 10 days

Figure 3. ESCs lacking MLL3/4 enzymatic activities are capable of differentiating towards the three germ layers but show defects in EB cavitation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a EB differentiation b D0 D4 D10 Hanging drop (D0-2) Static suspension (D2-10) kDa D0 PBS 150 - UTX LIF, - 1/4 1/4 of 2i 75 - + D2 RBBP5 harvest EBs into ultra- low-attached dish D4 37 - NANOG 2i

- c 15 - H3K4me1 LIF, - D7 cavitation 15 - H3K4me2

15 - H3K4me3 D10 15 - H3

d ) f/f

µm 600

( KO KI #27 400 KI #48 200 EBsdiameter 0 2 4 7 10 days

f/f KI #27 KI #48 ) e f g 1 - 3 p < 0.0001 f/f 100 KI #27 80 2 KI #48 60 40 f/f 1 20 KI #27

% beating beating % EBs 0 KI #48 0 4 6 8 10 12 14 days D12 beating (s frequency D12 h 400 10000 800 f/f Tbx5 Myh6 4000 Myl2 Ttn 300 600 KI #27 3000 KI #48 200 5000 2000 400 100 1000 200 0 0 0 0 mRNA level (fold) level mRNA 0 2 4 7 10 14 0 2 4 7 10 14 0 2 4 7 10 14 0 2 4 7 10 14 days

Figure 3-S1. ESCs lacking MLL3/4 enzymatic activities show defects in cardiomyogenesis bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Schematic of normal EB differentiation b PCA of RNA-Seq ESC (D0) EB (D4) EB (D10) Stages: ESC (D0) - LIF cavitation EB (D4) cavity EB (D10)

Cell types: f/f ES cells primitive streak-like cells ectoderm KO other differentiated cells mesoderm KI #27 extraembryonic definitive KI #48 endoderm endoderm

D0 D4 D10 c d f/f KI #27 KI #48 GO term P Value f/f KI#27 KI#48 f/f KI#27 KI#48 f/f KI#27 KI#48 Cluster Cluster III (289) Epithelium development 2.1E-10 I 2 Cell surface receptor signaling 1 2.2E-10 II pathway 0 III Gastrulation 1.4E-9 -1 Mesenchyme development 2.7E-9 IV -2

score Anatomical structure formation - 1.1E-8 D0 D4 D10 involved in morphogenesis Cluster IX (144) Pattern specification process 1.4E-13 V 2 Regionalization 3.5E-13

Expression z Expression 1 Anterior/posterior pattern specification 2.6E-11 0 VI Regulation of multicellular organismal 1.3E-10 VII -1 development -2 Anatomical structure formation VIII 2.4E-8 involved in morphogenesis IX D0 D4 D10 X XI Expression z-score D4 EBs: log TPM (KI / f/f) XII f 2 -2.5 0 2.5 Visceral endoderm Primitive streak Cer1 Cripto Pijuan-Sala 2019: cell-type markers in E6.5 embryos e Hnf1b T Visceral endoderm (85) Primitive streak (96) Cxcr4 Evx1 0.0 0.2 Bmp2 Mixl1 -0.1 0.1 -0.2 0.0 Cyp26a1 Gsc -0.3 -0.1 Foxa2 Eomes -0.4 -0.2 -0.5 -0.3 Krt8 Wnt3 -0.6 -0.4 Lhx1 Fgf4 -0.7 -0.5

Enrichment score Enrichment -0.6 Hhex Nodal Gata4 Fgf8 Krt18 -1 0 1 Up in Down in Up in Down in Cited2 KI D4 EBs KI D4 EBs KI D4 EBs KI D4 EBs Amot KI #27 Dkk1 NES -1.325 NES -0.921 KI #48 Gata6 p-value < 0.001 p-value 0.662 Lefty1 FDR q-value 0.102 FDR q-value 0.835 -8 -7-3 -2 -1 0 1

Figure 4. Loss of MLL3/4 enzymatic activities impairs visceral endoderm induction in EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a f/f KI #27 KI #48 Cluster I (555) Cluster VII (248) Organic acid metabolic process 2 Muscle system process 2 Regulation of stem cell population 1 Cardiovascular system development 1 0 0 maintenance Circulatory system development -1 -1 Oxoacid metabolic process -2 Muscle contraction -2 Tripartite regional subdivision Cluster II (482) Cluster VIII (634) Response to gamma radiation 2 Nervous system development 2 Alpha-amino acid metabolic process 1 Central nervous system development 1 0 0 Regulation of gene expression -1 Brain development -1 Regulation of nitrogen compound -2 Head development -2 metabolic process Cluster IV (801) Cluster X (179) Cell morphogenesis involved in

score 2 Defense response score 2 - - differentiation 1 Biological adhesion 1 0 0 Brain development Cellular response to chemical stimulus -1 -1 Cell morphogenesis -2 Immune response -2 Head development

Expression z Expression Cluster V (1,599) z Expression Cluster XI (226) Regulation of multicellular organismal organization 2 development 2 1 1 Chromatin silencing 0 Regulation of signal transduction 0 Negative regulation of gene expression, -1 Intracellular signal transduction -1 epigenetic -2 -2 Regulation of signaling Gene silencing Cluster VI (274) Cluster XII (754) 2 Fatty acid metabolic process 2 DNA metabolic process 1 Monocarboxylic acid metabolic process 1 DNA repair 0 0 -1 Cell migration -1 Cell cycle -2 Locomotion -2 Cell cycle process D0 D4 D10 D0 D4 D10

b Pijuan-Sala 2019: cell-type markers in E6.5 embryos d D4 EBs: log2 TPM (KI / f/f) Nascent mesoderm (133) BMP pathway Wnt pathway Nodal pathway 0.15 0.10 Bmp4 Lrp6 Cripto 0.05 0.00 Bmp2 Ctnnb1 Foxh1 -0.05 NES -1.139 Smad1 Tcf3 Smad3 -0.10 -0.15 p-value 0.184 Smad5 Axin1 Smad2 -0.20 FDR q-value 0.346 Bmpr1a Apc Smad4

Enrichment score Enrichment -0.25 Bmpr2 Mesdc2 Nodal Gdf3 -3 -2 -1 0 1 Wnt3 Macf1 Up in Down in KI #27 -1 0 1 KI D4 EBs KI D4 EBs Lrp5 KI #48 -1 0 1 c Visceral endoderm marker f/f KI Cer1 Cxcr4 Bmp2 Cyp26a1 Foxa2 Lhx1 40 p = 2.0E-02 80 p = 5.4E-04 p = 2.4E-04 p = 3.9E-03 p = 1.3E-03 100 p = 3.4E-04 20 30 30 60 200 20 20 40 50 TPM 10 100 10 20 10 0 0 0 0 0 0 D0 D4 D0 D4 D0 D4 D0 D4 D0 D4 D0 D4

e Genes required for cavitation f/f KI Atg5 Becn1 Rac1 Bnip3 Aifm1 Pten 30 80 300 300 80 50 60 60 40 20 200 200 30 40 40 20 TPM 10 100 100 20 20 10 0 0 0 0 0 0 D0 D4 D10 D0 D4 D10 D0 D4 D10 D0 D4 D10 D0 D4 D10 D0 D4 D10

Figure 4-S1. Loss of MLL3/4 enzymatic activities impairs visceral endoderm induction in EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a MLL4+ AEs in f/f or KI cells c Motif analysis: de novo MLL4+ AEs in D4 EBs ESC D4 EB Group I Group II Group III (31,800) (18,474) Motif P value Motif P value Motif P value ZIC3 8E-26 ZIC3 4E-125 ZIC3 5E-98 NANOG 4E-23 TBX2 1E-61 CBFA2T2 6E-82 24,093 7,707 10,767 de novo MLL4+ AEs ZFP219 3E-21 EOMES 2E-58 POLR2B 2E-80 OCT4 7E-20 MGA 1E-43 GATA4 1E-77 ZFP281 1E-19 POLR2B 3E-36 GATA3 4E-60 BCL11A 2E-19 TCF4 1E-33 LMO2 4E-69 MLL4 binding in D4 EBs: KI vs f/f Group II SMAD2/3 6E-18 CBFA2T2 1E-33 GATA6 6E-58 unchanged d RNA-seq: TFs linked to de novo MLL4+ AEs in 3 groups f/f KI 3,610 (34%) Nanog Eomes Gata4 Group I 1,504 Group III 600 p = 3.0E-02 200 n.s. 80 p = 8.3E-03 increased (14%) decreased 150 60 5,653 400 in KI (52%) in KI 100 40 200 50 20 0 0 0 Oct4 Mga Gata3 50 6 b Distribution of 89 de novo MLL4+ AEs associated 2000 p = 2.1E-03 40 n.s. p = 2.8E-02 with visceral endoderm markers in E6.5 embryos 1500 30 4 1000 20 TPM 2 Group: underrepresented overrepresented 500 10 0 0 0 I Zfp281 Tcf4 Gata6 30 60 p = 2.4E-03 n.s. 40 p = 2.3E-02 II 20 30 40 20 III 10 20 10 0 0.5 1.0 1.5 0 0 0 Fraction (observed) / Fraction (expected) ESC D4 EB ESC D4 EB ESC D4 EB

e 8 kb 36 kb 40 kb 10 f/f 10 f/f 10 f/f ESC 10 KI 10 KI 10 KI 10 10 10 MLL4 f/f f/f f/f D4 EB 10 KI 10 KI 10 KI 3 f/f 3 f/f 3 f/f ESC 3 KI 3 KI 3 KI 3 f/f 3 f/f 3 f/f D4 EB H3K4me1 3 KI 3 KI 3 KI 3 f/f 3 f/f 3 f/f ESC 3 KI 3 KI 3 KI 3 f/f 3 f/f 3 f/f D4 EB H3K27ac 3 KI 3 KI 3 KI 0.5 f/f 0.5 f/f 0.5 f/f ESC seq 0.5 0.5 0.5 - KI KI KI 0.5 f/f 0.5 f/f 0.5 f/f D4 EB ATAC 0.5 KI 0.5 KI 0.5 KI 0.5 f/f 1 f/f 1 f/f ESC 0.5 1 1 seq

- KI KI KI 0.5 f/f 1 f/f 1 f/f D4 EB RNA 0.5 KI 1 KI 1 KI

Cer1 Gata4 Gata6 GATA6 binding site

Figure 5. MLL3/4 enzymatic activities are required for induction of visceral endoderm TFs GATA4/6 in the early phase of EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a MLL4+ regions in f/f or KI EBs b 34,772 MLL4+ regions in EBs (34,772) MLL4 MLL4 20 f/f f/f KI 15 KI 10 5,869; 17% 5 read read count Normalized 0 8,365; 24% distal (27,863)

13,629; 39%

2,143; 6.2% promoter 2,583; 7.4% (6,909) 2,183; 6.3% -5 +5 Distance from MLL4 binding sites (kb)

Figure 5-S1. Genomic binding profiles of MLL4 in D4 EBs bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Motif analysis: 89 de novo MLL4+ AEs associated with visceral endoderm markers in E6.5 embryos Motif Logo P value GATA6 1E-7 EOMES 1E-6 GATA3 1E-6 GATA2 1E-6 GATA1 1E-5 ZFP281 1E-5 GATA4 1E-5

b Motif analysis: 95 de novo MLL4+ AEs associated with primitive streak markers in E6.5 embryos Motif Logo P value LEF 1E-5 TBX6 1E-4 EOMES 1E-4 ZIC3 1E-4 TCF7 1E-4 FOXO3 1E-4 TCF3 1E-3

c Motif analysis: 41 de novo MLL4+ AEs associated with nascent mesoderm markers in E6.5 embryos Motif Logo P value ZIC2 1E-3 LHX1 1E-3 PHOX2A 1E-3

Figure 5-S2. GATA family TFs are highly enriched in de novo MLL4+ AEs associated with visceral endoderm markers bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a De novo MLL4+ AEs in D4 EBs (10,767)

MLL4 H3K4me1 H3K27ac ATAC-seq MLL4 binding: KI vs f/f ESC D4 EB ESC D4 EB ESC D4 EB ESC D4 EB

increased f/f KI f/f KI f/f KI f/f KI f/f KI f/f KI f/f KI f/f KI Group I (1,504; 14%)

Group II unchanged (3,610; 34%)

decreased Group III (5,653; 52%)

-5 +5 Distance from MLL4 binding sites (kb) b Average profiles on de novo MLL4+ AEs in D4 EBs c 19 kb 15 f/f ESC, f/f ESC, KI D4 EB, f/f D4 EB, KI ESC 15 KI MLL4 H3K4me1 H3K27ac ATAC-seq 15 MLL4 f/f 40 3 4 4 D4 EB 15 KI 2 3 f/f 20 2 2 ESC 3 1 KI Group I Group 3 f/f 0 0 0 0 D4 EB H3K4me1 3 KI 10 f/f 40 3 4 4 ESC 10 KI 2 10 f/f 20 2 2 D4 EB 1 H3K27ac 10 KI Group II Group 1 f/f 0 0 0 0 ESC seq 1 - KI Normalized read read count Normalized 40 3 4 4 1 f/f D4 EB

ATAC 1 2 KI 20 2 2 3 f/f 1 ESC 3 seq Group IIIGroup - KI 0 0 0 0 3 f/f D4 EB -2 +2 RNA 3 KI Distance from MLL4 binding sites (kb) Fgf8

Figure 6. MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

61 kb 20 kb 29 kb 10 f/f 10 f/f 10 f/f ESC 10 KI 10 KI 10 KI 10 10 10 MLL4 f/f f/f f/f D4 EB 10 KI 10 KI 10 KI 3 f/f 3 f/f 3 f/f ESC 3 KI 3 KI 3 KI 3 f/f 3 f/f 3 f/f D4 EB H3K4me1 3 KI 3 KI 3 KI 5 f/f 5 f/f 5 f/f ESC 5 KI 5 KI 5 KI 5 f/f 5 f/f 5 f/f D4 EB H3K27ac 5 KI 5 KI 5 KI 0.5 f/f 0.5 f/f 0.5 f/f ESC seq 0.5 0.5 0.5 - KI KI KI 0.5 f/f 0.5 f/f 0.5 f/f D4 EB ATAC 0.5 KI 0.5 KI 0.5 KI 1 f/f 2 f/f 1 f/f ESC 1 2 1 seq

- KI KI KI 1 f/f 2 f/f 1 f/f D4 EB RNA 1 KI 2 KI 1 KI

Wnt3 Mixl1 Evx1

Figure 6-S1. MLL3/4-catalyzed H3K4me1 is dispensable for enhancer activation during early EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Group II de novo MLL4+ AEs in D4 EBs (3,610) c 19 kb MLL4 H3K4me1 H3K27ac 5 f/f ESC 5 KO ESC D4 EB ESC D4 EB ESC D4 EB 5 MLL4 f/f f/f KO f/f KO f/f KO f/f KO f/f KO f/f KO D4 EB 5 KO 3 f/f ESC 3 KO 3 f/f D4 EB H3K4me1 3 KO 3 f/f -5 +5 Distance from MLL4 binding sites (kb) ESC 3 KO b Average profiles on Group II de novo MLL4+ AEs in D4 EBs 3 f/f D4 EB H3K27ac 3 KO MLL4 H3K4me1 H3K27ac 15 3 3 ESC, f/f d 3 f/f 10 2 2 ESC, KO ESC 3 KO 3 seq KI 5 1 1 D4 EB, f/f - 3 f/f D4 EB, KO 0 0 0 RNA 3 -2 +2 D4 EB KO 3

Normalized read read count Normalized Distance from MLL4 binding sites (kb) KI

Fgf8 e 61 kb 20 kb 29 kb 5 f/f 5 f/f 5 f/f ESC 5 KO 5 KO 5 KO 5 5 5 MLL4 f/f f/f f/f D4 EB 5 KO 5 KO 5 KO 3 f/f 3 f/f 3 f/f ESC 3 KO 3 KO 3 KO 3 f/f 3 f/f 3 f/f D4 EB H3K4me1 3 KO 3 KO 3 KO 3 f/f 3 f/f 3 f/f ESC 3 KO 3 KO 3 KO 3 f/f 3 f/f 3 f/f D4 EB H3K27ac 3 KO 3 KO 3 KO

f 1 f/f 2 f/f 1 f/f ESC 1 KO 2 KO 1 KO 1 2 1 seq KI KI KI - 1 f/f 2 f/f 1 f/f RNA D4 EB 1 KO 2 KO 1 KO 1 KI 2 KI 1 KI

Wnt3 Mixl1 Evx1

Figure 6-S2. MLL3/4 proteins, but not MLL3/4-catalyzed H3K4me1 are required for enhancer activation during early EB differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a f/f KI #27 KI #48 c GO analysis: 2,526 genes induced from ESC to NPC

Gene group GO term P Value NPCs Regionalization 8.7E-10 D8 Pattern specification process 1.8E-8 up-regulated in Anterior/posterior pattern 2.6E-8 KI (42) specification Embryonic morphogenesis 3.1E-8 neurons Organ morphogenesis 1.1E-7 D16 Nervous system development 6.5E-79 Neurogenesis 4.1E-64 mild or no Generation of neurons 5.7E-64 change (2,283) b RNA-seq: 2,526 genes induced from ESC to NPC Neuron differentiation 1.4E-59 Neuron development 6.6E-45 Blood vessel development 1.0E-15 up-regulated in KI Vasculature development 3.3E-15 mild or no change down-regulated Blood vessel morphogenesis 4.1E-15 TPM (KI)TPM by in (201) 2 down-regulated in KI Organ morphogenesis 4.5E-15

log Circulatory system 8.6E-14 development

log2 TPM (f/f)

d NPC SEs e ChIP-seq: NPC SEs f/f KI f RNA-seq:

Super genes associated Nes (5) MLL4 H3K4me1 H3K27ac with NPCRN SEA-sseq

Notch1 (32) - enhancers (254) enhancers 3000 600 p = 0.108 p < 2.2E-16 p = 0.511 p = 0.149 Ptn (33) 200 150 200 Fgfr3 (79) 400 2000 Sox11 (86) 100 TPM

signal signal at enhancer 100 Metrn (96) RPKM 100 200 50 1000 Pax7 (122) 0 0 0 0 0 H3K27ac 0 1000 2000 3000 4000 ESC NPC ESC NPC ESC NPC ESC NPC Stitched SOX3 peaks ranked by H3K27ac signal

g 35 kb h EB and neural differentiation of Mll3-/-;Mll4KI/KI ESCs 10 f/f ESC 10 KI 10 MLL4 f/f me NPC p300 MLL4* 10 KI X 8 f/f ac ac On Development of the three ESC 8 KI germ layers; neural differentiation 8 f/f NPC

H3K4me1 TSS 8 KI LDTFs 10 f/f ESC 10 KI p300 MLL4* 10 f/f NPC H3K27ac 10 KI Off 5 X X Visceral endoderm induction; f/f X cavitation ESC 5 seq

- KI 5 f/f GATA4/6 TSS NPC RNA 5 KI NPC SEs MLL4*: enzyme-dead mutant me H3K4me1 ac H3K27ac

Nes

Figure 7. MLL3/4-catalyzed H3K4me1 is dispensable for activation of super-enhancers during neural differentiation bioRxiv preprint doi: https://doi.org/10.1101/2020.09.14.296558; this version posted September 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

a Neural differentiation c RNA-seq: 3,833 genes induced from ESC to neuron Hanging drop (D0-2) Static suspension (D2-8)

D0 - LIF, PBS up-regulated in KI + 1/4 of 2i mild or no change

D2 (KI)TPM 2 down-regulated in KI - LIF, - 2i harvest EBs into ultra- log D4 low-attached dish

- LIF, - 2i, + 5 μM RA PDL/laminin dish (D8-16) log2 TPM (f/f) GO analysis: 3,833 genes induced from ESC to neuron D8 NPCs d N-2 medium Gene group GO term P Value + BSA dissociate NPCs up-regulated in D10 No significant term KI (35) neurogenesis Nervous system development 5.7E-115 B-27 Generation of neurons 2.1E-91 mild or no medium Neurogenesis 8.8E-91 change (3,431) Neuron differentiation 3.8E-89 Neuron development 6.5E-85 D16 neurons Circulatory system 1.5E-20 development f/f KI #27 KI #48 b Cardiovascular system 1.5E-20 down-regulated Development in KI (367) Vasculature development 3.9E-20 NPCs Blood vessel development 6.0E-20 D9 Organ morphogenesis 2.3E-18

g GO analysis: genes associated with NPC SEs e NPC markers GO term P Value Nes Sox3 Nervous system development 5.30E-19 200 250 f/f Generation of neurons 1.80E-18 150 200 150 KI Neurogenesis 1.80E-18 100 100 Central nervous system development 8.80E-16 50 50 Positive regulation of RNA metabolic process 1.00E-15 0 0 Neuron differentiation 2.30E-15 Pax6 Pax7 TPM 30 80 h 97 kb 51 kb 60 20 10 f/f 10 f/f 40 ESC 10 10 KI 10 KI 20 10 10

MLL4 f/f f/f 0 0 NPC 10 KI 10 KI f Neuron markers 8 f/f 8 f/f ESC NeuN Tau 8 KI 8 KI 150 1500 8 f/f 8 f/f NPC 100 1000 H3K4me1 8 KI 8 KI 5 f/f 8 f/f 50 500 ESC 5 KI 8 KI 0 0 5 f/f 8 f/f NPC Ntrk2 App H3K27ac 5 8 TPM 1500 KI KI 100 4 f/f 5 f/f 1000 ESC 4 5 seq KI KI 50 - 5 500 4 f/f f/f NPC RNA 4 KI 5 KI 0 0 NPC SEs

Ptn Pax7 Figure 7-S1. MLL3/4-catalyzed H3K4me1 is dispensable for activation of super-enhancers during neural differentiation