ARTICLE

https://doi.org/10.1038/s41467-020-16989-w OPEN Identification of distinct loci for de novo DNA methylation by DNMT3A and DNMT3B during mammalian development

Masaki Yagi1,11,12, Mio Kabata2,3,12, Akito Tanaka2, Tomoyo Ukai1, Sho Ohta1, Kazuhiko Nakabayashi4, ✉ Masahito Shimizu3, Kenichiro Hata4, Alexander Meissner 5,6,7, Takuya Yamamoto 2,8,9,10 & ✉ Yasuhiro Yamada 1,8 1234567890():,; De novo establishment of DNA methylation is accomplished by DNMT3A and DNMT3B. Here, we analyze de novo DNA methylation in mouse embryonic fibroblasts (2i-MEFs) derived from DNA-hypomethylated 2i/L ES cells with genetic ablation of Dnmt3a or Dnmt3b. We identify 355 and 333 uniquely unmethylated in Dnmt3a and Dnmt3b knockout (KO) 2i-MEFs, respectively. We find that Dnmt3a is exclusively required for de novo methylation at both TSS regions and bodies of Polycomb group (PcG) target developmental genes, while Dnmt3b has a dominant role on the X . Consistent with this, tissue-specific DNA methylation at PcG target genes is substantially reduced in Dnmt3a KO embryos. Finally, we find that human patients with DNMT3 mutations exhibit reduced DNA methylation at regions that are hypomethylated in Dnmt3 KO 2i-MEFs. In conclusion, here we report a set of unique de novo DNA methylation target sites for both DNMT3 enzymes during mam- malian development that overlap with hypomethylated sites in human patients.

1 Division of Stem Cell Pathology, Center for Experimental Medicine and Systems Biology, Institute of Medical Science, University of Tokyo, Tokyo 108-8639, Japan. 2 Department of Life Science Frontiers, Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto 606-8507, Japan. 3 Department of Gastroenterology/Internal Medicine, Gifu University Graduate School of Medicine, Gifu 501-1194, Japan. 4 Department of Maternal-Fetal Biology, National Research Institute for Child Health and Development, Tokyo 157-8535, Japan. 5 Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany. 6 Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA. 7 Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA. 8 AMED-CREST, AMED 1-7-1 Otemachi, Tokyo 100-0004, Japan. 9 Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto 606-8501, Japan. 10 Medical-risk Avoidance based on iPS Cells Team, RIKEN Center for Advanced Intelligence Project (AIP), Kyoto 606-8507, Japan. 11Present address: Department of Molecular Biology, Massachusetts General Hospital, Boston, MA ✉ 02114, USA. 12These authors contributed equally: Masaki Yagi, Mio Kabata. email: [email protected]; [email protected]

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

NA methylation plays critical roles during mammalian 2i/L ES cells represent a powerful tool for de novo methylation Ddevelopment and ensures stable gene repression, genomic during early embryonic development. integrity, inactivation (XCI), and pre- In this study, to identify the unique targets of de novo servation of genomic imprinting1–3. Three conserved DNA methylation by the DNMT3 enzymes, we used female mouse 2i/L methyltransferases, DNMT3A, DNMT3B, and DNMT1, are ES cells with targeted deletions in either Dnmt3a or Dnmt3b responsible for de novo establishment and maintenance of DNA (Fig. 1a). Dnmt3 KO 2i/L ES cells grow well and are methylation patterns4. The significance of DNA methyl- morphologically normal under 2i/L culture conditions16.In transferases for mammalian development is highlighted by the addition, they efficiently contribute to chimeric embryos as fact that genetic deletions of any of these enzymes have lethal efficiently as wild-type (WT) 2i/L ES cells, indicating that Dnmt3 phenotypes in mice5,6. Moreover, dysregulation of DNMTs is KO 2i/L ES cells retain pluripotency and the ability to linked to several human diseases, including acute myeloid leu- differentiate into somatic cells in vivo16. To compare the targets kemia (AML) and immunodeficiency, centromere instability, and of each DNA methyltransferase, we established mouse embryonic facial anomalies (ICF) syndrome, indicating that DNA methyl- fibroblasts (MEFs) from chimeric mice derived from female transferases are essential for normal mammalian development Dnmt3 KO 2i/L ES cells (Dnmt3 KO 2i-MEFs), followed by and proper maintenance of somatic cells7,8. antibiotic selection, and assessed DNA methylation profiles by Previous studies revealed the epigenetic mechanisms that whole-genome bisulfite sequencing (WGBS) and target-captured establish the strict methylome integrity required for normal MethylC-seq analysis (Supplementary Fig. 1b). Considering that development. After epigenetic reprogramming in pre- the 5 hmC level is negligible in ESCs18, we regarded both 5 hmC implantation embryos, the DNA methyltransferases DNMT3A and 5 mC as 5 mC in this study. and DNMT3B, in conjunction with their nonenzymatic cofactor CpG methylation on all was lower in both DNMT3L, mediate global de novo DNA methylation during Dnmt3a and Dnmt3b KO 2i-MEFs than in WT 2i-MEFs (Fig. 1e embryonic and extraembryonic development5,9,10. DNMT3A and and Supplementary Fig. 1c). Extending the analysis to various DNMT3L establish genomic imprinting in gametes11–13, whereas genomic features, including transposable elements, we found that DNMT3B and DNMT3L are required for unique DNA methy- both Dnmt3a and Dnmt3b KO decreased DNA methylation levels lation patterns in placental progenitors10. The genomic enrich- on all elements (Fig. 1f, g). Consistent with a previous study19, ment patterns of DNMT3A differ from those of DNMT3B in Dnmt3b KO 2i-MEFs exhibited a clear reduction in DNA mouse embryonic stem (ES) cells, suggesting that each de novo methylation on transposable elements relative to WT or Dnmt3a methyltransferase has unique targets during development14,15. KO 2i-MEFs (Fig. 1g). As shown in our previous study16, 2i- Indeed, individual genetic ablation of these methyltransferases MEFs exhibited global loss of genomic imprinting regardless of results in lethal phenotypes at different developmental stages5. Dnmt3 status (Supplementary Fig. 1d). When we divided all CpG However, it remains unclear whether the differential targeting of sites into three categories according to DNA methylation levels DNMT3 enzymes is responsible for -specific de novo DNA [hypermethylated (>0.8), intermediate (0.2–0.8), and hypomethy- methylation during embryonic development. lated (<0.2) sites], we found that hypermethylated CpG sites were To comprehensively identify the target sites for de novo DNA less abundant in Dnmt3b KO 2i-MEFs, whereas Dnmt3a KO 2i- methylation by the DNMT3 enzymes, in the present study, we MEFs exhibited relatively modest reduction (Fig. 1h). The took advantage of female mouse ES cells established in the pre- reduced methylation level in Dnmt3b KO 2i-MEFs was observed sence of MEK and Gsk3 inhibitors, which lack most DNA at regions marked by various histone marks, such as H3K36me3 methylation (2i/L ES cells)16. In combination with genetic abla- and H3K9me3 (Supplementary Fig. 2a)14. We also found that tion of Dnmt3a or Dnmt3b, here we report unique de novo DNA Dnmt3b KO epiblasts exhibit reduced CpG methylation levels at methylation target sites for both DNMT3 enzymes during de novo methylated regions during peri-implantation (from ICM embryonic development. to post-implantation epiblasts) relative to WT or Dnmt3a KO epiblasts (Supplementary Fig. 2b). Together, these results suggest that DNMT3B de novo DNA methylation activity is predominant Results in most genomic regions during early development. DNA hypomethylated ESCs for de novo methylation analysis. Consistent with a previous report20, methylation levels at CGIs Given the early embryonic lethal phenotype of Dnmt3b knockout on the X chromosome were markedly lower in Dnmt3b KO 2i- (KO) mice, ES cells could be a powerful tool for tracking and MEFs than in Dnmt3a KO 2i-MEFs (Fig. 1i and Supplementary comparing the de novo methylation activity of DNMT3s during Fig. 2c) (3a KO vs WT: p = 3.6 × 10−1, 3b KO vs WT: p = 1.2 × embryonic development (Fig. 1a). It should be noted, however, 10−15; two-sided, Kolmogorov–Smirnov test), suggesting that that mouse ES cells cultured in serum and leukemia inhibitory Dnmt3b is largely responsible for de novo CGI methylation on factor (LIF) (S/L ES cells) exhibit global DNA hypermethylation the X chromosome. Non-CGI sites were also less methylated in relative to inner cell mass (ICM) cells, which are the in vivo Dnmt3b KO 2i-MEFs, which was more pronounced on the X counterpart of ES cells17. Indeed, de novo methylated regions at chromosome than on autosomes (Fig. 1i and Supplementary post-implantation epiblasts in vivo are already methylated in S/L Fig. 2c–e) (3a KO vs WT: p < 2.2 × 10−16, 3b KO vs WT: p < 2.2 × ES cells (Fig. 1b, c). Moreover, Polycomb group (PcG) target 10−16; two-sided, Kolmogorov–Smirnov test). In sharp contrast, genes, which often include key developmental Dnmt3a KO 2i-MEFs did not exhibit an apparent reduction in genes, are highly methylated in S/L ES cells, but still hypo- non-CGI methylation on the X chromosome (Fig. 1i and methylated in both ICM and epiblast (Fig. 1d and Supplementary Supplementary Fig. 2c). Collectively, these results suggest that Fig. 1a), precluding the use of S/L ES cells for analysis of de novo DNMT3B is the enzyme predominantly responsible for global de methylation at early developmental stages. In a previous study, we novo methylation on the X chromosome during XCI. showed that mouse ES cells established under 2i/L (MEK inhi- bitor, Gsk3 inhibitor, and LIF) culture conditions (2i/L ES cells) exhibit a substantial reduction in global DNA methylation levels16. Target genes of de novo methylation by DNMT3A and Female 2i/L ES cells lack DNA methylation at most sites, including DNMT3B. We next sought to identify, in an unbiased manner, the PcG target genes (Fig. 1b–d and Supplementary Fig. 1a). unique target loci of DNMT3A- and DNMT3B-mediated de novo Collectively, these findings indicated that hypomethylated female DNA methylation during embryonic development. To this end,

2 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE

abCpG site: n = 861,755 DNA hypomethylated ESCs 1.0 2i/L ESC (XX) 0.8

2i-MEFs 0.6 Wild type (WT) DNA methylation Methyl-seq, WGBS 0.4

RNA-seq CpG methylation 0.2 Dnmt3a KO 0.0 ICM Epiblast 2i-ESC S-ESC 2i-MEF In vivo In vitro Dnmt3b KO

c 51,900 kb 52,100 kb 52,300 kb 52,500 kb d PcG target genes Gene: n = 206

ICM 0.8

Epiblast 0.6 2i/L ESC 0.4 S/L ESC DNA methylation (%) [0–100]

2i-MEF CpG methylation 0.2

Skap2 Halr1 Hoxa1 Hoxa9Hoxa13 Evx1os 1700094M24Rik Hibadh Hoxa3 Hoxa11 Hoxaas3 0.0 CGI ICM Epiblast 2i-ESC S-ESC 2i-MEF In vivo In vitro

e f CpG island TSSnonTSS TSS Exon Intron WGBS 0 chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chrX chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 0.1 –0.1

–0.1 –0.2

–0.3 –0.3 –0.4

to wild type MEFs (log2) 2i-MEF –0.5 –0.5 to wild type MEFs (log2) 2i-MEF Relative CpG methylation level Dnmt3a KO Dnmt3b KO

Relative CpG methylation level Dnmt3a KO –0.6 Dnmt3b KO –0.7

gh100 i CGIs on Non-CGIs on X chromosome X chromosome IAP LINE SINE 80 0 WGBS WGBS 1.0 1.0 –0.05 60

0.8 –0.1 40 0.8 Percentage –0.15 0.6 20 0.6 –0.2 0 0.4 0.4 CpG methylation –0.25 WT 3a KO 3b KO CpG methylation –0.3 0.2 0.2

to wild type MEFs (log2) Dnmt

Relative CpG methylation level 2i-MEF –0.35 0.0 0.0 2i-MEF Hypermethylated (>0.8) WT 3a KO 3b KO WT 3a KO 3b KO Dnmt3a KO Intermediate (0.2-0.8) Dnmt3b KO Hypomethylated (<0.2) Dnmt Dnmt 1-kb tiles: ex1: n = 1,746,441 2i-MEF 2i-MEF ex2: n = 1,480,021 we first identified differentially methylated tiles (DMT; non- (4308 and 7357 DMTs in Dnmt3a and Dnmt3b KO 2i-MEFs, overlapping 1k tiles) that differed significantly between WT 2i- respectively) (Fig. 2a and Supplementary Data 1). A small subset MEFs and each KO 2i-MEFs by using WGBS data. Consistent of DMTs (n = 143) exhibited a modest reduction in DNA with previous reports5,21, DNA methylation levels at most regions methylation in both Dnmt3a KO and Dnmt3b KO 2i-MEFs, did not significantly differ between Dnmt3a and Dnmt3b KO 2i- suggesting that these loci are sufficiently methylated only when MEFs, but we identified thousands of DMTs in either of the KO DNMT3A and DNMT3B are both present (Fig. 2a and

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

Fig. 1 DNA hypomethylated ES cells for de novo methylation analysis. a Schematic of experimental design. Either Dnmt3a or Dnmt3b was disrupted by CRISPR/Cas9 in female 2i/L ES cells16. b CpG methylation levels at loci that were differentially methylated between ICM and epiblast (see the methods section for details) in 2i/L ES cells and S/L ES cells. WGBS data were used for the analysis. White dots indicate median methylation levels. Black bars and the lines stretched from the bar represent IQR and the lower/upper adjacent values (±1.5 IQR), respectively. ICM and epiblast data were obtained from GSE8423610. Data of 2i/L ES cells, S/L ES cells, and 2i-MEFs are from GSE8416516. c CpG methylation levels at a representative genomic region including the Hoxa cluster, as determined by WGBS. Each bar indicates a CpG site, and the height of the bar represents methylation percentage (0–100%). Locations of genes and CpG islands (CGIs) are indicated below. Data are from GSE8423610 and GSE8416516. d CpG methylation levels at PcG target developmental genes in ICM, epiblast, 2i/L ES cells, S/L ES cells, and 2i-MEFs, as determined by WGBS. Data were obtained from GSE8423610 and GSE8416516. e Relative

CpG methylation level [log2(fold change)] at each chromosome in Dnmt3 KO 2i-MEFs vs. wild-type 2i-MEFs, as determined by WGBS. Data from two independent experiments are shown. f Relative CpG methylation levels [log2(fold change)] at CGIs, promoters [Transcription Start Site (TSS) ± 1000 bp], exons, and introns in Dnmt3 KO 2i-MEFs vs. wild-type 2i-MEFs, as determined by MethylC-seq. g Relative CpG methylation levels [log2(fold change)] at transposable elements (IAPs, LINEs, and SINEs) in Dnmt3 KO 2i-MEFs vs. wild-type 2i-MEFs by WGBS. h: Fraction of hypermethylated (>0.8), intermediate (0.2–0.8), and hypomethylated (<0.2) CpG sites in wild-type and each type of KO 2i-MEFs. WGBS data (CpG sites at ≥5× coverage) were used for this analysis. i CpG methylation levels at CGIs and non-CGIs on the X chromosome in WT, Dnmt3a KO, and Dnmt3b KO 2i-MEFs, as determined by WGBS. White dots indicate median methylation levels.

Supplementary Data 1). To uncover the specific target genes of Data file). Based on these findings, we concluded that Dnmt3a each methyltransferase, we used MethylC-seq data that are confers de novo DNA methylation at PcG target developmental enriched and have high coverage reads in gene regions. Notably, genes upon ES cell differentiation. we identified 355 and 333 specific target genes of DNMT3A and To more thoroughly examine DNMT3A-mediated de novo DNMT3B, respectively (Fig. 2b, c and Supplementary Fig. 3a, b) methylation at PcG target developmental genes, we divided the (Supplementary Data 2–5). (GO) analyses region of each PcG target gene into the Transcription Start Site revealed that DNMT3B-specific genes are involved in synapto- (TSS) region (TSS ± 1000 bp) and gene body [TSS to Transcrip- nemal complex assembly (e.g., Tex12, Syde1, and Meioc), plasma tion End Site (TES)] and analyzed the methylation patterns membrane (e.g., Pcdha/b family), and cell proliferation in fore- (Fig. 2e). The TSS regions of a subset of PcG target developmental brain (e.g., Wnt7a, Wnt7b, Wnt3a, and Zeb2) (Supplementary genes exhibited lower levels of de novo methylation in Dnmt3a Fig. 3c, and Supplementary Data 4 and 5). Our analysis identified KO 2i-MEFs than in WT 2i-MEFs (Fig. 2e) [3a KO vs WT (mean the previously reported DNMT3B-specific targets, Pcdha/b family methylation levels, TSS): p < 2.2 × 10−16, 3b KO vs WT (mean genes22, confirming that our strategy can successfully identify the methylation levels, TSS): p = 5.4 × 10−14; two-sided, Wilcoxon target genes of the de novo methyltransferase (Supplementary signed-rank test]. In contrast to the repressive role of DNA Fig. 3a). By contrast, DNMT3A target genes included many methylation at TSS regions, gene body methylation is generally transcription factors that are essential for normal development, associated with active transcription24. Consistent with this, gene including genes of the Hox, Gata, and Cbx families (Fig. 2c and bodies of more highly expressed PcG target developmental genes Supplementary Data 3 and 4). Consistent with this, GO analyses tended to have higher methylation levels in 2i-MEFs (Fig. 2e). In indicated that DNMT3A target genes are involved in sequence- addition, DNMT3B binds to gene bodies of actively transcribed specific DNA binding (e.g., Nr6a1, Gata2, Hoxc9, and Pou5f1), genes marked by H3K36me3, and is responsible for de novo DNA transcription (e.g., Zfp46, Barx1, and Osr1), and pharyngeal sys- methylation at the gene bodies14,25. Indeed, H3K36me3-marked tem development (e.g., Gata3, Six1, and Tbx1) (Supplementary regions were less methylated in Dnmt3b KO 2i-MEFs than in Fig. 3d). Because the DNMT3A target genes included many Dnmt3a KO 2i-MEFs (Supplementary Fig. 2a). Importantly, developmental genes that are also known as PcG target tran- however, gene bodies of PcG target genes were less methylated in scription factor genes23, we next hypothesized that DNMT3A Dnmt3a KO 2i-MEFs than in Dnmt3b KO 2i-MEFs (Fig. 2e regulates de novo DNA methylation at PcG target developmental and Supplementary Fig. 5a) [3a KO vs WT (Gene body): p < 2.2 × genes during embryonic development. Notably, most of the PcG 10−16, 3b KO vs WT (Gene body): p = 8.9 × 10−3; two-sided, target genes that acquire de novo DNA methylation during Wilcoxon signed-rank test]. The reduced methylation at gene embryonic development were strikingly hypomethylated in bodies in Dnmt3a KO 2i-MEFs was similarly observed at those Dnmt3a KO 2i-MEFs (Fig. 2d and Supplementary Fig. 3e) [3a KO distant from TSS (Supplementary Fig. 5b), indicating that the vs WT (mean methylation levels): p = 5.6 × 10−48,3bKOvsWT observed changes are not always associated with close proximity (mean methylation level): p = 8.2×10−16, Wilcoxon signed-rank of CpG islands linked to promoters. Collectively, these results test]. For example, our data showed that all clusters suggest that, in sharp contrast to other genes, gene bodies of PcG (Hoxa, Hoxb, Hoxc, and Hoxd) located on different chromo- target developmental genes are preferentially methylated by somes, as well as other PcG target transcription factors, were DNMT3A, consistent with a recent study demonstrating that specifically methylated by DNMT3A (Fig. 2c, Supplementary DNMT3A1, a longer isoform of DNMT3A, preferentially Fig. 3b, and Supplementary Data 3). The reduced DNA methy- localizes to the methylated shores of bivalent CGIs in ES cells15. lation in Dnmt3a KO 2i-MEFs was observed in both short and Given that DNMT3A, together with DNMT3L, establishes long genes (Supplementary Fig. 4a). To confirm DNMT3A- genomic imprinting in gametes, we examined the involvement mediated de novo methylation, we next examined the effect of of DNMT3L in DNMT3A-mediated de novo methylation at PcG ectopic Dnmt3a expression on reduced methylation at PcG target target genes. However, methylation levels of PcG target genes developmental genes in Dnmt3a KO cells. In these experiments, were comparable between Dnmt3l KO and WT E14.5 MEFs we introduced a doxycycline (Dox)-inducible Dnmt3a1 expres- (Supplementary Fig. 5c), demonstrating that DNMT3L is sion vector into Dnmt3a KO 2i/L ES cells, and then differentiated dispensable for DNMT3A-mediated methylation at PcG target them into epiblast-like cells (EpiLCs) (Supplementary Fig. 4b). genes during embryonic development. Forced expression of Dnmt3a1 in Dnmt3a KO cells increased de novo DNA methylation at Hoxa1 and Meis1 to a level similar to Differential binding patterns of DNMT3A1 and DNMT3B1. that observed in WT cells (Supplementary Fig. 4c, d and Source To obtain mechanistic insights into the locus specificity of

4 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE

a Both DNMT3A specific DNMT3B specific n = 143 n = 4308 n = 7357 Rep 1 Wild type Rep 2 Rep 1 Dnmt3a KO Rep 2 Rep 1 Dnmt3b KO Rep 2

DNA methylation (%) 0 255075100

b Tdrd1 Cdkn1c chr19:56,825,469-56,827,143 chr7:143,458,663-143,461,197 Rep 1 Wild type Rep 2 Rep 1 Dnmt3a KO Rep 2 Rep 1 [0–100] Dnmt3b KO Rep 2 DNA methylation (%) Tdrd1 Cdkn1c CGI

d Dnmt

WT 3a KO 3b KO Grhl2 Ar Nfatc1 Zfp467 Grhl2 Tox Tshz1 Cbx4 Bcl11a Zeb2 Ar Irf2 c chr6:52,148,644-52,278,054 Npas2 Zfpm1 Bach2 Pcdh1 Nfatc1 Ebf1 Epas1 Rep 1 Hlx Hmga2 Lef1 Nfix Zfp467 Wild type Meis1 Dpf3 Eomes Rep 2 Ebf3 Mkx Tox Zfpm2 Hand1 Dach1 Ankrd33b Rep 1 Ebf2 Tshz1 Fli1 Lmx1a Hey2 Hoxa4 Dnmt3a KO Hoxa1 Cbx4 Hoxa3 Rep 2 Hoxb3 Hoxa2

[0–100] Hoxa5 Foxf2 Bcl11a Gata4 Hoxb7 Hoxa6 Rep 1 Tbx2 Trp73 Zeb2 Hoxc4 En1 Dnmt3b KO Pax8 Tbx4 Rep 2 Tbx5 Irf2 Vax2

DNA methylation (%) Hoxb6 Nr2f2 Egr3 Sox7 Hsf4 Npas2 Creb1 Pax3 Vsx2 Hoxa1 Hoxa2 Hoxa3 Hoxaas3 Mira Hoxa9 Hoxa10 Hoxa11 Hoxa13 Dlx3 Barx1 Zfpm1 Hoxb1 Irf5 Atf3 Wt1 CGI Dmrt3 Bach2 Irx5 Gata3 Tbx15 Sowahb Hoxa11 Pcdh1 Atoh8 Msx2 Otx1 chr11:96,267,573-96,369,327 Pou3f4 Ebf1 Dlx2 Lhx6 Foxf1 Foxc1 Epas1 Gbx2 Rep 1 Hoxc6 Hoxc10 Tbx20 Onecut3 Wild type Cbx8 Hlx Msx1 Vsx1 Rep 2 Hoxb13 Onecut2 Pax2 Hmga2 Pax7 Irx2 Hoxc5 Rep 1 Hoxc9 Hoxd9 Lef1 Sim1 Hes5 Gata6 Dnmt3a KO Barx2 Mafb Nfix Rep 2 Shox2 Pax1 [0–100] Onecut1 Alx3 Dmbx1 Meis1 Tbr1 Rep 1 Gsc Pax6 Irx1 Hoxa10 Dpf3 Dmrta2 Dnmt3b KO Six2 Pax9 Rep 2 Sox9 Pitx3

DNA methylation (%) Eomes Pitx1 Twist2 DNA methylation (%) Isl2 Foxd1 Evx2 Lhx2 Ebf3 Hoxb9 Hoxb8 Hoxb5os Mir10a Hoxb3 Hoxb3 Hoxb2 Gata5 Nkx2−2 Hoxb4 Dlx1 Zic1 Mkx Tlx1 Zic4 CGI Ovol1 Hoxd10 Foxq1 Zfpm2 Vax1 Tbx18 Nkx2−5 Pdx1 Foxc2 Hand1 Bhlhe41 chr6:48,437,519-48,439,826 Difference in methylation level (WT-KO) (%) Tal1 Gsc2 Six1 Dmrt2 Dach1 Nkx6−1 Olig3 Hoxb2 Rep 1 Tcf21 Nhlh2 Ankrd33b Hoxd11 Cdx2 Wild type Phox2b Tfap2d Rax Ebf2 Rep 2 Pou3f3 Hmx1 Otp En2 Sox18 Six3 Fli1 Rep 1 Alx1 Hoxc12 Foxe1 Dmrta1 Cited2 Lmx1a Dnmt3a KO Atoh1 Vgll2 Rep 2 Tfap2b Nr2e1

[0–100] Bhlhe23 Hey2 Cebpa Lhx5 Six6 Hoxb8 Rep 1 Gsx2 Hoxa4 Dbx1 Mnx1 Foxb2 Dnmt3b KO Bsx Pou4f2 Hoxa1 Rep 2 Nkx2−3

Zfp503 (WT-KO) (%) Fev DNA methylation (%) Nkx2−4 Uncx Hoxa3 Sox21 Olig2 DNA methylation (%) Irx3 Hes7 Zfp467 Helt Hoxb3 Pou4f3

Hoxd13 Difference in methylation level Nkx2−9 Pou3f2 Neurod2 Tlx3 Hoxa2 CGI Foxl2 Neurog1 Gsx1 Barhl2 Foxd4 Hoxa5 Neurog3 Neurog2 Id3 Ascl1 Mafa Foxf2 Jun 0 100 n = 206 −25 25

e Expression 5 level 0 n = 200 TSS Gene body Wild type TSS Gene body TSS Gene body Dnmt3a KO TSS Gene body TSS Gene body Dnmt3b KO TSS Gene body fa x7 un e1 ol1 Id3 Irf2 Irf5 Hlx xc9 Tox xc6 Isl2 xc5 x J xb4 Fli1 Otp xd4 x21 x18 ax8 o ax6 ax7 Irx1 Irx3 Irx2 Fev Irx5 Bsx Nfix Wt1 En1 En2 Gsc Rax Helt Tal1 Myc Mkx Zic1 atc1 Tlx1 Tlx3 Zic4 Six1 Alx1 Lef1 Alx3 Six3 Six2 Six6 Dlx1 Dlx3 Dlx2 r Ebf1 Ebf3 Tbr1 Hsf4 Ebf2 oxc2 o Otx1 Dpf3 Egr3 o f oxq1 o Lhx2 xd10 Lhx6 Lhx5 Ma Vax1 Vax2 Pitx1 Mafb Vsx1 Evx1 Tbx4 Pitx3 Tbx2 Vsx2 Evx2 Vgll2 Pax1 Pax3 Tbx5 Pax9 Sox9 Pax2 P P Zeb2 P Pdx1 Sim1 S Uncx Hey2 Cbx4 Gsx1 Dbx1 Olig3 Cbx8 Gsc2 Cdx2 Olig2 Gsx2 Hes5 Hes7 Msx1 Foxl2 Msx2 Gbx2 Foxf1 Nr2f2 Mnx1 Ascl1 Foxf2 Grhl2 Tcf21 T Ov Hmx1 Atoh1 Barx1 Atoh8 Barx2 Nhlh2 Foxc1 Meis1 F F Creb1 Foxd1 Tshz1 Fo F Nr2e1 Gata3 Foxb2 Gata6 Gata5 Dmrt3 Dmrt2 Tbx18 Tbx20 Tbx15 Ho So Shox2 Ho S Ho Zfpm1 Pcdh1 Epas1 Zfpm2 Bach2 Hoxa4 Hoxd9 H Hoxa3 N Hoxa1 Hoxb1 Hoxb3 Hoxb2 Hoxa2 Hoxb8 Hoxb6 Hoxa6 Hoxa5 Cited2 Twist2 Dach1 Hoxb7 Npas2 Barhl2 owahb Cebpa Hand1 Lmx1a Bcl11a Tfap2b Zfp503 Eomes Pou3f4 Pou4f3 Pou3f3 Pou3f2 Pou4f2 Zfp467 Dmbx1 Hmga2 Dmrta2 Phox2b Hoxc10 Hoxc12 Hoxa11 Hoxd11 Hoxd13 Nkx2−4 Nkx2−9 Hoxb13 Nkx6−1 Nkx2−3 Hoxa10 Nkx2−2 Ho Nkx2−5 S Bhlhe41 Bhlhe23 Onecut1 Onecut3 Onecut2 Neurog1 Neurog3 Neurod2 Neurog2 Ankrd33b

0100DNA methylation (%) Id3 Irf2 Jun Irx1 Irx3 Irx2 Irx5 Wt1 En1 Nfix Myc Zic1 Six1 Alx1 Ebf1 Pitx1 Mafb Sox9 Zeb2 Cbx4 Cbx8 Nr2f2 Atoh8 Meis1 Foxc1 Tshz1 Creb1 −25 25 Gata6 Dmrt2 Foxc2 Zfpm1 Pcdh1 Tbx15 Nfatc1 Hoxc6 Zfpm2 Shox2 Hoxc5

Hoxb8 Difference in methylation level Cebpa Twist2 Hoxb2 Cited2 Hoxa5 Hoxb7 Zfp503 Zfp467 Hmga2 Hoxc10 Hoxa10 (WT-KO) (%)

DNMT3 enzymes for de novo methylation, we examined geno- target genes of DNMT3B-mediated de novo methylation mic enrichment patterns of DNMT3A and DNMT3B by ana- (Fig. 3a), suggesting the existence of distinct mechanisms gov- lyzing ChIP-seq datasets14,15. Consistent with previous reports erning the locus-specific activity of de novo methyltransferases. demonstrating that DNMT3B is responsible for gene body Considering that DNMT3A mediates de novo methylation at PcG methylation of actively transcribed genes14,25, we confirmed target developmental genes, we next analyzed the binding pat- enrichment of DNMT3B1, but not DNMT3A1, in gene bodies terns of the DNMT3 enzymes on the PcG target genes. We (Fig. 3a). Notably, DNMT3A, and especially DNMT3A1, was observed prominent enrichment of DNMT3A1 binding on PcG enriched at DNMT3A target genes (Fig. 3a and Supplementary target developmental genes, including enrichment in gene bodies Fig. 6a). By contrast, enrichment of DNMT3B1 was not altered at (Fig. 3b). Remarkably, DNMT3B1 binding was reduced on the

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

Fig. 2 Target genes of de novo methylation by DNMT3A and DNMT3B. a Heatmap of CpG methylation levels within 1 K tiled regions in which CpG methylation levels differed between the WT and each Dnmt3 KO. Differentially methylated 1 K tiled regions are identified using the Bioconductor DSS package with methylation status of nonoverlapping 1 K tiles that have ≥10 read coverages at CpG sites within a tile across all samples (see the methods section in detail). b Representative loci of DNMT3B-specific target sites in 2i-MEFs. Each bar indicates a CpG site, and the height of the bar represents methylation percentage, as determined by MethylC-seq (0–100%). Locations of genes and CGIs are indicated below. c Representative loci of DNMT3A- specific target sites in 2i-MEFs. Essential developmental genes are frequently unmethylated in Dnmt3a KO 2i-MEFs. Each bar indicates a CpG site, and bar height represents methylation percentage by MethylC-seq (0–100%). Locations of genes and CGIs are indicated below. d The left panel shows mean DNA methylation levels at CpG sites within each PcG target developmental gene23 [from TSS to Transcription End Site (TES)] in WT 2i-MEFs from two independent experiments. The right panel shows methylation difference between WT and each Dnmt3 KO 2i-MEFs from two independent experiments. Genes are sorted by the mean methylation levels in the WT. Color scales indicate DNA methylation levels and their differences, respectively. e The upper panel shows mean DNA methylation levels at CpG sites within TSS ± 1000 bp and gene body (TSS-TES) of each PcG target gene in WT 2i-MEFs from two independent experiments. The lower panel shows methylation difference between WT and each type of Dnmt3 KO 2i-MEFs from two independent experiments. Color scale indicates DNA methylation levels and their difference, respectively. PcG genes are ordered based on expression levels (FPKM values) in WT 2i-MEFs (top panel).

PcG target genes23 (Fig. 3b), indicating that DNMT3B1 is pre- a massive erasure of genomic imprints and harbor an impaired cluded from these loci. Among DNMT3A target genes, further autonomous developmental potential16, which raised the possi- enrichment of DNMT3A1 was observed on PcG target develop- bility that the observed changes in DNA methylation are not mental genes (Fig. 3c), and reduced binding of DNMT3B1 was directly associated with DNMT3 de novo DNA methylation detected exclusively at DNMT3A target genes that were also PcG activity. To investigate DNMT3A-mediated de novo methylation target genes (Fig. 3c), highlighting unique binding patterns of during embryonic development in vivo, we obtained various tis- DNMT3 enzymes at PcG target loci. Together, these results sues (brain, limb, heart, lung, and liver) from Dnmt3a WT and suggest that the preferred physical interactions of DNMT3A and null embryos at E14.5 (Fig. 5a, b). We then performed MethylC- DNMT3B are associated with the locus-specific activity of the seq analysis to compare DNA methylation patterns at PcG target enzymes at PcG target developmental genes. developmental genes in different tissues (Supplementary Fig. 1b). Notably in this regard, PcG target genes in different tissues Dnmt3a exhibited distinct DNA methylation levels in WT embryos. For Derepression of PcG target genes in KO 2i-MEFs. example, the Hoxb cluster is methylated in the heart and liver, but Given that DNA methylation at CGIs linked to TSS regions are remains unmethylated in the brain (Fig. 5c). Notably, genetic associated with gene repression, we next investigated the func- ablation of Dnmt3a resulted in a reduction in DNA methylation tional implications of DNMT3-mediated de novo methylation on at the Hoxb cluster in the heart and liver (Fig. 5c). Moreover, gene expression during embryonic development. Because Dnmt3b other PcG target developmental genes such as Chromo, T-box, KO 2i-MEFs exhibited global CGI hypomethylation at the X fi fi and Hox families also exhibited tissue-speci c methylation pat- chromosome, we rst performed RNA-seq analysis to compare terns in WT organs, but were hypomethylated in Dnmt3a KO global gene expression profiles of X-linked genes between WT and – fi tissues (Fig. 5d e and Supplementary Fig. 7). Conventional both types of Dnmt KO 2i-MEFs. Despite the signi cant reduction bisulfite sequencing analysis confirmed that the tissue-specific in CGI methylation in Dnmt3b KO 2i-MEFs, expression levels of methylation at Meis1 is mediated by DNMT3A (Supplementary X-linked genes were comparable among WT, Dnmt3a KO, Fig. 8a). Consistent with the notion that PcG target develop- and Dnmt3b KO (3a KO: p = 7.6 × 10−1, 3b KO: p = 8.2 × 10−1) – mental genes are regulated by cooperative epigenetic mechan- (Fig. 4a), suggesting that DNA methylation independent isms, most PcG target genes were expressed at similar levels in mechanisms ensure dosage compensation of X-linked genes in 26 WT and Dnmt3a KO tissues (Fig. 5f and Supplementary Fig. 8b). Dnmt3b KO 2i-MEFs . We also examined expression of PcG Notably, gene bodies of transcriptionally active PcG target target developmental genes in WT and Dnmt KO 2i-MEFs. developmental genes were highly methylated in both the heart Although reduced CGI methylation at PcG target developmental and liver, but were less methylated in Dnmt3a KO tissues (Sup- genes was exclusively detectable in Dnmt3a KO 2i-MEFs, plementary Fig. 8c), affirming that de novo methylation at gene expression levels of most PcG target developmental genes were not bodies of PcG target developmental genes is mediated by altered in Dnmt3a KO 2i-MEFs (3a KO: p = 7.3 × 10−1, 3b KO: p = −1 DNMT3A in vivo. Taken together, these results underscore the 6.5 × 10 ) (Fig. 4b). However, when we focused on the TSS essential role of DNMT3A in establishing tissue-specific DNA regions of PcG target developmental genes expressed at low levels, methylation patterns at PcG target developmental genes during which were methylated in WT 2i-MEFs (>15%), we found that a embryonic development. small subset of the PcG target genes (e.g., Pax8) was modestly derepressed in Dnmt3a KO 2i-MEFs, accompanied by loss of DNMT3 DNA methylation at TSS regions (Fig. 4c–e and Supplementary Reduced DNA methylation in patients with muta- Fig. 6b). To further investigate DNMT3A-mediated repression of tions. Finally, we investigated the unique functions of DNMT3A PcG target developmental genes, we also obtained MEFs from and DNMT3B in humans. Somatic inactivating mutations of DNMT3A and DNMT3B have been reported in AML27 and ICF Dnmt3a WT and null embryos by crossing Dnmt3a heterozygous 7 KO mice. Importantly, derepression of Pax8 was confirmed in syndrome patients , respectively. To investigate the possible link between an impaired DNMT3 function and altered Dnmt3a KO MEFs (Supplementary Fig. 6c). Collectively, these fi data indicate that DNMT3A-mediated DNA methylation plays a DNA methylation in human diseases, we rst analyzed DNA role in stable silencing of a subset of PcG target developmental methylation status at mouse DNMT3B target genes in ICF genes. patients harboring a DNMT3B mutation. Human homologs of a subset of mouse DNMT3B target genes were similarly hypo- methylated in DNMT3B mutant ICF patients relative to con- Tissue-specific methylation at PcG target genes by DNMT3A. trol patients (Fig. 6a). Importantly, in addition to the PCDH Our previous study demonstrated that female 2i/L ES cells exhibit family of genes, previously reported to be hypomethylated in

6 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE

a DNMT3A1 binding DNMT3B1 binding 0.14 0.16 0.12 0.14

0.10 0.12

0.10 0.08

0.08 0.06

0.06 Read count per million mapped reads Read count per million mapped reads 0.04 −1000 TSS 33% 66% TES 1000 −1000 TSS 33% 66% TES 1000 Genomic region (5’ −> 3’) Genomic region (5’ −> 3’)

DNMT3A target genes (n = 258) DNMT3A target genes (n = 258) DNMT3B target genes (n = 217) DNMT3B target genes (n = 217) All genes (n = 19,693) All genes (n = 19,693)

b DNMT3A1 binding DNMT3B1 binding 0.16

0.14 0.10

0.12 0.08

0.10 0.06 0.08

0.06 0.04 Read count per million mapped reads Read count per million mapped reads −1000 TSS 33% 66% TES 1000 −1000 TSS 33% 66% TES 1000 Genomic region (5’ −> 3’) Genomic region (5’ −> 3’) PcG target genes (n = 206) PcG target genes (n = 206) All genes ( n = 19,693) All genes ( n = 19,693)

c DNMT3A1 binding DNMT3B1 binding 0.25 0.14

0.20 0.12

0.10 0.15 0.08

0.10 0.06

0.04

Read count per million mapped reads 0.05 Read count per million mapped reads −1000 TSS 33% 66% TES 1000 −1000 TSS 33% 66% TES 1000 Genomic region (5’ −> 3’) Genomic region (5’ −> 3’) DNMT3A target PcG genes (n = 32) DNMT3A target PcG genes (n = 32) DNMT3A target non-PcG genes (n = 226) DNMT3A target non-PcG genes (n = 226) All genes (n = 19,693) All genes (n = 19,693)

Fig. 3 Differential binding patterns of DNMT3A1 and DNMT3B1. a DNMT3A1 and DNMT3B1 binding to target genes for de novo methylation. ChIP-seq data were obtained from GSE57413 and GSE9652914,15. Shading represents the standard error of the mean (SEM). b DNMT3A1 and DNMT3B1 binding to PcG target developmental genes. Note that DNMT3A1 preferentially binds to PcG target genes, whereas DNMT3B1 is excluded from these genes. Shading represents SEM. c DNMT3A1 and DNMT3B1 binding to DNMT3A target genes that are also PcG target genes. Further enrichment of DNMT3A1 was observed on PcG target genes. Note that reduced binding of DNMT3B1 was exclusively detected at PcG target genes among DNMT3A target genes. Shading represents SEM.

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

abX linked genes (n = 320) PcG target genes (n = 206) 4

6 3

4 2

2 1

0 0 wild type MEFs (log2) −2 wild type MEFs (log2) −1 Expression level relative to Expression level relative to

−4 −2 3a KO 3b KO 3a KO 3b KO

Dnmt Dnmt 2i-MEF 2i-MEF

c 18 140 d Pax8 50 120 p = 0.0083 p = 0.0084

12 100 40 80 60 30 6 40 20 20 FPKM (expression level) FPKM (expression level)

0 0 Relative expression 10 Ar Nfix Tbx4 Barx1 Pax8 Nr2f2 Hoxa2 Hoxa5Hoxb4Hoxa6 Hoxb3 Pou3f4 Hoxc4 Hoxb2 0 Wild type 3a KO 3b KO 2i-MEF Dnmt Wild type Dnmt3a KO Dnmt3b KO 2i-MEF

e Hoxa2 Pax8 chr6:52,164,013-52,165,391 chr2:24,472,057-24,475,701 Rep 1 Wild type Rep 2 Rep 1 Dnmt3a KO Rep 2 Rep 1 [0–100] Dnmt3b KO Rep 2 DNA methylation (%) Hoxa2 Pax8 CGI

Fig. 4 Derepression of PcG target genes in Dnmt3a KO 2i-MEFs. a Relative gene expression levels (FPKM + 1) of X-linked genes (n = 320) in Dnmt3 KO 2i-MEFs (n = 1) vs. WT 2i-MEFs (n = 1), as determined by RNA-seq. Note that expression levels of most X-linked genes were comparable, regardless of Dnmt3 status. Solid lines in each box indicate the median. The bottom and top of the boxes are lower and upper quartiles, respectively. Whiskers extend to ±1.5 IQR. b Relative gene expression levels (FPKM + 1) of PcG target developmental genes (n = 206) in Dnmt3 KO 2i-MEFs (n = 1) vs. WT 2i-MEFs (n = 1), as determined by RNA-seq. Note that most PcG target genes were expressed at similar levels. Only a small subset of genes was modestly upregulated in Dnmt3 KO 2i-MEFs. c Expression level (FPKM) of PcG target genes, as determined by RNA-seq. Genes methylated more than 15% at TSS ± 1000 bp in WT 2i-MEFs are shown. A small subset of genes was modestly derepressed in Dnmt3a KO 2i-MEFs relative to the other cell types. d qRT-PCR analysis for Pax8 in 2i-MEFs. Note that Pax8 was derepressed exclusively in Dnmt3a KO 2i-MEFs. Data are presented as means ± SD. The mean expression level of 2i-MEFs was defined as 1. Statistical analysis was performed by Student’s t test (WT vs 3a KO p = 0.0083: WT vs 3b KO p = 0.0084; two-sided, n = 3 biologically independent samples). e DNA methylation status at promoters of representative PcG target genes that exhibited modest derepression upon Dnmt3a depletion (e.g., Hoxa2 and Pax8). Data from two independent experiments are shown. Each bar indicates a CG site, and bar height represents methylation percentage (0–100%). Locations of genes and CGIs are indicated below.

ICF patients, we identified additional genes that were hypo- developmental genes, previous studies reported that some PcG methylated in both Dnmt3b KO 2i-MEFs and ICF patients (e.g., target developmental genes, such as HOX genes and MEIS1, are TMEM151A) (Fig. 6a,b). Taken together, our results provide dysregulated in AML and Dnmt3a null mouse hematopoietic candidate genes that may be involved in the pathogenesis of ICF stem cells28–30. To explore this further, we examined a correlation syndrome. between the presence of the DNMT3A mutations and DNA We then examined the relationship between the DNMT3A methylation levels at PcG target developmental genes in AML mutations and DNA methylation levels at PcG target develop- using public datasets (GSE62298). Notably, among 30 differen- mental genes in AML. Consistent with the assumption that tially methylated PcG target developmental genes, 29 (e.g., DNMT3A-inactivating mutations impair repression of PcG target HOXA6, HOXB3, and MEIS1) were hypomethylated in DNMT3A

8 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE

ab E14.5 Wild type Dnmt3a KO Brain DNA methylation Limb Dnmt3a +/– ( ) Methyl-seq Heart Gene expression Lung RNA-seq Dnmt3a +/+ Dnmt3a –/– Liver Dnmt3a +/– ( )

c 96,280 kb 96,300 kb 96,320 kb 96,340 kb 96,360 kb

Wild type

Brain Dnmt3a KO Wild type

Limb Dnmt3a KO Wild type

Heart Dnmt3a KO Wild type

Lung Dnmt3a KO DNA methylation (%) [0–100] Wild type

Liver Dnmt3a KO

Hoxb9 Hoxb8 Hoxb7 Hoxb5os Hoxb5 Mir10a Hoxb4 Hoxb3 Hoxb3 Hoxb2 Hoxb1 CGI

d e Heart Hes7 DNA methylation Difference (WT) (WT-KO) Wild type

Dnmt3a KO (%) [0–100] Brain Limb Heart Lung Liver Brain Limb Heart Lung Liver Chromo Cbx4 DNA methylation domain Cbx8 Hes7 CGI Eomes Tbr1 Tbx2 Hoxb3 T-box Tbx4 family Tbx5 Wild type Tbx15 Tbx18 Tbx20

Dnmt3a KO (%) [0–100] Hoxa1 DNA methylation Hoxa2 Hoxa3 Hoxb3 Hoxa4 CGI Hoxa5 Hoxa6 Hoxa10 f Hoxa11 Heart PcG target genes (n = 206) Hoxb1 Hoxb2 Hoxb3 Hoxb4 Hoxb6 6 Hox Hoxb7 genes Hoxb8 Hoxb13 KO Hoxc4 4 Hoxc5 Hoxc6 Hoxc9 Hoxc10 Dnmt3a Hoxc12 Log2(FPKM + 1) 2 Hoxd9 Hoxd10 Hoxd11 Hoxd13 0 cor:0.989

DNA methylation (%) 050 0246

−25Difference in methylation level 25 Wild type (WT-KO) (%) Log2(FPKM + 1) mutant AML (R882: major mutation) relative to DNMT3A WT region specificity of de novo methylation by DNMT3A and AML, suggesting that inactivating DNMT3A mutation causes DNMT3B is essentially shared between mice and humans. defects in epigenetic repression of PcG target developmental genes (Fig. 6c, d). These results are also consistent with those of a recent study demonstrating that a gain-of-function mutation at Discussion DNMT3A causes microcephalic dwarfism and hypermethylation Although de novo DNA methylation plays a pivotal role during of PcG-regulated genes31. Collectively, these data suggest that the embryonic development, important unsolved questions include

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

Fig. 5 Tissue-specific methylation at PcG target genes by DNMT3A. a Schematic of experimental design. Various organs (brain, limb, heart, lung, and liver) were obtained from Dnmt3a WT and KO embryos at E14.5 for subsequent analyses. b Representative image of Dnmt3a WT and KO embryos at E14.5. No obvious phenotypic changes were observed in Dnmt3a KO embryos. Scale bars, 10 mm. c CpG methylation levels at the Hoxb cluster in Dnmt3a WT and KO organs (brain, limb, heart, lung, and liver), as determined by MethylC-seq. Each bar indicates a CpG site, and bar height represents methylation percentage (0–100%). Locations of genes and CGIs are indicated below. d The left panel shows mean DNA methylation levels at CpG sites within each representative PcG target gene in various organs, as determined by MethylC-seq. The right panel shows methylation differences between Dnmt3a WT and KO organs. Color scales indicate DNA methylation levels and their differences, respectively. e DNA methylation status at promoters of Hes7 and Hoxb3. Each bar indicates a CG site, and bar height represents methylation percentage, as determined by MethylC-seq (0–100%). Locations of genes and CGIs are indicated below. f Expression levels (FPKM + 1) of PcG target genes in hearts of Dnmt3a WT and KO embryos, as determined by RNA-seq.

precise roles of each DNMT3 enzyme. In this study, we utilized maintained on feeders (mitomycin C-treated MEFs) in KO DMEM (GIBCO) the hypomethylated ES cells as starting materials in a combina- containing 2 mM L-glutamine, 100× NEAA, P/S, 15% FBS, 0.11 mM β- tion with genetic disruption of Dnmt3 family methyltransferases, mercaptoethanol (GIBCO), and 1000 U/ml human recombinant LIF (Wako). To fi establish S/L ES cells, a blastocyst was placed on feeders in 24-well plates (passage and successfully unveiled the distinct region-speci city of de novo number 0: p0). After expansion of the ICM, the cells were passaged into 24-well methylation activity of these enzymes. Consistent with a previous plates (p1), followed by a second passage into six-well plates (p2) and a third report, most regions are similarly methylated in WT, Dnmt3a passage into 6 cm dishes (p3). For derivation of 2i/L ES cells, a blastocyst was KO, and Dnmt3b KO cells after differentiation, indicating the placed on feeders in 24-well plates in KO DMEM containing 2 mM L-glutamine, 100× NEAA, P/S, 15% FBS, 0.11 mM β-mercaptoethanol, 1000 U/ml LIF, 1 μM redundant function of DNMT3A and DNMT3B. However, we PD0325901 (STEMGENT), and 3 μM CHIR99021 (STEMGENT) (p0). After found that DNMT3B has a higher activity of de novo methylation passage into 24-well plates (p1), the cells were maintained without feeders in than DNMT3A, which is remarkably pronounced at CGIs on X DMEM/F12 (GIBCO) and Neurobasal (GIBCO) supplemented with the B27 chromosome. In contrast, de novo methylation at PcG target (GIBCO) and N2 cell supplement (GIBCO), P/S, 2 mM Glutamax (GIBCO), 1000 U/ml LIF, 1 μM PD0325901, and 3 μM CHIR99021. Cells were passaged into six- developmental genes requires DNMT3A, which is different from well plates (p2) and expanded into 6-cm dishes (p3). the mechanism in the extraembryonic lineage. Furthermore, we showed that tissue-specific de novo methylation at PcG target Genetic deletions of Dnmt3a and Dnmt3b in ES cells. Knockout of Dnmt3a or developmental genes is mediated by DNMT3A and is mechan- Dnmt3b for 2i/L ES cells was performed using CRISPR/Cas916. 2i/L ES cells at p3 istically associated with stable repression of a small subset of the were used to establish KO lines. sgRNA sequences were as follows: Dnmt3a, PcG target genes. Notably, no gene was identified to be hypo- ACCGCCTCCTGCATGATGCG; Dnmt3b: AATGCACTCCTCATACCCGC. methylated in Dnmt3a KO epiblasts when compared with WT epiblasts. Indeed, most of identified DNMT3A target genes are Establishment of 2i-MEFs by blastocyst injection. Female ICR mice were treated unmethylated in epiblasts (Supplementary Fig. 9a, b), suggesting with pregnant mare serum gonadotropin (PMSG; 7.5 IU) and human chorionic that de novo DNA methylation at these loci occurs after gonadotropin (hCG; 7.5 IU) by intraperitoneal injection. Embryos were rinsed with M2 medium (Sigma) and cultured in KSOM medium until development into implantation. Unexpectedly, we found that DNMT3A confers blastocysts. WT, Dnmt3a KO, and Dnmt3b KO 2i/L ES cells were trypsinized, and gene body methylation at PcG target developmental genes, which 7–8 cells per embryo were injected into the blastocoels of E3.5 blastocysts. Injected is in contrast to other genes being methylated by DNMT3B. blastocysts were transferred into the uteri of pseudopregnant ICR mice. A Fully Moreover, we showed that human diseases harboring inactivating Automated Fluorescence Stereo Microscope Leica M205 FA (Leica MICRO- SYSTEMS) was used to observe GFP-labeled embryos. MEFs were isolated from mutations at DNMT3 family genes exhibit reduced methylation at E14.5 embryos and cultured in DMEM (Nacalai Tesque) with 2 mM L-glutamine hypomethylated regions in mouse Dnmt3a and Dnmt3b KO cells. (Nacalai Tesque), 100× nonessential amino acids (NEAA) (Nacalai Tesque), 100 U/ Here we showed that de novo DNA methylation on the X ml penicillin, 100 μg/ml streptomycin (P/S) (Nacalai Tesque), and 10% fetal bovine chromosome mainly depends on DNMT3B but not on DNMT3A serum (FBS) (GIBCO). To remove host cells, GFP-labeled MEFs were collected by fl μ while de novo methylation at PcG target developmental genes is uorescence activated cell sorting (Aria II, BD) after selection with 350 g/ml G418 disulfate aqueous solution (Nacalai Tesque) for 7 days. mediated by DNMT3A. It should be noted that the inactive X chromosome is heavily targeted by Polycomb complexes32, sug- Establishment of Dnmt3a or Dnmt3l WT and KO MEFs. Dnmt3a WT and KO gesting that distinct mechanisms are involved in de novo embryos were obtained by crossing Dnmt3a heterozygous KO (B6;129S4-Dnmt3a methylation at PcG target genes. Notably, more pronounced < tm1Enl>) mice5. Dnmt3l WT and KO embryos were obtained by crossing reduction in DNA methylation was observed at LINEs in Dnmt3l heterozygous KO mice12. MEFs were isolated from E14.5 embryos. Dnmt3b KO than Dnmt3a KO 2i-MEFs. Given that greater occurrence of LINEs on the X chromosome than on autosomes, EpiLC induction. Culture medium for EpiLCs induction33 was DMEM/F12 + the enriched LINEs may account for the preferential reduction in Neurobasal containing B27 and N2 cell supplements, 20 ng/ml recombinant DNA methylation at the X chromosome in Dnmt3b KO 2i-MEFs. human activin A (R&D Systems), 12 ng/ml bFGF (Wako), 100 U/ml penicillin, μ In summary, taking advantage of DNA-hypomethylated 2i/L 100 g/ml streptomycin, and 1% KO serum replacement (GIBCO). ES cells, we identified a set of unique de novo target sites for DNA methylation by both DNMT3 enzymes during mammalian Animals. All animal studies were conducted in compliance with ethical regulations fi in The University of Tokyo and Kyoto University and were approved by Institute development. Our ndings have important implications regard- of Medical Sciences, The University of Tokyo and Center for iPS Cell Research and ing the roles of de novo methyltransferases in epigenetic regula- Application (CiRA), Kyoto University. MSM/Ms were obtained from RIKEN Bio tion of mammalian development and diseases. Resource Center34,35. B6;129S4-Dnmt3a < tm1Enl> mice5 were obtained from RIKEN Bio Resource Center. C57BL/6, 129×1/SvJ, ICR mice, and pseudopregnant ICR mice were purchased from SLC. Noon of the day when a plug was observed Methods was designated as embryonic day (E) 0.5. Data reporting. No statistical methods were used to predetermine sample size. No experiments were randomized. Investigators were not blinded to allocation during the experiments. Cell lines. All cell lines used for this study were derived in our laboratory and tested for mycoplasma contamination. Establishment and culture of ES cells. Zygotes with an F1 genetic background (129 × 1/SvJ and MSM/Ms) were obtained by in vitro fertilization. S/L ES and 2i/L Sex determination. The sex of cell lines and mouse embryos was determined by ES cells were established in a previous study16. Cells were maintained at 37 °C in an PCR using primers for Sry and Uty, both of which are encoded on the Y chro- fl atmosphere containing 5% CO2. Brie y, S/L ES cells were established and mosome. The primers used were as follows: Sry-Fw, CGTGGTGAGAGGCACAA

10 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE

a Differentially methylated genes among mouse Dnmt3b target genes (n = 60) 3B WT ICF patients 3B Mut MEI1 JPH4 GJA3 LY6K TNXB MAEL REM1 CDH5 UMOD ZNRF4 SYCP1 TDRD1 MEIOC PANX2 NRXN1 TRPC7 MLANA KCNG4 SOGA3 SRRM4 CMTM1 TRIM42 KLHL26 ZBTB22 SLC8A2 PCDHA1 PCDHB4 PCDHA4 PCDHB8 PCDHA8 PCDHB9 PCDHA9 PCDHB3 PCDHA3 DMRTB1 PCDHA6 PCDHB6 PCDHA2 PCDHB2 PCDHA5 PCDHA7 PCDHB7 DCDC2C KBTBD13 PCDHB11 PCDHA11 CCDC185 NEURL1B PCDHB13 PCDHB16 PCDHA12 PCDHB12 PCDHB15 CACNA1E PCDHGA1 SLC25A31 PCDHGA2 MAPK8IP2

TMEM151A DNA methylation HIST1H2BA

0.2 0.4 0.6 0.8 Value b chr5:141,090,911-141,253,251 chr11:7,088,910-7,089,165 3B WT 3B WT [0–100] [0–100] ICF patients 3B Mut 3B Mut DNA methylation (%) DNA methylation (%)

PCDHB2 PCDHB6 PCDHB9 PCDHB18P PCDHB4 PCDHB7 PCDHB12 RBMXL2

chr3:139,020,116-139,020,691 chr11:66,295,177-66,295,638 3B WT 3B WT [0–100] [0–100] ICF patients 3B Mut 3B Mut DNA methylation (%) DNA methylation (%)

PRR23B TMEM151A

cdDifferentially methylated genes among PcG target genes (n = 30) HOXA6 p = 0.0006 0.8

0.6

0.4

0.2 DNA methylation 0.0 Mut WT DNMT3A HOXB3 p = 0.0325 0.8

0.6

0.4

0.2 DNA methylation 0.0 Mut WT DNMT3A MEIS1 p = 0.0166 0.8

0.6 ID3 TLX1 TLX3 TBR1 TBX2 LHX2 VSX2 CBX8 CDX2

MNX1 0.4 MAFB MEIS1 GATA3 FOXC1 ATOH8 FOXC2 HOXB4 HOXA3 HOXB3 HOXA6 HOXB6 HOXB2 HOXA2 HOXA5 HOXC6 EOMES NKX2−3 POU3F3 HOXA11 HOXD13 0.2 DNA methylation DNA methylation DNMT3A Mut 0.0 DNMT3A WT Mut WT 0.2 0.4 0.6 0.8 Value DNMT3A

GTT; Sry-Rv, AGGCAACTGCAGGCTGTAAA; Uty-Fw, GAGTTCTTCTTGCG DNA/RNA extraction and cDNA synthesis. DNA was extracted using the TTCACCATCTG; and Uty-Rv, CTATCTAATCCACAAAGCGCCTTCTTC. PureLink Genomic DNA Mini Kit (Invitrogen), and RNA was extracted using the RNeasy plus Mini Kit (QIAGEN) or NucleoSpin RNA Plus (MACHEREY- NAGEL). DNA and RNA were quantified on a Nanodrop 2000c (Thermo Fisher Scientific) and Qubit (Thermo Fisher Scientific). Reverse transcription was per- GFP labeling of ES cells. ES cells (p3) were labeled by piggyBac transposon car- formed using the PrimeScript™ II First strand cDNA Synthesis Kit (Takara). rying CAG-EGFP-IRES-Neo36. Plasmid and transposase (2.5 µg each) were trans- fected into ESCs using Xfect mESC Transfection Reagent (Clontech). Transfected μ cells were selected with 350 g/ml G418 disulfate aqueous solution. GFP positive Quantitative RT-PCR. Quantitative PCR was performed using GoTaq qPCR colonies were picked and established as GFP-labeled clones (p5). Master Mix and CXR Reference Dye (Promega) on a StepOnePlus Real-Time PCR

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

Fig. 6 Reduced DNA methylation in patients with DNMT3 mutations. a Heatmap of DNA methylation levels at differentially methylated genes in mouse DNMT3B target genes in ICF syndrome patients-derived lines [DNMT3B wildtype (WT): two patients, DNMT3B mutant (Mut): two patients], as determined by reduced representation bisulfite sequencing (RRBS). Genes differentially methylated (|WT-Mut| >= 0.1) between WT and DNMT3B Mut are shown (n = 60). Note that most of differentially methylated genes (n = 57) exhibited reduced DNA methylation in DNMT3B-mutated ICF patients. DNA methylation datasets were obtained from GSE9574749. b DNA methylation status at representative loci of mouse DNMT3B target genes in ICF patients. DNMT3B target genes in mice were hypomethylated similarly in DNMT3B-mutated ICF patients. Each bar indicates a CG site, and bar height represents methylation percentage, as determined by RRBS (0–100%). Locations of genes are indicated below. c Heatmap of DNA methylation levels at PcG target developmental genes of 68 AML patients [DNMT3A wildtype (WT): 54 patients, DNMT3A mutant (Mut): 14 patients] by Infinium assay obtained from GSE6229829. Thirty PcG target genes whose average methylation levels within gene bodies differed (p < 0.05; two-sided, Mann–Whitney U-test) between DNMT3A WT (white) and DNMT3A Mut (gray) are shown. Color scale indicates DNA methylation levels. d Box plots of DNA methylation levels within gene bodies of HOXA6, HOXB3, and MEIS1 in DNMT3A WT (54 patients) and DNMT3A Mut (14 patients) AML. Bold black lines indicate the median methylation levels. The bottom and top of the box are lower and upper quartiles, respectively. Whiskers extend to ±1.5 IQR. (HOXA6, p = 0.0006: HOXB3, p = 0.0325: MEIS1, p = 0.0166; two-sided, Wilcoxon signed-rank test).

system (Applied Biosystems). Transcript levels were normalized against the cor- Library preparation for target-captured bisulfite sequencing. DNA (3 μg, as responding level of β-actin. Primers used were as follows: Pax8-Fw, ATGCCTCA determined by Qubit) was fragmented by sonication. Subsequently, library pre- CAACTCGATCAGA; Pax8-Rv, ACAATGCGTTGACGTACAACTT; β-Actin-Fw, paration was performed using the SureSelectXT Mouse Methyl-Seq Reagent Kit GCCAACCGTGAAAAGATGAC; and β-Actin-Rv, (Agilent Technologies). The Methyl-Seq Kit could enrich 109 Mb of mouse TCCGGAGTCCATCACAATG. genomic regions including CpG islands, Gencode promoters, DNase I hypersen- sitive sites, and tissue-specific DMRs. DNA was bisulfite-treated using the EZ DNA Methylation-Gold Kit™. Sequencing libraries were assessed by Bioanalyzer and Ezh2 inhibitor (GSK126) treatment. Cells were seeded at a density of 2 × 105 cells quantified using the KAPA Library Quantification Kit. The libraries were on six-well plates the day before GSK126 treatment. The cells were then cultured in then sequenced on a HiSeq 2500 (Illumina) as 2× 100 bp or 2× 101 bp paired- medium containing 10 μM GSK126 (Funakoshi) for 2 days. end reads.

Library preparation for RNA sequencing. RNA (200 ng) was prepared for library Western blot. Cells were dissolved in 500 μl of RIPA lysis buffer containing construction. High-quality RNA (RNA Integrity Number value ≥8, as determined protease inhibitor cocktail (Nacalai Tesque), dithiothreitol (Wako), and phos- by Bioanalyzer) was used for library preparation. RNA-seq libraries were generated phatase inhibitor cocktail (Nacalai Tesque). For histone extraction, cells were using the TruSeq Stranded mRNA LT sample prep kit (Illumina). PolyA- washed with PBS and treated with trypsin for 5 min. Cells (1 × 106) were lysed with containing mRNA was purified by magnetic beads conjugated to poly-T oligos, and extraction buffer on ice. The suspension was centrifuged at 12,000 rpm for 10 s, and RNA was fragmented and primed for cDNA synthesis. Cleaved RNA fragments the supernatant was removed. Next, 0.2 M H2SO4 was added to the pellet, and the were reverse transcribed into first-strand cDNA using transcriptase and random suspension was then centrifuged at 12,000 rpm for 20 min. The resultant super- primers. Second-strand cDNA was synthesized by incorporation of dUTP, and ds natant was mixed with 100% trichloroacetic acid and centrifuged at 12000 rpm for cDNA was separated using AMPure XP beads (Beckman Coulter). A single “A” 20 min; the supernatant was removed. Acetone was added to the pellet and lysed. nucleotide was added to the 3′ ends of the blunt fragments, and adapters with index The suspension was then centrifuged at 12,000 rpm for 20 min, and the super- were ligated to the ends of ds cDNAs. ds cDNA fragments were amplified by PCR natant was removed. Finally, the pellet was lysed with 1× SDS sample buffer. using PCR primer Cocktail (Illumina). The number of PCR cycles was minimized were denatured with 2× SDS at 95 °C for 5 min. A total of 30 μgof (11–15 cycles) to avoid skewing the representation of the libraries. RNA-seq denatured was run on a 10% SDS/PAGE gel and transferred to Amersham libraries were sequenced on a NextSeq500 (Illumina) as 75 bp single reads. Hybond-P PVDF Membrane (GE Healthcare) using PowerPac HC (Bio-Rad). Membranes were blocked in blocking buffer [1× TBS with 0.05% Tween-20 (TBST) containing 4% nonfat milk (Nacalai Tesque)], and then incubated overnight at 4 °C DNA methylation data analyses. For WGBS and MethylC-seq analyses of 129/ with primary antibodies diluted in blocking buffer. Primary antibodies were as MSM genetic background cells and tissues, SNP data for MSM/Ms were obtained follows: anti-Dnmt3a (Novus Biological, NB120-13888, 64B1446; dilution, 1:500; from the NIG Mouse Genome Database (MSMv4HQ, http://molossinus.lab.nig.ac. Santa Cruz Biotechnology, sc-20703; dilution, 1:200) and anti-β-actin (Santa Cruz jp/msmdb/index.jsp), and the MSM/Ms mouse genome was reconstructed from Biotechnology, sc-47778; dilution, 1:1000). Blots were rinsed with TBST. Mem- mm10 using these SNPs. Information about indels was not used in this study. Bases branes were incubated for 90 min at room temperature with HRP-conjugated with low-quality scores and the adapters in all sequenced reads were trimmed with secondary antibodies. Secondary antibodies used were ECL Anti-mouse IgG, HRP- cutadapt-1.1437. Trimmed reads were mapped independently to both the B6 linked whole antibody from sheep (GE Healthcare, NA931V; dilution, 1:5000), and (mm10) and MSM/Ms mouse genomes using the Bismark software-v0.18.238 with ECL Anti-rabbit IgG, HRP-linked whole antibody from donkey (GE Healthcare, bowtie2 (version 2.3.0)39. Reads uniquely mapped to the same chromosome and NA934V; dilution, 1:5000). After rinsing with TBST, Pierce ECL plus Western positions of both B6 and MSM/Ms genomes were used for further analyses. B6 Blotting Substrate (Thermo Scientific) was used for visualization, and bands were (129)-derived and MSM/Ms-derived sequenced reads were determined based on detected on a LAS4000 (GE Healthcare). MSM/Ms SNP data. The Y chromosome and mitochondrial genome were omitted from the MSM/Ms reference genome due to the lack of SNP data. SNPs in CpG sites were excluded, and the patterns of bisulfite conversion were taken into Bisulfite sequencing. DNA (200 ng) was bisulfite-treated using the EZ DNA account in the determination of parental alleles. Methylated cytosines were ™ extracted from reads using the Bismark methylation extractor with the following Methylation-Gold Kit (ZYMO RESEARCH). PCR was performed with EX Taq HS – – – – (Takara). PCR products were cloned into pCR4-TOPO vector (Invitrogen). Sub- options: ignore 10 ignore_r2 10 ignore_3prime 5 ignore_3prime_r2 5. For cloned colonies were sequenced using M13 reverse primer on an ABI 3500xL MethylC-seq analysis of B6 genetic background tissues, trimmed reads were (Applied Biosystems). Primers were as follows: Hoxa1-Fw, AAGTTTGGGG- mapped to the mouse genome (mm10) using Bismark software-v0.18.2 with TAATTTGGATAAAGG; Hoxa1-Rv, CCAACCCAAAAAAAAACCTAAACTAC; bowtie2 (version 2.3.0) and default settings, and uniquely mapped reads were used Zfp467-Fw, AGGTTATTTGTTTGTGTTTAGTGTG; Zfp467-Rv, TCCAAATATA for extraction of methylated cytosines using the Bismark methylation extractor with options–ignore 10–ignore_3prime 5 (single-end reads) or–ignore AATCTCCCAAATCTATC; Meis1-Fw, AGATAGTGAGGGTTTGAAATTTAGT – – – AAA; and Meis1-Rv, CCACCCCTAACTCCAAATAAAAAT. 10 ignore_r2 10 ignore_3prime 5 ignore_3prime_r2 5. For MethylC-seq analysis, because probe sequences of the SureSelectXT were designed for the original top strand, only the original bottom data were used for this analysis. The Integrative Genomics Viewer (version 2.8.2)40 was used to visualize CpG methylated Library preparation for whole-genome bisulfite sequencing . DNA (500 ng, as status (CpG sites at ≥5× coverage). Genomic locations for each genetic element determined by Qubit) was fragmented by sonication (Covaris) and ligated with were described previously41. UCSC LiftOver42 (http://genome.ucsc.edu/)was methylated adapters supplied by TruSeq™ DNA, Sample Prep Kit-v2 (Illumina). fi ™ used to convert the coordinates of the mm9 assembly to those of the mm10 Subsequently, DNA was bisul te-treated using EZ DNA Methylation-Gold Kit . assembly. Final library amplification was performed using Pfu Turbo Cx (Agilent Technol- ogies). Sequencing libraries were assessed on a Bioanalyzer (Agilent Technologies) and quantified using the KAPA Library Quantification Kit (KAPA BIOSYSTEMS). Identification of differentially methylated loci and regions. The R/Bioconductor The libraries were then sequenced on a HiSeq 2500 (Illumina) as 2× 100 bp or 2× package DSS43 was used to identify differentially methylated loci and regions. For 101 bp paired-end reads. identification of epiblast-specific methylated loci (Fig. 1b), the DMLtest function in

12 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w ARTICLE the DSS package (FDR < 0.05, Epi > ICM) was used with WGBS data (CpG sites at 9. Smith, Z. D. et al. A unique regulatory phase of DNA methylation in the early ≥5× coverage) obtained from GSE84236 to determine significantly differentially mammalian embryo. Nature 484, 339–344 (2012). methylated loci. For identification of DNMT3A and DNMT3B target regions 10. Smith, Z. D. et al. Epigenetic restriction of extraembryonic lineages mirrors globally (Fig. 2a and Supplementary Data 2), the DMLtest function in the DSS the somatic transition to cancer. Nature 549, 543–547 (2017). package (FDR < 0.01 and difference in methylation levels (WT–KO) > 0.4) was 11. Bourc’his, D., Xu, G. L., Lin, C. S., Bollman, B. & Bestor, T. H. Dnmt3L and the used with 1k tiled CpG methylation WGBS data (CpG sites at ≥10× coverage) to establishment of maternal genomic imprints. Science 294, 2536–2539 (2001). determine significantly differentially methylated tiles (DMTs; Supplementary 12. Hata, K., Okano, M., Lei, H. & Li, E. Dnmt3L cooperates with the Dnmt3 Data 1). For identification of DNMT3A and DNMT3B target genes (Fig. 2b, c, family of de novo DNA methyltransferases to establish maternal imprints in – Supplementary Fig. 3a, b and Supplementary Data 1 5), the DMLtest and callDMR mice. Development 129, 1983–1993 (2002). = = = functions in the DSS package (delta 0.2, p.threshold 0.00001, minCG 10) 13. Kaneda, M. et al. Essential role for de novo DNA methyltransferase Dnmt3a in ≥ were used with MethylC-seq data (CpG sites at 50× coverage) to determine paternal and maternal imprinting. Nature 429, 900–903 (2004). fi signi cantly differentially methylated regions (DMRs, each KO vs WT, Supple- 14. Baubec, T. et al. Genomic profiling of DNA methyltransferases reveals a role fi mentary Data 2 and 4). The genes included in the DMRs (WT > KO) were de ned for DNMT3B in genic methylation. Nature 520, 243–247 (2015). as target genes (Supplementary Data 3 and 5). Gene locations were obtained from fi fi 15. Manzo, M. et al. Isoform-speci c localization of DNMT3A regulates the GENCODE annotation le (vM15). DNA methylation fidelity at bivalent CpG islands. EMBO J. 36, 3421–3434 (2017). RNA-seq data analyses. After trimming of adaptor sequences and low-quality 16. Yagi, M. et al. Derivation of ground-state female ES cells maintaining gamete- bases with cutadapt-1.1537, sequenced reads were mapped to the mouse reference derived DNA methylation. Nature 548, 224–227 (2017). genome (mm10) using tophat-2.1.144 with the GENCODE version M15 annota- 17. Hackett, J. A. & Surani, M. A. Regulatory principles of pluripotency: from the tion.gtf file and the aligner Bowtie2-2.3.239. Uniquely mapped reads (MAPQ ≥ 20) ground state up. Cell Stem Cell 15, 416–430 (2014). were used for further analyses. The expression level of each gene was calculated as 18. Shirane, K. et al. Global landscape and regulatory principles of DNA fragments per kilobase per million mapped reads (FPKM) using cufflinks-2.2.145. methylation reprogramming for germ cell specification by mouse pluripotent stem cells. Dev. Cell 39,87–103 (2016). ChIP-seq data analysis. Raw ChIP-seq data (Dnmt3a and Dnmt3b binding data) 19. Auclair, G., Guibert, S., Bender, A. & Weber, M. Ontogeny of CpG island in ES cells were obtained from GSE5741314. Adaptors and low-quality bases at the methylation and specificity of DNMT3 methyltransferases during embryonic 3′ ends of reads were trimmed using cutadapt-1.1537. Untrimmed and trimmed development in the mouse. Genome Biol. 15, 10.1186/s13059-014-0545-5 reads of 20 bp or more were mapped to the mouse genome (mm10) with BWA (2014). 0.7.1246. Duplicate reads, chimeric reads, and reads with low mapping quality 20. Gendrel, A. V. et al. Smchd1-dependent and -independent pathways (MAPQ < 20) were removed using Picard (http://broadinstitute.github.io/picard/) determine developmental dynamics of CpG island methylation on the inactive and SAMtools47. Reads mapped to blacklisted regions (https://sites.google.com/ X chromosome. Dev. Cell 23, 265–279 (2012). site/anshulkundaje/projects/blacklists) were excluded from further analysis. Aver- 21. Liao, J. et al. Targeted disruption of DNMT1, DNMT3A and DNMT3B in age ChIP-seq plots were drawn using ngsplot-2.4748. human embryonic stem cells. Nat. Genet. 47, 469–478 (2015). 22. Toyoda, S. et al. Developmental epigenetic modification regulates stochastic Reporting summary. Further information on research design is available in expression of clustered protocadherin genes, generating single neuron – the Nature Research Life Sciences Reporting Summary linked to this article. diversity. Neuron 82,94 108 (2014). 23. Boyer, L. A. et al. Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature 441, 349–353 (2006). Data availability 24. Jones, P. A. Functions of DNA methylation: islands, start sites, gene bodies All data analyzed by WGBS, MethylC-seq, and RNA-seq were deposited in the Gene and beyond. Nat. Rev. Genet. 13, 484–492 (2012). Expression Omnibus (GEO) under accession number GSE111172. MethylC-seq data 25. Neri, F. et al. Intragenic DNA methylation prevents spurious transcription from Dnmt3a WT tissue were deposited under accession number GSE111173. The initiation. Nature 543,72–77 (2017). publicly available datasets used in this study are: GSE84236, GSE95747, GSE62298, 26. Csankovszki, G., Nagy, A. & Jaenisch, R. Synergism of Xist RNA, DNA GSE57413, GSE96529, GSE95747, GSE62298. All other relevant data supporting the key methylation, and histone hypoacetylation in maintaining X chromosome findings of this study are available within the article and its Supplementary Information inactivation. J. Cell Biol. 153, 773–784 (2001). files or from the corresponding authors upon reasonable request. The source data 27. Brunetti, L., Gundry, M. C. & Goodell, M. A. DNMT3A in Leukemia. Cold underlying Supplementary Fig 4c are provided as a Source Data file. A reporting Spring Harb Perspect Med 7, 10.1101/cshperspect.a030320 (2017). summary for this Article is available as a Supplementary Information file. 28. Challen, G. A. et al. Dnmt3a is essential for hematopoietic stem cell differentiation. Nat. Genet. 44,23–31 (2011). Received: 1 August 2019; Accepted: 2 June 2020; 29. Ferreira, H. J. et al. DNMT3A mutations mediate the epigenetic reactivation of the leukemogenic factor MEIS1 in acute myeloid leukemia. Oncogene 35, 3079–3082 (2016). 30. Tan, Y. T. et al. Deregulation of HOX genes by DNMT3A and MLL mutations converges on BMI1. Leukemia 30, 1609–1612 (2016). 31. Heyn, P. et al. Gain-of-function DNMT3A mutations cause microcephalic dwarfism and hypermethylation of Polycomb-regulated regions. Nat Genet. References 10.1038/s41588-018-0274-x (2018). 1. Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian 32. Plath, K. et al. Role of histone H3 lysine 27 methylation in X inactivation. – development. Nat. Rev. Genet. 14, 204–220 (2013). Science 300, 131 135 (2003). 2. Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev. 16, 33. Hayashi, K., Ohta, H., Kurimoto, K., Aramaki, S. & Saitou, M. Reconstitution fi 6–21 (2002). of the mouse germ cell speci cation pathway in culture by pluripotent stem – 3. Jaenisch, R. & Bird, A. Epigenetic regulation of gene expression: how the cells. Cell 146, 519 532 (2011). genome integrates intrinsic and environmental signals. Nat. Genet. 33(Suppl), 34. Takada, T. et al. The ancestor of extant Japanese fancy mice contributed to – 245–254 (2003). the mosaic genomes of classical inbred strains. Genome Res 23, 1329 1338 4. Jackson, M. et al. Severe global DNA hypomethylation blocks differentiation (2013). and induces histone hyperacetylation in embryonic stem cells. Mol. Cell Biol. 35. Takada, T., Yoshiki, A., Obata, Y., Yamazaki, Y. & Shiroishi, T. NIG_MoG: a fi 24, 8862–8871 (2004). mouse genome navigator for exploring intersubspeci c genetic – 5. Okano, M., Bell, D. W., Haber, D. A. & Li, E. DNA methyltransferases polymorphisms. Mamm. Genome 26, 331 337 (2015). Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian 36. Kim, S. I. et al. Inducible transgene expression in human iPS cells using – development. Cell 99, 247–257 (1999). versatile all-in-one piggyBac transposons. Methods Mol. Biol. 1357, 111 131 6. Li, E., Bestor, T. H. & Jaenisch, R. Targeted mutation of the DNA (2016). methyltransferase gene results in embryonic lethality. Cell 69, 915–926 (1992). 37. Martin, M. Cutadapt removes adapter sequences from high-throughput – 7. Jin, B. et al. DNA methyltransferase 3B (DNMT3B) mutations in ICF sequencing reads. 2011 17, 10 12, 10.14806/ej.17.1.200 (2011). fl syndrome lead to altered epigenetic modifications and aberrant expression of 38. Krueger, F. & Andrews, S. R. Bismark: a exible aligner and methylation caller fi – genes regulating development, neurogenesis and immune function. Hum. Mol. for Bisul te-Seq applications. Bioinformatics 27, 1571 1572 (2011). Genet. 17, 690–709 (2008). 39. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. – 8. Shah, M. Y. & Licht, J. D. DNMT3A mutations in acute myeloid leukemia. Methods 9, 357 359 (2012). – Nat. Genet. 43, 289–290 (2011). 40. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29,24 26 (2011).

NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16989-w

41. Illingworth, R. S. et al. Orphan CpG islands identify numerous conserved Methyl-seq libraries. A.T. performed blastocyst injections. K.N. and K.H. provided promoters in the mammalian genome. Plos Genet. 6, e1001134 (2010). Dnmt3l WT and KO MEFs. M.K., S.O., M.S., A.M., and T.Y. analyzed ChIP-seq, WGBS, 42. Rosenbloom, K. R. et al. The UCSC Genome Browser database: 2015 update. Methyl-seq, and RNA-seq data in mouse and human. Nucleic Acids Res 43, D670–D681 (2015). 43. Wu, H. et al. Detection of differentially methylated regions from whole- Competing interests genome bisulfite sequencing data without replicates. Nucleic Acids Res 43, e141 (2015). The authors declare no competing interests. 44. Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013). Additional information 45. Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals Supplementary information is available for this paper at https://doi.org/10.1038/s41467- unannotated transcripts and isoform switching during cell differentiation. 020-16989-w. Nat. Biotechnol. 28, 511–515 (2010). 46. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows- Correspondence and requests for materials should be addressed to T.Y. or Y.Y. Wheeler transform. Bioinformatics 25, 1754–1760 (2009). 47. Li, H. et al. The sequence alignment/Map format and SAMtools. Peer review information Nature Communications thanks Hongshan Guo and the other, Bioinformatics 25, 2078–2079 (2009). anonymous, reviewer(s) for their contribution to the peer review of this work. Peer 48. Shen, L., Shao, N., Liu, X. & Nestler, E. ngs.plot: quick mining and reviewer reports are available visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15, 284 (2014). Reprints and permission information is available at http://www.nature.com/reprints 49. Gatto, S. et al. ICF-specific DNMT3B dysfunction interferes with intragenic regulation of mRNA transcription and . Nucleic Acids Res. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in 45, 5739–5756 (2017). published maps and institutional affiliations.

Acknowledgements Open Access This article is licensed under a Creative Commons We are grateful to S. Sakurai and D. Seki for technical assistance. Y. Y. was supported in Attribution 4.0 International License, which permits use, sharing, part by P-CREATE (JP18cm0106203h0003), Japan Agency for Medical Research and adaptation, distribution and reproduction in any medium or format, as long as you give Development (AMED); AMED-CREST (JP19gm1110004h9903), AMED; JSPS KAKENHI (18H04026, 20H05384); the Princess Takamatsu Cancer Research Fund; the appropriate credit to the original author(s) and the source, provide a link to the Creative Takeda Science Foundation; the Mochida Foundation; and the Naito Foundation. T.Y. Commons license, and indicate if changes were made. The images or other third party ’ was supported by AMED-CREST (19gm1110004s0203); JSPS KAKENHI 19H03418; material in this article are included in the article s Creative Commons license, unless Core Center for iPS Cell Research, Research Center Network for Realization of Regen- indicated otherwise in a credit line to the material. If material is not included in the ’ erative Medicine, AMED; and the iPS Cell Research Fund. M. Y. was supported by JSPS article s Creative Commons license and your intended use is not permitted by statutory KAKENHI 15J05792. The ASHBi is supported by World Premier International Research regulation or exceeds the permitted use, you will need to obtain permission directly from Center Initiative (WPI), MEXT, Japan. the copyright holder. To view a copy of this license, visit http://creativecommons.org/ licenses/by/4.0/. Author contributions M.Y., T.Y., and Y.Y. designed and conceived the study and wrote the manuscript. © The Author(s) 2020 M.Y. and T.U. generated cell lines, performed experiments, and generated WGBS and

14 NATURE COMMUNICATIONS | (2020) 11:3199 | https://doi.org/10.1038/s41467-020-16989-w | www.nature.com/naturecommunications